Converting your retention schedule into a model of Microsoft 365 retention labels

Rob Bath
Dec 15, 2023
11 min read

Updated: Apr 22, 2024

Very often, my clients will present their retention schedule and ask me to help re-create it inside Microsoft 365. While in theory it might be possible to translate a retention schedule into a model of retention labels entirely, I’ve found that pretty much every organisation needs to accept that changes will need to be made.

These changes are often required simply to adapt to the technology. At other times these changes are introduced in order to make the resultant system simple to use and maintain. While the process is different for every organisation, this article highlights some of the common challenges that many face when translating their retention schedule into a model of retention labels.

Complex retention labels

Retention schedules can vary quite significantly in complexity. While occasionally I’m surprised when an organisation has a really pragmatic retention schedule, I’ve certainly encountered schedules containing in excess of 1,000 rows. There is certainly something to say for having very specific guidelines for each type of information asset, however, I struggle to imagine that many organisations successfully adhere to any schedule that is overly complicated. There’s a reason why keeping things simple is an idiom that many aspire towards.

Certainly, while Microsoft 365 can easily cope with thousands of retention labels, I find that, unfortunately, users often can’t. If you present a user with too many options to pick from most will select something that they feel is ‘about right’ or failing that will pick the top label from the available options. Few have the time to select the correct label that should be applied to each record if they are presented with too many possibilities to pick from.

One way we can mitigate this is to make sure that only relevant labels are available in each SharePoint Site / Microsoft 365 Group in our architecture. Rather than making all of our retention labels available in all of our sites, we can instead only associate each site with a relevant group of labels. This of course requires an understanding of the purpose of each site, so that a relevant group of labels can be chosen – something I feel is best done as part of a controlled provisioning process. My rule of thumb is to try to ensure that each SharePoint Site has a maximum of c. 8 retention labels available for staff to pick from – too many more and I feel the accuracy of their classification will start to be reduced.

We can also help to mitigate this issue by reducing the necessity for manually applying labels. Microsoft provides a wealth of automated ways to ensure that your content receives an appropriate label, including Trainable Classifiers, Sensitive Information Types and through a keyword query, but by far the most powerful approach is to set a default and appropriate retention label on each library in your architecture (and if needed on some folders too). This approach means that contextual retention is applied by default to your files with no additional effort for users – in other words, staff only need to manually tag by exception. This approach typically results in the best coverage of accurate labels while keeping manual effort to a minimum.

To some extent most organisations will need to simplify the complexity of their retention schedule when creating a successful model of retention labels. This often means grouping functionally similar rows of the retention schedule together to reduce the number of labels a user needs to pick from. Inevitably, this introduces some amount of compromise – perhaps keeping certain types of record for longer than we might need to, or triggering the start of a retention period from the date a file was last modified, rather than from the end of the financial year. As with everything else in records management, this is all about assessing risk – you need to ask yourself whether the risk of consolidating a specific group of rows in your retention schedule into a single label will likely lead to enhanced or reduced governance. Certainly, once they’ve considered the relative risks, many of the largest and most complex organisations I’ve worked with have opted for very pragmatic models of retention labels.

Challenging triggers

Microsoft provides four different triggers for starting retention periods (‘from date created’, ‘from date modified’, ‘from date labelled’, and ‘event-based’) see https://www.robbath.co.uk/post/triggering-retention-in-microsoft-365 for more information. However, in the real-world, retention schedules simply don’t restrict themselves to these four triggers. In pretty-much every retention schedule I’ve appraised, there are triggers such as ‘From the date a member of staff leaves’, or ‘When a property is sold’ that simply don’t have direct technical equivalents.

Each time one of these triggers needs to be translated into a retention label, the most appropriate of the four technical triggers provided by Microsoft needs to be selected. However, deciding which of these technical triggers works best for each label is more of an art than a science.

Let’s take ‘Until Superseded’, as an example. This is a trigger that I’ve seen used in dozens of retention schedules, typically when one file is periodically replaced by another. There are several different ways that this trigger could be realised in a model of retention labels, each of which have advantages and disadvantages to consider. Let’s take a closer look at some of these potential approaches for creating retention labels where the retention schedule asks for an ‘Until Superseded’ trigger:

Event-based

When a file is superseded, an ‘event’ could be triggered that starts the retention period for the now superseded file. However, there are various issues with this approach. Not only is this architecture quite complex to implement, it also isn’t usually all that practical to apply as part of business-as-usual processes.

Just to be clear, I love event-based retention and have used it to great effect in many architectures. It’s superb at handling scenarios such as triggering retention when a project closes, when an employee leaves, or at the end of the current year. However, I’m not convinced that it’s all that useful in the ‘Until Superseded’ scenario.

The reason that I feel that using an event-based label for an ‘Until Superseded’ files might be complex to implement is because event-based retention needs files to have a unique ID. For example, if you are closing a project, then all files in that project need to be tagged with an ID that is unique to that project before the event can be triggered. This of course means that if we want to trigger an event when a file is superseded by another, then we are likely to need to have a unique ID for each such file. While this is certainly possible to implement in Microsoft 365, I personally don’t think it’s really going to prove to be viable process to go through every time a file is superseded.

The other drawback of this approach is that the retention doesn’t apply (and therefore the records are not protected) until the event is triggered. This means that content could easily be deleted before it is superseded, which might not be appropriate depending on the specific scenario you face.

Quite simply, while I’ve used event-based retention labels successfully in many architectures, I don’t think that it’s really feasible to use them in ‘until-superseded’ scenarios.

Manual

This approach sees a user manually apply a label to the file when it has been superseded. This label can be configured with the ‘on-labelled’ trigger, meaning that the retention period can start from the date the label is applied.

There are two main benefits of this approach. Firstly, it is much easier for a user to apply a label to a file, than it is to trigger an event every time a file is superseded. Secondly, it is possible to apply a retention label to the file to prevent it from being deleted before it is superseded.

However, this approach relies on a user remembering to change the retention label that is applied manually. In my experience, this sort of reliance on manual classification tends to be inconsistent at best and as such, I question whether this approach will be successful in most scenarios.

Automated

Very often automation is the solution. For example, when a project is closed, or when a financial year ends, you could easily have an automated process that runs to apply appropriate retention to the files. This approach is something I’ve successfully rolled out in multiple organisations – and it really helps to streamline the way records are managed.

The downside is that it requires development effort. This investment makes sense when the automation is applied to key processes in your organisation such as managing the lifecycle of the employee record, or procurement exercises. However, this sort of investment is far harder to justify in the ‘Until Superseded’ scenario, where often we are talking about a handful of records with infrequent triggers. To me it doesn’t feel like automation is justified in this scenario.

Adapt retention duration

The problem with many of the previous approaches is that there are manual steps when the file is superseded. The only other approach is to consider whether you can be really pragmatic with the way you translate your retention schedule into retention labels. You might want, for example, to consider whether you could simply configure your ‘until superseded’ retention label to keep the content for longer than you might otherwise wish to. For example, if you know that a file is kept for up to 10 years before it is superseded, but could be superseded at any time before this date, then you might consider applying a label that ends the retention period 10 years after the file was first created.

This approach will naturally see some superseded files being kept for longer than the retention schedule requires, but has the significant advantage of not requiring any manual steps. Whether this is viable or not will depend upon the risk presented by keeping this type of information asset for longer than the period in the retention schedule – and as such is a decision that you might find acceptable in some situations but not others.

Record versioning [Update 8th Jan 2024]

Following the initial publication of this blog post, I was contacted by Rob Aaldijk from Xillio, who rightly pointed out that I should have included record versioning as another approach to consider for the ‘Until Superseded’ scenario.

If you’re not familiar with record versioning, this capability only applies to record labels (and not to standard retention labels). When record versioning is enabled, your users can lock and unlock a record. When it is unlocked the record can be modified, but whenever it is locked it becomes immutable. Unlocking a record causes a copy of the latest version of the record to be stored in the Preservation Hold Library. This means that with record versioning enabled you would store multiple versions of a single record, with each of these versions being subject to their own retention.

In the ‘Until Superseded’ scenario, record versioning could allow you to maintain a single file and unlock it whenever the file needs to be superseded. This is certainly very intuitive for users, who simply need to unlock and modify a record whenever it needs to be updated.

Personally, I’m not all that comfortable with this approach, as it requires you to use the Preservation Hold Library. As I’ve mentioned in previous blog posts, the Preservation Hold Library works superbly for legal holds, but I don’t feel it’s acceptable for records management. My concern stems from the fact that once a record enters the Preservation Hold Library you cannot remove it; the record will reside there for the remainder of its retention period. The problem is if you ever need to dispose of a record (for example if you are required to do so by GDPR) and that record is in the Preservation Hold Library, then you simply have no way of deleting it. In other words, you might be forced to keep content that you’re legally obliged to dispose of.

Regardless, Rob Aaldijk is right, this is certainly another option that you should consider for the ‘Until Superseded’ scenario.

So, what’s the solution for ‘Until Superseded’ records? Well, the answer is that it depends on your specific situation. If for example the records are being managed by a team who understand the importance of maintaining high quality information governance (something I often encounter in highly regulated sectors such as legal, pharmaceutical and finance), then the manual approach is often the most appropriate. If this label needs to be applied to larger volumes of content, then perhaps automation might work best for you. If you are willing to compromise on the accuracy of your retention schedule, adapting the retention duration could be the ideal solution.

In other words, in situations like ‘Until Superseded’ you really need to evaluate the available options and consider which technical approach is likely to provide you with the best outcome.

Too much to review

Another common issue that I frequently see overlooked when translating a retention schedule relates to the practicalities of running the future system as part of business as usual. One of the key aspects of any good information architecture is to evaluate whether the design will be practical to use. When defining an architecture in Microsoft 365, I often find myself considering subjects including:

the likely future rate of content growth
ongoing storage costs (especially if the need for cheaper (cold) storage is required)
the impact on staff time

This last point is especially important when considering disposition reviews. Often driven by a fear of deletion, some organisations tend to err on the side of caution and configure most of their labels to result in a disposition review. This sounds great on paper, but I’ve often found myself pointing out that in a few years’ time the reviewers could be inundated with tens or even hundreds of thousands of records that need review.

There are various options that can help mitigate this, including delegating reviews to different groups of people across your organisation, and considering whether the risk of automated deletion might need to be applied in more situations than your retention schedule might ideally like. Larger organisations should also consider technical limitations, for example there is a maximum numbers of reviewers in your tenant and a maximum number of files that can be subject to review at any given time (see https://learn.microsoft.com/en-us/purview/retention-limits for more information).

You need to be especially careful around these technical and capacity limitations when you plan to migrate content into your tenant. If you don’t plan in advance, millions of legacy files could be migrated into Microsoft 365, only to all be subject to your retention rules and bubble up for review or disposal at the same time. In other words, an architecture that might be well designed to support your business-as-usual activities, might overwhelm you if an exceptional volume of content is migrated into the platform (without sufficient planning).

The point I’m trying to make here, is that it’s critical that the architecture you define is feasible to operate. It’s worth spending time upfront considering the likely usage of the tenant, to ensure that the way you define your Sites, Groups and Labels will be easy to support and maintain in the future.

Summary of quick tips

Where possible consolidate your retention schedule into a concise model of retention labels
Ensure that your retention labels have functional names the staff find easy to understand
Make sure your triggers are practical to apply
Avoid having too many labels resulting in disposition review
Plan which groups of staff will be undertaking disposition reviews
Identify whether purpose of each label is to keep or delete
Plan to set the most appropriate default retention label on every library in your tenant (including libraries underpinning Microsoft Teams workspaces)
Ideally aim to have no more than 8 retention labels available in each Site/Group
Carefully plan your architecture to ensure that technical limitations aren’t reached and that user-capacity demands are viable
Be especially careful to plan the impact a migration might have on your retention architecture.
Above all – you need to be pragmatic!

Overall, the best tip I can give you is to make sure you have a comprehensive knowledge of how Microsoft’s retention capabilities work. This will empower you to make decisions around the most appropriate configuration of the platform to meet your needs.

If you would like best practice guidance, training or assistance with designing or implementing an architecture in Microsoft 365 that best meets your records management needs, why not get in touch to find out how I can help?

Upcoming Events

Converting your retention schedule into a model of Microsoft 365 retention labels

Complex retention labels

Challenging triggers

Too much to review

Summary of quick tips

Recent Posts

Comments