Across all sectors, organizations are steadily publishing more and more content online.
This has become even more prominent post-pandemic, as the global community shifted online to overcome the physical limitations placed upon them by lockdown.
Websites are used to sell products and services to clients, to publish and share sales and marketing materials, and – from small businesses to state governments – as a primary channel of communication with members of the public.
The proliferation of digital content presents a new set of challenges. Financial firms, for instance, must fulfill a range of compliance obligations to use digital channels (including websites) to advertise to clients. Elsewhere, organizations may need to keep records of their website content for dispute resolution purposes, and others might wish to preserve website pages simply for their cultural or historical significance, as a matter of public record, or as a point of reference for future marketing campaigns.
Website archiving is the only way of preserving website content in a form that lets it be revisited as it was at a particular point in time. It’s the only means of creating and maintaining a stable, time-stamped, verifiably authentic, and independent version of web content. As the archives are independent, they are separate from the original website architecture and will only include the elements that were live at the time of archive, creating an iteration as close to its original form as possible.
There are many reasons an organization should archive its website, but in all cases, they must ensure their archives are complete, secure and legally admissible.
In this guide, we look at some of the specific challenges around website archiving in three different sectors – financial services, the public sector and brands.
The Role of Website Archiving
Website archiving allows companies in the financial, public and retail sectors store immutable records of their web pages. This helps ensure the following:
Compliance
Regulated companies and firms worldwide must record and retain all electronic communications under MiFID II (EU), FCA (UK), SEC (US), ASIC (AU) and FINRA (US) rules. MiFID II, for instance, states that the recorded electronic communications must be:
- Complete – An organization must be aware of all types of electronic communications that are used and by whom. In addition to this, they must have a system and processes in place designed to capture and retain all records of those communications.
- Accurate – An organization must be fully confident in the recorded electronic communications’ content and metadata which can demonstrate the exact dates and times that anything took place.
- High Quality – An organization should be able to reproduce records of electronic communications in as close to their “original form” as possible.
The leash is tightening globally. As of November 2022, the SEC’s Marketing Rule expanded the definition of what constitutes an ‘advertisement’ to include website content, which must now be captured and archived in its entirety. Meanwhile, the Australian Securities and Investments Commission (ASIC) has begun requesting records of historical website price promises from insurance firms, and administering fines to those that are unable to provide them.
Legal admissibility
Companies could be required to provide authenticated evidence of electronic data when in court. The records must demonstrate that the data has been stored in a format that is unalterable, and when it was archived. These requirements are covered in the following rules and regulations:
- Federal Rules of Evidence (rule 901) – the requirement of authenticating or identifying an item of evidence.
- The Code of Practice on Evidential Weight and Legal Admissibility of Electronic Information (BS 10008:2014) – ensuring the authenticity and integrity of electronic information.
- SEC rule 17a-4 – which requires firms to archive electronic business communications in non-rewritable and non-erasable (WORM) format.
- eDiscovery Requests – archiving website data for dispute resolution and eDiscovery purposes can help ensure that the records are non-refutable and are a true reproduction of the content at that time.
Protection of IP and brand assets
Brands have a clear incentive to keep a long-term record of their activity to inform future campaigns. However, as more business activity occurs online, they will be continuously creating and publishing large amounts of digital content at speed which can be difficult to keep track of. Website archiving can be carried out on a regular basis, and with unlimited cloud storage and the ability to archive large data sets, it can ensure that nothing of value is lost.
Preservation of records of cultural and historical significance
Public-sector organizations and archivists may require the preservation of culturally important website content for instant access for historical public record data. Website archiving is the ideal solution to preserving these large amounts of data and storing in an unalterable WORM file format.
Who Benefits From Website Archiving?
Financial Services
The financial services industry is under constant pressure to adopt new online channels if they are to keep up with the evolving digital landscape. However, in doing so, their use of these channels needs to be balanced against stringent compliance regulation from the likes of the SEC, ASIC, FINRA and more.
Any solution must be able to demonstrate compliance with such regulations as well as with any potential data sovereignty and GDPR requirements.
Public Sector
Numerous national archives, libraries, governments, and universities now archive website data to preserve all records of cultural and historical significance. This is mostly driven by legislation such as the UK Public Records Act 1958 and, more recently, the Freedom of Information Act 2000.
As the public sector undertakes more activity online, organizations are looking for ways to evolve their website archiving provisions in order to take advantage of new technologies such as:
- The cloud – To allow for efficient and flexible storage of the large data sets.
- Indexing and search – To make the data useful to researchers, civil servants, students and members of the public (including public-facing portals such as The UK National Archives).
- The update from the traditional ARC file format to the ISO standard WARC file format, which can help to store born-digital or digitized materials.
Brands
FMCG brands are creating more and more online content in addition to traditional brand assets. This content can easily be altered, corrupted or lost without planning and foresight.
Keeping a record of online brand activity and customer communications can help…
- Inform future brand direction (through monitoring of performance)
- Inspire future campaigns
- Ensure legally admissible records of all communications are kept, for cases of dispute resolution.
How Do You Archive a Website?
When it comes to archiving your website content, there are several ways of doing so. Free online tools like the Wayback Machine are options but require users to manually save every page individually. This simply isn’t feasible for most firms given the frequency with which captures must take place to satisfy regulators; namely every time a change is made.
While some businesses rely on Content Management System (CMS) backups for record-keeping purposes, there are some major differences between a backup and an archive.
- Digital signatures and metadata: Most importantly, data taken from a CMS backup won’t have a digital signature, and therefore won’t be authenticatable, or admissible in court. Further, CMS backups don’t allow legal teams to easily export a record with all of its crucial metadata.
- Full-text search: CMS backups will not provide a full-text search feature. Auditors can request information urgently and at a moment’s notice – when manually capturing your site, the reams of data captured can make it difficult to locate specific data quickly.
- Compliant Data Storage: For regulated industries with specific recordkeeping rules (such as the public sector and financial services), a CMS backup does not meet requirements.
Alternatively, an automated website archiving service allows businesses to keep a complete record of their website content, while relieving the manual burden and ensuring legal admissibility.
Weighing Your Options
As the digital landscape becomes more expansive, the level of regulation is following suit. Organizations like the SEC and ASIC have recently made significant alterations to reflect our increasing reliance on online platforms, and how easily valuable information can be lost, given the swathes of it we encounter daily.
Website archiving fulfills so many functions for so many different types of organization, that it has become a fundamental requirement for a growing proportion of modern businesses. Whether satisfying regulators, retaining your brand’s accumulation of digital output, or preserving information of cultural and historical significance, it applies to industries from the public sector to financial services, retail brands, and many in between.
As archiving demands have increased, certain features have become indispensable, which are almost always provided by third-party compliance vendors. Automated website capture allows companies to crawl their entire site at regular, pre-set intervals and capture an accurate version of the site as it existed at that point, each time. Full text search further reduces the manual burden to a manageable level, should a specific page be requested for litigation purposes. Meanwhile, If this data is not stored in a legally admissible format, the entire process is worthless, from a compliance perspective at least.
Before starting the archiving process, it is important to understand the objective behind it, and whether your team has the tools to handle it themselves. Only then will you begin figuring out the correct approach to take from a resource perspective, and who is equipped to help you achieve those goals. And if your objectives are legal obligations, you’d be advised to err on the side of caution in an increasingly brutal regulatory climate.
Harriet Christie, Chief Operating Officer – Harriet graduated from the University of Sheffield in 2010, with a BA in Management Accounting, Entrepreneurship, Business Law, BSR, HR. She entered the Tourism space, starting as an Accounts Executive at LateRooms.com, and earning the title of Global Accounts Manager within 3 years. She occupied this role for a further 5 years as the business continued to evolve and flourish, before taking up her role as a Key Account Manager with MirrorWeb, a communications archiving solution based in Manchester.
Harriet was appointed Chief Operating Officer in 2020. Since then, she has helped oversee the evolution of the MirrorWeb product and service offering, as well as the business’ impressive growth since her taking on the role.
Website archiving stock image by Postmodern Studio/Shutterstock