Home Link Email Link

British Library

Sophisticated internet technologies from System Associates powers www.collectbritain.co.uk

Collect Britain Screen Shot

Following a substantial grant from the New Opportunities Fund, the British Library embarked on an ambitious online programme called Collect Britain, involving the digitisation of more than 90,000 images and sounds relating to Britain's heritage. The project is part of the British Library's ongoing strategy of widening access to its resources, making them available as a worldwide public resource to lifelong learners. A consortium of information organisations worked with the British Library, contributing to the design and content of the site including Edinburgh Data and Information Access at the University of Edinburgh (EDINA), the University of Portsmouth and The Ordnance Survey.

The result is an eclectic showcase of 19 themed collections, depicting diverse areas of British life as far back as 800AD. They include drawings and illustrations, paintings, stamps, regional dialects, wildlife sounds, songs from the Victorian Music Hall, maps and plans, photographs and more than 50,000 pages of newspapers from the 19th century.

Digitising and describing 90,000 items was a major task involving a team of 20 in-house professionals including curators*, project managers, photographers, metadata creators and editorial consultants. The whole project was technology enabled using digital media software and a content management system from Internet technology firm, System Associates. This technology allowed the effective labelling, description and management of the vast quantity of information that was to be made available online.

System Associates was one of several organisations contracted to develop the site and was responsible for provision, integration and implementation of the content management software, g-Serve, as well as a workflow-driven digital media library system, g-Media. It was also responsible for ensuring compliance with a number of data and e-Government standards and provided training and support for the system administrators.

Going digital

Timescales to implement the technology were short and priority was given to the digitisation and labelling of material. Russell Watkins, Web Manager for the Collect Britain project comments: "The digitisation of material was the area of the project that would require the greatest time and so we worked with System Associates to get a usable system up and running in less than two months. This allowed our in-house team to make a start on digitisation, while the content management system and additional functionality were being developed."

g-Media, the digital media library system (DML) is workflow-driven, so that much of the process is automated. It allows the British Library team to effectively govern the digitisation of an image from its creation and its classification to the application of metadata and quality assurance. Metadata tagging is intuitive, so that users of the system do not need to key metadata manually, although this option is still available. The DML system is also scaleable, so that new data fields can be created, edited, duplicated and deleted as the collection continues to evolve.

A built in user registration facility ensures no content can be posted to the website, before a series of pre-determined checks are carried out by authorised users.

Managing content

While digitisation began, System Associates engineered a tailor-made content management system (CMS), based on its core content management software, g-Serve. g-Serve is designed specifically for the public sector with built-in compliance to e-GIF, e-GMS, W3C and various metadata standards, including Dublin Core. Russell Watkins comments: "The NOF-digitise programme also had its own set of guidelines that we needed to adhere to and System Associates quickly engineered these into the content management system." This automatic compliance means that content conforms to all the necessary standards, regardless of the compliance knowledge and expertise of the system users. It is also easily updated as new regulations and compliance issues come into force.

Content is administered through a browser-based interface and g-Serve supports most file formats.

The CMS was configured to integrate several unique features, including an automated conversion system, the first in the world to be deployed using Adobe's Graphics Server software. The software allows original images to be used in a variety of media and can automatically generate variations based on colour, size, resolution and file type, so that visitors to the site can extract information in their most suitable format.

A pan-zoom facility is also extensively featured on the site. "We wanted users of the Collect Britain site to be able to scrutinise the images in as much detail as possible," comments Russell Watkins. "System Associates implemented an innovative pan-zoom facility which has allowed us to achieve this. By zooming into an image of an 18th century map of London, for example, users can see small areas in great detail, in many cases going from an overview of the whole map to seeing the actual road names and landmarks on individual streets."

Taking the strain

With up to 20 personnel involved in the digitisation of material, any system implemented had to be able support them simultaneously, with minimal loss of performance, regardless of the number of items being processed. System Associates was able to provide tangible stress testing analysis that demonstrated the system capable of processing 2000 40MB objects over a six-hour period and it can effectively handle 1.5 million items successfully. These stress tests surpassed British Library requirements and ensure a high level of future-proofing for the project.

The hardware behind the Collect Britain site currently manages a 1.5TB store of data and is hosting several thousand sound files and more than 50,000 searchable pages of newspapers from the 19th century.

Navigation and touring

Navigating this many items requires a highly effective search mechanism to retrieve relevant information. As part of the content management system, System Associates provided an advanced search engine that utilises GIS technology and the Ordnance Survey Land Ranger database. A user can type in a place name or postcode and search the site for any information relating to the area, including references to the site in newspapers, maps, plans and illustrations.

Prior to the implementation of the project, user profiles were developed. Primarily, users of the site are life-long learners with a strong interest in British history or in the history of specialist areas such as gardening or stamp collecting. As well as offering specific search options, visitors to the site can view the site by subject area via a 'Virtual Exhibition' or by a 'Themed Tour'. A Virtual Exhibition allows visitors with a specific interest to view material relevant to their learning. Alternatively, a Themed Tour offers a lesson approach to online learning, guiding visitors sequentially through a theme.

Timescales

There have been three phases in the development of the Collect Britain website. Phase I, a pre-release version of the site went live in May 2003. Last summer (2004) saw Phase II, a considerably improved version of the site with added content and functionality.

Spring 2005 sees the final phase as the project reaches its completion, with further new collections, virtual exhibitions and themed tours as well as an improved search functionality. This phase is currently being supported with a national public awareness campaign to encourage usage of the site. Already, the dialects collection has been launched to the public and has seen a significant increase in traffic since the campaign began. Adrian Arthur, British Library Head of Web Services comments: "Typically, before the awareness campaign, we would see 12,000-20,000 visitors a month. Since the awareness campaign, this has increased to approximately 25,000 a month. We anticipate this will continue to grow as we raise awareness of other collections."

While the project is now considered complete, the Collect Britain website will continue to evolve within the Library's wider digitisation strategy. Plans for the next few months include a new virtual exhibition based on material from the Caribbean and an innovative map search facility.

View the Collect Britain website (link opens in a new window)