SAA 2015: Session 701, But Where Is It? Access Tools for Born-Digital Records

In advance of the 2015 Annual Meeting, we invited SNAP members to contribute summaries of panels, roundtable and section meetings, forums, and pop-up sessions. Summaries represent the opinions of their individual authors; they are not necessarily endorsed by SNAP, members of the SNAP Steering Committee, or SAA.

Session 701 looked at how five different repositories – one federal, four college and university – provided patrons with access to born-digital collections. Moderator Tammi Kim, Assistant Librarian for the University of Delaware Library, began the session by introducing the panel, then opened the floor to the panelists.

Julia Corrin, University Archivist for Carnegie Mellon University, discussed the challenges of providing access to born-digital materials as a lone arranger. In a good week, she is able to spend about five hours on born-digital materials. Corrin pointed out that for small archives, sometimes it’s intimidating to go to a conference like SAA, though hoping to get ideas, because it feels like everyone else has access to so many more resources. She likened it to cats, though. Even though cat palaces are wonderful, most cats are more than happy with a plain old cardboard box. The same is true with researchers. As long as they have access to the materials, there is less worry (from them) about the delivery method. It’s better to do something than nothing, so at CMU, the goal is to “provide as much access as we can to as much as we can however we can.”

CMU keeps a preservation copy, then creates access copies to items not already in the digital repository as needed by researchers. Corrin uses free tools like Dropbox and Box. The digital repository, which holds roughly 400,000 files, is a great resource, but it only accepts PDFs. Some items aren’t difficult to convert, like photographs, but others are. These must be put on a thumb drive and taken to the reading room for patrons. A question came up at the end of the session about whether or not the archive was worried about theft of items. According to Corrin, nothing is provided from the born-digital materials that would be restricted. These are items that if in print form, patrons could make copies of. The next steps for CMU will be to build more finding aids that link to digital collections.

Adriane Hanson, Digital Curation and Processing Archivist at the Richard B. Russell Library at the University of Georgia, discussed how the library staff there wanted to integrate the workflow with the workflow for paper in order to keep it simpler both for staff and researchers. They are currently building their own access tool, but that is several years away from implementation, so they needed something to provide access to their born-digital materials in the interim. Further, they wanted a web-based tool to broaden access, but also needed to be able to lock down the files to ensure privacy and confidentiality, so they ended up using Google Drive. The Russell Library describes digital files in the finding aids alongside paper files. Therefore, patrons request the files the same way paper files are requested. Aeon has the ability to route requests for born-digital materials to Hanson through a special key, and she checks the queue once a day to ensure requests are being filled in a timely manner.

When requests come in, she must upload them to Google Drive, as they are not stored on a web-based server. One patron, for example, requested 2,000 files, which took an afternoon to upload. Requests are put into folders with the patron’s name on it. These files are available for two week periods, due to space. So far, there have only been four requests for born-digital files. The current system worked really well for three of the four patrons, but one researcher wanted the ability to download the files, which is currently not allowed. However, since the Russell Library has just changed its policy regarding use of digital cameras, it might also change the policy about allowing patrons to download the born-digital materials. There is not much difference in the patron being allowed to take a picture of a file than there is in providing a PDF. The endgame for the Russell Library will be a Fedora/Hydra web-based access tool, which will allow patrons to not only see the files, but also allow for analysis in a way the current system cannot.

John LaGloahec, Archives Specialist with the National Archives and Records Administration (NARA)’s Electronic Records Division, has been with that unit for eight years. During that time, NARA has worked to increase the amount of material it offers digitally – especially through its Access to Archival Databases (AAD) – to provide access to as many of its 11.7 billion born-digital materials as it can. Half of those born-digital items, LaGloahec notes, are PDFs that were not truly born-digital, but came to NARA in their digital format. Patrons can access AAD from anywhere, and about 9,000 individuals access the utility daily. One of the biggest users is the National Personnel Records Center (NPRC) as they assist families who are trying to prove a relative’s military service. There are two other options for accessing NARA’s born-digital materials. For items that are not in AAD, patrons can request access and then download the materials free of charge from NARA’s catalog. For items that are not accessible online, patrons can request the files from the Electronic Records Division for a charge. This charge increases if redaction is necessary.

Rossy Mendez, Public Services Project Archivist at the Seeley G. Mudd Library at Princeton University, said their access to born-digital materials was a work in progress. Upon acquisition, born-digital files are put though the library’s Forensic Recovery of Digital Objects (FRED), which is a forensic unit that has write-blocking capabilities, and then run through a series of steps in the BitCurator environment, which includes analyzing the records and packaging the Archival Information Packages (AIPs). Currently, preservation copies are stored in dark storage and then the access copy is provided through WebSpace, a data management tool that allows control of who can access the records – which is important because the Mudd Library holds university records. In the repository’s finding aids, the link to the WebSpace collection is embedded in the EAD. Users see a “view content” button, and from there, they can access the collections on or off campus.

This works well at the Mudd Library because there is no need for dedicated terminals in the reading room and archivists are free to respond to other patron requests. The downsides Mendez noted were that patrons don’t know the difference between born-digital and digitized collections, it can be difficult to tell where the digital collection is with older collections (or if there is any), and born-digital material is available on a folder level, so patrons might miss some things they want as they scroll through pages. Mendez is working to standardize the EAD descriptions to be more precise. Further, the Mudd Library is a Hydra partner, so they are exploring ways to increase access through that program.

Audra Eagle Yun, Head of Special Collections & Archives, University of California, Irvine, began by saying that her repository is participating in the ePADD initiative and is coordinating web archiving through Archive-It. Eventually it will integrate these things into the model being used today to access born-digital materials. However, for now, she is trying to pull collections together by provenance, not by format; want to be format-neutral. Currently, there are several groupings of collections in an instance on DSpace. Migration and appraisal processes are done according to established standards, and the standard catalog records for library collections link out to the Online Archive of California (OAC), which feeds into WorldCat. For archival collections, the feed is to ArchiveGrid. Currently, the finding aids show digital collections as a separate series. Finding aids in digital collections are linked in Wikipedia.

In 2010, UC-Irvine developed its virtual reading room, which functions like the physical reading room at the archive. The sign-in process to access restricted collections is similar for both. Access to the restricted digital collections occurs within five business days. The archive’s language for the agreement states that researchers accessing these collections will only use the materials for teaching, research, or private study. Because the collections are currently in DSpace, there is no way to set up analytics. However, the OAC does provide statistics on their collections, so UC-Irvine knows that the majority of researchers find their digital collections through Google. Within the next two years, the digital collections will move over to Calisphere, so there will be more ability to see analytic data.

One audience member asked if any of the repositories had experimented with vaporware, which would delete items off of patron’s computers after a certain timeframe. None of them had used it or looked into the possibility of using it. Another question was raised about whether full disk images were ever offered to patrons, and none of the institutions had provided them to patrons. Finally, an audience member asked if there was any concern about using products like Dropbox or Google Drive, which required trusting items to a third party. Because of the scope of the materials provided over these servers, none of the panelists had thought the cost outweighed the benefit.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s