I'd like to share an alternative approach that we're pursuing here at UVa. It
doesn't speak quite directly to operations on finding aids by themselves, with
no attention to representing on-line the collection so described, but more to
those situations where you make an attempt at a full digital surrogate for a
collection, using repository machinery. I hope, though, that it might be useful
to hear about. We started from a few principles as follows. (All of them have
exceptions, of course. {grin})
1) EAD is a wonderful markup language, but not always an optimal metadata
standard.
2) XML is for serializing, not for storage.
3) Solr is a fantastic indexing tool, but it's neither a datastore nor a
database.
4) Collections do not have an absolutely correct structure. Archivists and
scholars disagree sometimes.
5) The best ways to describe an individual entity are not necessarily the best
ways to describe the relationships between entities.
We assemble digital surrogates for archival collections as assemblages of
Fedora objects linked together by RDF. When we start with a finding aid, we
disassemble the EAD to develop a graph of documents, containers, series, etc.
in Fedora, with RDF predicates along the lines of "isConstituentOf",
"hasCollectionMember", etc. When we haven't got a finding aid, we build up the
graph from annotations on the physical objects (boxes, folders, etc.) as they
are processed for scanning. Obviously, we get a much simpler graph that way,
because no claims have been made by archivists about the structure of the
collection. Descriptive and other metadata is stored with each object in MODS
and other good -metadata- formats. A document object has metadata that pertain
only to the document (along with any data that permits us to represent the
document on-line, e.g. a scanned image or TEI text ), a folder object has
metadata for that folder, etc. Since we want to offer EAD for a collection (or!
any piece thereof), we supply a Fedora behavior (dissemination) against any
object, which behavior assembles a collection structure as "seen" from that
object (by following the RDF graph), then recursively assembles the appropriate
metadata and transforms it to produce EAD.
We like this approach because it offers a great deal of extensibility (we could
imagine using more sophisticated RDF to account for different opinions about a
collection, or offering a METS or other structured view as well) and it keeps
the repository contents "idiomatic". We haven't yet figured out entirely how we
bring this kind of content to Blacklight, but we'll be aided by the fact that
we have appropriately-attached metadata for anything that should appear as a
record in our indexes.
We're bringing the first part of this scheme (the assembly of object graphs) to
production in the next fortnight or so. We've got the code ready and tested and
are now enjoying the really fun stuff-- moving servers around and tinkering
with clustering and the like. The second part (producing EAD "live") is waiting
to go to production on some work from our cataloging dep't, who have assigned
some staff to polish up the mappings involved. We have very simple mappings in
place now, but not ones good enough to publish publicly. They're working away,
and we hope to see something in production later this fall. As for how we
provide discoverability, we'll start simply by indexing all these objects into
our local Blacklight instance. There's no need to consider how to index
highly-structured XML because we're not storing it. We can move on to providing
special views for records with awareness of the relationships that Fedora has
recorded on those objects and tools for discovering, v!
isualizing, and following them. Unfortunately, our one Blacklight developer
has plenty on her plate already, so I don't know how quickly we'll be able to
look at that. In the meanwhile, we can simply style out the
dynamically-constructed EAD as part of a Blacklight view for a given record,
which isn't particularly exciting, but is useful.
---
A. Soroka
Digital Research and Scholarship R & D
the University of Virginia Library