Re: [CODE4LIB] EAD in Blacklight (was: Re: [CODE4LIB] Batch loading in fedora)

[Your Name] Sun, 08 Aug 2010 08:21:11 -0700

I'd like to share an alternative approach that we're pursuing here at UVa. It 
doesn't speak quite directly to operations on finding aids by themselves, with 
no attention to representing on-line the collection so described, but more to 
those situations where you make an attempt at a full digital surrogate for a 
collection, using repository machinery. I hope, though, that it might be useful 
to hear about. We started from a few principles as follows. (All of them have 
exceptions, of course. {grin})


1) EAD is a wonderful markup language, but not always an optimal metadata 
standard. 

2) XML is for serializing, not for storage.

3) Solr is a fantastic indexing tool, but it's neither a datastore nor a 
database.

4) Collections do not have an absolutely correct structure. Archivists and 
scholars disagree sometimes.

5) The best ways to describe an individual entity are not necessarily the best 
ways to describe the relationships between entities.

We assemble digital surrogates for archival collections as assemblages of 
Fedora objects linked together by RDF. When we start with a finding aid, we 
disassemble the EAD to develop a graph of documents, containers, series, etc. 
in Fedora, with RDF predicates along the lines of "isConstituentOf", 
"hasCollectionMember", etc. When we haven't got a finding aid, we build up the 
graph from annotations on the physical objects (boxes, folders, etc.) as they 
are processed for scanning. Obviously, we get a much simpler graph that way, 
because no claims have been made by archivists about the structure of the 
collection. Descriptive and other metadata is stored with each object in MODS 
and other good -metadata- formats. A document object has metadata that pertain 
only to the document (along with any data that permits us to represent the 
document on-line, e.g. a scanned image or TEI text ), a folder object has 
metadata for that folder, etc. Since we want to offer EAD for a collection (or!
  any piece thereof), we supply a Fedora behavior (dissemination) against any 
object, which behavior assembles a collection structure as "seen" from that 
object (by following the RDF graph), then recursively assembles the appropriate 
metadata and transforms it to produce EAD.

We like this approach because it offers a great deal of extensibility (we could 
imagine using more sophisticated RDF to account for different opinions about a 
collection, or offering a METS or other structured view as well) and it keeps 
the repository contents "idiomatic". We haven't yet figured out entirely how we 
bring this kind of content to Blacklight, but we'll be aided by the fact that 
we have appropriately-attached metadata for anything that should appear as a 
record in our indexes.

We're bringing the first part of this scheme (the assembly of object graphs) to 
production in the next fortnight or so. We've got the code ready and tested and 
are now enjoying the really fun stuff-- moving servers around and tinkering 
with clustering and the like. The second part (producing EAD "live") is waiting 
to go to production on some work from our cataloging dep't, who have assigned 
some staff to polish up the mappings involved. We have very simple mappings in 
place now, but not ones good enough to publish publicly. They're working away, 
and we hope to see something in production later this fall. As for how we 
provide discoverability, we'll start simply by indexing all these objects into 
our local Blacklight instance. There's no need to consider how to index 
highly-structured XML because we're not storing it. We can move on to providing 
special views for records with awareness of the relationships that Fedora has 
recorded on those objects and tools for discovering, v!
 isualizing, and following them. Unfortunately, our one Blacklight developer 
has plenty on her plate already, so I don't know how quickly we'll be able to 
look at that. In the meanwhile, we can simply style out the 
dynamically-constructed EAD as part of a Blacklight view for a given record, 
which isn't particularly exciting, but is useful.

---
A. Soroka
Digital Research and Scholarship R & D
the University of Virginia Library

Re: [CODE4LIB] EAD in Blacklight (was: Re: [CODE4LIB] Batch loading in fedora)

Reply via email to