Hi Eric,

I’ve worked with EEBO as part of the Jisc Historical Texts 
(https://historicaltexts.jisc.ac.uk/home) platform - which provides access to 
EEBO and other collections for UK Universities. My work was around the metadata 
and search of metadata and full text and display of results. I was mainly 
looking at metadata but did some digging into the TEI files to see how the 
markup could be used to extract metadata (e.g. presence of illustrations in the 
text).

I was lucky (?!) enough to have access to the MARC records, but I did also do 
some work looking at the metadata included in the TEI files.

If there is anything I can help with I’d be happy to.

 The people who worked with the files in detail were a UK s/w development 
company Knowledge Integration (http://www.k-int.com/) - I can give you a 
contact there if that would be helpful.

Owen

Owen Stephens
Owen Stephens Consulting
Web: http://www.ostephens.com
Email: [email protected]
Telephone: 0121 288 6936

> On 5 Jun 2015, at 13:10, Eric Lease Morgan <[email protected]> wrote:
> 
> Does anybody here have experience reading the SGML/XML files representing the 
> content of EEBO? 
> 
> I’ve gotten my hands on approximately 24 GB of SGML/XML files representing 
> the content of EEBO (Early English Books Online). This data does not include 
> page images. Instead it includes metadata of various ilks as well as the 
> transcribed full text. I desire to reverse engineer the SGML/XML in order to: 
> 1) provide an alternative search/browse interface to the collection, and 2) 
> support various types of text mining services. 
> 
> While I am making progress against the data, it would be nice to learn of 
> other people’s experience so I do not not re-invent the wheel (too many 
> times). ‘Got ideas?
> 
> —
> Eric Lease Morgan
> University Of Notre Dame

Reply via email to