Hi Stuart, I've done RDF/DC to MARC for the Gutenberg Project. Requires a lot of clean up especially with respect to subject heading strings since LCSH might well appear in DC element but need to be parsed into marc subfields. Tedious, human intervention required in the case of the Gutenberg Project.
Close to finishing the editing of about 4000 records harvested in late December, 2014; about 16 months after an initial harvest of about 40,000. The RDF/DC had changed somewhat but significantly fewer subject headings it seemed. I decided to examine virtually every item and to find better records at the Library of Congress or more frequently the Internet Archive [ archive.org/details/texts ] Fully agree how important it is but don't think I'll do it again since consumes all my free time. Maybe if others could volunteer to do that, I could continue harvesting. Only download of the complete collection is possible but I use XSL to select records based on date added. The collections you mention are worthy of being included in library systems. Metadata quality is a limiting factor. regards, dana On Mon, Aug 18, 2014 at 5:04 PM, Stuart Yeates <stuart.yea...@vuw.ac.nz> wrote: > There are a stack of great free ebook repositories available on the web, > things like https://unglue.it/ http://www.gutenberg.org/ > https://en.wikibooks.org/wiki/Main_Page http://www.gutenberg.net.au/ > https://www.smashwords.com/books/category/1/newest/0/free/any etc, etc > > What there doesn't appear to be, is high-quality AACR2 / RDA records > available for these. There are things like https://ebooks.adelaide.edu. > au/meta/pg/ which are elaborate dublin core to MARC converters, but these > lack standardisation of names, authority control (people, entities, places, > etc), interlinking, etc. > > It seems to me that quality metadata would greatly increase the value / > findability / use of these projects and thus their visibility and available > sources. > > Are there any projects working in this space already? Are there suitable > tools available? > > cheers > stuart > -- Dana Pearson dbpearsonmlis.com Metadata and Bibliographic Services for Libraries