--- Tim Shearer <[EMAIL PROTECTED]> wrote:

> Hi Folks,
>
> I'm looking into tapping the texts in the Open Content Alliance.
>
> A few questions...
>
> As near as I can tell, they don't expose (perhaps even store?) any common
> unique identifiers (oclc number, issn, isbn, loc number).

I poked around in this world a few months ago in my previous job at California 
Digital Library,
also an OCA partner.

The unique key seems to be text string identifier (one that seems to be 
completely different from
the text string identifier in Open Library). Apparently there was talk at the 
last partner meeting
about moving to ISBNs:
http://dilettantes.code4lib.org/2007/10/22/tales-from-the-open-content-alliance/

To obtain identifiers in bulk, I think the recommended approach is the OAI-PMH 
interface, which
seems more reliable in recent months:

http://www.archive.org/services/oai.php?verb=Identify

http://www.archive.org/services/oai.php?verb=ListIdentifiers&metadataPrefix=oai_dc&set=collection:cdl

etc.


Additional instructions if you want to grab the content files.

>From any book's metadata page (e.g., 
>http://www.archive.org/details/chemicallecturee00newtrich)
click through on the "Usage Rights: See Terms" link; the rights are on a pane 
on the left-hand
side.

Once you know the identifier, you can grab the content files, using this syntax:
    http://www.archive.org/details/$ID
Like so:
    http://www.archive.org/details/chemicallecturee00newtrich

And then sniff the page to find the FTP link:
    ftp://ia340915.us.archive.org/2/items/chemicallecturee00newtrich

But I think they prefer to use HTTP for these, not the FTP, so switch this to:
    http://ia340915.us.archive.org/2/items/chemicallecturee00newtrich

Hope this helps!

  --SET


> We're a contributer so I can use curl to grab our records via http (and
> regexp my way to our local catalog identifiers, which they do
> store/expose).
>
> I've played a bit with the z39.50 interface at indexdata
> (http://www.indexdata.dk/opencontent/), but I'm not confident about the
> content behind it.  I get very limited results, for instance I can't find
> any UNC records and we're fairly new to the game.
>
> Again, I'm looking for unique identifiers in what I can get back and it's
> slim pickings.
>
> Anyone cracked this nut?  Got any life lessons for me?
>
> Thanks!
> Tim
>
> +++++++++++++++++++++++++++++++++++++++++++
> Tim Shearer
>
> Web Development Coordinator
> The University Library
> University of North Carolina at Chapel Hill
> [EMAIL PROTECTED]
> 919-962-1288
> +++++++++++++++++++++++++++++++++++++++++++
>

Reply via email to