We're looking at an infrastructure based on Marklogic running on Amazon EC2, so the scale of data to be indexed shouldn't actually be that big of an issue. Also, as I said to Jonathan, I only see myself indexing a handful of highly-relevant resources, so we're talking millions, rather than 100s of millions, of records.

On 6/30/2010 4:22 PM, Walker, David wrote:
You might also need to factor in an extra server or three (in the cloud or 
otherwise) into that equation, given that we're talking 100s of millions of 
records that will need to be indexed.

companies like iii and Ex Libris are the only ones with
enough clout to negotiate access
I don't think III is doing any kind of aggregated indexing, hence their 
decision to try and leverage APIs.  I could be wrong.

--Dave

==================
David Walker
Library Web Services Manager
California State University
http://xerxes.calstate.edu
________________________________________
From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Jonathan 
Rochkind [rochk...@jhu.edu]
Sent: Wednesday, June 30, 2010 1:15 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] DIY aggregate index

Cory Rockliff wrote:
Do libraries opt for these commercial 'pre-indexed' services simply
because they're a good value proposition compared to all the work of
indexing multiple resources from multiple vendors into one local index,
or is it that companies like iii and Ex Libris are the only ones with
enough clout to negotiate access to otherwise-unavailable database
vendors' content?

A little bit of both, I think. A library probably _could_ negotiate
access to that content... but it would be a heck of a lot of work. When
the staff time to negotiations come in, it becomes a good value
proposition, regardless of how much the licensing would cost you.  And
yeah, then the staff time to actually ingest and normalize and
troubleshoot data-flows for all that stuff on the regular basis -- I've
heard stories of libraries that tried to do that in the early 90s and it
was nightmarish.

So, actually, I guess i've arrived at convincing myself it's mostly
"good value proposition", in that a library probably can't afford to do
that on their own, with or without licensing issues.

But I'd really love to see you try anyway, maybe I'm wrong. :)

Can I assume that if a database vendor has exposed their content to me
as a subscriber, whether via z39.50 or a web service or whatever, that
I'm free to cache and index all that metadata locally if I so choose? Is
this something to be negotiated on a vendor-by-vendor basis, or is it an
impossibility?

I doubt you can assume that.  I don't think it's an impossibility.

Jonathan
---
[This E-mail scanned for viruses by Declude Virus]





--
Cory Rockliff
Technical Services Librarian
Bard Graduate Center: Decorative Arts, Design History, Material Culture
18 West 86th Street
New York, NY 10024
T: (212) 501-3037
rockl...@bgc.bard.edu

---
[This E-mail scanned for viruses by Declude Virus]

Reply via email to