Re: [CODE4LIB] DIY aggregate index

Blake, Miriam E Thu, 01 Jul 2010 09:50:04 -0700

On 7/1/10 9:44 AM, "Jonathan Rochkind" <[email protected]> wrote:


the technical issues of maintaining the regular flow of updates from
dozens of content providers, and normalizing all data to go in the same
          index, are non-trivial, I think now.

>>
This is very much one of the hardest parts, Jonathan.
Also, thinking about the kinds of services that users want from this data, we've
found the biggest need is to focus on citation references if you can get them. 
(e.g. ISI)
And if you think the bibliographic metadata is poor quality, try
matching on brief reference metadata (that which doesn't contain unique 
identifiers, of course.)
Complex fuzzy string matching and it still is never really great.
(this is part of the problem with cite counts being all over the map in the the 
apps out there!)

My words to the wise are to NOT do local loading unless you have a lot of time 
and money.
Vendors who are doing it have economies of scale.  Individual institutions 
typically
do not.  If the community were to make agreements to have centralized management
at a few institutions for this kind of "open" dataset, maybe. But, as someone 
noted, the middle-men
("value add" A&I producers - Thompson, EBSCO, etc.) are not going to love this 
idea.

Miriam Blake
Los Alamos National Laboratory Research Library

Re: [CODE4LIB] DIY aggregate index

Reply via email to