[CODE4LIB] Solr/Lucene consultant
We are considering the possibility of hiring a consultant to work with us on performance tuning and new feature development for our Solr-based collections search solution. I have a couple of leads on this already, but was curious if anyone on the list had experiences (good or bad) with anyone providing this type of service. Replies on or off the list are welcome. Thanks, David Dwiggins Systems Librarian/Archivist Historic New England ddwiggins at historicnewengland dot org 617-994-5948 http://www.historicnewengland.org Historic New England is celebrating its centennial. Discover all that's happening across the region this year at http://www.HistoricNewEngland.org/Centennial
[CODE4LIB] PastPerfect->MARC
Has anyone had any experience mapping book data from PastPerfect 4 into MARC format for import to a library system? We have about 550 book records from an old version of PastPerfect that we are no longer using, and want to import them into our MARC-based library database. It appears that the vendor used to have an add-on called ezMARC that would do this, but they are no longer making it in version 5, and I'm not sure if it might still be available for version 4. I also don't know how effective it is -- would like to have a testimonial from someone before we spend money to buy add ons to a product we're no longer using! I have all the data and could obviously map it manually. But I would obviously prefer to have a ready-made filter as a starting point. Ideas? -David __ David Dwiggins Systems Librarian/Archivist, Historic New England 141 Cambridge Street, Boston, MA 02114 (617) 994-5948 ddwigg...@historicnewengland.org http://www.historicnewengland.org
[CODE4LIB] Working with Getty vocabularies
Is there anyone out there with experience processing the raw data files for the Getty vocabularies (particularly TGN)? We're adopting AAT and TGN as the primary vocabularies for our new shared cataloging system for our museum, library and archival collections. I'm presently trying to come up with some scripts to automate matching of places in existing databases to places in the TGN taxonomy. But I'm finding that the Getty data files are very complex, and I haven't yet figured out a foolproof method to do this. I'm curious if anyone else has traveled this road before, and if so whether you might be able to share some tips or code snippets. Since most of our place names are going to be in the US, my gut feeling has been to first try to extract a list of places in the US and dump things like state, county, etc. into discrete database fields that I can match against. But I find myself a bit flummoxed by the polyhierarchical nature of the data (where one place can belong to multiple higher level places). Another issue is the wide variety of place types in use in the taxonomy. England, for example, is a country, but the United States is a nation. This makes sense to a degree, but it also makes it a bit hard to figure out which term to match when you're trying to automate matching against data where the creators were less discerning about this sort of fine distinction. I feel like I'm surely not the first person to tackle this, and would love to exchange notes... -David Dwiggins __ David Dwiggins Systems Librarian/Archivist, Historic New England 141 Cambridge Street, Boston, MA 02114 (617) 227-3956 x 242 ddwigg...@historicnewengland.org http://www.historicnewengland.org ( http://www.historicnewengland.org/ ) Visit http://www.LymanEstate.org for information on renting the historic Lyman Estate for your next event - a very special place for very special occasions.
Re: [CODE4LIB] Working with Getty vocabularies
Michael - Thanks for the code snippet -- I will take a stab at putting it into practice when I have a few minutes. Ed -- The TGN can be searched for free on a one-off basis by going to http://www.getty.edu/research/conducting_research/vocabularies/tgn/ If you want access to the raw data files (such as to load into a cataloging system, provide search capabilities online, or do computerized matching), this requires purchasing a license. But, at least in the case of our organization, the terms were quite reasonable, and the initial license allows us to get updates for five years, and then renew at a reduced rate. -David Dwiggins __ David Dwiggins Systems Librarian/Archivist, Historic New England 141 Cambridge Street, Boston, MA 02114 (617) 227-3956 x 242 ddwigg...@historicnewengland.org http://www.historicnewengland.org ( http://www.historicnewengland.org/ ) >>> Ed Summers 2/27/2009 11:06 AM >>> The TGN is still behind a pay-firewall right? Not that that means it isn't legit conversation on here (because it is) -- but just curious what the current state is. //Ed Visit http://www.LymanEstate.org for information on renting the historic Lyman Estate for your next event - a very special place for very special occasions.
Re: [CODE4LIB] digital storage
I've been pondering this a lot lately. We're starting from the ground up on a concerted digital asset management effort after years of one-off solutions. When I arrived, I inherited piles of CDs and DVDs, things stashed on servers all over the place, etc. I am now implementing a digital asset management system (ResourceSpace) to start ordering all this, which will bly tie into our new collections management system and new web content management system. For the moment, I have written a script to copy the resource and preview assets from ResourceSpace to a bucket on S3. (To save bandwidth/time I also used the batch load capability to ship them a hard drive with about 500 GB of data a few weeks ago.) So I now have two copies of all images: one protected by RAID on our iSCSI storage box, and one theoretically spread across multiple data centers at Amazon. Ideally I'd like to have one other copy at one of our remote offices (either online or offline), but that's for the future. I'm not sure we've entirely come to terms with the long term cost of preserving the material. We're buying enough local storage to get through our grant-funded ramp-up. After that replacing/adding drives and servers is going to have to be considered as much of a preservation/conservation expense as replacing the a leaky roof. But it's a relatively new expense (or at least orders of magnitude bigger than it has been for other data systems) so it's something we're going to have to educate people on. -David Dwiggins Historic New England __ David Dwiggins Systems Librarian/Archivist, Historic New England 141 Cambridge Street, Boston, MA 02114 (617) 227-3956 x 242 ddwiggins [at] historicnewengland.org ( mailto:ddwigg...@historicnewengland.org ) http://www.historicnewengland.org ( http://www.historicnewengland.org/ ) >>> Jimmy Ghaphery 8/27/2009 1:37 PM >>> We have a historic idea of what it means to maintain space for analog collections. For many institutions a lot of that initial funding has come from capital building funds. While the technological solutions are not clear to me at this point (and I'm benefiting from this thread on that), I am not sure if this won't turn into more of a long-term business problem. Has anyone been able to give a projection to their management on what the total cost per TB is for preservation over even a short horizon of 10 years? --Jimmy -- Jimmy Ghaphery Head, Library Information Systems VCU Libraries http://www.library.vcu.edu -- Visit http://www.LymanEstate.org for information on renting the historic Lyman Estate for your next event - a very special place for very special occasions.