On 1/4/06, Ed Summers <[EMAIL PROTECTED]> wrote: > I would opt for #2. There is a new version of MARC::Charset available > which should ease the marc8 <-> utf8 charset translation. Shortly > there will be a new MARC::File::XML that uses the latest MARC::Charset > > You might be interested in taking a look at how Evergreen is storing > MARC data. I know that they are using a modded version of > MARC::File::XML in some capacity. Mike Rylander is on this list, so > maybe he'll chime in--or otherwise you could drop into > irc://irc.freenode.net/code4lib and ask him (he's usually there). >
Chime! We store all MARC as MARCXML and use the LoC MODS xslt to extract displayable (and hence indexable) data. Of course that particular stylesheet is only useful for MARC21 data, and other MARC variants would reqire their own stylesheets, I believe. As of today, we're using the marcdump utility from the Yaz toolkit for transforming MARC-8 encoded iso2709 into utf-8 encoded MARCXML, but it has some ... issues ... that I believe the new MARC::Record and MARC::File::XML will fix. Our "import via z39.50" facility is using MARC::Record and our locally modified MARC::File::XML that uses the old MARC::Charset. It works OK, but I think the new MARC::Charset is going to be a real boon to us. In any case, I second Ed's suggestion that you move to MARCXML as much as possible. XML is /so/ much more flexable (read: there are many more tools for it) than iso2709, not to mention that creates a much lower barrier to entry for those wanting to hack on the guts of the system. Also, unless I'm mistaken, you can use Yaz Proxy to transform MARCXML into iso2709 were you to use that as your external Z39.50 server. -- Mike Rylander [EMAIL PROTECTED] GPLS -- PINES Development Database Developer http://open-ils.org