Mea culpa ... read on. :) On 3/16/06, Mike Rylander <[EMAIL PROTECTED]> wrote: > I've updated the cvs for MARC::File::XML with what I described below, > with one caveat. The one difference from what I was planning is that, > because as_xml() is generated by MARC::Record, I can't give it new > parameters. To test exporting to XML you'll need to set the record > format for export either in the use line for the module or using the > default_record_format() class method. Just call that with 'UNIMARC' > as the parameter and then export your record as normal using as_xml() > on the MARC::Record object.
It seems that I am either blind or insane ... I do have access to as_xml(), and I did in fact add the format option to it. Sorry for the confusion. :) I'm updating the POD now, and adding a new method to to export XML without a <collection> wrapper. > > (new_from_xml() does not suffer from this as that method is defined in > MARC/File/XML.pm, so it takes both an encoding parameter and a format > paramter, as explained in the documentation.) > > Will some brave soul please test this with some UNIMARC records and > let me know how it goes? > > ----------------------------------- > > CVS checkout intsructions > cvs -d:pserver:[EMAIL PROTECTED]:/cvsroot/marcpm login > cvs -z3 -d:pserver:[EMAIL PROTECTED]:/cvsroot/marcpm co > -P marc-xml > > Then, > cd marc-xml > perl Makefile.PL > make > make test > > And assuming 'make test' succeeds ... > make install > > ------------------------------- > > Thanks in advance, > > --miker > > On 3/16/06, Mike Rylander <[EMAIL PROTECTED]> wrote: > > I've been attempting to beat the MARC::File::XML stuff into a usable > > shape as of late, so I'm going to take a stab at fixing this. There > > will be some limitations (at first) as to what encodings we'll accept > > for UNIMARC records, but I'll cover the cases that I know about (and > > understand). > > > > Here's the plan: > > > > I will add a use flag to set the script-wide default for record format > > > > use MARC::File::XML ( RecordFormat => 'UNIMARC' ); > > > > that will default to MARC21. There will also be a class method to set this > > flag > > > > MARC::File::XML->default_record_format( 'UNIMARC' ); > > > > and, finally, a flag to both as_xml and new_from_xml to tell > > MARC::File::XML about individual records. I don't think, at this > > point, we should autodetect based on the existence of a 200 tag, as > > I'd like to stay away from heuristics if it can be avoided. If others > > disagree, please make the case! > > > > When processing a UNIMARC record, I'll look in 100$a for the encoding, > > and proceed if it's either 01 (iso646 -- nominally compatible with > > iso8859, though it requires interpretation) or 50 (UNICODE, which will > > always mean UTF8 in XML produced by MARC::File::XML). If it's > > anything else an error will be thrown. We can add support for other > > encodings as the direct need arises. > > > > For UNIMARC/UNICODE, the XML is obviously going to be UTF-8 encoded. > > For UNIMARC/ISO646, the XML will be marked as ISO-8859-1. Yes, it's a > > bit of a fib, but most XML parsers don't support ISO646, and most do > > support LATIN1 (8859-1), and the bytes won't get mangled by the parser > > in that case. > > > > Comments? > > > > On 3/16/06, Zeno Tajoli <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > > > >PROBLEM : > > > >* in MARC21, the encoding is defined by position 9 of the leader. > > > >'a' means UTF-8 > > > >* in UNIMARC, this is an empty position ! the encoding is in > > > >positions 26-27 and 28-29 of 100$a (<200 are all fixed coded fields > > > >in unimarc : http://bibliotheque.bgp-fr.com/Unimarc_abrege.pdf, page > > > >8 for 100$a) > > > > > > > >BIG PROBLEM : > > > >MARC::File::XML only checks for position 9, thinking the XML is > > > >necessary a marc21 file. > > > > > > > >I think (& joshua agrees) we will have to hack MARC::File::XML to > > > >solve this problem. > > > >We have 2 solutions : > > > >* add a test to define wether we are UNIMARC or MARC21. In UNIMARC, > > > >title is in 200, while 200 is empty in MARC21. > > > >* add a parameter to ->new_as_xml($xml,'UTF-8','UNIMARC') to specify > > > >we are sending the parser an unimarc file. > > > > > > as a person that has write a Unimarc -> MARC21 converter, I prefer > > > the second solution. > > > > > > Thanks for all > > > Bye > > > > > > Zeno Tajoli > > > CILEA - Segrate (MI) > > > tajoliAT_SPAM_no_prendiATcilea.it > > > (Indirizzo mascherato anti-spam; sostituisci quanto tra AT con @) > > > > > > > > > > > > -- > > Mike Rylander > > [EMAIL PROTECTED] > > GPLS -- PINES Development > > Database Developer > > http://open-ils.org > > > > > -- > Mike Rylander > [EMAIL PROTECTED] > GPLS -- PINES Development > Database Developer > http://open-ils.org > -- Mike Rylander [EMAIL PROTECTED] GPLS -- PINES Development Database Developer http://open-ils.org