On Wed, Oct 29, 2008 at 5:01 PM, Dan Scott <[EMAIL PROTECTED]> wrote:

> 2008/10/29 Bill Erickson <[EMAIL PROTECTED]>:
> > Hi all,
> >
> > I ran across some gnarly MARC data today, which contained, among other
> > things, MARC codes of "<".  I realized that Marc::File::XML outputs the
> MARC
> > tags, codes, and indicators without escaping them.  This results, in my
> > case, in invalid XML like:
> >
> > <subfield code="<">France</subfield>
> >
> > It seems reasonable that, regardless of the (horrible) content of the
> MARC,
> > marc::file::xml should produce valid XML.
> >
> > Attached is a patch to explicitly escape the values before inserting them
> > into the XML document under construction.  I'm not sure if it's the best
> > approach, but it got me up and running again.
>
> Any chance of including a sample (horrible) MARC record to include in
> a testcase?
>
> I'm not saying I would build a testcase for MARC::File::XML, but I
> might build one for File_MARC (PHP)... and a nice horrible MARC record
> from the wild would help.
>
>

Attached, including the post-escape XML version.

-b
<?xml version="1.0" encoding="UTF-8"?>
<record
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
    xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/ standards/marcxml/schema/MARC21slim.xsd"
    xmlns="http://www.loc.gov/MARC21/slim";>

  <leader>00727nam  2200205 a 4500</leader>
  <controlfield tag="001">03-0016458</controlfield>
  <controlfield tag="005">19971103184734.0</controlfield>
  <controlfield tag="008">970701s1997    oru          u000 0 eng u</controlfield>
  <datafield tag="035" ind1=" " ind2=" ">
    <subfield code="a">(Sirsi) a351664</subfield>
  </datafield>
  <datafield tag="050" ind1="0" ind2="0">
    <subfield code="a">ML270.2</subfield>
    <subfield code="b">.A6 1997</subfield>
  </datafield>
  <datafield tag="100" ind1="1" ind2=" ">
    <subfield code="a">Anthony, James R.</subfield>
  </datafield>
  <datafield tag="245" ind1="0" ind2="0">
    <subfield code="a">French baroque music from Beaujoyeulx to Rameau</subfield>
  </datafield>
  <datafield tag="250" ind1=" " ind2=" ">
    <subfield code="a">Rev. and expanded ed.</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="a">Portland, OR :</subfield>
    <subfield code="b">Amadeus Press,</subfield>
    <subfield code="c">1997.</subfield>
  </datafield>
  <datafield tag="300" ind1=" " ind2=" ">
    <subfield code="a">586 p. :</subfield>
    <subfield code="b">music</subfield>
  </datafield>
  <datafield tag="650" ind1=" " ind2="0">
    <subfield code="a">Music</subfield>
    <subfield code="&lt;">France</subfield>
    <subfield code="y">16th century</subfield>
    <subfield code="x">History and criticism.</subfield>
  </datafield>
  <datafield tag="650" ind1=" " ind2="0">
    <subfield code="a">Music</subfield>
    <subfield code="z">France</subfield>
    <subfield code="y">17th century</subfield>
    <subfield code="x">History and criticism.</subfield>
  </datafield>
  <datafield tag="650" ind1=" " ind2="0">
    <subfield code="a">Music</subfield>
    <subfield code="z">France</subfield>
    <subfield code="y">18th century</subfield>
    <subfield code="x">History and criticism.</subfield>
  </datafield>
  <datafield tag="949" ind1=" " ind2=" ">
    <subfield code="a">ML 270.2 A6 1997</subfield>
    <subfield code="w">LC</subfield>
    <subfield code="i">30007006841505</subfield>
    <subfield code="r">Y</subfield>
    <subfield code="t">BOOKS</subfield>
    <subfield code="l">HUNT-CIRC</subfield>
    <subfield code="m">HUNTINGTON</subfield>
  </datafield>
  <datafield tag="596" ind1=" " ind2=" ">
    <subfield code="a">1</subfield>
  </datafield>
</record>

Attachment: test.mrc
Description: Binary data

Reply via email to