Does Xerces handle double byte characters when it parses a document? I have
UTF-8 specifed for the encoding,
but when I parse the document containing Kanji characters in values for
elements and attributes I get the following error:
 “ú•t:       01/09/10  15:14
ƒNƒ‰ƒX:      com.ibm.emms.cptk.Validator
ƒ?ƒ\ƒbƒh:     validate
  MSG#SAXParseError Xerces reports a parsing problem.
  org.xml.sax.SAXParseException: The element type "source" must be
terminated by the matching end-tag "</source>".
     at
org.apache.xerces.framework.XMLParser.reportError(XMLParser.java:1196)
     at
org.apache.xerces.framework.XMLDocumentScanner.reportFatalXMLError(XMLDocumentScanner.java:635)


     at
org.apache.xerces.framework.XMLDocumentScanner.abortMarkup(XMLDocumentScanner.java:684)


     at
org.apache.xerces.framework.XMLDocumentScanner$ContentDispatcher.dispatch(XMLDocumentScanner.java:1192)


     at
org.apache.xerces.framework.XMLDocumentScanner.parseSome(XMLDocumentScanner.java:381)

     at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:1081)
     at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:1122)

An example of where the Kanji characters are:
            <title>·‚Ó‚Ÿ‚ª‘?</title>
            <sortTitle>‚½‚Í‚¢‚ ‚³‚ ‚ª</sortTitle>
            
<creatorString>‚ª‚Í‰æ‰Æ‚┃‚¤Œá‚ª‚Í‚ª‚͍‚</creatorString>

            <description>‚Ó‚Ÿ‰æ‰Æ‚Í</description>
            <basicPublishingMetadata>
                <source>ŠGƒ_ƒCƒA‰ß‘ÓÂŒŠ‚ ‚ ‚ç‚™</source>
            </basicPublishingMetadata>
            <itemMetadataList count="1">
                <itemMetadata>
                    <identifier scheme="doi">kj</identifier>
                    <title>All fields in congi chars</title>
                    <sortTitle>‚ ‚Ÿ‚¢‚ ‚â‚ç</sortTitle>
                    <creatorString>‚ ‚Ó‚Ÿ‚ª‚Í</creatorString>
                    
<description>‚瑼’ƒ‚Ó‚¥‚Ó‚Ÿ’ƒ</description>
Thanks.
Sally Nemes

EMMS Subsystem Development, IBM Software Group
Internet Mail: [EMAIL PROTECTED]
T/L 975-2872,  External (561) 862-2872
IMAD 4181
8051 Congress Avenue
Boca Raton, Florida 33487

Reply via email to