hi,
reading from XML spec:
http://www.w3.org/TR/2000/REC-xml-20001006#sec-line-ends
(...) 2.11 End-of-Line Handling (...)
To simplify the tasks of applications, the characters passed to an
application by the XML processor must be as if the XML processor
normalized all line breaks in external parsed entities (including the
document entity) on input, before parsing, by translating both the
two-character sequence #xD #xA and any #xD that is not followed by
#xA to a single #xA character. (...)
i think conforming to this description the input "#x20 #xA #xD #x20"
should be normalized to "#x20 #xA #xA #x20" however it seems that Xerces 2
is normalizing it incorrectly to "#x20 #xA #x20" (or maybe it is correct?)
thanks,
alek
ps. i have used attached test2.xml for testing that contains this:
<t>-#xA-#xD-#xD#xA-#xA#xD-</t>
$ od --format x1 test2.xml
0000000 3c 74 3e 2d 0a 2d 0d 2d 0d 0a 2d 0a 0d 2d 3c 2f
0000020 74 3e
0000022
$ od -c test2.xml
0000000 < t > - \n - \r - \r \n - \n \r - <
/
0000020 t >
0000022
so the last sequence of \n\r should be normilzed to \n\n ...
however when running sax.DocumentTracer sample i get this:
>java sax.DocumentTracer test2.xml
setDocumentLocator(locator=org.apache.xerces.parsers.AbstractSAXParser$LocatorPr
oxy@fd13b5)
startDocument()
startElement(uri="",localName="t",qname="t",attributes={})
characters(text="-")
characters(text="\n-")
characters(text="\n-")
characters(text="\n-")
characters(text="\n-")
endElement(uri="",localName="t",qname="t")
endDocument()
the result is the same for xni.DocumentTracer:
>java xni.DocumentTracer test2.xml
startDocument(...)
startElement(element={prefix=null,localpart="t",rawname="t",uri=null},attribute
s={})
characters(text="-")
characters(text="\n-")
characters(text="\n-")
characters(text="\n-")
characters(text="\n-")
endElement(element={prefix=null,localpart="t",rawname="t",uri=null})
endDocument()
<t>-
-
-
-
-</t>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]