Hi Lars,

When you say both files look the same have you actually taken a look at the
byte sequences? If you have you'll see that there are sequences of OD OA
(CR LF) all over the place in JavaScript.java. The XInclude processor is
doing what it's supposed to do: literally including every character in the
document.

Thanks.

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: [EMAIL PROTECTED]
E-mail: [EMAIL PROTECTED]

"Lars Vogel" <[EMAIL PROTECTED]> wrote on 03/12/2008 06:51:28 AM:

> Dear all,
>
> I tested this a little bit more and the behavior is different for
> different Input documents. For example for the file "index.html" the
> result is correct while for the file "JavaScript.java" the result
> not correct is.
>
> See attachment.
>
> Both files look the same. Can this be a bug? If yes, can someone
> point me to the bug database for xerces?
>
> Best regards, Lars

> 2008/3/4, Lars Vogel <[EMAIL PROTECTED]>:
> Hi Michael,
>
> so there is no way to avoid this? Best regards, Lars

> 2008/3/4, Michael Glavassevich <[EMAIL PROTECTED]>:
> Lars' example is doing text inclusion:
>
>
> <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"; parse="text"
> href="JavaScript.java"/>
>
>
> so the XML 1.0 rules for end-of-line normalization don't apply here. The
> text in "JavaScript.java" is literally included in the document. That
> includes any carriage returns. A serializer will write those as &#13; so
> that they survive the round trip through another parse.
>
> Thanks.
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: [EMAIL PROTECTED]
> E-mail: [EMAIL PROTECTED]
>
> [EMAIL PROTECTED] wrote on 03/03/2008 03:29:34 PM:
>
>
> > &#13; is the carriage return character. Some systems use the &#13;
> > &#10; sequence to break lines (MS systems among others); some just
> > use &#10; (Unix systems, among others), and there are a few rare
> > cases that use something else. XML parsers are able to tolerate any
> > of these on input and will convert them all into &#10;.
> >
> > It is the responsiblity of the serializer, when the XML is written
> > back out, to decide which of these representations to use for the
> > generated XML text. In most cases it will use whatever
> > representation is native to that environment -- in our case, we ask
> > Java what the local convention is for line breaks, and we use that
> > unless a special effort is made to use something else.
> >
> > Without more details, I can't tell whether you've got that
> > misconfigured, or if whatever you're passing the generated XML
> > document to isn't handling it properly, or if something else is going
on.
> >
> > ______________________________________
> > "... Three things see no end: A loop with exit code done wrong,
> > A semaphore untested, And the change that comes along. ..."
> > -- "Threes" Rev 1.1 - Duane Elms / Leslie Fish (http://www.ovff.
> > org/pegasus/songs/threes-rev-11.html)
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]

>
> [attachment "example.zip" deleted by Michael Glavassevich/Toronto/IBM]
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to