I am not sure I understand what you said :o(
 
If I use the DOM parser provided with the JDK, the \n are not represented in the DOM tree as "#TEXT:".
 
If I keep the same code but use the Xerces parser, the \n becomes "#TEXT:". It should not be the case. As they are not included in an element (between tags), the ignore whitespace does not effect on it.
 
Lydie.


From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Wednesday, September 27, 2006 15:13
To: [email protected]
Subject: RE: Additionnal #Text node

> <Root>
> <FirstElement>Value</FirstElement>
> </Root>
>
> My DOM tree will look like:
>
> + ELEMENT: Root
> + #TEXT:
> + ELEMENT: FirstElement
> + #TEXT: Value
> + #TEXT:

That is correct. Xerces, and the DOM specification, make no assumptions regarding whether your particular application considers the whitespace to be meaningful or not -- some do, so the parser must present this node unless you have in some way explicitly told it otherwise (by providing a DTD or schema which says that character data is not considered meaningful at this point, *and* by telling the parser that it may suppress whitespace in the "element content" context).

(The sax equivalent is the "ignorable whitespace" concept -- which is a misnomer, but which is intended to convey the same distinction of whitespace-in-element-content.)

Reply via email to