It's simply a matter of 'walking' the DOM tree and looking at each node -
check to see if it's a Node.TEXT_NODE, that it has no content (or
whitespace-only content), and it's not something you know you want to keep
(such as legitimate whitespace inside an element).
I believe the standard structure of a DOM tree includes staggered TEXT_NODEs
between each ELEMENT_NODE. Something like the below diagram:
root
+------+----+-------+-------+
text elem text elem text
I'm not sure where the code is that I wrote to remedy this problem (it was
written quite a while ago). I will take a look, however it's really not
that hard to write you're own. As I've said before, traverse the tree
starting at the document element; look at each text node and determine if it
contains what you consider to be ignorable whitespace; if it is ignorable,
remove it (or add it to a list of nodes to be removed after you're done
traversing the tree).
Good luck!
Brion Swanson
-----Original Message-----
From: Daniel Pfuhl [mailto:[EMAIL PROTECTED]
Sent: Wednesday, May 16, 2001 8:37 AM
To: [EMAIL PROTECTED]
Subject: RE: problems formatting output
Hi
that is the same problem I have. Can you give me a hint
about removing this ignorable whitespace textnodes. I think
that's also my problem because I was wondering about nodes
I'd never created in my DOM but wich where accessible in
a nodelist of my examplecode :-(
Maybe you can send me a piece of code for removing them or
tell me how to accomplish that.
thanks in advance
daniel
[EMAIL PROTECTED] schrieb am 16.05.01:
> It's not necessarily what you're doing wrong, it's simply that the
> OutputFormat and the DOMWriter are not perfect. They parse in the
> whitespace and send it back out in some odd fashion that I don't quite
> understand.
>
> I've had this problem as well and I tried to fix it by telling Xerces to
> drop ignorableWhitespace, but that did not work for me. In the end, I
> created a small bit of code that traverses the DOM tree looking for
> ignorable whitespace text nodes and then removing them. The output then
was
> formatted properly.
------------------------------------------
Daniel Pfuhl
mailto:[EMAIL PROTECTED]
____________________________________________________________________________
__
Ferienklick.de - 225 Reisekataloge auf einen Blick!
Direkt zu Ihrem Traumurlaub: http://ferienklick.de/?PP=2-0-100-105-0
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]