It's simply a matter of 'walking' the DOM tree and looking at each node -
check to see if it's a Node.TEXT_NODE, that it has no content (or
whitespace-only content), and it's not something you know you want to keep
(such as legitimate whitespace inside an element).

I believe the standard structure of a DOM tree includes staggered TEXT_NODEs
between each ELEMENT_NODE.  Something like the below diagram:

                root
      +------+----+-------+-------+
    text   elem   text   elem   text

I'm not sure where the code is that I wrote to remedy this problem (it was
written quite a while ago).  I will take a look, however it's really not
that hard to write you're own.  As I've said before, traverse the tree
starting at the document element; look at each text node and determine if it
contains what you consider to be ignorable whitespace;  if it is ignorable,
remove it (or add it to a list of nodes to be removed after you're done
traversing the tree).

Good luck!

Brion Swanson

-----Original Message-----
From: Daniel Pfuhl [mailto:[EMAIL PROTECTED]
Sent: Wednesday, May 16, 2001 8:37 AM
To: [EMAIL PROTECTED]
Subject: RE: problems formatting output


Hi

that is the same problem I have. Can you give me a hint
about removing this ignorable whitespace textnodes. I think
that's also my problem because I was wondering about nodes
I'd never created in my DOM but wich where accessible in
a nodelist of my examplecode :-(
Maybe you can send me a piece of code for removing them or
tell me how to accomplish that.

thanks in advance

daniel


[EMAIL PROTECTED] schrieb am 16.05.01:
> It's not necessarily what you're doing wrong, it's simply that the
> OutputFormat and the DOMWriter are not perfect.  They parse in the
> whitespace and send it back out in some odd fashion that I don't quite
> understand.
> 
> I've had this problem as well and I tried to fix it by telling Xerces to
> drop ignorableWhitespace, but that did not work for me.  In the end, I
> created a small bit of code that traverses the DOM tree looking for
> ignorable whitespace text nodes and then removing them.  The output then
was
> formatted properly.

------------------------------------------
Daniel Pfuhl
mailto:[EMAIL PROTECTED]
____________________________________________________________________________
__
Ferienklick.de - 225 Reisekataloge auf einen Blick!
Direkt zu Ihrem Traumurlaub: http://ferienklick.de/?PP=2-0-100-105-0


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to