Since XML does not allow the Unicode character 0, those documents are not really XHTML, and any real parser, whether it is validating or not should reject them. You should complain to your data providor, or pre-process the documents to remove/change any offending characters.
Dave
Raber Chris
<[EMAIL PROTECTED] To: [EMAIL PROTECTED]
om> cc: (bcc: David N
Bertoni/Cambridge/IBM)
Subject: Re: Handling invalid
character references...
02/16/2002 12:17
PM
Oops, my message was truncated. What I meant to say
is:
Apparently validating parsers like Xerces are going
to reject invalid XML characters like � (i.e. nul).
Unfortunately I have a situation where my input is
html pages that have been morphed into xhtml, and the
xhtml sometimes contains character references such as
these.
Is there a way to configure Xerces/Xalan to either
ignore these characters, or to morph them to something
else?
TIA,
-Chris.
__________________________________________________
Do You Yahoo!?
Yahoo! Sports - Coverage of the 2002 Olympic Games
http://sports.yahoo.com
