Re: Handling invalid character references...

David N Bertoni/Cambridge/IBM 18 Feb 2002 21:58:01 -0000

Since XML does not allow the Unicode character 0, those documents are not
really XHTML, and any real parser, whether it is validating or not should
reject them.  You should complain to your data providor, or pre-process the
documents to remove/change any offending characters.


Dave



                                                                                
                                                      
                      Raber Chris                                               
                                                      
                      <[EMAIL PROTECTED]         To:      [EMAIL PROTECTED]     
                                             
                      om>                      cc:      (bcc: David N 
Bertoni/Cambridge/IBM)                                          
                                               Subject: Re: Handling invalid 
character references...                                  
                      02/16/2002 12:17                                          
                                                      
                      PM                                                        
                                                      
                                                                                
                                                      
                                                                                
                                                      



Oops, my message was truncated. What I meant to say
is:

Apparently validating parsers like Xerces are going
to reject invalid XML characters like &#0; (i.e. nul).

Unfortunately I have a situation where my input is
html pages that have been morphed into xhtml, and the
xhtml sometimes contains character references such as
these.

Is there a way to configure Xerces/Xalan to either
ignore these characters, or to morph them to something
else?

TIA,

-Chris.


__________________________________________________
Do You Yahoo!?
Yahoo! Sports - Coverage of the 2002 Olympic Games
http://sports.yahoo.com

Re: Handling invalid character references...

Reply via email to