Re: How to force SAX parser to ignore encoding problems

2009-08-07 Thread Stefan Behnel
Łukasz wrote:
 I have a problem with my XML parser (created with libraries from
 xml.sax package). When parser finds a invalid character (in CDATA
 section) for example �, throws an exception SAXParseException.
 
 Is there any way to just ignore this kind of problem. Maybe there is a
 way to set up parser in less strict mode?
 
 I know that I can catch this exception and determine if this is this
 kind of problem and then ignore this, but I am asking about any global
 setting.

The parser from libxml2 that lxml provides has a recovery option, i.e. it
can keep parsing regardless of errors and will drop the broken content.

However, it is *always* better to fix the input, if you get any hand on it.
Broken XML is *not* XML at all. If you can't fix the source, you can never
be sure that the data you received is in any way complete or even usable.

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to force SAX parser to ignore encoding problems

2009-07-31 Thread Łukasz
On 31 Lip, 09:28, Łukasz lkrzys...@gmail.com wrote:
 Hi,
 I have a problem with my XML parser (created with libraries from
 xml.sax package). When parser finds a invalid character (in CDATA
 section) for example ,

After sending this message I noticed that example invalid characters
are not displaying on some platforms :)

-- 
http://mail.python.org/mailman/listinfo/python-list