What exactly is invalid about the XML fragment you provided? It seems to parse correctly with ElementTree: >>> from xml.etree import ElementTree as ET >>> e = ET.fromstring(""" ... <cities> ... <city> ... <name>Tampa</name> ... <description>A great city ^^ and place to live</description> ... </city> ... <city> ... <name>Clearwater</name> ... <description>Beautiful beaches</description> ... </city> ... </cities> ... """) >>> print ET.tostring(e) <cities> <city> <name>Tampa</name> <description>A great city ^^ and place to live</description> </city> <city> <name>Clearwater</name> <description>Beautiful beaches</description> </city> </cities> >>>
Do you have invalid characters? unclosed tags? The solution to each of these problems is different. More info will solicit better solutions. -Chris On Fri, Jan 05, 2007 at 01:50:18PM -0800, [EMAIL PROTECTED] wrote: > I've got an XML feed from a vendor that is not well-formed, and having > them change it is not an option. I'm trying to figure out how to > create an error-handler that will ignore the invalid token and continue > on. > > The file is large, so I'd prefer not to put it all in memory or save it > off and strip out the bad characters before I parse it. > > I've included one of the problematic characters in a small XML snippet > below. > > I'm new to Python, and I don't know how to accomplish this. Any help is > greatly appreciated! > > ----------------------------------------------------------------- > > Here is my code: > > from xml.sax import make_parser > from xml.sax.handler import ContentHandler > import StringIO > > class ErrorHandler: > def __init__(self, parser): > self.parser = parser > def warning(self, msg): > print '*** (ErrorHandler.warning) msg:', msg > def error(self, msg): > print '*** (ErrorHandler.error) msg:', msg > def fatalError(self, msg): > print msg > > class ContentHandler(ContentHandler): > def __init__ (self): > pass > def startElement(self, name, attrs): > pass > def characters (self, ch): > pass > def endElement(self, name): > pass > > xmlstr = """ > <cities> > <city> > <name>Tampa</name> > <description>A great city and place to live</description> > </city> > <city> > <name>Clearwater</name> > <description>Beautiful beaches</description> > </city> > </cities> > """ > parser = make_parser() > curHandler = ContentHandler() > errorHandler = ErrorHandler(parser) > parser.setContentHandler(curHandler) > parser.setErrorHandler(errorHandler) > parser.parse(StringIO.StringIO(xmlstr)) > > -- > http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list