[EMAIL PROTECTED] writes: >> The problem is that when the sax handler raises an exception, > I can't see how to find out why. What I want to do is for > DodgyErrorHandler to do something different depending on > where we are in the course of parsing. Is there anyway > to get that information back from xml.sax (or indeed from > any other sax handler?) > > You can get raw location information, yes. See: > > http://www.xml.com/pub/a/2004/11/24/py-xml.html > > But I don't think this is enough for you. You also need recovery, > which you're implementing in crude form.
(If you're referring to the Locator objects), yes I'm aware that's possible. But what I want is not my location in the document, but for the parser to say "this is an error because I am in the middle of a tag & the document ended", or "I was in the middle of a text section and the document ended", or "I was in the middle of an attribute value and the document ended", etc, so that I can then construct a simple end to the document, inserting quote marks, finishing the tag, and closing all unclosed tags as appropriate. I have just realised that I might be able to grab the message that the exception gives me, look at the expat source code and work out what parsing events cause which error messages. Which is a bit round the houses, but I think ought to work. > I tend to agree with Magnus that using an SGML parser might be your > best bet. You might even be able to turn that SGML into XML using a > tool such as James Clark's SX: > > http://www.jclark.com/sp/sx.htm If I can't get my scheme above to work, I'll have a go. But I was hoping to do this without requiring additional packages. And in any case, it doesn't need to be perfectly robust. As long as it handles 99% of cases, I'll be happy. -- Dr. Toby White Dept. of Earth Sciences, Downing Street, Cambridge CB2 3EQ. UK Email: <[EMAIL PROTECTED]> -- http://mail.python.org/mailman/listinfo/python-list