[EMAIL PROTECTED] writes:

>> The problem is that when the sax handler raises an exception,
> I can't see how to find out why. What I want to do is for
> DodgyErrorHandler to do something different depending on
> where we are in the course of parsing. Is there anyway
> to get that information back from xml.sax (or indeed from
> any other sax handler?)
>
> You can get raw location information, yes.  See:
>
> http://www.xml.com/pub/a/2004/11/24/py-xml.html
>
> But I don't think this is enough for you.  You also need recovery,
> which you're implementing in crude form.

(If you're referring to the Locator objects), yes I'm aware
that's possible. But what I want is not my location in the 
document, but for the parser to say "this is an error because
I am in the middle of a tag & the document ended", or "I
was in the middle of a text section and the document ended", or
"I was in the middle of an attribute value and the document
ended", etc, so that I can then construct a simple end to the
document, inserting quote marks, finishing the tag, and closing 
all unclosed tags as appropriate.

I have just realised that I might be able to grab the message
that the exception gives me, look at the expat source code 
and work out what parsing events cause which error messages.
Which is a bit round the houses, but I think ought to work.


> I tend to agree with Magnus that using an SGML parser might be your
> best bet.  You might even be able to turn that SGML into XML using a
> tool such as James Clark's SX:
>
> http://www.jclark.com/sp/sx.htm

If I can't get my scheme above to work, I'll have a go. But I was
hoping to do this without requiring additional packages. And in
any case, it doesn't need to be perfectly robust. As long as it
handles 99% of cases, I'll be happy.

-- 
Dr. Toby White 
Dept. of Earth Sciences, Downing Street, Cambridge CB2 3EQ. UK
Email: <[EMAIL PROTECTED]>
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to