On Tue, Oct 6, 2009 at 3:41 PM, Ville M. Vainio <vivai...@gmail.com> wrote:
> > > That document had some text I imported from a report file that had > > form feed characters in each header. I had to delete the Ctrl-L > > characters and it was then okay. > > New versions of leo should strip those characters on write, it was > probably saved with old version of leo. > Leo already does this. The new, correct, code uses xml.sax.saxutils.escape. In other words, the problem can only arise when reading .leo files created by older versions of Leo. > > Cases like this make me lose my faith in xml bit by bit. > Hmm. Clearly, the problem we are discussing arises because old versions of Leo did not create proper xml files. It's not clear to me that sax (or any other parser) can reliably recovers from unexpected characters in the input. The problem may be hard, given that unicode is involved :-) OTOH, it would seem feasible to attempt error recovery in when parse_leo_file when xml.sax.SAXParseException is thrown. We could strip ctrl characters from the input, then pass the cleaned text back to sax. I'll create a bug for this, and experiment. Edward --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "leo-editor" group. To post to this group, send email to leo-editor@googlegroups.com To unsubscribe from this group, send email to leo-editor+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/leo-editor?hl=en -~----------~----~----~----~------~----~------~--~---