Re: more issues - losing data in files
On Wed, Oct 7, 2009 at 7:04 PM, Matt Wilkie wrote: > there is an open source > library for binary xml which might help with perfofmance on large > filse: Thanks for this link. Most of my .leo files have nothing but @thin nodes in them, so the actual .leo file is small. I'll keep this in mind, though. I suspect some Leo users have very large .leo files. Edward --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "leo-editor" group. To post to this group, send email to leo-editor@googlegroups.com To unsubscribe from this group, send email to leo-editor+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/leo-editor?hl=en -~--~~~~--~~--~--~---
Re: more issues - losing data in files
> New versions of leo should strip those characters on write, it was > probably saved with old version of leo. > > Cases like this make me lose my faith in xml bit by bit. > > I have some preliminary sketches in my head for using either sqlite or > zip files as tnode storage. This would also help small memory systems > (read: phones), where only visited tnodes would be actually loaded to > memory. It would also speed up saving of big files, as only changed > nodes (and the outline xml) would need to be written. for what it's worth, and it probably would do nothing to prevent choking on embedded control characters, there is an open source library for binary xml which might help with perfofmance on large filse: "a straightforward, open, patent-unencumbered binary-encoding format for XML data that is a stand-alone work-alike drop-in replacement for an XML file that mirrors the XML markup structures in a way that is similar to the in-memory representations of many parser libraries" -- http://www.cubewerx.com/bxml cheers, -- -matt --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "leo-editor" group. To post to this group, send email to leo-editor@googlegroups.com To unsubscribe from this group, send email to leo-editor+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/leo-editor?hl=en -~--~~~~--~~--~--~---
Re: more issues - losing data in files
On Wed, Oct 7, 2009 at 11:20 AM, Edward K. Ream wrote: > > > OTOH, it would seem feasible to attempt error recovery in when > parse_leo_file when xml.sax.SAXParseException is thrown. We could strip > ctrl characters from the input, then pass the cleaned text back to sax. > > I'll create a bug for this. > Done: https://bugs.launchpad.net/leo-editor/+bug/445596 EKR --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "leo-editor" group. To post to this group, send email to leo-editor@googlegroups.com To unsubscribe from this group, send email to leo-editor+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/leo-editor?hl=en -~--~~~~--~~--~--~---
Re: more issues - losing data in files
On Tue, Oct 6, 2009 at 3:41 PM, Ville M. Vainio wrote: > > > That document had some text I imported from a report file that had > > form feed characters in each header. I had to delete the Ctrl-L > > characters and it was then okay. > > New versions of leo should strip those characters on write, it was > probably saved with old version of leo. > Leo already does this. The new, correct, code uses xml.sax.saxutils.escape. In other words, the problem can only arise when reading .leo files created by older versions of Leo. > > Cases like this make me lose my faith in xml bit by bit. > Hmm. Clearly, the problem we are discussing arises because old versions of Leo did not create proper xml files. It's not clear to me that sax (or any other parser) can reliably recovers from unexpected characters in the input. The problem may be hard, given that unicode is involved :-) OTOH, it would seem feasible to attempt error recovery in when parse_leo_file when xml.sax.SAXParseException is thrown. We could strip ctrl characters from the input, then pass the cleaned text back to sax. I'll create a bug for this, and experiment. Edward --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "leo-editor" group. To post to this group, send email to leo-editor@googlegroups.com To unsubscribe from this group, send email to leo-editor+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/leo-editor?hl=en -~--~~~~--~~--~--~---
Re: more issues - losing data in files
On Tue, Oct 6, 2009 at 9:33 PM, Casey (kc) wrote: > > May they all be this easy. > > This is one of those RTFM situations where I'm going to have to > contribute all my pocket change to the kitty. > > That document had some text I imported from a report file that had > form feed characters in each header. I had to delete the Ctrl-L > characters and it was then okay. New versions of leo should strip those characters on write, it was probably saved with old version of leo. Cases like this make me lose my faith in xml bit by bit. I have some preliminary sketches in my head for using either sqlite or zip files as tnode storage. This would also help small memory systems (read: phones), where only visited tnodes would be actually loaded to memory. It would also speed up saving of big files, as only changed nodes (and the outline xml) would need to be written. There are some use cases for this, like slurping in a gigantic source tree in one huge .leo file (possibly as @@auto nodes) for later study on-the-road. Saving and loading the whole tree every time is a big drag. -- Ville M. Vainio http://tinyurl.com/vainio --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "leo-editor" group. To post to this group, send email to leo-editor@googlegroups.com To unsubscribe from this group, send email to leo-editor+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/leo-editor?hl=en -~--~~~~--~~--~--~---
Re: more issues - losing data in files
May they all be this easy. This is one of those RTFM situations where I'm going to have to contribute all my pocket change to the kitty. That document had some text I imported from a report file that had form feed characters in each header. I had to delete the Ctrl-L characters and it was then okay. Thanks a lot Ed. --KC On Oct 6, 3:03 pm, "Edward K. Ream" wrote: > On Tue, Oct 6, 2009 at 1:52 PM, Casey (kc) wrote: > > > In another post I described how I reverted back to 4.5.1 in order to > > get Leo to run again. > > > Now I'm having issues that seem peculiar. > > SAXParseException saying "not well-formed (invalid token)". > > See:http://webpages.charter.net/edreamleo/FAQ.html#trouble-shooting > > Scroll down until you see > > SAXParseException: :123:25: not well-formed (invalid token) > > Follow the directions. > > Edward --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "leo-editor" group. To post to this group, send email to leo-editor@googlegroups.com To unsubscribe from this group, send email to leo-editor+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/leo-editor?hl=en -~--~~~~--~~--~--~---
Re: more issues - losing data in files
On Tue, Oct 6, 2009 at 1:52 PM, Casey (kc) wrote: > > In another post I described how I reverted back to 4.5.1 in order to > get Leo to run again. > > Now I'm having issues that seem peculiar. > SAXParseException saying "not well-formed (invalid token)". > See: http://webpages.charter.net/edreamleo/FAQ.html#trouble-shooting Scroll down until you see SAXParseException: :123:25: not well-formed (invalid token) Follow the directions. Edward --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "leo-editor" group. To post to this group, send email to leo-editor@googlegroups.com To unsubscribe from this group, send email to leo-editor+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/leo-editor?hl=en -~--~~~~--~~--~--~---