Re: more issues - losing data in files

2009-10-08 Thread Edward K. Ream
On Wed, Oct 7, 2009 at 7:04 PM, Matt Wilkie map...@gmail.com wrote:


 there is an open source
 library for binary xml which might help with perfofmance on large
 filse:


Thanks for this link.  Most of my .leo files have nothing but @thin nodes in
them, so the actual .leo file is small.  I'll keep this in mind, though.  I
suspect some Leo users have very large .leo files.

Edward

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
leo-editor group.
To post to this group, send email to leo-editor@googlegroups.com
To unsubscribe from this group, send email to 
leo-editor+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/leo-editor?hl=en
-~--~~~~--~~--~--~---



Re: more issues - losing data in files

2009-10-07 Thread Edward K. Ream
On Tue, Oct 6, 2009 at 3:41 PM, Ville M. Vainio vivai...@gmail.com wrote:


  That document had some text I imported from a report file that had
  form feed characters in each header.  I had to delete the Ctrl-L
  characters and it was then okay.

 New versions of leo should strip those characters on write, it was
 probably saved with old version of leo.


Leo already does this.  The new, correct, code uses
xml.sax.saxutils.escape.  In other words, the problem can only arise when
reading .leo files created by older versions of Leo.


 Cases like this make me lose my faith in xml bit by bit.


Hmm.  Clearly, the problem we are discussing arises because old versions of
Leo did not create proper xml files.  It's not clear to me that sax (or any
other parser) can reliably recovers from unexpected characters in the
input.  The problem may be hard, given that unicode is involved :-)

OTOH, it would seem feasible to attempt error recovery in when
parse_leo_file when xml.sax.SAXParseException is thrown.  We could strip
ctrl characters from the input, then pass the cleaned text back to sax.

I'll create a bug for this, and experiment.

Edward

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
leo-editor group.
To post to this group, send email to leo-editor@googlegroups.com
To unsubscribe from this group, send email to 
leo-editor+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/leo-editor?hl=en
-~--~~~~--~~--~--~---



Re: more issues - losing data in files

2009-10-07 Thread Edward K. Ream
On Wed, Oct 7, 2009 at 11:20 AM, Edward K. Ream edream...@gmail.com wrote:



 OTOH, it would seem feasible to attempt error recovery in when
 parse_leo_file when xml.sax.SAXParseException is thrown.  We could strip
 ctrl characters from the input, then pass the cleaned text back to sax.

 I'll create a bug for this.


Done: https://bugs.launchpad.net/leo-editor/+bug/445596

EKR

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
leo-editor group.
To post to this group, send email to leo-editor@googlegroups.com
To unsubscribe from this group, send email to 
leo-editor+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/leo-editor?hl=en
-~--~~~~--~~--~--~---



Re: more issues - losing data in files

2009-10-07 Thread Matt Wilkie

 New versions of leo should strip those characters on write, it was
 probably saved with old version of leo.

 Cases like this make me lose my faith in xml bit by bit.

 I have some preliminary sketches in my head for using either sqlite or
 zip files as tnode storage. This would also help small memory systems
 (read: phones), where only visited tnodes would be actually loaded to
 memory. It would also speed up saving of big files, as only changed
 nodes (and the outline xml) would need to be written.

for what it's worth, and it probably would do nothing to prevent
choking on embedded control characters, there is an open source
library for binary xml which might help with perfofmance on large
filse:  a straightforward, open, patent-unencumbered binary-encoding
format for XML data that is a stand-alone work-alike drop-in
replacement for an XML file that mirrors the XML markup structures in
a way that is similar to the in-memory representations of many parser
libraries -- http://www.cubewerx.com/bxml

cheers,
-- 
-matt

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
leo-editor group.
To post to this group, send email to leo-editor@googlegroups.com
To unsubscribe from this group, send email to 
leo-editor+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/leo-editor?hl=en
-~--~~~~--~~--~--~---



Re: more issues - losing data in files

2009-10-06 Thread Edward K. Ream
On Tue, Oct 6, 2009 at 1:52 PM, Casey (kc) kccol...@gmail.com wrote:


 In another post I described how I reverted back to 4.5.1 in order to
 get Leo to run again.

 Now I'm having issues that seem peculiar.




 SAXParseException saying not well-formed (invalid token).


See: http://webpages.charter.net/edreamleo/FAQ.html#trouble-shooting

Scroll down until you see

SAXParseException: unknown:123:25: not well-formed (invalid token)

Follow the directions.

Edward

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
leo-editor group.
To post to this group, send email to leo-editor@googlegroups.com
To unsubscribe from this group, send email to 
leo-editor+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/leo-editor?hl=en
-~--~~~~--~~--~--~---



Re: more issues - losing data in files

2009-10-06 Thread Casey (kc)

May they all be this easy.

This is one of those RTFM situations where I'm going to have to
contribute all my pocket change to the kitty.

That document had some text I imported from a report file that had
form feed characters in each header.  I had to delete the Ctrl-L
characters and it was then okay.

Thanks a lot Ed.

--KC

On Oct 6, 3:03 pm, Edward K. Ream edream...@gmail.com wrote:
 On Tue, Oct 6, 2009 at 1:52 PM, Casey (kc) kccol...@gmail.com wrote:

  In another post I described how I reverted back to 4.5.1 in order to
  get Leo to run again.

  Now I'm having issues that seem peculiar.
  SAXParseException saying not well-formed (invalid token).

 See:http://webpages.charter.net/edreamleo/FAQ.html#trouble-shooting

 Scroll down until you see

 SAXParseException: unknown:123:25: not well-formed (invalid token)

 Follow the directions.

 Edward
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
leo-editor group.
To post to this group, send email to leo-editor@googlegroups.com
To unsubscribe from this group, send email to 
leo-editor+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/leo-editor?hl=en
-~--~~~~--~~--~--~---