Re: more issues - losing data in files

2009-10-08 Thread Edward K. Ream
On Wed, Oct 7, 2009 at 7:04 PM, Matt Wilkie  wrote:


> there is an open source
> library for binary xml which might help with perfofmance on large
> filse:


Thanks for this link.  Most of my .leo files have nothing but @thin nodes in
them, so the actual .leo file is small.  I'll keep this in mind, though.  I
suspect some Leo users have very large .leo files.

Edward

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To post to this group, send email to leo-editor@googlegroups.com
To unsubscribe from this group, send email to 
leo-editor+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/leo-editor?hl=en
-~--~~~~--~~--~--~---



Re: more issues - losing data in files

2009-10-07 Thread Matt Wilkie

> New versions of leo should strip those characters on write, it was
> probably saved with old version of leo.
>
> Cases like this make me lose my faith in xml bit by bit.
>
> I have some preliminary sketches in my head for using either sqlite or
> zip files as tnode storage. This would also help small memory systems
> (read: phones), where only visited tnodes would be actually loaded to
> memory. It would also speed up saving of big files, as only changed
> nodes (and the outline xml) would need to be written.

for what it's worth, and it probably would do nothing to prevent
choking on embedded control characters, there is an open source
library for binary xml which might help with perfofmance on large
filse:  "a straightforward, open, patent-unencumbered binary-encoding
format for XML data that is a stand-alone work-alike drop-in
replacement for an XML file that mirrors the XML markup structures in
a way that is similar to the in-memory representations of many parser
libraries" -- http://www.cubewerx.com/bxml

cheers,
-- 
-matt

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To post to this group, send email to leo-editor@googlegroups.com
To unsubscribe from this group, send email to 
leo-editor+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/leo-editor?hl=en
-~--~~~~--~~--~--~---



Re: more issues - losing data in files

2009-10-07 Thread Edward K. Ream
On Wed, Oct 7, 2009 at 11:20 AM, Edward K. Ream  wrote:

>
>
> OTOH, it would seem feasible to attempt error recovery in when
> parse_leo_file when xml.sax.SAXParseException is thrown.  We could strip
> ctrl characters from the input, then pass the cleaned text back to sax.
>
> I'll create a bug for this.
>

Done: https://bugs.launchpad.net/leo-editor/+bug/445596

EKR

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To post to this group, send email to leo-editor@googlegroups.com
To unsubscribe from this group, send email to 
leo-editor+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/leo-editor?hl=en
-~--~~~~--~~--~--~---



Re: more issues - losing data in files

2009-10-07 Thread Edward K. Ream
On Tue, Oct 6, 2009 at 3:41 PM, Ville M. Vainio  wrote:

>
> > That document had some text I imported from a report file that had
> > form feed characters in each header.  I had to delete the Ctrl-L
> > characters and it was then okay.
>
> New versions of leo should strip those characters on write, it was
> probably saved with old version of leo.
>

Leo already does this.  The new, correct, code uses
xml.sax.saxutils.escape.  In other words, the problem can only arise when
reading .leo files created by older versions of Leo.

>
> Cases like this make me lose my faith in xml bit by bit.
>

Hmm.  Clearly, the problem we are discussing arises because old versions of
Leo did not create proper xml files.  It's not clear to me that sax (or any
other parser) can reliably recovers from unexpected characters in the
input.  The problem may be hard, given that unicode is involved :-)

OTOH, it would seem feasible to attempt error recovery in when
parse_leo_file when xml.sax.SAXParseException is thrown.  We could strip
ctrl characters from the input, then pass the cleaned text back to sax.

I'll create a bug for this, and experiment.

Edward

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To post to this group, send email to leo-editor@googlegroups.com
To unsubscribe from this group, send email to 
leo-editor+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/leo-editor?hl=en
-~--~~~~--~~--~--~---



Re: more issues - losing data in files

2009-10-06 Thread Ville M. Vainio

On Tue, Oct 6, 2009 at 9:33 PM, Casey (kc)  wrote:
>
> May they all be this easy.
>
> This is one of those RTFM situations where I'm going to have to
> contribute all my pocket change to the kitty.
>
> That document had some text I imported from a report file that had
> form feed characters in each header.  I had to delete the Ctrl-L
> characters and it was then okay.

New versions of leo should strip those characters on write, it was
probably saved with old version of leo.

Cases like this make me lose my faith in xml bit by bit.

I have some preliminary sketches in my head for using either sqlite or
zip files as tnode storage. This would also help small memory systems
(read: phones), where only visited tnodes would be actually loaded to
memory. It would also speed up saving of big files, as only changed
nodes (and the outline xml) would need to be written.

There are some use cases for this, like slurping in a gigantic source
tree in one huge .leo file (possibly as @@auto nodes) for later study
on-the-road. Saving and loading the whole tree every time is a big
drag.

-- 
Ville M. Vainio
http://tinyurl.com/vainio

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To post to this group, send email to leo-editor@googlegroups.com
To unsubscribe from this group, send email to 
leo-editor+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/leo-editor?hl=en
-~--~~~~--~~--~--~---



Re: more issues - losing data in files

2009-10-06 Thread Casey (kc)

May they all be this easy.

This is one of those RTFM situations where I'm going to have to
contribute all my pocket change to the kitty.

That document had some text I imported from a report file that had
form feed characters in each header.  I had to delete the Ctrl-L
characters and it was then okay.

Thanks a lot Ed.

--KC

On Oct 6, 3:03 pm, "Edward K. Ream"  wrote:
> On Tue, Oct 6, 2009 at 1:52 PM, Casey (kc)  wrote:
>
> > In another post I described how I reverted back to 4.5.1 in order to
> > get Leo to run again.
>
> > Now I'm having issues that seem peculiar.
> > SAXParseException saying "not well-formed (invalid token)".
>
> See:http://webpages.charter.net/edreamleo/FAQ.html#trouble-shooting
>
> Scroll down until you see
>
> SAXParseException: :123:25: not well-formed (invalid token)
>
> Follow the directions.
>
> Edward
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To post to this group, send email to leo-editor@googlegroups.com
To unsubscribe from this group, send email to 
leo-editor+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/leo-editor?hl=en
-~--~~~~--~~--~--~---



Re: more issues - losing data in files

2009-10-06 Thread Edward K. Ream
On Tue, Oct 6, 2009 at 1:52 PM, Casey (kc)  wrote:

>
> In another post I described how I reverted back to 4.5.1 in order to
> get Leo to run again.
>
> Now I'm having issues that seem peculiar.




> SAXParseException saying "not well-formed (invalid token)".
>

See: http://webpages.charter.net/edreamleo/FAQ.html#trouble-shooting

Scroll down until you see

SAXParseException: :123:25: not well-formed (invalid token)

Follow the directions.

Edward

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To post to this group, send email to leo-editor@googlegroups.com
To unsubscribe from this group, send email to 
leo-editor+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/leo-editor?hl=en
-~--~~~~--~~--~--~---