On Thu, Nov 3, 2011 at 10:27 AM, Terry Brown <terry_n_br...@yahoo.com> wrote:
> Probably not relevant to Leo import export, but javascript often > contains things not valid in HTML, notably < and & Yikes. > > Which raises the question of > CDATA http://www.w3schools.com/xml/xml_cdata.asp > strictly speaking Leo should not parse anything in a CDATA block. Ok. I'll put this on the list for someday. > I wasn't suggesting using ElementTree / lxml for parsing, just that > the .text and .tail attribue model allows complete representation of > HTML's "pernicious mixed content". > http://www.thaiopensource.com/relaxng/design.html#section:11 Thanks for this. In any event, my initial enthusiasm for the scanner-based approach was unfounded. I had forgotten to remove the code that completely ignores whitespace. When I did so, the original whitespace failure reappeared! [Sounds of teeth gnashing.] This is a really nasty problem. Somehow the html code generator must find a way around it. Either that, or pretend that the importer is allowed to insert whitespace in some cases. In short, the present html importer is being overly persnickety. I don't know how to cure that problem without gutting a significant part of the import check... Edward -- You received this message because you are subscribed to the Google Groups "leo-editor" group. To post to this group, send email to leo-editor@googlegroups.com. To unsubscribe from this group, send email to leo-editor+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/leo-editor?hl=en.