On 04/02/2012 18:03, Rob Oakes wrote:
Dear eLyXer Users and Developers,

I'm still at work on the import/export module for Microsoft Word documents. I'm 
making pretty good progress. I've got a rough prototype that works pretty well 
and I'm now starting to refine it.

My approach up to now has been to use regular expressions to match portions of 
the document and then use a library to translate those to the corresponding 
Word XML structures. It's working pretty well with my simple test documents.

Before going too far with this approach, though, I wanted to post (another 
general query).

In the eLyXer library, there is already a robust set of tools used for 
converting LyX documents to HTML. Does anyone know if the library is written in 
such as way that getting a generic in-memory representation of the document 
would be possible? It would be awesome to re-use as much existing code for the 
Word document export as possible. That would allow me to support a broader 
number of features, and gives me a framework for working with maths.
Strong suggestion: use LyX proper. I am quite sure you already know that because I saw some patches from you in this area but I'll explain anyway: LyX's html own export is so good and fast because it effectively knows the in-memory representation of the document. You can't be faster nor more accurate than that. I mean, unless you want to rewrite LyX in python.

IIUC you want a single module in python for both import and export in python. But I don't think this is a valid argument. As for the word to lyx format conversion, if you want to use this epub library there must be a way to use that in C++ I'm sure...

Any thoughts Alex (and others)? I've downloaded the sources and have begun to 
work through them, but before spending hours to days trying to wrap my head 
around them, I thought I would ask.

AFAIK, eLyXer doesn't construct a document model. So you'd better spend this time reading the C++ code for exporting to html/xhtml ;-)

Abdel.

Reply via email to