Re: eLyXer for Document Parsing

2012-02-04 Thread slitt
On Sat, 4 Feb 2012 14:00:24 -0700
Rob Oakes  wrote:

> Hi Steve,
[clip]
> > One more question: You sure you want to go in-memory? What happens
> > if a guy has a 1200 page book with 100 chapters each containing 10
> > sections, each containing 10 subsections, and tries to parse it on
> > a machine with 512 MB RAM? 
> 
> I pity this poor man's decision to convert the whole mess to Word,
> rather than splitting it out into individual chapters.
> 
> But, I appreciate the voice for reason answer sanity and best
> practice. Short answer, no, not convinced that I want to go in
> memory. My first pass was to just to become comfortable with eLyXer
> to see if it might meet my needs. I'm still try to get comfortable
> with the structure of LyX documents and .docx documents. I've found a
> nice little python library with support for basic docx features and
> was going to try and refine that to something slightly more usable.
> 
> > You in a heap of trouble son. He'll be swapped half way into the
> > next century. If instead you used an event parser (e.g SAX) with a
> > few stacks, it will probably be slower, and it will be much more
> > hard to write, but for practical purposes there won't be an upper
> > limit on input file size.
> 
> Good points. The python library makes use of lxml, which supports
> sax. After I've got a better handle on my constraints, I'll spend the
> time required to design something more robust. 

On my lyx2kindle program
(http://www.troubleshooters.com/projects/lyx2kindle/) I used Python's
HTMLParser XML event parser tool. It was easy, though I think your lxml
idea is faster with big documents. For my 11K word book "Rules of the
Happiness Highway", conversion was maybe a second. Anyway, my
lyx2kindle.py illustrates use of HTMLParser, illustrates the use of a
stack to keep levels and maintain a poor man's state machine, and also
another part of it implements the kludge of the century.

SteveT


errors typesetting UserGuide.lyx

2012-02-04 Thread Richard Talley
I had been running MacTex 2009 and Lyx 1.6.8. and using them extensively
for technical documentation and homework assignments.

I upgraded this morning to MacTex 2011 (using the Tex Live Utility to bring
all the packages up to date) and Lyx 2.0.2.

I tried to create a pdf of the UserGuide (located in
/Applications/LyX.app/Contents/Resources/doc/ - opens as read only)

Four eps files are generating 'Error converting to loadable format' in the
LyX document. The LaTeX error dialog shows:

Package pdftex.def Error: File
`34_Applications_LyX_app_Contents_Resources_doc_clipart_mobius.pdf}

Using draft setting for this image.

Try typing  to proceed.

If that doesn't work, type X  to quit.



- OR -


Package pdftex.def Error: File
`42_Applications_LyX_app_Contents_Resources_doc_clipart_escher-lsd.pdf}}

\hfill{}\subfloat[\label{f...



So my system doesn't seem to be converting eps files correctly into pdfs
for LyX to use. Any suggestions for troubleshooting? I've saved the
complete LaTeX log, but it's huge and I'm not sure what's relevant.


-- Rich

MacBook Pro/OS X 10.6.8


Re: eLyXer for Document Parsing

2012-02-04 Thread Rob Oakes
Hi Steve,

> Not only possible but easy if you do things the Steve Litt way. eLyXer
> quickly punches out HTML that's clean enough to read with an XML
> parser, I think. So, eLyXer converts to HTML, and then your program's
> DOMbuilder module converts that HTML to in-memory DOM. No muss, no
> fuss, no bother, no picking apart eLyXer code (it's big and not
> immediately obvious, not a single weekend task).

Thanks for the recommendations. I'll need to look into this further. It's 
definitely the easiest way to go, and easy is usually the best. So says the Zen 
of Python (sort of):

If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

I was hoping for a slightly more direct route, though. That would allow me to 
maintain some of the internal data, such as cross-links. But, as I don't have 
months to implement, easy is always better than hard.

> One more question: You sure you want to go in-memory? What happens if a
> guy has a 1200 page book with 100 chapters each containing 10 sections,
> each containing 10 subsections, and tries to parse it on a machine with 512 
> MB RAM? 

I pity this poor man's decision to convert the whole mess to Word, rather than 
splitting it out into individual chapters.

But, I appreciate the voice for reason answer sanity and best practice. Short 
answer, no, not convinced that I want to go in memory. My first pass was to 
just to become comfortable with eLyXer to see if it might meet my needs. I'm 
still try to get comfortable with the structure of LyX documents and .docx 
documents. I've found a nice little python library with support for basic docx 
features and was going to try and refine that to something slightly more usable.

> You in a heap of trouble son. He'll be swapped half way into the next 
> century. If
> instead you used an event parser (e.g SAX) with a few stacks, it will
> probably be slower, and it will be much more hard to write, but for
> practical purposes there won't be an upper limit on input file size.

Good points. The python library makes use of lxml, which supports sax. After 
I've got a better handle on my constraints, I'll spend the time required to 
design something more robust. 

Cheers,

Rob

Re: eLyXer for Document Parsing

2012-02-04 Thread slitt
On Sat, 4 Feb 2012 10:03:00 -0700
Rob Oakes  wrote:

> Dear eLyXer Users and Developers,
> 
> I'm still at work on the import/export module for Microsoft Word
> documents. I'm making pretty good progress. I've got a rough
> prototype that works pretty well and I'm now starting to refine it.
> 
> My approach up to now has been to use regular expressions to match
> portions of the document and then use a library to translate those to
> the corresponding Word XML structures. It's working pretty well with
> my simple test documents.
> 
> Before going too far with this approach, though, I wanted to post
> (another general query).
> 
> In the eLyXer library, there is already a robust set of tools used
> for converting LyX documents to HTML. Does anyone know if the library
> is written in such as way that getting a generic in-memory
> representation of the document would be possible? It would be awesome
> to re-use as much existing code for the Word document export as
> possible. That would allow me to support a broader number of
> features, and gives me a framework for working with maths.
> 
> Any thoughts Alex (and others)? I've downloaded the sources and have
> begun to work through them, but before spending hours to days trying
> to wrap my head around them, I thought I would ask.


This is obviously an Alex question, so I'll go ahead and answer it :-)

Not only possible but easy if you do things the Steve Litt way. eLyXer
quickly punches out HTML that's clean enough to read with an XML
parser, I think. So, eLyXer converts to HTML, and then your program's
DOMbuilder module converts that HTML to in-memory DOM. No muss, no
fuss, no bother, no picking apart eLyXer code (it's big and not
immediately obvious, not a single weekend task).

One more question: You sure you want to go in-memory? What happens if a
guy has a 1200 page book with 100 chapters each containing 10 sections,
each containing 10 subsections, and tries to parse it on a machine with 512 MB 
RAM? 
You in a heap of
trouble son. He'll be swapped half way into the next century. If
instead you used an event parser (e.g SAX) with a few stacks, it will
probably be slower, and it will be much more hard to write, but for
practical purposes there won't be an upper limit on input file size.

SteveT


eLyXer for Document Parsing

2012-02-04 Thread Rob Oakes
Dear eLyXer Users and Developers,

I'm still at work on the import/export module for Microsoft Word documents. I'm 
making pretty good progress. I've got a rough prototype that works pretty well 
and I'm now starting to refine it.

My approach up to now has been to use regular expressions to match portions of 
the document and then use a library to translate those to the corresponding 
Word XML structures. It's working pretty well with my simple test documents.

Before going too far with this approach, though, I wanted to post (another 
general query).

In the eLyXer library, there is already a robust set of tools used for 
converting LyX documents to HTML. Does anyone know if the library is written in 
such as way that getting a generic in-memory representation of the document 
would be possible? It would be awesome to re-use as much existing code for the 
Word document export as possible. That would allow me to support a broader 
number of features, and gives me a framework for working with maths.

Any thoughts Alex (and others)? I've downloaded the sources and have begun to 
work through them, but before spending hours to days trying to wrap my head 
around them, I thought I would ask.

Cheers,

Rob

Re: What is the toolbar icon name for

2012-02-04 Thread Jason Rute
Yaniv  gmail.com> writes:

> 
> Julien Rioux  physics.utoronto.ca> writes:/
> 
> > 
> > On 10/08/2011 1:55 AM, Jason Rute wrote:
> > > Hello, I added
> > >
> > >   Item "Insert | |" "math-delim | |"
> > >
> > > to my stdtoolbars.inc file.  It works fine, except that I can't figure out
> > > what the corresponding icon file name should be (in
> > > AppData/Roaming/Lyx2.0/images/math).
> ...
> > > Does anyone know?  Is this a bug?
> 
> I just wanted to share that I also looked for this icon filename but could not
> find what it was. Anyone knows the answer?
> 

I figured it out!  It should be bars_bars.png.  I had to look in the source code
(src/frontends/qt4/GuiApplication.cpp).  It is fairly self explanatory:

QString png_name;
if (it != end) {
png_name = it->value;
} else {
png_name = name;
png_name.replace('_', "underscore");
png_name.replace(' ', '_');

// This way we can have "math-delim { }" on the toolbar.
png_name.replace('(', "lparen");
png_name.replace(')', "rparen");
png_name.replace('[', "lbracket");
png_name.replace(']', "rbracket");
png_name.replace('{', "lbrace");
png_name.replace('}', "rbrace");
png_name.replace('|', "bars");
png_name.replace(',', "thinspace");
png_name.replace(':', "mediumspace");
png_name.replace(';', "thickspace");
png_name.replace('!', "negthinspace");
}








Re: alpha-numeric Section Numbering

2012-02-04 Thread stefano franchi
On Sat, Feb 4, 2012 at 4:07 AM, Boris Seincher  wrote:
> Hello!
>
> I want to use LyX wor a law paper in german. When I use Part, Chapter,
> Section and so on I get a sctructure like this:
> I., 1, 1.1 ... However I want something like this: A, I, 1, a), aa), (i),
> (ii).
>
> Is it somehow possible?
>
> Thank you in advance

One way to do it is by redefining the corresponding commands:
\thepart, \thechapter, etc. If I understood your scheme correctly, you
want:

part: Alpha
chapter: Roman
section: arabic
subsection: alpha + )
subsubsection: : subsection + alpha
paragraph: ( + roman + )

You may try this code in the preamble of your doc:

\renewcommand{\thepart}{\Alph{part}}
\renewcommand{\thechapter}{\Roman{chapter}}
\renewcommand{\thesection}{\arabic{section}}
\renewcommand{\thesubsection}{\alph{subsection})}
\renewcommand{\thesubsubsection}{\alph{subsection}\alph{subsubsection})}
\renewcommand{\theparagraph}{(\roman{paragraph})}

This code may not get you all the way there, though. It only changes
the style of the counters, not the overall styling of the sectioning
commands. For instance, you will still get  "Part A" not just "A".

A better solution may be to use the titlesec package or (the memoir
class which includes the titlesec package). Take a look at section 9.2
of the manual (available on Ctan and probably on your own
installation: http://www.ctan.org/tex-archive/macros/latex/contrib/titlesec/).
Titlesec will allow you complete control over the sectioning commands

In either case, you'll get your desidered numbering scheme in the pdf
only.  Lyx will still show you arabic figures on screen. You would
have to write a module to fix what you see on screen as well (although
a bug made it impossible a few versions back. Things may have
changed).

Cheers,

Stefano

-- 
__
Stefano Franchi
Associate Research Professor
Department of Hispanic Studies            Ph:   +1 (979) 845-2125
Texas A&M University                          Fax:  +1 (979) 845-6421
College Station, Texas, USA

stef...@tamu.edu
http://stefano.cleinias.org


alpha-numeric Section Numbering

2012-02-04 Thread Boris Seincher

Hello!

I want to use LyX wor a law paper in german. When I use Part, Chapter, 
Section and so on I get a sctructure like this:
I., 1, 1.1 ... However I want something like this: A, I, 1, a), aa), 
(i), (ii).


Is it somehow possible?

Thank you in advance