Re: Straghtforward XML export?
FWIW, I've put up a github repo with my LyX->xml2rfc tool, though it's still a work in progress: https://github.com/nicowilliams/lyx2rfc BTW, I can't get "lyx -e lyxhtml ..." to work. lyx -e xhtml does work, but then there are some differences from the LyXHTML option in the File->Export menu. The differences appear to be confined to the magicparlabel numbering, so I don't think I care; I mention this only because it seems odd. Thanks for all the help! Nico --
Re: Straghtforward XML export?
Well, thanks lots for your help. I have something that's very close. Close enough that I can now author I-Ds in LyX. I've found one more bug in the LyX XHTML output, and I filed a bug for it (bibitem anchor generation is not working properly), and I can work around it. Cheers! Nico --
Re: Straghtforward XML export?
On Thu, May 10, 2012 at 10:09 AM, Richard Heck wrote: >> I don't know how to create a custom inset that does.. [...] > > Try putting this into Local Layout, under Document>Settings: Excellent, that worked great. > I guess if you want these as metadata, you should also add: > InTitle 1 > to each of them. I want them as metadata in my XSLT stylesheet's output, so it sufficed without the InTitle bit. Thanks! Nico --
Re: Straghtforward XML export?
On Thu, May 10, 2012 at 8:27 PM, Richard Heck wrote: > On 05/10/2012 04:52 PM, Nico Williams wrote: >> On Thu, May 10, 2012 at 3:31 PM, Richard Heck wrote: >> Here's a LyX snippet: > > OK, I see the problem. The vertical space gets moved, for reasons > that probably aren't very interesting. Can you file a bug about this on > trac? I can fix it, but it will take a little thought about how best to do > it. Filed http://www.lyx.org/trac/ticket/8154 Thanks. >> FYI, right now I'm struggling with how to transform h2, h3, h4 >> elements into nested section elements; [...] >> > It could be done in LyX, but I guess I'd suggest pre-processing the > whole thing with some kind of script. It shouldn't be too hard to do. > Find h1, write a start tag; when you see another h1, write the end tag > for the first one; etc. I've figured out how to handle this with XSLT 2.0. Here's a snippet: The key is the << operator (here encoded, so <<). The right operand had to be stored in a variable because there's no other way (that I could find!) to refer to the node I wanted to there. That took a lot of effort to work out. Much more than I'd wanted to. And it requires XSLT 2.0. But it works and it's not terribly inelegant -- more elegant than any robust script to do the same, most likely. >> [I'm guessing that LyX's XHTML output is not stable, but I can cope, >> provided I find a way to transform those h elements into nested >> sections.] >> > It's generally stable, but of course under development. Mostly, I want > it to be as modular and customizable as possible, in which case we can > all make it do what we want. Great. Thanks so much for your work and your help! Nico --
Re: Straghtforward XML export?
On 05/10/2012 04:52 PM, Nico Williams wrote: On Thu, May 10, 2012 at 3:31 PM, Richard Heck wrote: Actually, it looks like this got fixed a while ago. In a simple text document I get: I'm running LyX 2.0.0. The vspace I had was in an author inset, FWIW. The output you show is certainly fine. If you want to post a simple example file that does the wrong thing, please do. Here's a LyX snippet: \begin_layout Standard A paragraph. \begin_inset VSpace defskip \end_inset Text after a vspace. \end_layout OK, I see the problem. The vertical space gets moved, for reasons that probably aren't very interesting. Can you file a bug about this on trac? I can fix it, but it will take a little thought about how best to do it. FYI, right now I'm struggling with how to transform h2, h3, h4 elements into nested section elements; this seems very difficult to do in XSLT 1.0, but I'm still exploring ideas, including XSLT 2.0. (This actually seems like a common problem, some recipes for which I do find online and in books, but no solutions general enough.) Of course, the way LyX represents sections/subsections/subsubsections internally is exactly the same as in its XHTML output, and it'd be asking a lot to ask for LyX to wrap section contents in a div -- if I can do this with XSLT you might be able to incorporate that solution as an option in LyX, say. It could be done in LyX, but I guess I'd suggest pre-processing the whole thing with some kind of script. It shouldn't be too hard to do. Find h1, write a start tag; when you see another h1, write the end tag for the first one; etc. [I'm guessing that LyX's XHTML output is not stable, but I can cope, provided I find a way to transform those h elements into nested sections.] It's generally stable, but of course under development. Mostly, I want it to be as modular and customizable as possible, in which case we can all make it do what we want. Richard
Re: Straghtforward XML export?
On Thu, May 10, 2012 at 3:31 PM, Richard Heck wrote: > Actually, it looks like this got fixed a while ago. In a simple text > document I get: I'm running LyX 2.0.0. The vspace I had was in an author inset, FWIW. The output you show is certainly fine. > If you want to post a simple example file that does the wrong thing, please > do. Here's a LyX snippet: \begin_layout Standard A paragraph. \begin_inset VSpace defskip \end_inset Text after a vspace. \end_layout FYI, right now I'm struggling with how to transform h2, h3, h4 elements into nested section elements; this seems very difficult to do in XSLT 1.0, but I'm still exploring ideas, including XSLT 2.0. (This actually seems like a common problem, some recipes for which I do find online and in books, but no solutions general enough.) Of course, the way LyX represents sections/subsections/subsubsections internally is exactly the same as in its XHTML output, and it'd be asking a lot to ask for LyX to wrap section contents in a div -- if I can do this with XSLT you might be able to incorporate that solution as an option in LyX, say. [I'm guessing that LyX's XHTML output is not stable, but I can cope, provided I find a way to transform those h elements into nested sections.] Nico --
Re: Straghtforward XML export?
On 05/10/2012 11:52 AM, Nico Williams wrote: On Thu, May 10, 2012 at 10:02 AM, Richard Heck wrote: On 05/09/2012 02:29 AM, Nico Williams wrote: [Actually, I'm noticing one problem with LyXHTML: it doesn't preserve vertical spacing in any way, not even as horizontal spacing! I'm talking about Insert->Formatting->Vertical Space. I suspect that there are other such things that aren't preserved. For now I'll live. Vertical space is useful for multi-paragraph list items, which are very common in RFCs and Internet-Drafts. If need be I suspect I can write a patch and submit it.] I basically didn't know what to do with the vspace stuff, the issue being that HTML in a way just doesn't have that kind of concept. But if you have an idea, please let me know, and I'll be happy to put it in. Ah, good point. Hmmm, could you use? Or maybe an XML entity that gets defined into a newline but with a processor could replace with an element? Actually, it looks like this got fixed a while ago. In a simple text document I get: this that. If you want to post a simple example file that does the wrong thing, please do. Richard
Re: Straghtforward XML export?
On Thu, May 10, 2012 at 10:02 AM, Richard Heck wrote: > On 05/09/2012 02:29 AM, Nico Williams wrote: >>> [Actually, I'm noticing one problem with LyXHTML: it doesn't preserve >>> vertical spacing in any way, not even as horizontal spacing! I'm >>> talking about Insert->Formatting->Vertical Space. I suspect that >>> there are other such things that aren't preserved. For now I'll live. >>> Vertical space is useful for multi-paragraph list items, which are >>> very common in RFCs and Internet-Drafts. If need be I suspect I can >>> write a patch and submit it.] > > I basically didn't know what to do with the vspace stuff, the issue being > that > HTML in a way just doesn't have that kind of concept. But if you have an > idea, > please let me know, and I'll be happy to put it in. Ah, good point. Hmmm, could you use ? Or maybe an XML entity that gets defined into a newline but with a processor could replace with an element? Nico --
Re: Straghtforward XML export?
On 05/09/2012 02:14 AM, Nico Williams wrote: On Tue, May 8, 2012 at 10:58 PM, Richard Heck wrote: On 05/08/2012 07:30 PM, Nico Williams wrote: LyXHTML looks very promising. It certainly preserves everything I have in my [admittedly small] test file. If it preserves custom inset names then I could probably use custom insets to provide the additional metadata I need (I still haven't quite figured out how to create custom insets, but give me time). XSLT can do the rest. It will do with custom insets whatever you ask it to do. If I remember correctly, it defaults to something like: or an equivalent span, depending upon whether its a charstyle or a flex inset. Excellent. I've got an XSLT stylesheet in the works that does what I want. I don't know how to create a custom inset that does.. nothing much except have a custom inset name. Specifically I need variants of the Author inset to represent the metadata I need (author organization, e-mail address, and postal address). With that I'd be set. Try putting this into Local Layout, under Document>Settings: Format 31 InsetLayout Flex:MyInset LyXType Custom End InsetLayout Flex:MyInsets LyXType Custom HTMLTag mytag End You can specify more if you wish, but that gets you started. (As LaTeX, these export as normal text.) I guess if you want these as metadata, you should also add: InTitle 1 to each of them. Richard
Re: Straghtforward XML export?
On 05/09/2012 02:29 AM, Nico Williams wrote: [Actually, I'm noticing one problem with LyXHTML: it doesn't preserve vertical spacing in any way, not even as horizontal spacing! I'm talking about Insert->Formatting->Vertical Space. I suspect that there are other such things that aren't preserved. For now I'll live. Vertical space is useful for multi-paragraph list items, which are very common in RFCs and Internet-Drafts. If need be I suspect I can write a patch and submit it.] Found a solution to that: a nest list with no bulleting/numbering is rendered as a single withs for the nested list elements, which works out perfectly. No doubt the vspace loss will come up elsewhere, but for now it's fine. I basically didn't know what to do with the vspace stuff, the issue being that HTML in a way just doesn't have that kind of concept. But if you have an idea, please let me know, and I'll be happy to put it in. Richard
Re: Straghtforward XML export?
> [Actually, I'm noticing one problem with LyXHTML: it doesn't preserve > vertical spacing in any way, not even as horizontal spacing! I'm > talking about Insert->Formatting->Vertical Space. I suspect that > there are other such things that aren't preserved. For now I'll live. > Vertical space is useful for multi-paragraph list items, which are > very common in RFCs and Internet-Drafts. If need be I suspect I can > write a patch and submit it.] Found a solution to that: a nest list with no bulleting/numbering is rendered as a single with s for the nested list elements, which works out perfectly. No doubt the vspace loss will come up elsewhere, but for now it's fine.
Re: Straghtforward XML export?
On Tue, May 8, 2012 at 10:58 PM, Richard Heck wrote: > On 05/08/2012 07:30 PM, Nico Williams wrote: >> LyXHTML looks very promising. It certainly preserves everything I >> have in my [admittedly small] test file. If it preserves custom inset >> names then I could probably use custom insets to provide the >> additional metadata I need (I still haven't quite figured out how to >> create custom insets, but give me time). XSLT can do the rest. >> > It will do with custom insets whatever you ask it to do. If I remember > correctly, it defaults to something like: > > or an equivalent span, depending upon whether its a charstyle or a > flex inset. Excellent. I've got an XSLT stylesheet in the works that does what I want. I don't know how to create a custom inset that does.. nothing much except have a custom inset name. Specifically I need variants of the Author inset to represent the metadata I need (author organization, e-mail address, and postal address). With that I'd be set. > In principle, you can also tell the LyXHTML output to use some other > tag than div or span. This is all customized in the layout files, as is > explained in the bits on XHTML in the customization manual. So I'm > guessing that you could get quite a long way towards XML simply in > that sort of way. The divs are fine. I can address them just fine with XPath, so I'm quite happy. If the LyXHTML schema changes radically I'll just have to re-write the XSLT stylesheet I'm writing now, but as long as no metadata is lost I'll be fine. Eventually I'll probably want to develop a layout and class for actually dealing with RFCs directly in LyX. The typesetting rules for RFCs are... trivial in comparison to most other layouts. But I confess knowing nothing about LaTeX, so it will be sometime before I get there. For now I'm just happy -ecstatic even- to just consume LyXHTML with XSLT. [Actually, I'm noticing one problem with LyXHTML: it doesn't preserve vertical spacing in any way, not even as horizontal spacing! I'm talking about Insert->Formatting->Vertical Space. I suspect that there are other such things that aren't preserved. For now I'll live. Vertical space is useful for multi-paragraph list items, which are very common in RFCs and Internet-Drafts. If need be I suspect I can write a patch and submit it.] Thanks for your help. Sorry to need so much handholding, I'm out of my element here, Nico --
Re: Straghtforward XML export?
On 05/08/2012 07:30 PM, Nico Williams wrote: On Tue, May 8, 2012 at 12:40 AM, Guenter Milde wrote: So how about XHTML as starting point for your XSLT transformations? LyXHTML looks very promising. It certainly preserves everything I have in my [admittedly small] test file. If it preserves custom inset names then I could probably use custom insets to provide the additional metadata I need (I still haven't quite figured out how to create custom insets, but give me time). XSLT can do the rest. It will do with custom insets whatever you ask it to do. If I remember correctly, it defaults to something like: or an equivalent span, depending upon whether its a charstyle or a flex inset. In principle, you can also tell the LyXHTML output to use some other tag than div or span. This is all customized in the layout files, as is explained in the bits on XHTML in the customization manual. So I'm guessing that you could get quite a long way towards XML simply in that sort of way. Richard Otherwise, you could use the native XHTML formatter as a model for adding "native XML" output. Indeed, I think that would be a good last resort. Ideally there'd be a terribly straightforward LyXML, but LyXHTML looks manageable. Another starting point would be the external "elyxer" tool: a Python package that takes a LyX file and converts it to XHTML. http://elyxer.nongnu.org/ That looks pretty good too. That's a lot of realistic options. Thanks again, Nico --
Re: Straghtforward XML export?
On Tue, May 8, 2012 at 12:40 AM, Guenter Milde wrote: > So how about XHTML as starting point for your XSLT transformations? LyXHTML looks very promising. It certainly preserves everything I have in my [admittedly small] test file. If it preserves custom inset names then I could probably use custom insets to provide the additional metadata I need (I still haven't quite figured out how to create custom insets, but give me time). XSLT can do the rest. > Otherwise, you could use the native XHTML formatter as a model for adding > "native XML" output. Indeed, I think that would be a good last resort. Ideally there'd be a terribly straightforward LyXML, but LyXHTML looks manageable. > Another starting point would be the external "elyxer" tool: a Python > package that takes a LyX file and converts it to XHTML. > http://elyxer.nongnu.org/ That looks pretty good too. That's a lot of realistic options. Thanks again, Nico --
Re: Straghtforward XML export?
On Tue, May 8, 2012 at 12:40 AM, Guenter Milde wrote: > So how about XHTML as starting point for your XSLT transformations? > > Otherwise, you could use the native XHTML formatter as a model for adding > "native XML" output. > > Another starting point would be the external "elyxer" tool: a Python > package that takes a LyX file and converts it to XHTML. > http://elyxer.nongnu.org/ Ah, those are good ideas. I'll take a look. Thanks!
Re: Straghtforward XML export?
On 2012-05-07, Nico Williams wrote: > [-- Type: text/plain, Encoding: --] > No, i got that. I don't actually care for docbook. I want a straightforward > translation to XML that preserves all data and metadata. If I need a > specific schema I can always use XSLT to get output in that form. So how about XHTML as starting point for your XSLT transformations? Otherwise, you could use the native XHTML formatter as a model for adding "native XML" output. Another starting point would be the external "elyxer" tool: a Python package that takes a LyX file and converts it to XHTML. http://elyxer.nongnu.org/ Günter
Re: Straghtforward XML export?
Is there canonical documentation of the LyX file format? I can't find it... I did find this: http://wiki.lyx.org/Devel/LyXFileFormat , but that's just a changelog. There's nothing else obvious in http://wiki.lyx.org/Devel/ ... The development/FORMAT file in the source tree is also a changelog. Nico --
Re: Straghtforward XML export?
On Mon, May 7, 2012 at 12:56 PM, Nico Williams wrote: > Ah, that works. Thanks! I'll take a look and see if the native > DocBook export works for me. Nope, it still doesn't allow more than one author in docbook, though it does merge all the authors listed in the LyX document source.
Re: Straghtforward XML export?
No, i got that. I don't actually care for docbook. I want a straightforward translation to XML that preserves all data and metadata. If I need a specific schema I can always use XSLT to get output in that form. Nico --
Re: Straghtforward XML export?
Nico Williams wrote: > On Mon, May 7, 2012 at 12:07 PM, Pavel Sanda wrote: > > Nico Williams wrote: > >> How does LyX represent documents internally? If it does it in an > >> objectified form then it should be fairly straightforward to walk the > >> document tree and emit XML, no? Or, looking at .lyx files, maybe it > >> should be possible to script a simple LyX->XML conversion has > >> anyone tried this before? > > > > we miss someone who knows docbook/sgml/xml rather well and would like to > > help > > to bring lyx output more up-to-date or at least clearly state what needs to > > be done. > > http://article.gmane.org/gmane.editors.lyx.devel/119220 > > Lookingat LyX's format, it seems like translating to XML using a > LyX-specific schema should be utterly straightforward. For example, > something like this: heh, you didn't get the point ;) to sumarize: - lyx already produce docbook xml. but in older format. - people spend lot of time to write quite complex web guides how to setup things and fix issues for new docbook format but never share their wisdom with lyx developers. either in contribution to lyx documentation or in stating what needs to be changed in lyx output. - no lyx dev seems to be motivated to study docbook xml so although we think that the upgrade would be simple, until we know what exactly should change, things will stay as they are now :) pavel
Re: Straghtforward XML export?
On Mon, May 7, 2012 at 12:41 PM, Pavel Sanda wrote: > Nico Williams wrote: >> This I hadn't seen. One thing to note is that the LyX I'm running (on >> Ubuntu) has no option to save as or export to SGML or DocBook. I >> gather from the link you gave me that SGML and Docbook are natively >> supported export formats, so I guess Ubuntu's build must be lacking >> that feature. Is that correct? > > export items depend on software you have installed, in case of docbook > sgml-tools are needed. not using it i can't say much more, but it seems > that your question are answered in the older link. Ah, that works. Thanks! I'll take a look and see if the native DocBook export works for me.
Re: Straghtforward XML export?
On Mon, May 7, 2012 at 12:07 PM, Pavel Sanda wrote: > Nico Williams wrote: >> How does LyX represent documents internally? If it does it in an >> objectified form then it should be fairly straightforward to walk the >> document tree and emit XML, no? Or, looking at .lyx files, maybe it >> should be possible to script a simple LyX->XML conversion has >> anyone tried this before? > > we miss someone who knows docbook/sgml/xml rather well and would like to help > to bring lyx output more up-to-date or at least clearly state what needs to > be done. > http://article.gmane.org/gmane.editors.lyx.devel/119220 Lookingat LyX's format, it seems like translating to XML using a LyX-specific schema should be utterly straightforward. For example, something like this: \lyxformat 413 \begin_document \begin_header \textclass article ... \end_header \begin_body \begin_layout Title Some Doc \end_layout \begin_layout Author Joe Sixpack \begin_inset VSpace defskip \end_inset Sixpack Corp. \end_layout \begin_layout Abstract Foo bar baz blah blah. \end_layout \begin_layout Abstract Two paragrap abstract, eh? \end_layout ... should translate into: Some Doc Joe SixpackSixpack Corp. Foo bar baz blah blah. Two paragrap abstract, eh? ... Translating insets and layouts into XML elements and attributes seems relatively straightforward. Translating directives seems straightforward also. Now, note that the two paragraph abstract would be translated into two elements, but an XSLT stylesheet could easily translate that into: .. A straightforward LyX->XML translation seems like the best approach to LyX->XML translation because translation to any other schemas can then be done via XSLT. Nico --
Re: Straghtforward XML export?
Nico Williams wrote: > This I hadn't seen. One thing to note is that the LyX I'm running (on > Ubuntu) has no option to save as or export to SGML or DocBook. I > gather from the link you gave me that SGML and Docbook are natively > supported export formats, so I guess Ubuntu's build must be lacking > that feature. Is that correct? export items depend on software you have installed, in case of docbook sgml-tools are needed. not using it i can't say much more, but it seems that your question are answered in the older link. p
Re: Straghtforward XML export?
On Mon, May 7, 2012 at 12:07 PM, Pavel Sanda wrote: > Nico Williams wrote: >> The LaTeX->XML tools I've tried leave me... sad. They tend to drop >> some things. For example: vertical space, which becomes a simple >> newline in a paragraph's text. It would be better to translate >> vertical space into elements -- that'd be much, much more >> useful in XSLT than embedded newlines! >> >> So I'm wondering: why couldn't LyX export to XML using a native schema >> that preserves as much LyX markup as possible, indeed, if not all of >> it? > > google says: > http://bgu.perso.libertysurf.fr/doc/db4lyx/ I did see that link when I was researching this. It's very out of date. > http://www.neomantic.com/tutorials/lyx-and-docbookXML/ This I hadn't seen. One thing to note is that the LyX I'm running (on Ubuntu) has no option to save as or export to SGML or DocBook. I gather from the link you gave me that SGML and Docbook are natively supported export formats, so I guess Ubuntu's build must be lacking that feature. Is that correct? Nico --
Re: Straghtforward XML export?
Nico Williams wrote: > The LaTeX->XML tools I've tried leave me... sad. They tend to drop > some things. For example: vertical space, which becomes a simple > newline in a paragraph's text. It would be better to translate > vertical space into elements -- that'd be much, much more > useful in XSLT than embedded newlines! > > So I'm wondering: why couldn't LyX export to XML using a native schema > that preserves as much LyX markup as possible, indeed, if not all of > it? google says: http://bgu.perso.libertysurf.fr/doc/db4lyx/ http://www.neomantic.com/tutorials/lyx-and-docbookXML/ > How does LyX represent documents internally? If it does it in an > objectified form then it should be fairly straightforward to walk the > document tree and emit XML, no? Or, looking at .lyx files, maybe it > should be possible to script a simple LyX->XML conversion has > anyone tried this before? we miss someone who knows docbook/sgml/xml rather well and would like to help to bring lyx output more up-to-date or at least clearly state what needs to be done. http://article.gmane.org/gmane.editors.lyx.devel/119220 p