Re: XML stream writer library

2021-01-12 Thread Richard Kimberly Heck
On 1/12/21 6:19 PM, Thibaut Cuvelier wrote:
> On Tue, 12 Jan 2021 at 16:33, Lorenzo Bertini
> mailto:lorenzobertin...@gmail.com>> wrote:
>
> Il 08/01/21 03:00, Thibaut Cuvelier ha scritto:
> > A tour of some C++ libraries for XML:
> > - RapidXML: mostly unmaintained since 2013, no support for
> namespaces
> > (except in forks: https://github.com/dwd/rapidxml
> 
> > >)
> > - Boost Property Tree: no XML parser, which limits further use
> (it can
> > use RapidXML though, see above)
> > - libstudxml: C++ library, designed for speed, no DOM
> > - libxml2: C library, designed for features and not speed (also
> includes
> > XPath and XSLT, DTD and XML Schema, namespaces), "mature" and
> barely not
> > evolving anymore
> > - libxml++: depends on glibmm2
> > - Xerces-C++: C++ library, designed for features and not speed
> (also
> > includes XPath, DTD and XML Schema, namespaces), "mature" and
> barely not
> > evolving anymore; no XSLT (Xalan could be used, but it only
> works with a
> > ancient version of Xerces; XQuilla implemented XPath 2, but is
> no more
> > developed since 2016)
> > - Expat: C library, designed for speed, no DOM by default
> (provided by
> > https://github.com/kolotsey/expat-dom
> 
> >  >), with namespaces
> > - tinyxml2: C++ library, designed for speed only (also includes
> XPath
> > through the unmaintained
> https://github.com/stanthomas/tinyxml2-ex
> 
> >  >, no validation, no
> > namespaces), mature and slowly evolving
> > - pugixml: C++ library, designed for speed with a few features
> (like
> > XPath, no validation, no namespaces), mature and evolving
> > - libroxml: C library, no clear design goal (includes XPath,
> namespaces,
> > no validation), evolving
> > - Saxon-C: C/C++ wrapper of the state-of-the-art Java library,
> largest
> > amount of features (XPath and XSLT 3, DTD and XML Schema
> validation --
> > extension for RelaxNG: http://www.cfoster.net/saxon-jing/
> 
> >  > --, namespaces), very mature,
> > really evolving (both performance and features), but it requires
> a JVM
> > (Excelsior is built-in, even though it's not been maintained for
> quite a
> > long time)
> > - Qt: no, I was joking :). Qt XML is not supported anymore, it's
> > recommended to switch to QXmlStreamReader and QXmlStreamWriter
> (which
> > are only SAX-like). Qt XML Patterns used to have XPath, XSLT,
> and XML
> > Schema, but it's been deprecated a while ago (Qt 5.13 for the last
> > wake-up call, but it hasn't been touched since Qt 4, basically)
> >
> > If LyX is being really serious about XML (i.e. moving as many
> things as
> > possible to XML technologies), Saxon is probably the way to go.
> > Otherwise, it's going to be too heavy to ship Saxon and a JVM
> along with
> > LyX. Instead, pugixml seems to me like a good choice: a few
> features
> > (XPath is the most relevant for LyX, and included in the base
> library,
> > no need for addons), good performance, still maintained (there is a
> > chance to have bugs fixed in a newer version, plus security
> > vulnerabilities taken care of).
> Was this addressed in the virtual meeting? 
>
>
> As far as I know, it wasn't discussed.

We were pretty focused on planning for 2.4.0.

 

> Anyhow, I think that for a start we'd need only the most basic
> features
> (tag insertion, indent), as was the purpose of #12055 in the first
> place
> (I'm sorry to have opened this pandora's box), so maybe no harm will
> come if we start wrapping pugi.
>
> Let me know what you think, and if this is not the time for this, as
> with LyX 2.4 coming out there might be other things that need focus.
>
>
> It looks like the patches cannot get integrated into the master
> development branch before 2.4 is out (or at least branched). However,
> in the meantime, I think I can create a feature branch and push your
> patches there (https://www.lyx.org/trac/browser/features
> ).

Yes, that would be the way to go.

Riki



-- 
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel


Re: XML stream writer library

2021-01-12 Thread Thibaut Cuvelier
On Tue, 12 Jan 2021 at 16:33, Lorenzo Bertini 
wrote:

> Il 08/01/21 03:00, Thibaut Cuvelier ha scritto:
> > A tour of some C++ libraries for XML:
> > - RapidXML: mostly unmaintained since 2013, no support for namespaces
> > (except in forks: https://github.com/dwd/rapidxml
> > )
> > - Boost Property Tree: no XML parser, which limits further use (it can
> > use RapidXML though, see above)
> > - libstudxml: C++ library, designed for speed, no DOM
> > - libxml2: C library, designed for features and not speed (also includes
> > XPath and XSLT, DTD and XML Schema, namespaces), "mature" and barely not
> > evolving anymore
> > - libxml++: depends on glibmm2
> > - Xerces-C++: C++ library, designed for features and not speed (also
> > includes XPath, DTD and XML Schema, namespaces), "mature" and barely not
> > evolving anymore; no XSLT (Xalan could be used, but it only works with a
> > ancient version of Xerces; XQuilla implemented XPath 2, but is no more
> > developed since 2016)
> > - Expat: C library, designed for speed, no DOM by default (provided by
> > https://github.com/kolotsey/expat-dom
> > ), with namespaces
> > - tinyxml2: C++ library, designed for speed only (also includes XPath
> > through the unmaintained https://github.com/stanthomas/tinyxml2-ex
> > , no validation, no
> > namespaces), mature and slowly evolving
> > - pugixml: C++ library, designed for speed with a few features (like
> > XPath, no validation, no namespaces), mature and evolving
> > - libroxml: C library, no clear design goal (includes XPath, namespaces,
> > no validation), evolving
> > - Saxon-C: C/C++ wrapper of the state-of-the-art Java library, largest
> > amount of features (XPath and XSLT 3, DTD and XML Schema validation --
> > extension for RelaxNG: http://www.cfoster.net/saxon-jing/
> >  --, namespaces), very mature,
> > really evolving (both performance and features), but it requires a JVM
> > (Excelsior is built-in, even though it's not been maintained for quite a
> > long time)
> > - Qt: no, I was joking :). Qt XML is not supported anymore, it's
> > recommended to switch to QXmlStreamReader and QXmlStreamWriter (which
> > are only SAX-like). Qt XML Patterns used to have XPath, XSLT, and XML
> > Schema, but it's been deprecated a while ago (Qt 5.13 for the last
> > wake-up call, but it hasn't been touched since Qt 4, basically)
> >
> > If LyX is being really serious about XML (i.e. moving as many things as
> > possible to XML technologies), Saxon is probably the way to go.
> > Otherwise, it's going to be too heavy to ship Saxon and a JVM along with
> > LyX. Instead, pugixml seems to me like a good choice: a few features
> > (XPath is the most relevant for LyX, and included in the base library,
> > no need for addons), good performance, still maintained (there is a
> > chance to have bugs fixed in a newer version, plus security
> > vulnerabilities taken care of).
> Was this addressed in the virtual meeting?


As far as I know, it wasn't discussed.


> Also, since Xerces-C was the
> most feature full and mature after Saxon-C, I was curious as to why you
> didn't mention it.
>

Actually, Xerces-C and Xerces-C++ are just the same thing (the official
name being Xerces-C++ and the name of the packages Xerces-C, if I got it
correctly).


> Anyhow, I think that for a start we'd need only the most basic features
> (tag insertion, indent), as was the purpose of #12055 in the first place
> (I'm sorry to have opened this pandora's box), so maybe no harm will
> come if we start wrapping pugi.
>
> Let me know what you think, and if this is not the time for this, as
> with LyX 2.4 coming out there might be other things that need focus.
>

It looks like the patches cannot get integrated into the master development
branch before 2.4 is out (or at least branched). However, in the meantime,
I think I can create a feature branch and push your patches there (
https://www.lyx.org/trac/browser/features).
-- 
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel


Re: XML stream writer library

2021-01-12 Thread Lorenzo Bertini

Il 08/01/21 03:00, Thibaut Cuvelier ha scritto:

A tour of some C++ libraries for XML:
- RapidXML: mostly unmaintained since 2013, no support for namespaces 
(except in forks: https://github.com/dwd/rapidxml 
)
- Boost Property Tree: no XML parser, which limits further use (it can 
use RapidXML though, see above)

- libstudxml: C++ library, designed for speed, no DOM
- libxml2: C library, designed for features and not speed (also includes 
XPath and XSLT, DTD and XML Schema, namespaces), "mature" and barely not 
evolving anymore

- libxml++: depends on glibmm2
- Xerces-C++: C++ library, designed for features and not speed (also 
includes XPath, DTD and XML Schema, namespaces), "mature" and barely not 
evolving anymore; no XSLT (Xalan could be used, but it only works with a 
ancient version of Xerces; XQuilla implemented XPath 2, but is no more 
developed since 2016)
- Expat: C library, designed for speed, no DOM by default (provided by 
https://github.com/kolotsey/expat-dom 
), with namespaces
- tinyxml2: C++ library, designed for speed only (also includes XPath 
through the unmaintained https://github.com/stanthomas/tinyxml2-ex 
, no validation, no 
namespaces), mature and slowly evolving
- pugixml: C++ library, designed for speed with a few features (like 
XPath, no validation, no namespaces), mature and evolving
- libroxml: C library, no clear design goal (includes XPath, namespaces, 
no validation), evolving
- Saxon-C: C/C++ wrapper of the state-of-the-art Java library, largest 
amount of features (XPath and XSLT 3, DTD and XML Schema validation -- 
extension for RelaxNG: http://www.cfoster.net/saxon-jing/ 
 --, namespaces), very mature, 
really evolving (both performance and features), but it requires a JVM 
(Excelsior is built-in, even though it's not been maintained for quite a 
long time)
- Qt: no, I was joking :). Qt XML is not supported anymore, it's 
recommended to switch to QXmlStreamReader and QXmlStreamWriter (which 
are only SAX-like). Qt XML Patterns used to have XPath, XSLT, and XML 
Schema, but it's been deprecated a while ago (Qt 5.13 for the last 
wake-up call, but it hasn't been touched since Qt 4, basically)


If LyX is being really serious about XML (i.e. moving as many things as 
possible to XML technologies), Saxon is probably the way to go. 
Otherwise, it's going to be too heavy to ship Saxon and a JVM along with 
LyX. Instead, pugixml seems to me like a good choice: a few features 
(XPath is the most relevant for LyX, and included in the base library, 
no need for addons), good performance, still maintained (there is a 
chance to have bugs fixed in a newer version, plus security 
vulnerabilities taken care of).
Was this addressed in the virtual meeting? Also, since Xerces-C was the 
most feature full and mature after Saxon-C, I was curious as to why you 
didn't mention it.


Anyhow, I think that for a start we'd need only the most basic features 
(tag insertion, indent), as was the purpose of #12055 in the first place 
(I'm sorry to have opened this pandora's box), so maybe no harm will 
come if we start wrapping pugi.


Let me know what you think, and if this is not the time for this, as 
with LyX 2.4 coming out there might be other things that need focus.

--
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel


Re: XML stream writer library

2021-01-07 Thread Thibaut Cuvelier
On Thu, 7 Jan 2021 at 18:23, Thibaut Cuvelier  wrote:

> On Thu, 7 Jan 2021, 12:52 Lorenzo Bertini, 
> wrote:
>
>> I think almost all the options are on the table at this point. For the
>> sake of completeness I think it's worth mentioning DOM library Boost
>> Property Tree, which popped up frequently while searching.
>>
>> I think Thibaut is right when saying that, for the way LyX is structured
>> now, a SAX writer would be more appropriate, because we won't work on
>> xml directly, but convert the LyX file. However most of the libraries
>> have a DOM approach, and also, if someday we'll convert LyX format to
>> something xml-like, we might have to start all of this again.
>>
>> I did a small benchmark with pugixml and to both read and write a xml
>> document of 2.2Mb of equivalent ~100/120 pages chock full of math: it
>> takes negligble time to both read and write on my really modest laptop
>> A10-9600). Peak memory consumption was 14Mb, but since some MathML was
>> corrupted (it has trouble with backslash \) it's possible it might be
>> way less once fixed: LyX consumption opening the corresponding LyX file
>> was ~120Mb. The benchmark table in
>> <
>> http://rapidxml.sourceforge.net/manual.html#namespacerapidxml_1performance>
>>
>> seems to indicate that pugixml and RapidXML have performance just one
>> order greater than strlen, so I don't think parse time will ever be a
>> problem.
>
>
> Thanks for your benchmark. For me, the major difference between the two
> libraries is that pugixml is still maintained, but not really RapidXML. And
> XML parsing is very often a source of security problems (not just XXE).
>
> I'm unfamiliar with the concept of "wrapping" libraries and "layers": is
>> it when you write your own classes and methods on top of some common
>> stuff those libraries do, so if for whatever reason you have to switch
>> you can "plug" another easily?
>>
>
> Yes, exactly.
>

Below is my take on
https://stackoverflow.com/questions/9387610/what-xml-parser-should-i-use-in-c
and https://github.com/fffaraz/awesome-cpp#xml

XPath would be very useful if LyX switches to an XML representation (easy
queries on an XML document, think of SQL for XML).
XSLT is a way to describe transformations from XML to anything. If LyX
switches to an XML representation, it might be used to replace C++
exporters (but formula conversion will be a pain!). It might lower the
entry bar for new contributors, even though XSLT is not an easy language.
XQuery is a script language for XML processes.
Apart from Java libraries, only versions 1.0 are implemented: apart from
XPath, it really limits their use… A state-of-the-art implementation of the
current norms is Saxon, which has a C binding.

To allow for validation of XML files (i.e. check they respect some
grammar), DTD is the oldest way (inherited from SGML), XML Schema adds many
features over DTD (like types). The best technology nowadays is RelaxNG
(it's not recent: 2005), which is much more powerful than XML Schema.

XInclude is the XML way of specifying includes of other files (not
necessarily XML). Think \input in LaTeX or LyX child documents with a few
more features.

Name spaces are similar to those of C++, and are especially useful when
mixing several standards (like MathML and DocBook).

A tour of some C++ libraries for XML:
- RapidXML: mostly unmaintained since 2013, no support for namespaces
(except in forks: https://github.com/dwd/rapidxml)
- Boost Property Tree: no XML parser, which limits further use (it can use
RapidXML though, see above)
- libstudxml: C++ library, designed for speed, no DOM
- libxml2: C library, designed for features and not speed (also includes
XPath and XSLT, DTD and XML Schema, namespaces), "mature" and barely not
evolving anymore
- libxml++: depends on glibmm2
- Xerces-C++: C++ library, designed for features and not speed (also
includes XPath, DTD and XML Schema, namespaces), "mature" and barely not
evolving anymore; no XSLT (Xalan could be used, but it only works with a
ancient version of Xerces; XQuilla implemented XPath 2, but is no more
developed since 2016)
- Expat: C library, designed for speed, no DOM by default (provided by
https://github.com/kolotsey/expat-dom), with namespaces
- tinyxml2: C++ library, designed for speed only (also includes XPath
through the unmaintained https://github.com/stanthomas/tinyxml2-ex, no
validation, no namespaces), mature and slowly evolving
- pugixml: C++ library, designed for speed with a few features (like XPath,
no validation, no namespaces), mature and evolving
- libroxml: C library, no clear design goal (includes XPath, namespaces, no
validation), evolving
- Saxon-C: C/C++ wrapper of the state-of-the-art Java library, largest
amount of features (XPath and XSLT 3, DTD and XML Schema validation --
extension for RelaxNG: http://www.cfoster.net/saxon-jing/ --, namespaces),
very mature, really evolving (both performance and features), but it
requires a JVM (Excelsior is built-in, even though it's 

Re: XML stream writer library

2021-01-07 Thread Thibaut Cuvelier
On Thu, 7 Jan 2021, 12:52 Lorenzo Bertini, 
wrote:

> I think almost all the options are on the table at this point. For the
> sake of completeness I think it's worth mentioning DOM library Boost
> Property Tree, which popped up frequently while searching.
>
> I think Thibaut is right when saying that, for the way LyX is structured
> now, a SAX writer would be more appropriate, because we won't work on
> xml directly, but convert the LyX file. However most of the libraries
> have a DOM approach, and also, if someday we'll convert LyX format to
> something xml-like, we might have to start all of this again.
>
> I did a small benchmark with pugixml and to both read and write a xml
> document of 2.2Mb of equivalent ~100/120 pages chock full of math: it
> takes negligble time to both read and write on my really modest laptop
> A10-9600). Peak memory consumption was 14Mb, but since some MathML was
> corrupted (it has trouble with backslash \) it's possible it might be
> way less once fixed: LyX consumption opening the corresponding LyX file
> was ~120Mb. The benchmark table in
> <
> http://rapidxml.sourceforge.net/manual.html#namespacerapidxml_1performance>
>
> seems to indicate that pugixml and RapidXML have performance just one
> order greater than strlen, so I don't think parse time will ever be a
> problem.


Thanks for your benchmark. For me, the major difference between the two
libraries is that pugixml is still maintained, but not really RapidXML. And
XML parsing is very often a source of security problems (not just XXE).

I'm unfamiliar with the concept of "wrapping" libraries and "layers": is
> it when you write your own classes and methods on top of some common
> stuff those libraries do, so if for whatever reason you have to switch
> you can "plug" another easily?
>

Yes, exactly.

>
-- 
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel


Re: XML stream writer library

2021-01-07 Thread Lorenzo Bertini
I think almost all the options are on the table at this point. For the 
sake of completeness I think it's worth mentioning DOM library Boost 
Property Tree, which popped up frequently while searching.


I think Thibaut is right when saying that, for the way LyX is structured 
now, a SAX writer would be more appropriate, because we won't work on 
xml directly, but convert the LyX file. However most of the libraries 
have a DOM approach, and also, if someday we'll convert LyX format to 
something xml-like, we might have to start all of this again.


I did a small benchmark with pugixml and to both read and write a xml 
document of 2.2Mb of equivalent ~100/120 pages chock full of math: it 
takes negligble time to both read and write on my really modest laptop 
A10-9600). Peak memory consumption was 14Mb, but since some MathML was 
corrupted (it has trouble with backslash \) it's possible it might be 
way less once fixed: LyX consumption opening the corresponding LyX file 
was ~120Mb. The benchmark table in 
 
seems to indicate that pugixml and RapidXML have performance just one 
order greater than strlen, so I don't think parse time will ever be a 
problem.


I'm unfamiliar with the concept of "wrapping" libraries and "layers": is 
it when you write your own classes and methods on top of some common 
stuff those libraries do, so if for whatever reason you have to switch 
you can "plug" another easily?


Thanks, Lo.
--
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel


Re: XML stream writer library

2021-01-06 Thread Thibaut Cuvelier
On Tue, 5 Jan 2021 at 10:37, Joel Kulesza  wrote:

> On Tue, Jan 5, 2021 at 1:19 AM Pavel Sanda  wrote:
>
>> On Mon, Jan 04, 2021 at 09:48:42PM +0100, Thibaut Cuvelier wrote:
>> > There are multiple issues here. What is needed to generate HTML and
>> DocBook
>> > is a simple SAX writer, not a parser. I've done plenty of research about
>> > it, there's no XML library that does that. Most of them are using a DOM,
>> > which is a total waste of memory for such an application: it stores a
>> > complete XML tree in memory before serialising it. With SAX, you just
>> need
>> > a string backend, which is much more lightweight (by several factors).
>>
>> After little bit more thinking, is using DOM actually that big issue?
>> I mean how much it takes - for document of length n its O(n) in space?
>>
>> Sure, it might be cut to constant, but practically speaking when you have
>> 100 pages document what is the real time/memory consumption. Timewise
>> you spent 1s in XML compared to next 30s in conversion figures to pdf or
>> whatever format? Spacewise probably one more time than what we
>> already allocated for document itself.
>>
>> If using more heavy-weight caliber xml lib is not pain from API point
>> of view (and I do not know, you are the expert here) then we might
>> actually consider it, given the difficulties in SAX space?
>>
>
> I had a similar thought and will note that I've had good success on other
> projects with pugixml.
>

It's typical to have a DOM tree that is two to five times larger than the
raw text, that's not always negligible (Xerces is close to 2, Java
implementations anywhere between 2 and 5, I haven't checked pugixml or
TinyXML2 for this specific criterion). But that's not the real issue: for
generating HTML and DocBook, for now, DOM is not so useful from a developer
point of view, DOM is more suitable to handle an existing document or to
modify it, not really to generate one from scratch. A SAX writer is really
what's the most appropriate, given the way LyX is internally structured:
there is very little need to go backward when generating the file (e.g.,
add something to the header when encountering some LyX inset).

Using DOM will not really simplify the code (I'm speaking for the DocBook
export, which is highly similar to HTML). However, it might make its logic
easier to understand for a newcomer. Nevertheless, DOM comes with more
complex syntax: with SAX, you are only appending content to the file, with
only strings; with DOM, you have to indicate where you want to write
something (with methods like InsetEndChild), and you pass around complete
XML nodes (built from the same strings).

More specifically, in SAX (where stream is mostly a large string object
with helper methods):

stream.writeStartTag("tag");

With DOM, taking the example of TinyXML2 (where document is the root of the
DOM tree and node the node in the tree that is being filled):

node->InsertEndChild( document->NewElement("tag") );

Both are perfectly good choices, though. If we write a thin layer on top of
a DOM writer (as Riki suggested, this would allow decoupling with the
actual XML library), we might be able to have a syntax close to that of SAX
while having the extra flexibility of DOM. This way, the LyX code would be
clean, and avoid current intricacies to output things at the right place
(in DocBook, especially the  tag).

More specifically, @Pavel: for DocBook, you spend 0% of your time dealing
with images, as it's supposed to be done by the DocBook processor
afterwards. Any gain in the XML part of LyX will be noticeable by the user
for large documents (book-sized).
(And I won't say that something being O(n) is negligible in this case: I'm
using daily exponential-time algorithms that work so much faster than
polynomial-time ones…)
-- 
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel


Re: XML stream writer library

2021-01-05 Thread Joel Kulesza
On Tue, Jan 5, 2021 at 1:19 AM Pavel Sanda  wrote:

> On Mon, Jan 04, 2021 at 09:48:42PM +0100, Thibaut Cuvelier wrote:
> > There are multiple issues here. What is needed to generate HTML and
> DocBook
> > is a simple SAX writer, not a parser. I've done plenty of research about
> > it, there's no XML library that does that. Most of them are using a DOM,
> > which is a total waste of memory for such an application: it stores a
> > complete XML tree in memory before serialising it. With SAX, you just
> need
> > a string backend, which is much more lightweight (by several factors).
>
> After little bit more thinking, is using DOM actually that big issue?
> I mean how much it takes - for document of length n its O(n) in space?
>
> Sure, it might be cut to constant, but practically speaking when you have
> 100 pages document what is the real time/memory consumption. Timewise
> you spent 1s in XML compared to next 30s in conversion figures to pdf or
> whatever format? Spacewise probably one more time than what we
> already allocated for document itself.
>
> If using more heavy-weight caliber xml lib is not pain from API point
> of view (and I do not know, you are the expert here) then we might
> actually consider it, given the difficulties in SAX space?
>

I had a similar thought and will note that I've had good success on other
projects with pugixml.

Regards,
Joel
-- 
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel


Re: XML stream writer library

2021-01-05 Thread Pavel Sanda
On Mon, Jan 04, 2021 at 09:48:42PM +0100, Thibaut Cuvelier wrote:
> There are multiple issues here. What is needed to generate HTML and DocBook
> is a simple SAX writer, not a parser. I've done plenty of research about
> it, there's no XML library that does that. Most of them are using a DOM,
> which is a total waste of memory for such an application: it stores a
> complete XML tree in memory before serialising it. With SAX, you just need
> a string backend, which is much more lightweight (by several factors). 

After little bit more thinking, is using DOM actually that big issue?
I mean how much it takes - for document of length n its O(n) in space? 

Sure, it might be cut to constant, but practically speaking when you have 
100 pages document what is the real time/memory consumption. Timewise
you spent 1s in XML compared to next 30s in conversion figures to pdf or
whatever format? Spacewise probably one more time than what we
already allocated for document itself.

If using more heavy-weight caliber xml lib is not pain from API point
of view (and I do not know, you are the expert here) then we might
actually consider it, given the difficulties in SAX space?

Pavel
-- 
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel


Re: XML stream writer library

2021-01-04 Thread Richard Kimberly Heck
On 1/4/21 5:10 PM, Pavel Sanda wrote:
> On Mon, Jan 04, 2021 at 09:48:42PM +0100, Thibaut Cuvelier wrote:
>> My recommendation, based on a quite long study of XML libraries (i.e.
>> several years, but quite far from full-time): either use QXmlStreamWriter
>> (which is mostly a SAX implementation in C++) or write our own.
>> QXmlStreamWriter is almost 4k-line long, but it can substantially be
>> simplified in our case (
>> https://github.com/qt/qtbase/blob/54875be84de059374920e4c0deacd13a41caaa13/src/corelib/serialization/qxmlstream.cpp).
>>
>>
>> TinyXML2 (https://github.com/leethomason/tinyxml2), pugixml (
>> https://github.com/zeux/pugixml), and Xerces-C++ (
>> https://xerces.apache.org/xerces-c/) are only DOM-based. There are quite a
>> few C libraries, like libxml2, that can be SAX-like, but C libraries are
>> horrible to use (http://www.xmlsoft.org/examples/testWriter.c).

I did some searching and, yes, I see the problem. Word is that recent
versions of libxml and libxml2 have dependencies on Gnome libraries that
we don't want.

I'll let you know if I get any answers to my question on the Fedora list.


> I do not dare to make any qualified recommendation between the choices
> above. But thinking aloud -- if there de facto isn't an alternative
> to QXmlStreamWriter, would it be hard to separate that class from
> the rest of Qt, fork and include it as an internal lyx routine?
> We would have full control over that code without unnecessary surprises
> of Qt's development.

I was going to suggest something in this spirit.

If, as our usual policy has been, we confine QXmlStreamWrapper to
support/, then what that basically means is writing our own LyX API as a
kind of wrapper around the Qt stuff. (Thibaut, if you haven't already,
you might look at how the FileName class. Much of it is a wrapper around
QFile.) Some, even many, of the routines might just directly call the Qt
equivalent (probably after a call to toqstr, from qstring_helpers). This
would be a relatively quick way to get something that worked and was
easy to use, and work on adapting DocBook and HTML export to this code
could proceed.

At that point, we could then write our own XML backend, possibly
adapting it from the Qt code. There are quite a few dependencies there,
but I'll guess some of them we do not need (e.g., the QApplication and
QFile dependencies). Our we build a lightweight library from scratch.
(It does seem like maybe there's a general need for that.) With the
already functioning backend from QXmlStreamWrapper, it would be easy to
test our own code and make sure it was producing the same output.

Riki


-- 
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel


Re: XML stream writer library

2021-01-04 Thread Pavel Sanda
On Mon, Jan 04, 2021 at 09:48:42PM +0100, Thibaut Cuvelier wrote:
> My recommendation, based on a quite long study of XML libraries (i.e.
> several years, but quite far from full-time): either use QXmlStreamWriter
> (which is mostly a SAX implementation in C++) or write our own.
> QXmlStreamWriter is almost 4k-line long, but it can substantially be
> simplified in our case (
> https://github.com/qt/qtbase/blob/54875be84de059374920e4c0deacd13a41caaa13/src/corelib/serialization/qxmlstream.cpp).
> 
> 
> TinyXML2 (https://github.com/leethomason/tinyxml2), pugixml (
> https://github.com/zeux/pugixml), and Xerces-C++ (
> https://xerces.apache.org/xerces-c/) are only DOM-based. There are quite a
> few C libraries, like libxml2, that can be SAX-like, but C libraries are
> horrible to use (http://www.xmlsoft.org/examples/testWriter.c).

I do not dare to make any qualified recommendation between the choices
above. But thinking aloud -- if there de facto isn't an alternative
to QXmlStreamWriter, would it be hard to separate that class from
the rest of Qt, fork and include it as an internal lyx routine?
We would have full control over that code without unnecessary surprises
of Qt's development.

Pavel
-- 
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel


Re: XML stream writer library

2021-01-04 Thread Yuriy Skalko

TinyXML2 (https://github.com/leethomason/tinyxml2), pugixml (
https://github.com/zeux/pugixml), and Xerces-C++ (
https://xerces.apache.org/xerces-c/) are only DOM-based. There are quite a
few C libraries, like libxml2, that can be SAX-like, but C libraries are
horrible to use (http://www.xmlsoft.org/examples/testWriter.c).


There are several C++ wrappers for libxml2 on GitHub. Maybe they can be 
useful:


https://github.com/libxmlplusplus/libxmlplusplus
https://github.com/rioki/libxmlmm


Yuriy
--
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel


Re: XML stream writer library

2021-01-04 Thread Thibaut Cuvelier
On Mon, 4 Jan 2021 at 20:30, Richard Kimberly Heck  wrote:

> On 1/3/21 3:37 PM, Lorenzo Bertini wrote:
>
> Hello list,
>
> In 12055 , discussing the merge of
> some MathMLStream and XmlStream components, we were contemplating the
> possibility of using an external library to handle XML streams, for example
> with indentation and tag insertion. One of the candidates was
> QXmlStreamWriter  class,
> but with the talk about removing unnecessary Qt components we thought to
> ask the list.
>
> Lest us know what do you think it's the best course, and if you know of
> other libraries we should look.
>
> As I mention in the bug, I looked over various XML libraries a while ago,
> when I was thinking about the long-standing idea of converting LyX's own
> format to XML. There seemed to be a myriad of options, and I never settled
> upon one. But it looks like there's a general feeling that we don't want to
> get too married to Qt---any more than we already are. That is in part
> because Qt seems to break itself fairly frequently (especially on OSX) and
> partly because they keep changing their attitude towards open source. There
> was some thing not long ago about how recent updates would only be
> available to paid subscribers right away, or something like that.
>
> So I'd generally suggest searching around for good, well-maintained XML
> libraries, maybe asking on Stack Exchange what people like. I'll send an
> email to the Fedora list and see what suggestions pop up.
>
There are multiple issues here. What is needed to generate HTML and DocBook
is a simple SAX writer, not a parser. I've done plenty of research about
it, there's no XML library that does that. Most of them are using a DOM,
which is a total waste of memory for such an application: it stores a
complete XML tree in memory before serialising it. With SAX, you just need
a string backend, which is much more lightweight (by several factors). In
this case, as the content is generated without ever looking back, SAX is
the best choice.

You have more choices in the Java world, and the standard library is often
enough (well, the standard extensions javax and JAXP). If you need a good
XML tool, chances are it will be written in Java, especially if it's open
source (Saxon for XSLT or XQuery, eXist or MarkLogic for XML database).

On the other hand, if you want to represent a complete LyX document and
work on it, you'd rather go for DOM, as you will always have the whole
structure in memory: you may want to edit things at any point in the
document. (Unless there is never an operation on the file structures, and
only on the set of insets of the document)

My recommendation, based on a quite long study of XML libraries (i.e.
several years, but quite far from full-time): either use QXmlStreamWriter
(which is mostly a SAX implementation in C++) or write our own.
QXmlStreamWriter is almost 4k-line long, but it can substantially be
simplified in our case (
https://github.com/qt/qtbase/blob/54875be84de059374920e4c0deacd13a41caaa13/src/corelib/serialization/qxmlstream.cpp).


TinyXML2 (https://github.com/leethomason/tinyxml2), pugixml (
https://github.com/zeux/pugixml), and Xerces-C++ (
https://xerces.apache.org/xerces-c/) are only DOM-based. There are quite a
few C libraries, like libxml2, that can be SAX-like, but C libraries are
horrible to use (http://www.xmlsoft.org/examples/testWriter.c).
-- 
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel


Re: XML stream writer library

2021-01-04 Thread Richard Kimberly Heck
On 1/3/21 3:37 PM, Lorenzo Bertini wrote:
>
> Hello list,
>
> In 12055 , discussing the merge
> of some MathMLStream and XmlStream components, we were contemplating
> the possibility of using an external library to handle XML streams,
> for example with indentation and tag insertion. One of the candidates
> was QXmlStreamWriter 
> class, but with the talk about removing unnecessary Qt components we
> thought to ask the list.
>
> Lest us know what do you think it's the best course, and if you know
> of other libraries we should look.
>
As I mention in the bug, I looked over various XML libraries a while
ago, when I was thinking about the long-standing idea of converting
LyX's own format to XML. There seemed to be a myriad of options, and I
never settled upon one. But it looks like there's a general feeling that
we don't want to get too married to Qt---any more than we already are.
That is in part because Qt seems to break itself fairly frequently
(especially on OSX) and partly because they keep changing their attitude
towards open source. There was some thing not long ago about how recent
updates would only be available to paid subscribers right away, or
something like that.

So I'd generally suggest searching around for good, well-maintained XML
libraries, maybe asking on Stack Exchange what people like. I'll send an
email to the Fedora list and see what suggestions pop up.

Riki


-- 
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-12 Thread Richard Heck

On 05/11/2013 07:11 AM, Abdelrazak Younes wrote:
On Sat, May 11, 2013 at 12:03 PM, Abdelrazak Younes you...@lyx.org 
mailto:you...@lyx.org wrote:


On Sat, May 11, 2013 at 8:40 AM, Pavel Sanda sa...@lyx.org
mailto:sa...@lyx.org wrote:

Abdelrazak Younes wrote:
 I will discuss that face2face during the meeting.

You should bring mirror then, no one else in this thread is in
Milano.
Anyway it's too late, Richard already barricaded in
underground garage of his
house and won't show until 378 patches implementing xml is
done as I infer
from the last testament.


I just discussed with Lars. He agrees that using Qt is a good
option... what a shock ! :-)
Vincent and JMarc don't care what we use.

I am talking about QXmlStreamReader and (as a second step)
QXmlStreamWriter. Our lexer class can just use QXmlStreamReader
internally, we don't have to spread the use of this call all other
the place.

So Richard, let's use a new feature repo for that. This is
agreed with Lars, Vincent and JMarc. Then we would create an xml
branch in that repo.

Lars and Vincent are setting this up right now :-)


So now it is set up, look (and check) at the documentation here:

http://wiki.lyx.org/Devel/LyXGit

I have erased all old branches because we want only feature branches 
based on master.


I just created xml branch in there.

Richard, I guess you are still sleeping so I hope you agree with all 
that. My goal is that we  collaborate on the XML support using this 
shared branch and repo.


This is all fine with me. I'll look at the feature branch business 
probably tomorrow. Busy today.


My intention was to work on writing a LyX file first, to try to 
stabilize the format, and then work on reading it.


As far as the Lexer goes, is the proposal to add some XML methods that 
will be implemented using QXmlStreamReader? If so, I'm not sure I see 
the advantage of adding them to the Lexer, as opposed to creating a new 
class for reading XML files.


Is the suggestion then also to write some sort of wrapper for 
QXmlStreamWriter rather than to use its methods directly?


While we're at this, I note that QXmlStream* wants a QIoDevice on which 
to operate, probably a QFile in our case. Any idea about how that should 
be handled? What about zipped files?


Richard



Re: XML Parsing Library [was Re: XML For LyX]

2013-05-12 Thread Richard Heck

On 05/11/2013 07:11 AM, Abdelrazak Younes wrote:
On Sat, May 11, 2013 at 12:03 PM, Abdelrazak Younes > wrote:


On Sat, May 11, 2013 at 8:40 AM, Pavel Sanda > wrote:

Abdelrazak Younes wrote:
> I will discuss that face2face during the meeting.

You should bring mirror then, no one else in this thread is in
Milano.
Anyway it's too late, Richard already barricaded in
underground garage of his
house and won't show until 378 patches implementing xml is
done as I infer
from the last testament.


I just discussed with Lars. He agrees that using Qt is a good
option... what a shock ! :-)
Vincent and JMarc don't care what we use.

I am talking about QXmlStreamReader and (as a second step)
QXmlStreamWriter. Our lexer class can just use QXmlStreamReader
internally, we don't have to spread the use of this call all other
the place.

So Richard, let's use a new "feature" repo for that. This is
agreed with Lars, Vincent and JMarc. Then we would create an "xml"
branch in that repo.

Lars and Vincent are setting this up right now :-)


So now it is set up, look (and check) at the documentation here:

http://wiki.lyx.org/Devel/LyXGit

I have erased all old branches because we want only feature branches 
based on "master".


I just created "xml" branch in there.

Richard, I guess you are still sleeping so I hope you agree with all 
that. My goal is that we  collaborate on the XML support using this 
shared branch and repo.


This is all fine with me. I'll look at the feature branch business 
probably tomorrow. Busy today.


My intention was to work on writing a LyX file first, to try to 
stabilize the format, and then work on reading it.


As far as the Lexer goes, is the proposal to add some XML methods that 
will be implemented using QXmlStreamReader? If so, I'm not sure I see 
the advantage of adding them to the Lexer, as opposed to creating a new 
class for reading XML files.


Is the suggestion then also to write some sort of wrapper for 
QXmlStreamWriter rather than to use its methods directly?


While we're at this, I note that QXmlStream* wants a QIoDevice on which 
to operate, probably a QFile in our case. Any idea about how that should 
be handled? What about zipped files?


Richard



Re: XML Parsing Library [was Re: XML For LyX]

2013-05-11 Thread Abdelrazak Younes
Guys

Qt has a nice xml reader
Also one with a nice stream like interface that would fit nicely in ours
parser. And it quite fast too.

I will discuss that face2face during the meeting.

Abdel
On May 9, 2013 11:26 PM, Richard Heck rgh...@lyx.org wrote:

 On 05/09/2013 02:25 PM, Pavel Sanda wrote:

 Richard Heck wrote:

 On Linux, of course, it is different. One would just expect this library
 already to be installed. But things do not work that way on the other
 OSs.

 I belive we should actually _include_ some leightweight library in our
 sources so it is fixed and we do not rely in any versioning problem or
 avalability on various architectures.


 I had the same thought.

 Richard




Re: XML Parsing Library [was Re: XML For LyX]

2013-05-11 Thread Pavel Sanda
Abdelrazak Younes wrote:
 I will discuss that face2face during the meeting.

You should bring mirror then, no one else in this thread is in Milano.
Anyway it's too late, Richard already barricaded in underground garage of his
house and won't show until 378 patches implementing xml is done as I infer
from the last testament.

Pavel


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-11 Thread Andrew Parsloe

On 11/05/2013 6:40 p.m., Pavel Sanda wrote:

Abdelrazak Younes wrote:

I will discuss that face2face during the meeting.


You should bring mirror then, no one else in this thread is in Milano.
Anyway it's too late, Richard already barricaded in underground garage of his
house and won't show until 378 patches implementing xml is done as I infer
from the last testament.

Pavel

It is absolutely none of my business, but I do enjoy my daily 'fix' 
eavesdropping on developer messages.


Andrew


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-11 Thread Abdelrazak Younes
On Sat, May 11, 2013 at 8:40 AM, Pavel Sanda sa...@lyx.org wrote:

 Abdelrazak Younes wrote:
  I will discuss that face2face during the meeting.

 You should bring mirror then, no one else in this thread is in Milano.
 Anyway it's too late, Richard already barricaded in underground garage of
 his
 house and won't show until 378 patches implementing xml is done as I infer
 from the last testament.


I just discussed with Lars. He agrees that using Qt is a good option...
what a shock ! :-)
Vincent and JMarc don't care what we use.

I am talking about QXmlStreamReader and (as a second step)
QXmlStreamWriter. Our lexer class can just use QXmlStreamReader internally,
we don't have to spread the use of this call all other the place.

So Richard, let's use a new feature repo for that. This is agreed with
Lars, Vincent and JMarc. Then we would create an xml branch in that repo.

Lars and Vincent are setting this up right now :-)

Abdel.


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-11 Thread Abdelrazak Younes
On Sat, May 11, 2013 at 11:11 AM, Andrew Parsloe apars...@clear.net.nzwrote:


  It is absolutely none of my business, but I do enjoy my daily 'fix'
 eavesdropping on developer messages.


The live show is even more entertaining :-)

Alessandro, do you like it?

Abdel.


Re: XML Library Question Answered?

2013-05-11 Thread Abdelrazak Younes
Hum I just saw this thread...
So we all agree, that's good.

Abdel.


On Fri, May 10, 2013 at 10:09 PM, Nico Williams n...@cryptonector.comwrote:

 On Fri, May 10, 2013 at 12:45 PM, Richard Heck rgh...@lyx.org wrote:
  The only significant worry here concerns stability: Could a Qt update
 break
  us? We already depend heavily on Qt, so this is not as large a concern as
  with depending upon other external libraries. And my sense is that these
  classes are likely to be pretty stable.

 QXmlStreamReader looks perfect for parsing LyX XML.  QXmlStreamWriter
 looks perfect for writing it.

 I seriously doubt these will be unstable.  Note that writing [valid]
 XML is much easier than reading it, so you could write your own stream
 writer.  Reading actually isn't that hard either -- it helps to not
 support external entities (and, indeed, QXmlStreamReader doesn't).
 But really, XML itself is stable, and these libraries look
 well-matched up to XML (as one would expect), so I see no reason for
 there to be backwards incompatible API/ABI/semantic changes to them.
 Removal, OTOH, is much harder to foresee, but in that case you can
 just write your own streamers.

 Nico
 --



Re: XML Parsing Library [was Re: XML For LyX]

2013-05-11 Thread Abdelrazak Younes
On Sat, May 11, 2013 at 12:03 PM, Abdelrazak Younes you...@lyx.org wrote:

 On Sat, May 11, 2013 at 8:40 AM, Pavel Sanda sa...@lyx.org wrote:

 Abdelrazak Younes wrote:
  I will discuss that face2face during the meeting.

 You should bring mirror then, no one else in this thread is in Milano.
 Anyway it's too late, Richard already barricaded in underground garage of
 his
 house and won't show until 378 patches implementing xml is done as I infer
 from the last testament.


 I just discussed with Lars. He agrees that using Qt is a good option...
 what a shock ! :-)
 Vincent and JMarc don't care what we use.

 I am talking about QXmlStreamReader and (as a second step)
 QXmlStreamWriter. Our lexer class can just use QXmlStreamReader internally,
 we don't have to spread the use of this call all other the place.

 So Richard, let's use a new feature repo for that. This is agreed with
 Lars, Vincent and JMarc. Then we would create an xml branch in that repo.

 Lars and Vincent are setting this up right now :-)


So now it is set up, look (and check) at the documentation here:

http://wiki.lyx.org/Devel/LyXGit

I have erased all old branches because we want only feature branches based
on master.

I just created xml branch in there.

Richard, I guess you are still sleeping so I hope you agree with all that.
My goal is that we  collaborate on the XML support using this shared branch
and repo.

Abdel


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-11 Thread Abdelrazak Younes
Guys

Qt has a nice xml reader
Also one with a nice stream like interface that would fit nicely in ours
parser. And it quite fast too.

I will discuss that face2face during the meeting.

Abdel
On May 9, 2013 11:26 PM, "Richard Heck"  wrote:

> On 05/09/2013 02:25 PM, Pavel Sanda wrote:
>
>> Richard Heck wrote:
>>
>>> On Linux, of course, it is different. One would just expect this library
>>> already to be installed. But things do not work that way on the other
>>> OSs.
>>>
>> I belive we should actually _include_ some leightweight library in our
>> sources so it is fixed and we do not rely in any versioning problem or
>> avalability on various architectures.
>>
>
> I had the same thought.
>
> Richard
>
>


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-11 Thread Pavel Sanda
Abdelrazak Younes wrote:
> I will discuss that face2face during the meeting.

You should bring mirror then, no one else in this thread is in Milano.
Anyway it's too late, Richard already barricaded in underground garage of his
house and won't show until 378 patches implementing xml is done as I infer
from the last testament.

Pavel


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-11 Thread Andrew Parsloe

On 11/05/2013 6:40 p.m., Pavel Sanda wrote:

Abdelrazak Younes wrote:

I will discuss that face2face during the meeting.


You should bring mirror then, no one else in this thread is in Milano.
Anyway it's too late, Richard already barricaded in underground garage of his
house and won't show until 378 patches implementing xml is done as I infer
from the last testament.

Pavel

It is absolutely none of my business, but I do enjoy my daily 'fix' 
eavesdropping on developer messages.


Andrew


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-11 Thread Abdelrazak Younes
On Sat, May 11, 2013 at 8:40 AM, Pavel Sanda  wrote:

> Abdelrazak Younes wrote:
> > I will discuss that face2face during the meeting.
>
> You should bring mirror then, no one else in this thread is in Milano.
> Anyway it's too late, Richard already barricaded in underground garage of
> his
> house and won't show until 378 patches implementing xml is done as I infer
> from the last testament.
>

I just discussed with Lars. He agrees that using Qt is a good option...
what a shock ! :-)
Vincent and JMarc don't care what we use.

I am talking about QXmlStreamReader and (as a second step)
QXmlStreamWriter. Our lexer class can just use QXmlStreamReader internally,
we don't have to spread the use of this call all other the place.

So Richard, let's use a new "feature" repo for that. This is agreed with
Lars, Vincent and JMarc. Then we would create an "xml" branch in that repo.

Lars and Vincent are setting this up right now :-)

Abdel.


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-11 Thread Abdelrazak Younes
On Sat, May 11, 2013 at 11:11 AM, Andrew Parsloe wrote:

>
>>  It is absolutely none of my business, but I do enjoy my daily 'fix'
> eavesdropping on developer messages.
>

The live show is even more entertaining :-)

Alessandro, do you like it?

Abdel.


Re: XML Library Question Answered?

2013-05-11 Thread Abdelrazak Younes
Hum I just saw this thread...
So we all agree, that's good.

Abdel.


On Fri, May 10, 2013 at 10:09 PM, Nico Williams wrote:

> On Fri, May 10, 2013 at 12:45 PM, Richard Heck  wrote:
> > The only significant worry here concerns stability: Could a Qt update
> break
> > us? We already depend heavily on Qt, so this is not as large a concern as
> > with depending upon other external libraries. And my sense is that these
> > classes are likely to be pretty stable.
>
> QXmlStreamReader looks perfect for parsing LyX XML.  QXmlStreamWriter
> looks perfect for writing it.
>
> I seriously doubt these will be unstable.  Note that writing [valid]
> XML is much easier than reading it, so you could write your own stream
> writer.  Reading actually isn't that hard either -- it helps to not
> support external entities (and, indeed, QXmlStreamReader doesn't).
> But really, XML itself is stable, and these libraries look
> well-matched up to XML (as one would expect), so I see no reason for
> there to be backwards incompatible API/ABI/semantic changes to them.
> Removal, OTOH, is much harder to foresee, but in that case you can
> just write your own streamers.
>
> Nico
> --
>


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-11 Thread Abdelrazak Younes
On Sat, May 11, 2013 at 12:03 PM, Abdelrazak Younes  wrote:

> On Sat, May 11, 2013 at 8:40 AM, Pavel Sanda  wrote:
>
>> Abdelrazak Younes wrote:
>> > I will discuss that face2face during the meeting.
>>
>> You should bring mirror then, no one else in this thread is in Milano.
>> Anyway it's too late, Richard already barricaded in underground garage of
>> his
>> house and won't show until 378 patches implementing xml is done as I infer
>> from the last testament.
>>
>
> I just discussed with Lars. He agrees that using Qt is a good option...
> what a shock ! :-)
> Vincent and JMarc don't care what we use.
>
> I am talking about QXmlStreamReader and (as a second step)
> QXmlStreamWriter. Our lexer class can just use QXmlStreamReader internally,
> we don't have to spread the use of this call all other the place.
>
> So Richard, let's use a new "feature" repo for that. This is agreed with
> Lars, Vincent and JMarc. Then we would create an "xml" branch in that repo.
>
> Lars and Vincent are setting this up right now :-)
>

So now it is set up, look (and check) at the documentation here:

http://wiki.lyx.org/Devel/LyXGit

I have erased all old branches because we want only feature branches based
on "master".

I just created "xml" branch in there.

Richard, I guess you are still sleeping so I hope you agree with all that.
My goal is that we  collaborate on the XML support using this shared branch
and repo.

Abdel


Re: Re: XML Parsing Library [was Re: XML For LyX]

2013-05-10 Thread José Matos
On Thursday 09 May 2013 14:21:37 Richard Heck wrote:
 On Linux, of course, it is different. One would just expect this library 
 already to be installed. But things do not work that way on the other OSs.
 
 Richard

From the webpage:
Libxml2 is known to be very portable, the library should build and work 
without serious troubles on a variety of systems (Linux, Unix, Windows, CygWin, 
MacOS, MacOS X, RISC Os, OS/2, VMS, QNX, MVS, VxWorks, ...)

Or are you thinking about any other system that is not included in this list? 
:-)

libxml is a proven package that due to its license is widely used, it is easy 
to install and it has a very record in terms of stability.

What would be the disadvantage of relying on it? I mean what are concerns about 
depending on it?

Regards,
-- 
José Abílio


Re: Re: XML Parsing Library [was Re: XML For LyX]

2013-05-10 Thread Pavel Sanda
José Matos wrote:
 Or are you thinking about any other system that is not included in this list? 
 :-)

I don't see Haiku, where we currently compile ;)

 What would be the disadvantage of relying on it? I mean what are concerns 
 about depending on it?

But jokes aside, you have to rely on arbitrary decision of third party which
can do whatever is pleased to do so in new versions, if some problem arises you
can't stick to version known to work, because the other guys have the library
on command on you linux distro.

I'm not theoretizing, both this is relatively fresh experience with packaging
problems for 10 different versions of python and impossibility to keep support
for svn 1.6 (forced bump to 1.7).

Any bugs here would be dataloss, so we want to have it under our control as
much as possible, not just external dep.

Pavel


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-10 Thread Gökçen Eraslan

On 09-05-2013 18:52, Richard Heck wrote:

On 05/08/2013 06:24 PM, José Matos wrote:

On Wednesday 08 May 2013 17:43:41 Richard Heck wrote:

Thinking ahead, however: Should we use some SAX library to read the
XML? Or should we just adapt the Lexer for this purpose?

Richard

Lars had that working for a previous version of lyx with lexer. His
branches are still available in git, I think...


I just had a look at those. He had an XML parser here:
http://www.lyx.org/trac/browser/lyxsvn/lyx-devel/branches/personal/larsbj/xml/src/support/xmlparser.h?rev=19478



There is also iksemel:

https://code.google.com/p/iksemel/

Very nice, lightweight, portable XML library. It also has python bindings.



but it appears to be based upon xmlpp, which I cannot get to compile on
my machine. It's a very old library. An older version uses expat, which
is pretty heavy duty.

I did some googling and found this page:
http://lars.ruoff.free.fr/xmlcpp/
which describes a bunch of free XML libraries and was updated 2/2012.
Most of what's there is either (a) very large, like Xerces and libxml2,
or else (b) a DOM-style parser, which is not what we wantm, I think. The
best of the options appears to be:
http://www.fxtech.com/xmlio/
which is a very lightweight (53KB source) and simple, SAX-like parser.
LGPL. It is also quite old, but it compiles just fine here. Of course,
it also writes XML.

It could probably use some updating if we were going to use it, but the
code is very simple, so this would be easy to do.

Richard




--
Gökcen Eraslan


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-10 Thread Richard Heck

On 05/10/2013 04:46 AM, José Matos wrote:

On Thursday 09 May 2013 14:21:37 Richard Heck wrote:

On Linux, of course, it is different. One would just expect this library
already to be installed. But things do not work that way on the other OSs.

Richard

 From the webpage:
Libxml2 is known to be very portable, the library should build and work without 
serious troubles on a variety of systems (Linux, Unix, Windows, CygWin, MacOS, MacOS X, 
RISC Os, OS/2, VMS, QNX, MVS, VxWorks, ...)

Or are you thinking about any other system that is not included in this list? 
:-)


No, I just meant that we would have to include it in our sources, since 
we cannot rely upon libxml to be available on actual machines that are 
running non-Linux OSs. It's available for that OS, yes, but it's not 
actually going to be installed. Unlike on Linux, where it either is 
installed or else we can rely upon package managers to handle the 
dependency.



libxml is a proven package that due to its license is widely used, it is easy 
to install and it has a very record in terms of stability.

What would be the disadvantage of relying on it? I mean what are concerns about 
depending on it?


As Pavel said, if we are using some XML library to read and write LyX 
files, then we have to be especially careful that LyX does not get 
broken by some external change over which we have no control. (I'll add 
to his mention of python the whole mess over losing fork() on OSX.) This 
is yet another reason we need whatever library we use to be included in 
our code. And then the concern about libxml is the one I mentioned 
previously: Between libxml and libxml++, we're looking at 50+MB of code. 
More than all of LyX. That'd be worth it if we were actually going to 
use much of what libxml provides. But I doubt that is true.


Richard



Re: Re: Re: XML Parsing Library [was Re: XML For LyX]

2013-05-10 Thread José Matos
On Friday 10 May 2013 02:19:40 Pavel Sanda wrote:
 But jokes aside, you have to rely on arbitrary decision of third party which
 can do whatever is pleased to do so in new versions, if some problem arises 
 you
 can't stick to version known to work, because the other guys have the library
 on command on you linux distro.

libxml is quite stable so this not an issue and I don't see this changing in 
new versions.

 I'm not theoretizing, both this is relatively fresh experience with packaging
 problems for 10 different versions of python and impossibility to keep support
 for svn 1.6 (forced bump to 1.7).

As I have told before the first version of python 3 has been released 5 years 
ago and it is possible to write and use code that is compatible with both 
python 2 and python 3.

I understand that this is just a single data point in our argument. The point 
here was that this was not unexpected in any meaningful sense. :-)

 Any bugs here would be dataloss, so we want to have it under our control as
 much as possible, not just external dep.

That is why I said that libxml has a proven record, this is more than an 
external dependency this is the reference external dependency.

 Pavel

FWIW I am not trying to argue for the sake of it. The are always benefits and 
drawbacks with any choice and I am just trying to stress that not all cases are 
equal.

I should also say that I will be glad to go with any path we choose. It is in 
cases like this where the shortcomings of email as a communication channel are 
more visible. :-)
 
-- 
José Abílio


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-10 Thread Georg Baum
Richard Heck wrote:

 No, I just meant that we would have to include it in our sources, since
 we cannot rely upon libxml to be available on actual machines that are
 running non-Linux OSs. It's available for that OS, yes, but it's not
 actually going to be installed. Unlike on Linux, where it either is
 installed or else we can rely upon package managers to handle the
 dependency.

With the same reasoning you could conclude that we need to ship Qt within 
the sources. If there is a bug at the right place in Qt, you can get all 
sorts of problems including severe data loss as well. But, we don't include 
Qt, since experience tells that such bugs do not happen in the versions 
included in linux distros.

I don't see any problem in relying upon libxml2 as an external dependency. I 
have made very good experiences with it, and it is widely used. For windows, 
it would just be included in the requirements bundle, and for linux and OS X 
it is safe to rely on the systemn versions. I agree that including it in the 
sources is not an option because of the size. I don't know about the C++ 
bindings, if these are not as reliable as libxml2 it might be worth it to 
write some simple ones and include them in the LyX source code.


Georg



Re: XML Parsing Library [was Re: XML For LyX]

2013-05-10 Thread Pavel Sanda
Georg Baum wrote:
 With the same reasoning you could conclude that we need to ship Qt within 
 the sources. If there is a bug at the right place in Qt, you can get all 
 sorts of problems including severe data loss as well. But, we don't include 

Well, indirectly yes, but still, we don't use Qt for reading/writing file.

 Qt, since experience tells that such bugs do not happen in the versions 
 included in linux distros.

I can clearly remember how we needed bump new LyX version because new Qt
came out and severally damamged user experience due to new crashes in UI.

Pavel


Re: XML Library Question Answered?

2013-05-10 Thread Nico Williams
On Fri, May 10, 2013 at 12:45 PM, Richard Heck rgh...@lyx.org wrote:
 The only significant worry here concerns stability: Could a Qt update break
 us? We already depend heavily on Qt, so this is not as large a concern as
 with depending upon other external libraries. And my sense is that these
 classes are likely to be pretty stable.

QXmlStreamReader looks perfect for parsing LyX XML.  QXmlStreamWriter
looks perfect for writing it.

I seriously doubt these will be unstable.  Note that writing [valid]
XML is much easier than reading it, so you could write your own stream
writer.  Reading actually isn't that hard either -- it helps to not
support external entities (and, indeed, QXmlStreamReader doesn't).
But really, XML itself is stable, and these libraries look
well-matched up to XML (as one would expect), so I see no reason for
there to be backwards incompatible API/ABI/semantic changes to them.
Removal, OTOH, is much harder to foresee, but in that case you can
just write your own streamers.

Nico
--


Re: Re: XML Parsing Library [was Re: XML For LyX]

2013-05-10 Thread José Matos
On Thursday 09 May 2013 14:21:37 Richard Heck wrote:
> On Linux, of course, it is different. One would just expect this library 
> already to be installed. But things do not work that way on the other OSs.
> 
> Richard

>From the webpage:
"Libxml2 is known to be very portable, the library should build and work 
without serious troubles on a variety of systems (Linux, Unix, Windows, CygWin, 
MacOS, MacOS X, RISC Os, OS/2, VMS, QNX, MVS, VxWorks, ...)"

Or are you thinking about any other system that is not included in this list? 
:-)

libxml is a proven package that due to its license is widely used, it is easy 
to install and it has a very record in terms of stability.

What would be the disadvantage of relying on it? I mean what are concerns about 
depending on it?

Regards,
-- 
José Abílio


Re: Re: XML Parsing Library [was Re: XML For LyX]

2013-05-10 Thread Pavel Sanda
José Matos wrote:
> Or are you thinking about any other system that is not included in this list? 
> :-)

I don't see Haiku, where we currently compile ;)

> What would be the disadvantage of relying on it? I mean what are concerns 
> about depending on it?

But jokes aside, you have to rely on arbitrary decision of third party which
can do whatever is pleased to do so in new versions, if some problem arises you
can't stick to version known to work, because the other guys have the library
on command on you linux distro.

I'm not theoretizing, both this is relatively fresh experience with packaging
problems for 10 different versions of python and impossibility to keep support
for svn 1.6 (forced bump to 1.7).

Any bugs here would be dataloss, so we want to have it under our control as
much as possible, not just external dep.

Pavel


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-10 Thread Gökçen Eraslan

On 09-05-2013 18:52, Richard Heck wrote:

On 05/08/2013 06:24 PM, José Matos wrote:

On Wednesday 08 May 2013 17:43:41 Richard Heck wrote:

Thinking ahead, however: Should we use some SAX library to read the
XML? Or should we just adapt the Lexer for this purpose?

Richard

Lars had that working for a previous version of lyx with lexer. His
branches are still available in git, I think...


I just had a look at those. He had an XML parser here:
http://www.lyx.org/trac/browser/lyxsvn/lyx-devel/branches/personal/larsbj/xml/src/support/xmlparser.h?rev=19478



There is also iksemel:

https://code.google.com/p/iksemel/

Very nice, lightweight, portable XML library. It also has python bindings.



but it appears to be based upon xmlpp, which I cannot get to compile on
my machine. It's a very old library. An older version uses expat, which
is pretty heavy duty.

I did some googling and found this page:
http://lars.ruoff.free.fr/xmlcpp/
which describes a bunch of free XML libraries and was updated 2/2012.
Most of what's there is either (a) very large, like Xerces and libxml2,
or else (b) a DOM-style parser, which is not what we wantm, I think. The
best of the options appears to be:
http://www.fxtech.com/xmlio/
which is a very lightweight (53KB source) and simple, SAX-like parser.
LGPL. It is also quite old, but it compiles just fine here. Of course,
it also writes XML.

It could probably use some updating if we were going to use it, but the
code is very simple, so this would be easy to do.

Richard




--
Gökcen Eraslan


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-10 Thread Richard Heck

On 05/10/2013 04:46 AM, José Matos wrote:

On Thursday 09 May 2013 14:21:37 Richard Heck wrote:

On Linux, of course, it is different. One would just expect this library
already to be installed. But things do not work that way on the other OSs.

Richard

 From the webpage:
"Libxml2 is known to be very portable, the library should build and work without 
serious troubles on a variety of systems (Linux, Unix, Windows, CygWin, MacOS, MacOS X, 
RISC Os, OS/2, VMS, QNX, MVS, VxWorks, ...)"

Or are you thinking about any other system that is not included in this list? 
:-)


No, I just meant that we would have to include it in our sources, since 
we cannot rely upon libxml to be available on actual machines that are 
running non-Linux OSs. It's available for that OS, yes, but it's not 
actually going to be installed. Unlike on Linux, where it either is 
installed or else we can rely upon package managers to handle the 
dependency.



libxml is a proven package that due to its license is widely used, it is easy 
to install and it has a very record in terms of stability.

What would be the disadvantage of relying on it? I mean what are concerns about 
depending on it?


As Pavel said, if we are using some XML library to read and write LyX 
files, then we have to be especially careful that LyX does not get 
broken by some external change over which we have no control. (I'll add 
to his mention of python the whole mess over losing fork() on OSX.) This 
is yet another reason we need whatever library we use to be included in 
our code. And then the concern about libxml is the one I mentioned 
previously: Between libxml and libxml++, we're looking at 50+MB of code. 
More than all of LyX. That'd be worth it if we were actually going to 
use much of what libxml provides. But I doubt that is true.


Richard



Re: Re: Re: XML Parsing Library [was Re: XML For LyX]

2013-05-10 Thread José Matos
On Friday 10 May 2013 02:19:40 Pavel Sanda wrote:
> But jokes aside, you have to rely on arbitrary decision of third party which
> can do whatever is pleased to do so in new versions, if some problem arises 
> you
> can't stick to version known to work, because the other guys have the library
> on command on you linux distro.

libxml is quite stable so this not an issue and I don't see this changing in 
new versions.

> I'm not theoretizing, both this is relatively fresh experience with packaging
> problems for 10 different versions of python and impossibility to keep support
> for svn 1.6 (forced bump to 1.7).

As I have told before the first version of python 3 has been released 5 years 
ago and it is possible to write and use code that is compatible with both 
python 2 and python 3.

I understand that this is just a single data point in our argument. The point 
here was that this was not unexpected in any meaningful sense. :-)

> Any bugs here would be dataloss, so we want to have it under our control as
> much as possible, not just external dep.

That is why I said that libxml has a proven record, this is more than an 
external dependency this is the reference external dependency.

> Pavel

FWIW I am not trying to argue for the sake of it. The are always benefits and 
drawbacks with any choice and I am just trying to stress that not all cases are 
equal.

I should also say that I will be glad to go with any path we choose. It is in 
cases like this where the shortcomings of email as a communication channel are 
more visible. :-)
 
-- 
José Abílio


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-10 Thread Georg Baum
Richard Heck wrote:

> No, I just meant that we would have to include it in our sources, since
> we cannot rely upon libxml to be available on actual machines that are
> running non-Linux OSs. It's available for that OS, yes, but it's not
> actually going to be installed. Unlike on Linux, where it either is
> installed or else we can rely upon package managers to handle the
> dependency.

With the same reasoning you could conclude that we need to ship Qt within 
the sources. If there is a bug at the right place in Qt, you can get all 
sorts of problems including severe data loss as well. But, we don't include 
Qt, since experience tells that such bugs do not happen in the versions 
included in linux distros.

I don't see any problem in relying upon libxml2 as an external dependency. I 
have made very good experiences with it, and it is widely used. For windows, 
it would just be included in the requirements bundle, and for linux and OS X 
it is safe to rely on the systemn versions. I agree that including it in the 
sources is not an option because of the size. I don't know about the C++ 
bindings, if these are not as reliable as libxml2 it might be worth it to 
write some simple ones and include them in the LyX source code.


Georg



Re: XML Parsing Library [was Re: XML For LyX]

2013-05-10 Thread Pavel Sanda
Georg Baum wrote:
> With the same reasoning you could conclude that we need to ship Qt within 
> the sources. If there is a bug at the right place in Qt, you can get all 
> sorts of problems including severe data loss as well. But, we don't include 

Well, indirectly yes, but still, we don't use Qt for reading/writing file.

> Qt, since experience tells that such bugs do not happen in the versions 
> included in linux distros.

I can clearly remember how we needed bump new LyX version because new Qt
came out and severally damamged user experience due to new crashes in UI.

Pavel


Re: XML Library Question Answered?

2013-05-10 Thread Nico Williams
On Fri, May 10, 2013 at 12:45 PM, Richard Heck  wrote:
> The only significant worry here concerns stability: Could a Qt update break
> us? We already depend heavily on Qt, so this is not as large a concern as
> with depending upon other external libraries. And my sense is that these
> classes are likely to be pretty stable.

QXmlStreamReader looks perfect for parsing LyX XML.  QXmlStreamWriter
looks perfect for writing it.

I seriously doubt these will be unstable.  Note that writing [valid]
XML is much easier than reading it, so you could write your own stream
writer.  Reading actually isn't that hard either -- it helps to not
support external entities (and, indeed, QXmlStreamReader doesn't).
But really, XML itself is stable, and these libraries look
well-matched up to XML (as one would expect), so I see no reason for
there to be backwards incompatible API/ABI/semantic changes to them.
Removal, OTOH, is much harder to foresee, but in that case you can
just write your own streamers.

Nico
--


Re: XML For LyX

2013-05-09 Thread Alex Vergara Gil
 
 I have started to think seriously about moving to XML for LyX's native 
 file format. I doubt that we will want to do this for 2.1, as it is too 
 late, really, so I am thinking about doing it for some time early in the 
 2.2 cycle, which means starting now.
 
First of all, This is a very old feature request that will be greatly 
appreciated at least from my part! So if you manage to achieve this it will be 
a huge improvement but just a starting point for the rest of things that can be 
done with an XML native format. There is a thread for this named Would a 
native LyX XML schema be accepted?.

 My plan is first to write routines that will output a pure XML version 
 of a LyX document and then to worry about the read routines once that is 
 working. I think it will be fairly easy to get that much done, by 
 working off the XHTML stuff. Some of that will prove re-usable.
 
 Thinking ahead, however: Should we use some SAX library to read the XML? 
 Or should we just adapt the Lexer for this purpose?
 
 Richard
 

I think there are much work done in this sense, please read Nico Williams' 
approach. I think is the correct way to follow.

My 5c

Alex

XML Parsing Library [was Re: XML For LyX]

2013-05-09 Thread Richard Heck

On 05/08/2013 06:24 PM, José Matos wrote:

On Wednesday 08 May 2013 17:43:41 Richard Heck wrote:

Thinking ahead, however: Should we use some SAX library to read the XML? Or 
should we just adapt the Lexer for this purpose?

Richard

Lars had that working for a previous version of lyx with lexer. His branches 
are still available in git, I think...


I just had a look at those. He had an XML parser here:
http://www.lyx.org/trac/browser/lyxsvn/lyx-devel/branches/personal/larsbj/xml/src/support/xmlparser.h?rev=19478
but it appears to be based upon xmlpp, which I cannot get to compile on 
my machine. It's a very old library. An older version uses expat, which 
is pretty heavy duty.


I did some googling and found this page:
http://lars.ruoff.free.fr/xmlcpp/
which describes a bunch of free XML libraries and was updated 2/2012. 
Most of what's there is either (a) very large, like Xerces and libxml2, 
or else (b) a DOM-style parser, which is not what we wantm, I think. The 
best of the options appears to be:

http://www.fxtech.com/xmlio/
which is a very lightweight (53KB source) and simple, SAX-like parser. 
LGPL. It is also quite old, but it compiles just fine here. Of course, 
it also writes XML.


It could probably use some updating if we were going to use it, but the 
code is very simple, so this would be easy to do.


Richard



Re: XML Parsing Library [was Re: XML For LyX]

2013-05-09 Thread Rob Oakes



On Thu, May 9, 2013 at 10:52 AM, Richard Heck rgh...@lyx.org wrote:


I just had a look at those. He had an XML parser here:
http://www.lyx.org/trac/browser/lyxsvn/lyx-devel/branches/personal/larsbj/xml/src/support/xmlparser.h?rev=19478
but it appears to be based upon xmlpp, which I cannot get to compile 
on my machine. It's a very old library. An older version uses expat, 
which is pretty heavy duty.


I did some googling and found this page:
http://lars.ruoff.free.fr/xmlcpp/
which describes a bunch of free XML libraries and was updated 2/2012. 
Most of what's there is either (a) very large, like Xerces and 
libxml2, or else (b) a DOM-style parser, which is not what we wantm, 
I think. The best of the options appears to be:

http://www.fxtech.com/xmlio/
which is a very lightweight (53KB source) and simple, SAX-like 
parser. LGPL. It is also quite old, but it compiles just fine here. 
Of course, it also writes XML.


It could probably use some updating if we were going to use it, but 
the code is very simple, so this would be easy to do.


Is there a reason we would want to avoid libxml? I've found it to offer 
the best feature set and ease of use. It also ships with a set of 
excellent Python bindings, which we could incorporate into the Python 
we ship. Between the two, there is very little that wouldn't be 
possible from an XML processing standpoint.


We might even be able to incorporate some of the XSL processing that 
some of the users have been salivating over.


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-09 Thread Richard Heck

On 05/09/2013 01:39 PM, Rob Oakes wrote:



On Thu, May 9, 2013 at 10:52 AM, Richard Heck rgh...@lyx.org wrote:
I just had a look at those. He had an XML parser here: 
http://www.lyx.org/trac/browser/lyxsvn/lyx-devel/branches/personal/larsbj/xml/src/support/xmlparser.h?rev=19478 
but it appears to be based upon xmlpp, which I cannot get to compile 
on my machine. It's a very old library. An older version uses expat, 
which is pretty heavy duty. I did some googling and found this page: 
http://lars.ruoff.free.fr/xmlcpp/ which describes a bunch of free XML 
libraries and was updated 2/2012. Most of what's there is either (a) 
very large, like Xerces and libxml2, or else (b) a DOM-style parser, 
which is not what we wantm, I think. The best of the options appears 
to be: http://www.fxtech.com/xmlio/ which is a very lightweight (53KB 
source) and simple, SAX-like parser. LGPL. It is also quite old, but 
it compiles just fine here. Of course, it also writes XML. It could 
probably use some updating if we were going to use it, but the code 
is very simple, so this would be easy to do.


Is there a reason we would want to avoid libxml? I've found it to 
offer the best feature set and ease of use. It also ships with a set 
of excellent Python bindings, which we could incorporate into the 
Python we ship. Between the two, there is very little that wouldn't be 
possible from an XML processing standpoint.


The libxml2 sources, unzipped, are 45MB. The C++ bindings, in libxml++, 
are another 7.1MB. That's my main worry. The entire LyX src/ directory 
is only 11MB. Something that powerful also feels a bit like overkill for 
what we will be doing.


On Linux, of course, it is different. One would just expect this library 
already to be installed. But things do not work that way on the other OSs.


Richard



Re: XML Parsing Library [was Re: XML For LyX]

2013-05-09 Thread Pavel Sanda
Richard Heck wrote:
 On Linux, of course, it is different. One would just expect this library 
 already to be installed. But things do not work that way on the other OSs.

I belive we should actually _include_ some leightweight library in our sources
so it is fixed and we do not rely in any versioning problem or avalability
on various architectures.

Pavel


Re: XML For LyX

2013-05-09 Thread Nico Williams
On Thu, May 9, 2013 at 8:21 AM, Alex Vergara Gil a...@cphr.edu.cu wrote:
 First of all, This is a very old feature request that will be greatly
 appreciated at least from my part!

Me too.

 I think there are much work done in this sense, please read Nico Williams'
 approach. I think is the correct way to follow.

He has.  The consensus is that XML support needs to be native.

That doesn't settle other issues, like: how much should LyX internals
change to accommodate XML.  The approach I'd take would be to adjust
the document on output to match the strict containership requirements
of XML, just like my script does.  This is minimally invasive.


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-09 Thread Richard Heck

On 05/09/2013 02:25 PM, Pavel Sanda wrote:

Richard Heck wrote:

On Linux, of course, it is different. One would just expect this library
already to be installed. But things do not work that way on the other OSs.

I belive we should actually _include_ some leightweight library in our sources 
so it is fixed and we do not rely in any versioning problem or avalability on 
various architectures.


I had the same thought.

Richard



Re: XML For LyX

2013-05-09 Thread Richard Heck

On 05/09/2013 03:57 PM, Nico Williams wrote:

On Thu, May 9, 2013 at 8:21 AM, Alex Vergara Gil a...@cphr.edu.cu wrote:

First of all, This is a very old feature request that will be greatly
appreciated at least from my part!

Me too.


I think there are much work done in this sense, please read Nico Williams' 
approach. I think is the correct way to follow.

He has.  The consensus is that XML support needs to be native.

That doesn't settle other issues, like: how much should LyX internals
change to accommodate XML.  The approach I'd take would be to adjust
the document on output to match the strict containership requirements
of XML, just like my script does.  This is minimally invasive.


The LyX document is internally a (very complex) tree structure, so I 
think this is pretty simple. As Jose mentioned, Lars has the write side 
of it pretty much done a long time ago. My sense is that it was so long 
ago that it would be as much work to adapt what he did as to re-write 
it, so I propose to do the latter.


Richard



Re: XML For LyX

2013-05-09 Thread Nico Williams
On Thu, May 9, 2013 at 4:27 PM, Richard Heck rgh...@lyx.org wrote:
 The LyX document is internally a (very complex) tree structure, so I think
 this is pretty simple. As Jose mentioned, Lars has the write side of it
 pretty much done a long time ago. My sense is that it was so long ago that
 it would be as much work to adapt what he did as to re-write it, so I
 propose to do the latter.

I agree.  I wrote my own Python XML output class for my script.  That
was quite easy.  Most of the logic in my script is about fixing things
that need to be fixed, like rewriting a sequence of \series tokens and
text so that they have proper containership.

ALSO, I used three XML namespaces for the various sorts of elements
that LyX uses; this seemed quite natural.  You should definitely look
at the output of my lyx2xml and see if that works for you; if not I'd
love to hear what you'd do instead w.r.t. tag names and namespaces.

Nico
--


Re: XML For LyX

2013-05-09 Thread Nico Williams
I should add that while *writing* XML is easy enough (valid XML too),
it's reading that's hard, so you can't avoid using a library.


Re: XML For LyX

2013-05-09 Thread Alex Vergara Gil
> 
> I have started to think seriously about moving to XML for LyX's native 
> file format. I doubt that we will want to do this for 2.1, as it is too 
> late, really, so I am thinking about doing it for some time early in the 
> 2.2 cycle, which means starting now.
> 
First of all, This is a very old feature request that will be greatly 
appreciated at least from my part! So if you manage to achieve this it will be 
a huge improvement but just a starting point for the rest of things that can be 
done with an XML native format. There is a thread for this named "Would a 
native LyX XML schema be accepted?".

> My plan is first to write routines that will output a pure XML version 
> of a LyX document and then to worry about the read routines once that is 
> working. I think it will be fairly easy to get that much done, by 
> working off the XHTML stuff. Some of that will prove re-usable.
> 
> Thinking ahead, however: Should we use some SAX library to read the XML? 
> Or should we just adapt the Lexer for this purpose?
> 
> Richard
> 

I think there are much work done in this sense, please read Nico Williams' 
approach. I think is the correct way to follow.

My 5c

Alex

XML Parsing Library [was Re: XML For LyX]

2013-05-09 Thread Richard Heck

On 05/08/2013 06:24 PM, José Matos wrote:

On Wednesday 08 May 2013 17:43:41 Richard Heck wrote:

Thinking ahead, however: Should we use some SAX library to read the XML? Or 
should we just adapt the Lexer for this purpose?

Richard

Lars had that working for a previous version of lyx with lexer. His branches 
are still available in git, I think...


I just had a look at those. He had an XML parser here:
http://www.lyx.org/trac/browser/lyxsvn/lyx-devel/branches/personal/larsbj/xml/src/support/xmlparser.h?rev=19478
but it appears to be based upon xmlpp, which I cannot get to compile on 
my machine. It's a very old library. An older version uses expat, which 
is pretty heavy duty.


I did some googling and found this page:
http://lars.ruoff.free.fr/xmlcpp/
which describes a bunch of free XML libraries and was updated 2/2012. 
Most of what's there is either (a) very large, like Xerces and libxml2, 
or else (b) a DOM-style parser, which is not what we wantm, I think. The 
best of the options appears to be:

http://www.fxtech.com/xmlio/
which is a very lightweight (53KB source) and simple, SAX-like parser. 
LGPL. It is also quite old, but it compiles just fine here. Of course, 
it also writes XML.


It could probably use some updating if we were going to use it, but the 
code is very simple, so this would be easy to do.


Richard



Re: XML Parsing Library [was Re: XML For LyX]

2013-05-09 Thread Rob Oakes



On Thu, May 9, 2013 at 10:52 AM, Richard Heck  wrote:


I just had a look at those. He had an XML parser here:
http://www.lyx.org/trac/browser/lyxsvn/lyx-devel/branches/personal/larsbj/xml/src/support/xmlparser.h?rev=19478
but it appears to be based upon xmlpp, which I cannot get to compile 
on my machine. It's a very old library. An older version uses expat, 
which is pretty heavy duty.


I did some googling and found this page:
http://lars.ruoff.free.fr/xmlcpp/
which describes a bunch of free XML libraries and was updated 2/2012. 
Most of what's there is either (a) very large, like Xerces and 
libxml2, or else (b) a DOM-style parser, which is not what we wantm, 
I think. The best of the options appears to be:

http://www.fxtech.com/xmlio/
which is a very lightweight (53KB source) and simple, SAX-like 
parser. LGPL. It is also quite old, but it compiles just fine here. 
Of course, it also writes XML.


It could probably use some updating if we were going to use it, but 
the code is very simple, so this would be easy to do.


Is there a reason we would want to avoid libxml? I've found it to offer 
the best feature set and ease of use. It also ships with a set of 
excellent Python bindings, which we could incorporate into the Python 
we ship. Between the two, there is very little that wouldn't be 
possible from an XML processing standpoint.


We might even be able to incorporate some of the XSL processing that 
some of the users have been salivating over.


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-09 Thread Richard Heck

On 05/09/2013 01:39 PM, Rob Oakes wrote:



On Thu, May 9, 2013 at 10:52 AM, Richard Heck  wrote:
I just had a look at those. He had an XML parser here: 
http://www.lyx.org/trac/browser/lyxsvn/lyx-devel/branches/personal/larsbj/xml/src/support/xmlparser.h?rev=19478 
but it appears to be based upon xmlpp, which I cannot get to compile 
on my machine. It's a very old library. An older version uses expat, 
which is pretty heavy duty. I did some googling and found this page: 
http://lars.ruoff.free.fr/xmlcpp/ which describes a bunch of free XML 
libraries and was updated 2/2012. Most of what's there is either (a) 
very large, like Xerces and libxml2, or else (b) a DOM-style parser, 
which is not what we wantm, I think. The best of the options appears 
to be: http://www.fxtech.com/xmlio/ which is a very lightweight (53KB 
source) and simple, SAX-like parser. LGPL. It is also quite old, but 
it compiles just fine here. Of course, it also writes XML. It could 
probably use some updating if we were going to use it, but the code 
is very simple, so this would be easy to do.


Is there a reason we would want to avoid libxml? I've found it to 
offer the best feature set and ease of use. It also ships with a set 
of excellent Python bindings, which we could incorporate into the 
Python we ship. Between the two, there is very little that wouldn't be 
possible from an XML processing standpoint.


The libxml2 sources, unzipped, are 45MB. The C++ bindings, in libxml++, 
are another 7.1MB. That's my main worry. The entire LyX src/ directory 
is only 11MB. Something that powerful also feels a bit like overkill for 
what we will be doing.


On Linux, of course, it is different. One would just expect this library 
already to be installed. But things do not work that way on the other OSs.


Richard



Re: XML Parsing Library [was Re: XML For LyX]

2013-05-09 Thread Pavel Sanda
Richard Heck wrote:
> On Linux, of course, it is different. One would just expect this library 
> already to be installed. But things do not work that way on the other OSs.

I belive we should actually _include_ some leightweight library in our sources
so it is fixed and we do not rely in any versioning problem or avalability
on various architectures.

Pavel


Re: XML For LyX

2013-05-09 Thread Nico Williams
On Thu, May 9, 2013 at 8:21 AM, Alex Vergara Gil  wrote:
> First of all, This is a very old feature request that will be greatly
> appreciated at least from my part!

Me too.

> I think there are much work done in this sense, please read Nico Williams'
> approach. I think is the correct way to follow.

He has.  The consensus is that XML support needs to be native.

That doesn't settle other issues, like: how much should LyX internals
change to accommodate XML.  The approach I'd take would be to adjust
the document on output to match the strict containership requirements
of XML, just like my script does.  This is minimally invasive.


Re: XML Parsing Library [was Re: XML For LyX]

2013-05-09 Thread Richard Heck

On 05/09/2013 02:25 PM, Pavel Sanda wrote:

Richard Heck wrote:

On Linux, of course, it is different. One would just expect this library
already to be installed. But things do not work that way on the other OSs.

I belive we should actually _include_ some leightweight library in our sources 
so it is fixed and we do not rely in any versioning problem or avalability on 
various architectures.


I had the same thought.

Richard



Re: XML For LyX

2013-05-09 Thread Richard Heck

On 05/09/2013 03:57 PM, Nico Williams wrote:

On Thu, May 9, 2013 at 8:21 AM, Alex Vergara Gil  wrote:

First of all, This is a very old feature request that will be greatly
appreciated at least from my part!

Me too.


I think there are much work done in this sense, please read Nico Williams' 
approach. I think is the correct way to follow.

He has.  The consensus is that XML support needs to be native.

That doesn't settle other issues, like: how much should LyX internals
change to accommodate XML.  The approach I'd take would be to adjust
the document on output to match the strict containership requirements
of XML, just like my script does.  This is minimally invasive.


The LyX document is internally a (very complex) tree structure, so I 
think this is pretty simple. As Jose mentioned, Lars has the write side 
of it pretty much done a long time ago. My sense is that it was so long 
ago that it would be as much work to adapt what he did as to re-write 
it, so I propose to do the latter.


Richard



Re: XML For LyX

2013-05-09 Thread Nico Williams
On Thu, May 9, 2013 at 4:27 PM, Richard Heck  wrote:
> The LyX document is internally a (very complex) tree structure, so I think
> this is pretty simple. As Jose mentioned, Lars has the write side of it
> pretty much done a long time ago. My sense is that it was so long ago that
> it would be as much work to adapt what he did as to re-write it, so I
> propose to do the latter.

I agree.  I wrote my own Python XML output class for my script.  That
was quite easy.  Most of the logic in my script is about fixing things
that need to be fixed, like rewriting a sequence of \series tokens and
text so that they have proper containership.

ALSO, I used three XML namespaces for the various sorts of "elements"
that LyX uses; this seemed quite natural.  You should definitely look
at the output of my lyx2xml and see if that works for you; if not I'd
love to hear what you'd do instead w.r.t. tag names and namespaces.

Nico
--


Re: XML For LyX

2013-05-09 Thread Nico Williams
I should add that while *writing* XML is easy enough (valid XML too),
it's reading that's hard, so you can't avoid using a library.


Re: XML For LyX

2013-05-08 Thread José Matos
On Wednesday 08 May 2013 17:43:41 Richard Heck wrote:
 Thinking ahead, however: Should we use some SAX library to read the XML? 
 Or should we just adapt the Lexer for this purpose?
 
 Richard

Lars had that working for a previous version of lyx with lexer. His branches 
are still available in git, I think...

-- 
José Abílio


Re: XML For LyX

2013-05-08 Thread Nico Williams
Reading will be easier, I think, for the reasons I've described before.
 Also, you could use lyx2xml to write so you can test the read path, but I
don't know of an xml2lyx tool you could use for the reverse.

Just my 2c.


Re: XML For LyX

2013-05-08 Thread José Matos
On Wednesday 08 May 2013 17:43:41 Richard Heck wrote:
> Thinking ahead, however: Should we use some SAX library to read the XML? 
> Or should we just adapt the Lexer for this purpose?
> 
> Richard

Lars had that working for a previous version of lyx with lexer. His branches 
are still available in git, I think...

-- 
José Abílio


Re: XML For LyX

2013-05-08 Thread Nico Williams
Reading will be easier, I think, for the reasons I've described before.
 Also, you could use lyx2xml to write so you can test the read path, but I
don't know of an xml2lyx tool you could use for the reverse.

Just my 2c.


Re: XML format status

2010-10-04 Thread Jürgen Spitzmüller
Gour D. wrote:
 btw, is it comparable in the sense of being more complete than LateX2e
 preventing clashes between different packages or just 'another macro
 package' ?

I'm not so much into the internals of latex3 development. I think they try to 
generally overcome some fundamental limitations of LaTeX2e and set up a 
cleaner framework. In that sense, I think the core will support some basic 
things that were delegated to packages in 2e. Also, I think the framework will 
help to prevent package clashes (via namespaces etc.).

 Now I wonder what will happen with XeTeX since LuaTeX is supposed to
 replace or rather become new PDFTeX and will handle Unicode as well?

Who knows? It's not unlikely that they will co-exist. XeTeX is actively 
developed and used. Even the creator of pdftex, Han The Than, recently made a 
significant contribution to XeTeX, namely margin kerning (the one missing 
feature of XeTeX wrt pdftex IMHO).

 Sincerely,
 Gour

Jürgen


Re: XML format status

2010-10-04 Thread Jürgen Spitzmüller
Gour D. wrote:
 OT: Can someone explain me what is the aim of latex-3 development in
 the light of LuaTeX or LuaTeX will be just another implementation?

LaTeX3 is a new macro collection (so it aims to replace LateX2e and is 
comparable to ConTeXt). LuaTeX is a processor (such as XeTeX or PDFTeX or 
TeX). I suppose you can use LaTeX3 together with lualatex as well as with 
pdflatex.

Jürgen


Re: XML format status

2010-10-04 Thread Gour D.
On Mon, 4 Oct 2010 07:49:26 +0200
 Jürgen == Jürgen Spitzmüller sp...@lyx.org wrote:

Jürgen LaTeX3 is a new macro collection (so it aims to replace
Jürgen LateX2e and is comparable to ConTeXt).

Ahh...I got this one now. Thanks.

btw, is it comparable in the sense of being more complete than LateX2e
preventing clashes between different packages or just 'another macro
package' ?

Jürgen LuaTeX is a processor (such as XeTeX or PDFTeX or TeX). I
Jürgen suppose you can use LaTeX3 together with lualatex as well as
Jürgen with pdflatex.

OK. That was clear enough as well.

Now I wonder what will happen with XeTeX since LuaTeX is supposed to
replace or rather become new PDFTeX and will handle Unicode as well?


Sincerely,
Gour

-- 

Gour  | Hlapicina, Croatia  | GPG key: CDBF17CA



signature.asc
Description: PGP signature


Re: XML format status

2010-10-04 Thread Jürgen Spitzmüller
Gour D. wrote:
> btw, is it comparable in the sense of being more complete than LateX2e
> preventing clashes between different packages or just 'another macro
> package' ?

I'm not so much into the internals of latex3 development. I think they try to 
generally overcome some fundamental limitations of LaTeX2e and set up a 
cleaner framework. In that sense, I think the core will support some basic 
things that were delegated to packages in 2e. Also, I think the framework will 
help to prevent package clashes (via namespaces etc.).

> Now I wonder what will happen with XeTeX since LuaTeX is supposed to
> replace or rather become new PDFTeX and will handle Unicode as well?

Who knows? It's not unlikely that they will co-exist. XeTeX is actively 
developed and used. Even the creator of pdftex, Han The Than, recently made a 
significant contribution to XeTeX, namely margin kerning (the one missing 
feature of XeTeX wrt pdftex IMHO).

> Sincerely,
> Gour

Jürgen


Re: XML format status

2010-10-04 Thread Jürgen Spitzmüller
Gour D. wrote:
> OT: Can someone explain me what is the aim of latex-3 development in
> the light of LuaTeX or LuaTeX will be just another implementation?

LaTeX3 is a new macro collection (so it aims to replace LateX2e and is 
comparable to ConTeXt). LuaTeX is a processor (such as XeTeX or PDFTeX or 
TeX). I suppose you can use LaTeX3 together with lualatex as well as with 
pdflatex.

Jürgen


Re: XML format status

2010-10-04 Thread Gour D.
On Mon, 4 Oct 2010 07:49:26 +0200
>> "Jürgen" == Jürgen Spitzmüller  wrote:

Jürgen> LaTeX3 is a new macro collection (so it aims to replace
Jürgen> LateX2e and is comparable to ConTeXt).

Ahh...I got this one now. Thanks.

btw, is it comparable in the sense of being more complete than LateX2e
preventing clashes between different packages or just 'another macro
package' ?

Jürgen> LuaTeX is a processor (such as XeTeX or PDFTeX or TeX). I
Jürgen> suppose you can use LaTeX3 together with lualatex as well as
Jürgen> with pdflatex.

OK. That was clear enough as well.

Now I wonder what will happen with XeTeX since LuaTeX is supposed to
replace or rather become new PDFTeX and will handle Unicode as well?


Sincerely,
Gour

-- 

Gour  | Hlapicina, Croatia  | GPG key: CDBF17CA



signature.asc
Description: PGP signature


Re: XML format status

2010-10-03 Thread Gour D.
On Wed, 09 Jun 2010 05:27:58 +0200
 Peter == Peter Kümmel syntheti...@gmx.net wrote:

Excuse me for dropping in so lately...

I sent a post about LyX/LATeX vs ConTeXt on users list e days ago
stating that I plan to abandon idea to use the latter and 'return'
back to winning team.

Now, I've discovered this thread about XML as LyX's native format...

Peter I would prefer a more readable format than XML like json, even I
Peter would use Lua, because it is the future scripting languange in
Peter LaTeX, but I assume we could never explain the rest of the
Peter world, why we we don't use beloved XML. So let's use XML. And
Peter validating a XML with a DTD is really an advantage.

I did two ~500 pages book with LyX back in '99 (who can remember which
version was current at that time) mixing Croatian, English and
Sanskrit diacritics all in a one book. Never had any problem to
require validating XML against some DTD or schema or how they called
it.

Prior to that I also tried using Docbook for authopring (I believe the
editor was Epcedit, German one) and quickly run away from all
XML/XSLT/FOP stuff...

Now, I really do not understand what problem would XML format in LyX
solve? (YAGNI)

Otoh, just enter 'XML sucks' in Google and you will have some nice
reading. :-)

OT: Can someone explain me what is the aim of latex-3 development in
the light of LuaTeX or LuaTeX will be just another implementation?


Sincerely,
Gour

-- 

Gour  | Hlapicina, Croatia  | GPG key: CDBF17CA



signature.asc
Description: PGP signature


Re: XML format status

2010-10-03 Thread Gour D.
On Wed, 09 Jun 2010 05:27:58 +0200
>> "Peter" == Peter Kümmel  wrote:

Excuse me for dropping in so lately...

I sent a post about LyX/LATeX vs ConTeXt on users list e days ago
stating that I plan to abandon idea to use the latter and 'return'
back to winning team.

Now, I've discovered this thread about XML as LyX's native format...

Peter> I would prefer a more readable format than XML like json, even I
Peter> would use Lua, because it is the future scripting languange in
Peter> LaTeX, but I assume we could never explain the rest of the
Peter> world, why we we don't use beloved XML. So let's use XML. And
Peter> validating a XML with a DTD is really an advantage.

I did two ~500 pages book with LyX back in '99 (who can remember which
version was current at that time) mixing Croatian, English and
Sanskrit diacritics all in a one book. Never had any problem to
require validating XML against some DTD or schema or how they called
it.

Prior to that I also tried using Docbook for authopring (I believe the
editor was Epcedit, German one) and quickly run away from all
XML/XSLT/FOP stuff...

Now, I really do not understand what problem would XML format in LyX
solve? (YAGNI)

Otoh, just enter 'XML sucks' in Google and you will have some nice
reading. :-)

OT: Can someone explain me what is the aim of latex-3 development in
the light of LuaTeX or LuaTeX will be just another implementation?


Sincerely,
Gour

-- 

Gour  | Hlapicina, Croatia  | GPG key: CDBF17CA



signature.asc
Description: PGP signature


Re: XML format status

2010-06-09 Thread Sam Liddicott

 On 09/06/10 04:27, Peter Kümmel wrote:

Am Dienstag, den 08.06.2010, 17:22 -0400 schrieb Richard Heck:

On 06/08/2010 03:49 PM, Peter Kümmel wrote:

Am Dienstag, den 08.06.2010, 20:52 +0200 schrieb Andre Poenitz:


On Tue, Jun 08, 2010 at 04:29:21PM +0200, Abdelrazak Younes wrote:


On 06/08/2010 03:27 PM, Vincent van Ravesteijn wrote:


What is the current status or thinking of the XML format for lyx 2?



Ideally, LyX 2 would have an XML file format. However, no-one is
actively working on the issue, so we postponed it.

As far as I know, we didn't really decide when and how to do the transition.


I worked recently with JSon format (www.json.org), cleaner to the
human, faster to parse and less verbose than XML, quite nice...


This might indeed be a good option.


http://gitorious.org/JsonQt/
http://gitorious.org/qjson


But the question remains what is the aim of the new format: is it for
us, or is it for other who wanna generate, manipulate, ... LyX files.



My understanding was that the point was to make the LyX format more
easily parsable by LyX and, in particular, to provide validation that a
file really is in the proper format. So, for us, but without breaking
the easy manipulability of LyX files via sed, awk, etc.


I would prefer a more readable format than XML like json, even I would
use Lua, because it is the future scripting languange in LaTeX, but
I assume we could never explain the rest of the world, why we we don't
use beloved XML. So let's use XML. And validating a XML with a DTD is
really an advantage.


This last point should not be underestimated by those who want to 
transform Lyx documents using sed and awk. A DTD  lets you know that you 
did not break the document.


Because XSLT is such a convenient transforming tool, xml will make 
conversion to and from other (xml-ish) documents much simpler than it is 
for lyx at the moment.


XSLT is grep/sed/awk for xml. If you start with the default identity 
transform you just then add patterns for the paths you want to 
change/exclude. XSLT is sed for structured documents. (I've done some 
nasty 4K long sed scripts that are state-machines for transforming 
structured documents, and XSLT is much nicer).


However, I will admit there are rarely 1-liners for xslt; but I did 
write a sed pattern for xslt that lets you make 1-lines, by passing the 
xpath match pattern, and replace string on the command line.


JSON is an advantage where there is not an xml parser available, but are 
there any systems that can't provide a DOM tree from a document these days?


lua has xml parsing extensions and will need them as the future 
scripting language of latex, because it will be dealing with xml even if 
not with an xml tex format. (And I prefer an xml tex format because 
currently only tex can interpret tex because the syntax is extensible, 
hence the difficulty that Lyx can have importing tex documents - it 
can't even safely know how to ignore the bits it doesn't understand!).


--
*Sam's signature*


Re: XML format status

2010-06-09 Thread Guenter Milde
On 2010-06-08, Sam Liddicott wrote:
 This is a multi-part message in MIME format.
 --050502020101020702060201
 Content-Type: text/plain; charset=ISO-8859-1; format=flowed
 Content-Transfer-Encoding: 7bit

   On 08/06/10 15:27, Pavel Sanda wrote:
 Sam Liddicott wrote:
 I also still dream about lyx being the first decent docbook editor.
 are you aware of the fact that lyx already have output routines for docbook?

 Yes, but I recall being told that it wasn't supported and that if it 
 still worked it was pretty much by good luck these days.

 If that  is no longer true, I'd be glad to know!

It is still true.

OTOH, native XHTML output is brand new (LyX 2) and actively worked
on, so you might give it a try.

Günter



Re: XML format status

2010-06-09 Thread Pavel Sanda
Guenter Milde wrote:
 On 2010-06-08, Sam Liddicott wrote:
  This is a multi-part message in MIME format.
  --050502020101020702060201
  Content-Type: text/plain; charset=ISO-8859-1; format=flowed
  Content-Transfer-Encoding: 7bit
 
On 08/06/10 15:27, Pavel Sanda wrote:
  Sam Liddicott wrote:
  I also still dream about lyx being the first decent docbook editor.
  are you aware of the fact that lyx already have output routines for 
  docbook?
 
  Yes, but I recall being told that it wasn't supported and that if it 
  still worked it was pretty much by good luck these days.
 
  If that  is no longer true, I'd be glad to know!
 
 It is still true.

its not maintained, but it should work. the problem is that it outputs
docbook sgml, version 4.x. if i understand correctly transforming
it into docbook xml is oneliner patch in lyx sources. what involves
transformation into newer version, that i'm not sure but there are
already working tools which do this if its not easy for us.

to me it looks like that as far lyx sources is concerned it would
be quite easy to make your dream true. what we miss is somebody
who knows docbook well enough to provide us the information
what exactly should be changed in the output xml file (hint!).

http://www.mail-archive.com/lyx-devel@lists.lyx.org/msg151947.html
http://www.neomantic.com/tutorials/lyx-and-docbookXML

pavel


Re: XML format status

2010-06-09 Thread Pavel Sanda
Sam Liddicott wrote:
 I would prefer a more readable format than XML like json, even I would
 use Lua, because it is the future scripting languange in LaTeX, but
 I assume we could never explain the rest of the world, why we we don't
 use beloved XML. So let's use XML. And validating a XML with a DTD is
 really an advantage.

 This last point should not be underestimated by those who want to transform 
 Lyx documents using sed and awk. A DTD  lets you know that you did not 
 break the document.

and not overestimated by those who want it to be readable and editable by
humans, which was starting point of this subthread... ;)

pavel


Re: XML format status

2010-06-09 Thread Sam Liddicott

 On 09/06/10 09:10, Pavel Sanda wrote:

Guenter Milde wrote:

On 2010-06-08, Sam Liddicott wrote:

This is a multi-part message in MIME format.
--050502020101020702060201
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
   On 08/06/10 15:27, Pavel Sanda wrote:

Sam Liddicott wrote:

I also still dream about lyx being the first decent docbook editor.

are you aware of the fact that lyx already have output routines for docbook?

Yes, but I recall being told that it wasn't supported and that if it
still worked it was pretty much by good luck these days.
If that  is no longer true, I'd be glad to know!

It is still true.

its not maintained, but it should work. the problem is that it outputs
docbook sgml, version 4.x. if i understand correctly transforming
it into docbook xml is oneliner patch in lyx sources. what involves
transformation into newer version, that i'm not sure but there are
already working tools which do this if its not easy for us.

to me it looks like that as far lyx sources is concerned it would
be quite easy to make your dream true. what we miss is somebody
who knows docbook well enough to provide us the information
what exactly should be changed in the output xml file (hint!).

http://www.mail-archive.com/lyx-devel@lists.lyx.org/msg151947.html
http://www.neomantic.com/tutorials/lyx-and-docbookXML


Thanks - you give me good hope.
It is ironic that I want to use Lyx to avoid having to know docbook too 
well, but may have to learn it to fixup lyx!


Ah well!

Sam*

*


Re: XML format status

2010-06-09 Thread Pavel Sanda
Sam Liddicott wrote:
 Thanks - you give me good hope.
 It is ironic that I want to use Lyx to avoid having to know docbook too 
 well, but may have to learn it to fixup lyx!

we are waiting for your mail :)
pavel


Re: XML format status

2010-06-09 Thread Sam Liddicott

 On 09/06/10 04:27, Peter Kümmel wrote:

Am Dienstag, den 08.06.2010, 17:22 -0400 schrieb Richard Heck:

On 06/08/2010 03:49 PM, Peter Kümmel wrote:

Am Dienstag, den 08.06.2010, 20:52 +0200 schrieb Andre Poenitz:


On Tue, Jun 08, 2010 at 04:29:21PM +0200, Abdelrazak Younes wrote:


On 06/08/2010 03:27 PM, Vincent van Ravesteijn wrote:


What is the current status or thinking of the XML format for lyx 2?



Ideally, LyX 2 would have an XML file format. However, no-one is
actively working on the issue, so we postponed it.

As far as I know, we didn't really decide when and how to do the transition.


I worked recently with JSon format (www.json.org), cleaner to the
human, faster to parse and less verbose than XML, quite nice...


This might indeed be a good option.


http://gitorious.org/JsonQt/
http://gitorious.org/qjson


But the question remains what is the aim of the new format: is it for
us, or is it for other who wanna generate, manipulate, ... LyX files.



My understanding was that the point was to make the LyX format more
easily parsable by LyX and, in particular, to provide validation that a
file really is in the proper format. So, for us, but without breaking
the easy manipulability of LyX files via sed, awk, etc.


I would prefer a more readable format than XML like json, even I would
use Lua, because it is the future scripting languange in LaTeX, but
I assume we could never explain the rest of the world, why we we don't
use beloved XML. So let's use XML. And validating a XML with a DTD is
really an advantage.


This last point should not be underestimated by those who want to 
transform Lyx documents using sed and awk. A DTD  lets you know that you 
did not break the document.


Because XSLT is such a convenient transforming tool, xml will make 
conversion to and from other (xml-ish) documents much simpler than it is 
for lyx at the moment.


XSLT is grep/sed/awk for xml. If you start with the default "identity" 
transform you just then add patterns for the paths you want to 
change/exclude. XSLT is sed for structured documents. (I've done some 
nasty 4K long sed scripts that are state-machines for transforming 
structured documents, and XSLT is much nicer).


However, I will admit there are rarely 1-liners for xslt; but I did 
write a sed pattern for xslt that lets you make 1-lines, by passing the 
xpath match pattern, and replace string on the command line.


JSON is an advantage where there is not an xml parser available, but are 
there any systems that can't provide a DOM tree from a document these days?


lua has xml parsing extensions and will need them as the future 
scripting language of latex, because it will be dealing with xml even if 
not with an xml tex format. (And I prefer an xml tex format because 
currently only tex can interpret tex because the syntax is extensible, 
hence the difficulty that Lyx can have importing tex documents - it 
can't even safely know how to ignore the bits it doesn't understand!).


--
*Sam's signature*


Re: XML format status

2010-06-09 Thread Guenter Milde
On 2010-06-08, Sam Liddicott wrote:
> This is a multi-part message in MIME format.
> --050502020101020702060201
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> Content-Transfer-Encoding: 7bit

>   On 08/06/10 15:27, Pavel Sanda wrote:
>> Sam Liddicott wrote:
>>> I also still dream about lyx being the first decent docbook editor.
>> are you aware of the fact that lyx already have output routines for docbook?

> Yes, but I recall being told that it wasn't supported and that if it 
> still worked it was pretty much by good luck these days.

> If that  is no longer true, I'd be glad to know!

It is still true.

OTOH, native XHTML output is brand new (LyX 2) and actively worked
on, so you might give it a try.

Günter



Re: XML format status

2010-06-09 Thread Pavel Sanda
Guenter Milde wrote:
> On 2010-06-08, Sam Liddicott wrote:
> > This is a multi-part message in MIME format.
> > --050502020101020702060201
> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> > Content-Transfer-Encoding: 7bit
> 
> >   On 08/06/10 15:27, Pavel Sanda wrote:
> >> Sam Liddicott wrote:
> >>> I also still dream about lyx being the first decent docbook editor.
> >> are you aware of the fact that lyx already have output routines for 
> >> docbook?
> 
> > Yes, but I recall being told that it wasn't supported and that if it 
> > still worked it was pretty much by good luck these days.
> 
> > If that  is no longer true, I'd be glad to know!
> 
> It is still true.

its not maintained, but it should work. the problem is that it outputs
docbook sgml, version 4.x. if i understand correctly transforming
it into docbook xml is oneliner patch in lyx sources. what involves
transformation into newer version, that i'm not sure but there are
already working tools which do this if its not easy for us.

to me it looks like that as far lyx sources is concerned it would
be quite easy to make your dream true. what we miss is somebody
who knows docbook well enough to provide us the information
what exactly should be changed in the output xml file (hint!).

http://www.mail-archive.com/lyx-devel@lists.lyx.org/msg151947.html
http://www.neomantic.com/tutorials/lyx-and-docbookXML

pavel


Re: XML format status

2010-06-09 Thread Pavel Sanda
Sam Liddicott wrote:
>> I would prefer a more readable format than XML like json, even I would
>> use Lua, because it is the future scripting languange in LaTeX, but
>> I assume we could never explain the rest of the world, why we we don't
>> use beloved XML. So let's use XML. And validating a XML with a DTD is
>> really an advantage.
>
> This last point should not be underestimated by those who want to transform 
> Lyx documents using sed and awk. A DTD  lets you know that you did not 
> break the document.

and not overestimated by those who want it to be readable and editable by
humans, which was starting point of this subthread... ;)

pavel


Re: XML format status

2010-06-09 Thread Sam Liddicott

 On 09/06/10 09:10, Pavel Sanda wrote:

Guenter Milde wrote:

On 2010-06-08, Sam Liddicott wrote:

This is a multi-part message in MIME format.
--050502020101020702060201
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
   On 08/06/10 15:27, Pavel Sanda wrote:

Sam Liddicott wrote:

I also still dream about lyx being the first decent docbook editor.

are you aware of the fact that lyx already have output routines for docbook?

Yes, but I recall being told that it wasn't supported and that if it
still worked it was pretty much by good luck these days.
If that  is no longer true, I'd be glad to know!

It is still true.

its not maintained, but it should work. the problem is that it outputs
docbook sgml, version 4.x. if i understand correctly transforming
it into docbook xml is oneliner patch in lyx sources. what involves
transformation into newer version, that i'm not sure but there are
already working tools which do this if its not easy for us.

to me it looks like that as far lyx sources is concerned it would
be quite easy to make your dream true. what we miss is somebody
who knows docbook well enough to provide us the information
what exactly should be changed in the output xml file (hint!).

http://www.mail-archive.com/lyx-devel@lists.lyx.org/msg151947.html
http://www.neomantic.com/tutorials/lyx-and-docbookXML


Thanks - you give me good hope.
It is ironic that I want to use Lyx to avoid having to know docbook too 
well, but may have to learn it to fixup lyx!


Ah well!

Sam*

*


Re: XML format status

2010-06-09 Thread Pavel Sanda
Sam Liddicott wrote:
> Thanks - you give me good hope.
> It is ironic that I want to use Lyx to avoid having to know docbook too 
> well, but may have to learn it to fixup lyx!

we are waiting for your mail :)
pavel


Re: XML format status

2010-06-08 Thread Vincent van Ravesteijn
 What is the current status or thinking of the XML format for lyx 2?


Ideally, LyX 2 would have an XML file format. However, no-one is
actively working on the issue, so we postponed it.

As far as I know, we didn't really decide when and how to do the transition.

Are you interested in having an XML format ?


Re: XML format status

2010-06-08 Thread Sam Liddicott

 On 08/06/10 14:27, Vincent van Ravesteijn wrote:

What is the current status or thinking of the XML format for lyx 2?


Ideally, LyX 2 would have an XML file format. However, no-one is
actively working on the issue, so we postponed it.

As far as I know, we didn't really decide when and how to do the transition.

Are you interested in having an XML format ?


I am interested  for newfangle literate programming (which is pretty 
complete as it is) http://www.nongnu.org/newfangle/index.shtml but 
further development would be simpler using xslt on lyx's xml instead of 
on the tex output. (Otherwise I'm going to have to start parsing tex in 
awk).


I also still dream about lyx being the first decent docbook editor.

Sam

--
*Sam's signature*


Re: XML format status

2010-06-08 Thread Pavel Sanda
Sam Liddicott wrote:
 I also still dream about lyx being the first decent docbook editor.

are you aware of the fact that lyx already have output routines for docbook?

pavel


Re: XML format status

2010-06-08 Thread Abdelrazak Younes

On 06/08/2010 03:27 PM, Vincent van Ravesteijn wrote:

What is the current status or thinking of the XML format for lyx 2?

 

Ideally, LyX 2 would have an XML file format. However, no-one is
actively working on the issue, so we postponed it.

As far as I know, we didn't really decide when and how to do the transition.
   


I worked recently with JSon format (www.json.org), cleaner to the human, 
faster to parse and less verbose than XML, quite nice...


Abdel.



Re: XML format status

2010-06-08 Thread Sam Liddicott

 On 08/06/10 15:27, Pavel Sanda wrote:

Sam Liddicott wrote:

I also still dream about lyx being the first decent docbook editor.

are you aware of the fact that lyx already have output routines for docbook?


Yes, but I recall being told that it wasn't supported and that if it 
still worked it was pretty much by good luck these days.


If that  is no longer true, I'd be glad to know!

Sam

--
*Sam's signature*


Re: XML format status

2010-06-08 Thread Sam Liddicott

 On 08/06/10 15:29, Abdelrazak Younes wrote:

On 06/08/2010 03:27 PM, Vincent van Ravesteijn wrote:

What is the current status or thinking of the XML format for lyx 2?


Ideally, LyX 2 would have an XML file format. However, no-one is
actively working on the issue, so we postponed it.

As far as I know, we didn't really decide when and how to do the 
transition.


I worked recently with JSon format (www.json.org), cleaner to the 
human, faster to parse and less verbose than XML, quite nice...



Dunno about those claims:
https://www.p6r.com/articles/2008/11/02/xsloutput-methodjson/ shows some 
pretty similar looking xml/json.


And Jason is a less rich and less mature standard.

XML easily supports embedding of other xml documents of which there are 
many types.

https://www.p6r.com/articles/2010/04/05/xml-to-json-and-back/

xml lets me stick my svg directly in my lyx (if...) document.

In short xml makes use of lots of people spending thousands of hours on 
getting it right. Html is nearly 20 years old now and SGML older than 
that, with origins in document publishing.


json it a few less hours spent by people who (seem like they) are trying 
to address a small problem related to exchange of small quantities of 
short lived data.


I know which I prefer for documents!

Sam

--
*Sam's signature*


Re: XML format status

2010-06-08 Thread Andre Poenitz
On Tue, Jun 08, 2010 at 04:29:21PM +0200, Abdelrazak Younes wrote:
 On 06/08/2010 03:27 PM, Vincent van Ravesteijn wrote:
 What is the current status or thinking of the XML format for lyx 2?
 
 Ideally, LyX 2 would have an XML file format. However, no-one is
 actively working on the issue, so we postponed it.
 
 As far as I know, we didn't really decide when and how to do the transition.
 
 I worked recently with JSon format (www.json.org), cleaner to the
 human, faster to parse and less verbose than XML, quite nice...

This might indeed be a good option.

Andre'


Re: XML format status

2010-06-08 Thread Peter Kümmel
Am Dienstag, den 08.06.2010, 20:52 +0200 schrieb Andre Poenitz:
 On Tue, Jun 08, 2010 at 04:29:21PM +0200, Abdelrazak Younes wrote:
  On 06/08/2010 03:27 PM, Vincent van Ravesteijn wrote:
  What is the current status or thinking of the XML format for lyx 2?
  
  Ideally, LyX 2 would have an XML file format. However, no-one is
  actively working on the issue, so we postponed it.
  
  As far as I know, we didn't really decide when and how to do the 
  transition.
  
  I worked recently with JSon format (www.json.org), cleaner to the
  human, faster to parse and less verbose than XML, quite nice...
 
 This might indeed be a good option.

http://gitorious.org/JsonQt/
http://gitorious.org/qjson


But the question remains what is the aim of the new format: is it for
us, or is it for other who wanna generate, manipulate, ... LyX files.

Peter



Re: XML format status

2010-06-08 Thread Richard Heck

On 06/08/2010 03:49 PM, Peter Kümmel wrote:

Am Dienstag, den 08.06.2010, 20:52 +0200 schrieb Andre Poenitz:
   

On Tue, Jun 08, 2010 at 04:29:21PM +0200, Abdelrazak Younes wrote:
 

On 06/08/2010 03:27 PM, Vincent van Ravesteijn wrote:
   

What is the current status or thinking of the XML format for lyx 2?

   

Ideally, LyX 2 would have an XML file format. However, no-one is
actively working on the issue, so we postponed it.

As far as I know, we didn't really decide when and how to do the transition.
 

I worked recently with JSon format (www.json.org), cleaner to the
human, faster to parse and less verbose than XML, quite nice...
   

This might indeed be a good option.
 

http://gitorious.org/JsonQt/
http://gitorious.org/qjson


But the question remains what is the aim of the new format: is it for
us, or is it for other who wanna generate, manipulate, ... LyX files.

   
My understanding was that the point was to make the LyX format more 
easily parsable by LyX and, in particular, to provide validation that a 
file really is in the proper format. So, for us, but without breaking 
the easy manipulability of LyX files via sed, awk, etc.


Richard


Peter
   




Re: XML format status

2010-06-08 Thread Peter Kümmel
Am Dienstag, den 08.06.2010, 17:22 -0400 schrieb Richard Heck:
 On 06/08/2010 03:49 PM, Peter Kümmel wrote:
  Am Dienstag, den 08.06.2010, 20:52 +0200 schrieb Andre Poenitz:
 
  On Tue, Jun 08, 2010 at 04:29:21PM +0200, Abdelrazak Younes wrote:
   
  On 06/08/2010 03:27 PM, Vincent van Ravesteijn wrote:
 
  What is the current status or thinking of the XML format for lyx 2?
 
 
  Ideally, LyX 2 would have an XML file format. However, no-one is
  actively working on the issue, so we postponed it.
 
  As far as I know, we didn't really decide when and how to do the 
  transition.
   
  I worked recently with JSon format (www.json.org), cleaner to the
  human, faster to parse and less verbose than XML, quite nice...
 
  This might indeed be a good option.
   
  http://gitorious.org/JsonQt/
  http://gitorious.org/qjson
 
 
  But the question remains what is the aim of the new format: is it for
  us, or is it for other who wanna generate, manipulate, ... LyX files.
 
 
 My understanding was that the point was to make the LyX format more 
 easily parsable by LyX and, in particular, to provide validation that a 
 file really is in the proper format. So, for us, but without breaking 
 the easy manipulability of LyX files via sed, awk, etc.
 

I would prefer a more readable format than XML like json, even I would
use Lua, because it is the future scripting languange in LaTeX, but
I assume we could never explain the rest of the world, why we we don't 
use beloved XML. So let's use XML. And validating a XML with a DTD is
really an advantage.

Peter



  1   2   3   4   >