Thoughts on LibreOffice Math

2013-06-23 Thread Frédéric WANG

Hi all,

My name is Frédéric Wang and as some you may have noticed, I've recently 
contributed some patches for the LibreOffice Formula Editor. For those 
who don't know me: I work for the MathJax project, I've been 
contributing to the Mozilla MathML project for several years and I 
recently started to do some developments on MathML in WebKit too.


Some people on the MathJax user list reported issues with the 
mathematical formulas generated with LibreOffice and it turned out that 
the exported MathML code is currently quite bad. Hence I looked into the 
LibreOffice Math a few days ago, reported bugs and submitted a few 
patches. I'd like to share my thoughts on the situation and future of 
LibreOffice Math. Thomas Lange a message a long time ago about how Math 
could evolved: 
http://www.mail-archive.com/dev@sw.openoffice.org/msg00200.html. First, 
I note the following requirements:


1) Some people like the current Math semi-WYSIWYG interface and are 
familiar with the StarMath syntax. So this interface should be preserved 
anyway.

2) Some people would like a complete WYSIWYG editor.
3) Some people would like a LaTeX input mode to enter mathematical 
formulas (like in Abiword)
4) Some people (at least the MathML & MathJax communities) would like 
MathML import/export and copy and paste, like in Microsoft Word. It is 
also a requirement of the ODT format.
5) Some people would like a high quality rendering (e.g. for printing, 
to export to SVG etc). This is subjective but that would mean at least 
the quality level of documents generated by Microsoft Word or LaTeX.


Currently, LibreOffice Math is centered around its StarMath syntax and 
internal tree structure, which make 1) possible and there is already an 
experimental visual editor to do 2). As I read the code, the MathML 
export is not very good but that can be improved. However, importing 
from more expressive language like LaTeX or MathML seems very hard and 
makes 3) and 4) unlikely. I didn't look at the rendering code, but 
Khaled Hosny mentioned on bug 32362 that it would have to be rewritten 
from scratch and I suspect one issue is the StarMath internal tree.


I'd like to propose to center LibreOffice Math around the MathML syntax 
(and corresponding DOM structure) instead:


* I hope I can improve the code to get a decent MathML export and thus 
1) and 2) could be preserved. It would still be possible to keep the UI 
to work directly on the MathML DOM.
* For 3), there are many LaTeX to MathML converters like itex2MML (used 
in Abiword), BlahTeX, MathJax etc that could be integrated in LibreOffice.
* This would obviously make 4) easy. MathML has a  element 
which is currently used to store the StarMath syntax and could be used 
to store LaTeX too. Davide Carlisle also has an XSLT stylesheet to 
convert MathML code to LaTeX.
* Microsoft Word uses an XML language for mathematics similar to MathML 
so it should be possible to get 5). Khaled Hosny mentioned a fork of 
GtkMathView that has support for Open Type MATH and can produce a good 
rendering. It takes MathML as input and can export PNG or SVG images.


--
Frédéric Wang
maths-informatique-jeux.com/blog/frederic
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Thoughts on LibreOffice Math

2013-06-24 Thread Regina Henschel

Hi Frédéric,

find comments inside.

Frédéric WANG schrieb:

Hi all,

My name is Frédéric Wang and as some you may have noticed, I've recently
contributed some patches for the LibreOffice Formula Editor. For those
who don't know me: I work for the MathJax project, I've been
contributing to the Mozilla MathML project for several years and I
recently started to do some developments on MathML in WebKit too.

Some people on the MathJax user list reported issues with the
mathematical formulas generated with LibreOffice and it turned out that
the exported MathML code is currently quite bad.


Can you please describe in more details what is wrong with the exported 
MathML? LibreOffice uses "Presentation Markup" and not "Content Markup". 
But besides that, what is bad?


 Hence I looked into the

LibreOffice Math a few days ago, reported bugs and submitted a few
patches. I'd like to share my thoughts on the situation and future of
LibreOffice Math. Thomas Lange a message a long time ago about how Math
could evolved:
http://www.mail-archive.com/dev@sw.openoffice.org/msg00200.html. First,
I note the following requirements:

1) Some people like the current Math semi-WYSIWYG interface and are
familiar with the StarMath syntax. So this interface should be preserved
anyway.
2) Some people would like a complete WYSIWYG editor.
3) Some people would like a LaTeX input mode to enter mathematical
formulas (like in Abiword)
4) Some people (at least the MathML & MathJax communities) would like
MathML import/export


MathML import and export is already available. What is needed in addition?

 and copy and paste, like in Microsoft Word. It is

also a requirement of the ODT format.
5) Some people would like a high quality rendering (e.g. for printing,
to export to SVG etc). This is subjective but that would mean at least
the quality level of documents generated by Microsoft Word or LaTeX.

Currently, LibreOffice Math is centered around its StarMath syntax and
internal tree structure, which make 1) possible and there is already an
experimental visual editor to do 2). As I read the code, the MathML
export is not very good but that can be improved. However, importing
from more expressive language like LaTeX or MathML seems very hard and
makes 3) and 4) unlikely. I didn't look at the rendering code, but
Khaled Hosny mentioned on bug 32362 that it would have to be rewritten
from scratch and I suspect one issue is the StarMath internal tree.

I'd like to propose to center LibreOffice Math around the MathML syntax
(and corresponding DOM structure) instead:

* I hope I can improve the code to get a decent MathML export and thus
1) and 2) could be preserved.


Do you want an export to Content Markup?

 It would still be possible to keep the UI

to work directly on the MathML DOM.
* For 3), there are many LaTeX to MathML converters like itex2MML (used
in Abiword), BlahTeX, MathJax etc that could be integrated in LibreOffice.
* This would obviously make 4) easy. MathML has a  element
which is currently used to store the StarMath syntax


The StarMath code is stored in the  element.

 and could be used

to store LaTeX too. Davide Carlisle also has an XSLT stylesheet to
convert MathML code to LaTeX.
* Microsoft Word uses an XML language for mathematics similar to MathML
so it should be possible to get 5). Khaled Hosny mentioned a fork of
GtkMathView that has support for Open Type MATH and can produce a good
rendering. It takes MathML as input and can export PNG or SVG images.



Export to PNG already exists but needs improvement.

There are not only the large goals you have outlined, but a lot of 
enhancement request, which are open for many years, like arbitrary 
colors, storing in Gallery, better integral sign.


The Math module has got very small enhancements in the last years. It 
would be really good, if you can help in that area.


Kind regards
Regina


___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Thoughts on LibreOffice Math

2013-06-24 Thread Frédéric WANG

Can you please describe in more details what is wrong with the exported
MathML? LibreOffice uses "Presentation Markup" and not "Content Markup".
But besides that, what is bad?

For example,

- If you type "1 + 2x" then "2x" is interpreted as a single number 
rather than a number times a variable.
- If you add a space like "1 + 2 x", then it is now interpreted as 
"{1+2} times x"
- If you type "a+b+c+d+e+..." you'll get a nested mrow structure like 
{a+b} + c}+d}+e}+...} which is correct but very inconvenient to 
browse e.g. for accessibility tools. The operators of same priority 
could be grouped in the same row.
- Other commands like wide accents, math symbols or overstrike are lost 
are badly exported to MathML.


I don't want to give the details here, but I'm going to open bugs and 
submit patches when I have time:

https://www.libreoffice.org/bugzilla/buglist.cgi?emailreporter1=1&query_format=advanced&email1=fred.wang%40free.fr&component=Formula%20Editor&product=LibreOffice


MathML import and export is already available. What is needed in addition?
Yes, I didn't mean it is not supported, but just wanted to list some 
features that I think people would like to implement or keep in any 
future versions of Math. Note that:


- MathML export has some bugs but I hope it won't be too difficult to fix.
- A MathML or LaTeX import is not possible without loss of information, 
because StarMath is less expressive ; or one will have to extend the 
StarMath language with many new constructions.
- LibreOffice MathML claims export/import support for "MathML 1.0" but 
the recommendation is now "MathML3".
- As I already reported, HTML5 import/export with embedded MathML would 
be great!
- Perhaps the (X)HTML export should have an option to insert a line of 
code to load MathJax, when the document contains MathML formulas: 
src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=MML_HTMLorMML";>



Do you want an export to Content Markup?
No I'm not interested in a Content MathML export. If I can improve the 
Presentation MathML export to at least produce good markup for the 
StarMath constructions and so that users stop complaining about issues 
with the formulas generated by LibreOffice, then that would be fine to me.

The StarMath code is stored in the  element.
Yes, that's what I mean, the  /  elements 
are children of .



Export to PNG already exists but needs improvement.
True. That said, I'm not really a big fan of PNG export for mathematical 
formulas ;-)



There are not only the large goals you have outlined, but a lot of
enhancement request, which are open for many years, like arbitrary
colors, storing in Gallery, better integral sign.

The Math module has got very small enhancements in the last years. It
would be really good, if you can help in that area.
I'd like to help but I'm not sure I'll have time to. In the short term, 
I just plan to improve MathML export so that LibreOffice users can 
publish their math document on the Web, without having to use PNG images 
or similar hacks.


Thanks,

--
Frédéric Wang
maths-informatique-jeux.com/blog/frederic
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Thoughts on LibreOffice Math


>/  problem with that is that there are actually 2 text editing components,
/>/  Writer and EditEngine (used by Impress/Draw/Calc)... the current
/>/  architecture of Math embedded objects has the advantage that it works
/>/  with both of these and so in all applications.
/
That is something we will have to consider, but the current way is just
counter intuitive, not only that inserting a formula is deeply hidden
behind multiple level menus, but also editing a formula switches to a
completely different UI causing much disruption IMO (and the need to
click a formula to edit it is just annoying). I don't mind keeping Math
as separate competent, but we need a more integrated math editing that
what we have now.
BTW, I'm wondering how using MathML (or more generally ODF objects) is 
related to this GSOC idea:


https://wiki.documentfoundation.org/Development/Gsoc/Ideas#ODF_Formulas_in_Writer

I'm not really sure to understand to which "fields" the proposal refers, 
but my idea was to use MathML everywhere in LibreOffice Writer.


--
Frédéric Wang
maths-informatique-jeux.com/blog/frederic

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Thoughts on LibreOffice Math

> I think enabling the visual editor by default is indeed important. What is
> blocking it from going out of experimental features? I think the GSOC
> student who worked on that mentioned some crashes and I can try to debug
> and fix them. Are there other serious bugs / missing features apart
> undo/redo?

So there is a light weight todo for some of the things that would be nice
to have here:
http://cgit.freedesktop.org/libreoffice/core/tree/starmath/visual-editor-todo

Some of them, might already be fixed. Or reported as bugs, as some of the
other guys mentioned.

In my opinion the biggest problem with the current approach is that it is
unnecessarily complex, because the StarMath syntax tree is very concrete
(rather than being abstract).
So maintainability is at risk with the current approach, but a rewrite also
carries it's risks...
So it's a matter of resources and priority... Just my two cents...


--
Regards Jonas Finnemann Jensen.


On Thu, Jul 4, 2013 at 6:10 PM, Frédéric WANG  wrote:

> On 04/07/2013 13:17, Michael Meeks wrote:
>
>> First - thanks so much for your contribution ! :-) it's great to have
>> someone working on and caring about math - it seems to me like you should
>> have commit access if you havn't already for that ( can you poke me with
>> your gerrit account name ;-)
>>
> Thanks, my gerrit account is "Frédéric Wang" attached to the mail address
> I'm writing this mail. BTW, I'm not sure why, but I can not comment on
> gerrit commit. Do I need commit access for that?
>
>
>  Which of course makes life hard :-) IMHO it's fine to switch to something
>> more standard; but of course for back-compatibility we need to be able to
>> import (and probably export) (perfectly) StarMath to and from our new
>> representation.
>>
> I think the visual rendering can be preserved when converting from
> StarMath to MathML, but the other direction is not possible since StarMath
> does not have all the MathML features. In general, perfectly preserving the
> StarMath markup is not really possible for example "a_b^c" and "a^c_b" will
> be converted to the same MathML markup and it's not possible to convert it
> back. A more serious example is "matrix{A ## B}", "stack{A # B}" and "A
> newline B" that are all exported as a MathML table with two rows A and B...
>
>
>  Having said that - I'd love to see the final small pieces (IIRC. mostly
>> undo/redo which is quite asy) for interactive editing sorted out before
>> doing a deprecate/replace. The skills gained pulling the interactive
>> formula editing out of experimental work will be useful learning for the
>> re-write :-)
>>
> I think enabling the visual editor by default is indeed important. What is
> blocking it from going out of experimental features? I think the GSOC
> student who worked on that mentioned some crashes and I can try to debug
> and fix them. Are there other serious bugs / missing features apart
> undo/redo?
>
>  There were a number of technology suggestions in the thread too:
>> "just
>> re-use GtkMathView" - that seem to bring significant licensing and
>> dependency issues. In general, that seems deeply problematic to me.
>>
>> Of course, if we can re-use some code from Firefox for a new
>> formula
>> editor (I assume they only render not edit) then that would be really
>> ideal - though, naturally there would need to be some degree of
>> abstracting of rendering etc. That's something I'd love to see. Editing
>> is often quite unpleasant to achieve ;-)
>>
>> Thoughts on that much appreciated :-) how re-usable is the
>> firefox code
>> - is it a tightly coupled, vast chunk of beast tied to dozens of
>> megabytes / mega-lines of existing firefox infrastructure ? or is it
>> something smallish and re-usable ? :-)
>>
>
> Gecko has an editor but it currently does not know anything about MathML
> and so editing it will produce invalid markup (try for example  contenteditable="true"><**msqrt>ab mi> in a HTML page).
>
> Gecko/WebKit MathML code base itself is small (say about the same size as
> StarMath) but it is strongly dependent on the rest of the Web rendering
> engine (all the CSS properties, table and text layout, DOM etc) so I'm not
> sure it's a really good idea to use it in LibreOffice. The need for Web
> content is a bit different, Web people want all the CSS features like
> text-shadow on mathematical expressions or want to mix MathML with other
> languages like SVG or want to use Javascript/DOM to edit them. I think
> LibreOffice people just want basic math support without too much
> interaction with the other features of the rendering engine (although at
> least integration in the surrounding text is important). I admit that I
> didn't check GtkMathView in details, but it seemed to be a small piece of
> code that is designed to be integrated in other applications and
> interactive edition is possible. The rendering did not seem so good to me
> (a font issue on my system?) but that would be better if Khaled ad