Re: [tex4ht] Getting correct MathML for multi-character symbols

2018-02-02 Thread Michal Hoftich
Hi Bill,

On Thu, Feb 1, 2018 at 10:04 PM, William F Hammond  wrote:
> The discussion about correct MathML output for numbers bring this old issue
> back to me.  Take, for example, the common math (though not engineering)
> symbol "Hom".  Let's assume we want it to become  -- although some may
> wish for .  What LaTeX markup for Hom should be given to tex4ht (that
> also works for regular LaTeX)?
>

In such cases I think it is best to use custom command, like \Hom,
which can be configured in tex4ht to produce the desired output.

> There is no way that I think is fully correct unless it involves something
> like
> \DeclareMathOperator, which should be handled, but is too heavy for casual
> use.
>
> For casual use \mathrm{Hom} will generate Hom with tex4ht and, if I
> might add, also with latexml.  The problem with using \mathrm this way is
> that its LaTeX purpose is a zone font operation and its argument for regular
> LaTeX may contain math expressions, not just a single symbol name.  A
> translation to MathML can only safely set \mathrm as  when its content
> is free of operations, otherwise it needs to become, I suppose, .
>

We just found that tex4ht contains some undocumented post-processing
script for XHTML+MathML output, which tries to fix some issues that
are hard to fix on the TeX level. Like the  issue. Other think it
does is `div` -> `div`.

It produces invalid HTML unfortunately. I think easier for me will be
to recreate it using make4ht Lua filters than trying to understand how
these old tools work. Do you have more examples of problematic output
that should be taken into account?

> I've always thought that \mbox{Hom} should be the way to go.  In regular
> LaTeX one cannot set a math expression in an mbox inside math without
> explicitly returning to math mode.  However, for \mbox{Hom} tex4ht uses
>  (as does also latexml).  But isn't  is supposed to be a
> semantic escape from math expression parsing that a computer algebra system
> can ignore?  In fact, amsmath introduced \text{} for semantic escaping from
> math expression parsing.
>
> By the way, as for \mbox{} found in LaTeX outside of math, I think its
> content should be passed through to html as if the \mbox{} were not present.
> For \mbox{} inside math, internal math content should throw an error for
> translation to MathML.  If the content has no internal white space, it
> should be , but  if there is internal white space.
>

I honestly don't know answers to these questions :/

Best regards,
Michal


Re: [tex4ht] [bug #378] Wrong MathML output for numbers

2018-02-02 Thread Michal Hoftich
Hi Deimi,

> There is a section "Digits  into Numbers"
> in tex4ht-xhtmml-xtpipes.tex  file.
> If I read it correctly, the code deals with
>  merging "mn" nodes. Apparently Eitan's approach
> was   to  postprocess  html file with xtpipes (and not to fix c-code).

that's great observation! I can see that this file tries to solve more
issues, like child  elements in  or some common HTML issues.
It can be requested using -cxhtmml option for t4ht, like:

make4ht -u filename.tex "mathml" "" "-cxhtmml"

I can see some issues here though: it translates utf8 characters to HTML
entites and it insert end tags for  or  elements, which
results in non-valid HTML file.

Anyway, I think I will make a default filter to fix these issues for
make4ht, as Eitan clearly used a post-processing himself.

Best regards,
Michal


Re: [tex4ht] [bug #378] Wrong MathML output for numbers

2018-02-02 Thread Deimantas Galcius
There is a section "Digits  into Numbers"
in tex4ht-xhtmml-xtpipes.tex  file. 
If I read it correctly, the code deals with 
 merging "mn" nodes. Apparently Eitan's approach
was   to  postprocess  html file with xtpipes (and not to fix c-code).
 


On Feb 2, 2018, at 1:19 AM, Karl Berry wrote:

> Follow-up Comment #1, bug #378 (project tex4ht):
> 
> I'm sorry, I don't understand yet. Do you know where the "" string
> actually gets emitted? Because "mn" only occurs as a word in a few places
> anywhere in the *.tex files (and not in tex4ht-c.tex), none of which looks
> especially relevant to the creation of the 1 to me.
> 
> Anyway, in tex4ht-mathltx.tex, there are some \Configure commands for some,
> but not all, of the numerals, starting around line 2245. Only 0..6 are
> configured there, and only 4..5 with the "mathltx-" option (whatever that is).
> What happens with 7..9? Does any of this make sense to you?
> 
> I wonder if Eitan provided an option for determining the behavior of the
> numerals in math mode. So wish he were still here ...
> 
> 
>___
> 
> Reply to this item at:
> 
>  
> 
> ___
>  Message sent via/by Puszcza
>  http://puszcza.gnu.org.ua/
>