Re: [tex4ht] [bug #274] tex4ht features vs. lwarp vs. ...

2016-03-30 Thread William F Hammond
On Tue, Mar 29, 2016 at 8:26 AM, Michal wrote, quoting CVR:

> The user needs to tag math as $a_{\mathbf{n}}$ for perfect MathML output
> > without intervention.  Another common example found in documents is $(a
> > $), this would be passed by TeX, but not MathML since the closing
> > parenthesis is outside math. Prof William Hammond has been campaigning
> for
> > profiled LaTeX for several years now, but many users are hardly bothered
> > since they expect other systems to adopt to their non-standard tagging
> > methods. This can only result in a frustrating experience with tex4ht
> > unfortunately.
> >
>
> We can educate users who actively wants to convert their documents,
> they really need to understand the nature of HTML and MathML in order
> to produce valid output. Flexibility of TeX if generally good thing
> and feature, only the abusing users are problem :)
>

Most of us read documentation and instructions as little as possible.
Moreover, there is a low level of language inconsistency in free-ranging
use of TeX markup.

For that reason I think the only way to 'educate' users is to provide a
layer of (1) syntax enforcement and (2) source validation under a suitable
LaTeX profile.

With tex4ht I think the way to do that is first, regardless of final output
format (and like latexml), to make an xml (or sgml for more 'power')
document under a suitable tex4ht LaTeX profile.  Use XSLT or a standard
sgml library (for more power) like sgmlspl (perl) or OpenSP (C++) to
translate the profiled document to whatever end format.

  -- Bill

-- 
William F Hammond
Email: gel...@gmail.com
https://www.facebook.com/william.f.hammond
http://www.albany.edu/~hammond/


Re: [tex4ht] [bug #274] tex4ht features vs. lwarp vs. ...

2016-03-29 Thread Michal Hoftich
On Sat, Mar 26, 2016 at 5:44 AM, Radhakrishnan CV  wrote:
> On Tue, Mar 22, 2016 at 9:51 PM, Michal Hoftich 
> wrote:
>>
>> Follow-up Comment #1, bug #274 (project tex4ht):
>>
>> That thread has lot of flaming potential, which I don't want to fire up :)
>> But
>> we probably should clarify some misunderstandings about tex4ht.
>
>
> It is more of an impractical approach to tex4ht. The \Configure mechanism in
> tex4ht relies on seeding of configure hooks in the original macros of a
> given package used in the document. Here the question is who will do the
> seeding. Eiten had done it extensively for around 400 and odd packages
> popular during his time. Every day, several packages get updated, newer
> packages get released, people do use many of these at their will and
> freedom. Hence, in the absence of *.4ht's that have hooks-seeded functions
> of newly released/revised packages, it is obvious that tex4ht will break
> down. I have seen documents that use packages like xstring, breqn,
> stackengine, tabstackengine, siunitex, acronym, etc at work and we are
> expected to generate XML out of these documents.  Many packages like
> siunitex, acronym, breqn are now written in expl3 which is another
> challenge.

expl3 shouldn't be a big challenge (other than a bit unusual syntax),
on contrary, it encourages separating of code and design, so it should
be easier to insert tex4ht hooks. at least in theory. for example
xtemplate package seems to have ideas in  quite good direction.
>
> I would consider tex4ht as a backend. Many packages sadly lack the driver
> (*.4ht) for this backend. The best people to write this backend are the
> authors of these packages since others need more time and in-depth knowledge
> of these packages to write backend drivers.

Sure, the authors are the best persons for that. There are also
packages which cause tex4ht to fail compilation once they are included
(most notoriously fontspec).
>

> This being the reality, personally I have chosen to redefine macros from
> different packages, in an add-on configuration to be used for XML/HTML
> generation. I agree that this is not the right way or preferred way, but
> practically that is the only solution when one is expected to handle
> hundreds of documents every day with several funny packages and functions
> profusely used. This way, tex4ht works wonderfully well with minimal effort
> for me and I am sure, tex4ht is the best engine among all that would
> generate another markup from TeX documents.

Sometimes it is only option, because users are really innovative in
custom macros writing. I've tried to convert some mathematical books
from Project Guttenberg to epub3 and there were macros which abused
\section commands to just write bold and large text. While it produced
desired output in PDF, the HTML was obviously total mess.

>
> In view of the above, I suggest that we would request authors to provide the
> backend driver of their packages for tex4ht. This is the only practically
> feasible solution. An example would be hyperref, the primary author of this
> package (Sebastian Rahtz) wrote text4ht driver also since Sebastian was a
> big user and admirer of tex4ht.
>

I agree :)

> The tex4ht team might come up with necessary documentation of how to write
> .4ht for a package that would largely help authors. If each author spends a
> few more minutes to do their bit, usage of tex4ht will be pleasure then.
> Since HTML is gaining more popularity/usage owing to support of smart
> devices and for its re-flowing ability without losing format features to
> suit the dimensions of device screens (a severe handicap of PDF), authors
> shall invest a bit more energy to provide tex4ht support which is as
> essential as the one provided for outputs like PDF.

I agree as well. There is a question how to create that documentation.
I've tried to write tex4ht tutorial as Wiki on github and I've found
Markdown as too limited (I really don't understand why it is so
popular nowadays. It is indeed easy to write some basic formatting
with, which is fine for Stackexchange answers or custom note archive,
but it is real pain as soon as one needs some more advanced feature).
It also doesn't make much sense to use anything other than TeX for
tex4ht documentation :). So a question is where to host the source
code and generated documentation. Here on Puszcsa? Or Github? It has
buil-in support for page hosting and is easier for colloaboration.


> Secondly, the permissive nature of TeX/LaTeX. $a_{\bf n}$ will create the
> correct output in pdf, but will not generate the right kind of output in
> MathML which should be like.
>
>   
> 
>   
> a
>   
>   
> 
>   n
> 
>   
> 
>   
>
> The user needs to tag math as $a_{\mathbf{n}}$ for perfect MathML output
> without intervention.  Another common example found in documents is $(a
> $), this would be passed by TeX, but 

Re: [tex4ht] [bug #274] tex4ht features vs. lwarp vs. ...

2016-03-25 Thread Radhakrishnan CV
On Tue, Mar 22, 2016 at 9:51 PM, Michal Hoftich 
wrote:

> Follow-up Comment #1, bug #274 (project tex4ht):
>
> That thread has lot of flaming potential, which I don't want to fire up :)
> But
> we probably should clarify some misunderstandings about tex4ht.
>
​
It is more of ​an impractical approach to tex4ht. The \Configure mechanism
in tex4ht relies on seeding of configure hooks in the original macros of a
given package used in the document. Here the question is who will do the
seeding. Eiten had done it extensively for around 400 and odd packages
popular during his time. Every day, several packages get updated, newer
packages get released, people do use many of these at their will and
freedom. Hence, in the absence of *.4ht's that have hooks-seeded functions
of newly released/revised packages, it is obvious that tex4ht will break
down. I have seen documents that use packages like xstring, breqn,
stackengine, tabstackengine, siunitex, acronym, etc at work and we are
expected to generate XML out of these documents.  Many packages like
siunitex, acronym, breqn are now written in expl3 which is another
challenge.

I would consider tex4ht as a backend. Many packages sadly lack the driver
(*.4ht) for this backend. The best people to write this backend are the
authors of these packages since others need more time and in-depth
knowledge of these packages to write backend drivers.

This being the reality, personally I have chosen to redefine macros from
different packages, in an add-on configuration to be used for XML/HTML
generation. I agree that this is not the right way or preferred way, but
practically that is the only solution when one is expected to handle
hundreds of documents every day with several funny packages and functions
profusely used. This way, tex4ht works wonderfully well with minimal effort
for me and I am sure, tex4ht is the best engine among all that would
generate another markup from TeX documents.

In view of the above, I suggest that we would request authors to provide
the backend driver of their packages for tex4ht. This is the only
practically feasible solution. An example would be hyperref, the primary
author of this package (Sebastian Rahtz) wrote text4ht driver also since
Sebastian was a big user and admirer of tex4ht.

The tex4ht team might come up with necessary documentation of how to write
.4ht for a package that would largely help authors. If each author spends a
few more minutes to do their bit, usage of tex4ht will be pleasure then.
Since HTML is gaining more popularity/usage owing to support of smart
devices and for its re-flowing ability without losing format features to
suit the dimensions of device screens (a severe handicap of PDF), authors
shall invest a bit more energy to provide tex4ht support which is as
essential as the one provided for outputs like PDF.

Secondly, the permissive nature of TeX/LaTeX. $a_{\bf n}$ will create the
correct output in pdf, but will not generate the right kind of output in
MathML which should be like.

  

  
a
  
  

  n

  

  

The user needs to tag math as $a_{\mathbf{n}}$ for perfect MathML output
without intervention.  Another common example found in documents is $(a
$), this would be passed by TeX, but not MathML since the closing
parenthesis is outside math. Prof William Hammond has been campaigning for
profiled LaTeX for several years now, but many users are hardly bothered
since they expect other systems to adopt to their non-standard tagging
methods. This can only result in a frustrating experience with tex4ht
unfortunately.


> Regarding bug reports, I've tried to compile LWARP documentation. It needed
> some fixes, there was problem with \label commands used inside \caption, it
> totally explodes tex4ht. I am not sure whether \caption{caption
> text\label{some label}} is legal construct in LaTeX, but it should compile
> without errors.
>

​This is legal tagging and works OK for me.
​


> There is also missing cleveref support, which I thought I reported last
> summer, but apparently didn't. I have some basic cleveref.4ht file, which
> works except for links.
>
> \fbox contents often overflow the border, it is probably just some CSS
> issue
>
> SVG's produced by Tikz are often invalid, especially when font formatting
> commands are used in diagrams. I personally use Tikz externalization
> instead
> of built-in tex4ht support, it doesn't work correctly in most cases.
>

​Entirely agree with you. Maybe we can provide an extra package for tikz
(owing to its extensive usage) to write out tikz picture sources to an
external TeX file to make it easier enough to process separately, generate
pdf and convert to desired graphic format. The package will also flag the
figures automatically in HTML output.  All can be done in one go if the
user dares to invoke shell-escape.
​


> ​[...]​
>
> PS: we really need some more collaborators