Re: [tex4ht] [bug #274] tex4ht features vs. lwarp vs. ...
On Tue, Mar 29, 2016 at 8:26 AM, Michal wrote, quoting CVR: > The user needs to tag math as $a_{\mathbf{n}}$ for perfect MathML output > > without intervention. Another common example found in documents is $(a > > $), this would be passed by TeX, but not MathML since the closing > > parenthesis is outside math. Prof William Hammond has been campaigning > for > > profiled LaTeX for several years now, but many users are hardly bothered > > since they expect other systems to adopt to their non-standard tagging > > methods. This can only result in a frustrating experience with tex4ht > > unfortunately. > > > > We can educate users who actively wants to convert their documents, > they really need to understand the nature of HTML and MathML in order > to produce valid output. Flexibility of TeX if generally good thing > and feature, only the abusing users are problem :) > Most of us read documentation and instructions as little as possible. Moreover, there is a low level of language inconsistency in free-ranging use of TeX markup. For that reason I think the only way to 'educate' users is to provide a layer of (1) syntax enforcement and (2) source validation under a suitable LaTeX profile. With tex4ht I think the way to do that is first, regardless of final output format (and like latexml), to make an xml (or sgml for more 'power') document under a suitable tex4ht LaTeX profile. Use XSLT or a standard sgml library (for more power) like sgmlspl (perl) or OpenSP (C++) to translate the profiled document to whatever end format. -- Bill -- William F Hammond Email: gel...@gmail.com https://www.facebook.com/william.f.hammond http://www.albany.edu/~hammond/
Re: [tex4ht] [bug #274] tex4ht features vs. lwarp vs. ...
On Sat, Mar 26, 2016 at 5:44 AM, Radhakrishnan CV wrote: > On Tue, Mar 22, 2016 at 9:51 PM, Michal Hoftich > wrote: >> >> Follow-up Comment #1, bug #274 (project tex4ht): >> >> That thread has lot of flaming potential, which I don't want to fire up :) >> But >> we probably should clarify some misunderstandings about tex4ht. > > > It is more of an impractical approach to tex4ht. The \Configure mechanism in > tex4ht relies on seeding of configure hooks in the original macros of a > given package used in the document. Here the question is who will do the > seeding. Eiten had done it extensively for around 400 and odd packages > popular during his time. Every day, several packages get updated, newer > packages get released, people do use many of these at their will and > freedom. Hence, in the absence of *.4ht's that have hooks-seeded functions > of newly released/revised packages, it is obvious that tex4ht will break > down. I have seen documents that use packages like xstring, breqn, > stackengine, tabstackengine, siunitex, acronym, etc at work and we are > expected to generate XML out of these documents. Many packages like > siunitex, acronym, breqn are now written in expl3 which is another > challenge. expl3 shouldn't be a big challenge (other than a bit unusual syntax), on contrary, it encourages separating of code and design, so it should be easier to insert tex4ht hooks. at least in theory. for example xtemplate package seems to have ideas in quite good direction. > > I would consider tex4ht as a backend. Many packages sadly lack the driver > (*.4ht) for this backend. The best people to write this backend are the > authors of these packages since others need more time and in-depth knowledge > of these packages to write backend drivers. Sure, the authors are the best persons for that. There are also packages which cause tex4ht to fail compilation once they are included (most notoriously fontspec). > > This being the reality, personally I have chosen to redefine macros from > different packages, in an add-on configuration to be used for XML/HTML > generation. I agree that this is not the right way or preferred way, but > practically that is the only solution when one is expected to handle > hundreds of documents every day with several funny packages and functions > profusely used. This way, tex4ht works wonderfully well with minimal effort > for me and I am sure, tex4ht is the best engine among all that would > generate another markup from TeX documents. Sometimes it is only option, because users are really innovative in custom macros writing. I've tried to convert some mathematical books from Project Guttenberg to epub3 and there were macros which abused \section commands to just write bold and large text. While it produced desired output in PDF, the HTML was obviously total mess. > > In view of the above, I suggest that we would request authors to provide the > backend driver of their packages for tex4ht. This is the only practically > feasible solution. An example would be hyperref, the primary author of this > package (Sebastian Rahtz) wrote text4ht driver also since Sebastian was a > big user and admirer of tex4ht. > I agree :) > The tex4ht team might come up with necessary documentation of how to write > .4ht for a package that would largely help authors. If each author spends a > few more minutes to do their bit, usage of tex4ht will be pleasure then. > Since HTML is gaining more popularity/usage owing to support of smart > devices and for its re-flowing ability without losing format features to > suit the dimensions of device screens (a severe handicap of PDF), authors > shall invest a bit more energy to provide tex4ht support which is as > essential as the one provided for outputs like PDF. I agree as well. There is a question how to create that documentation. I've tried to write tex4ht tutorial as Wiki on github and I've found Markdown as too limited (I really don't understand why it is so popular nowadays. It is indeed easy to write some basic formatting with, which is fine for Stackexchange answers or custom note archive, but it is real pain as soon as one needs some more advanced feature). It also doesn't make much sense to use anything other than TeX for tex4ht documentation :). So a question is where to host the source code and generated documentation. Here on Puszcsa? Or Github? It has buil-in support for page hosting and is easier for colloaboration. > Secondly, the permissive nature of TeX/LaTeX. $a_{\bf n}$ will create the > correct output in pdf, but will not generate the right kind of output in > MathML which should be like. > > > > > a > > > > n > > > > > > The user needs to tag math as $a_{\mathbf{n}}$ for perfect MathML output > without intervention. Another common example found in documents is $(a > $), this would be passed by TeX, but
Re: [tex4ht] [bug #274] tex4ht features vs. lwarp vs. ...
On Tue, Mar 22, 2016 at 9:51 PM, Michal Hoftich wrote: > Follow-up Comment #1, bug #274 (project tex4ht): > > That thread has lot of flaming potential, which I don't want to fire up :) > But > we probably should clarify some misunderstandings about tex4ht. > It is more of an impractical approach to tex4ht. The \Configure mechanism in tex4ht relies on seeding of configure hooks in the original macros of a given package used in the document. Here the question is who will do the seeding. Eiten had done it extensively for around 400 and odd packages popular during his time. Every day, several packages get updated, newer packages get released, people do use many of these at their will and freedom. Hence, in the absence of *.4ht's that have hooks-seeded functions of newly released/revised packages, it is obvious that tex4ht will break down. I have seen documents that use packages like xstring, breqn, stackengine, tabstackengine, siunitex, acronym, etc at work and we are expected to generate XML out of these documents. Many packages like siunitex, acronym, breqn are now written in expl3 which is another challenge. I would consider tex4ht as a backend. Many packages sadly lack the driver (*.4ht) for this backend. The best people to write this backend are the authors of these packages since others need more time and in-depth knowledge of these packages to write backend drivers. This being the reality, personally I have chosen to redefine macros from different packages, in an add-on configuration to be used for XML/HTML generation. I agree that this is not the right way or preferred way, but practically that is the only solution when one is expected to handle hundreds of documents every day with several funny packages and functions profusely used. This way, tex4ht works wonderfully well with minimal effort for me and I am sure, tex4ht is the best engine among all that would generate another markup from TeX documents. In view of the above, I suggest that we would request authors to provide the backend driver of their packages for tex4ht. This is the only practically feasible solution. An example would be hyperref, the primary author of this package (Sebastian Rahtz) wrote text4ht driver also since Sebastian was a big user and admirer of tex4ht. The tex4ht team might come up with necessary documentation of how to write .4ht for a package that would largely help authors. If each author spends a few more minutes to do their bit, usage of tex4ht will be pleasure then. Since HTML is gaining more popularity/usage owing to support of smart devices and for its re-flowing ability without losing format features to suit the dimensions of device screens (a severe handicap of PDF), authors shall invest a bit more energy to provide tex4ht support which is as essential as the one provided for outputs like PDF. Secondly, the permissive nature of TeX/LaTeX. $a_{\bf n}$ will create the correct output in pdf, but will not generate the right kind of output in MathML which should be like. a n The user needs to tag math as $a_{\mathbf{n}}$ for perfect MathML output without intervention. Another common example found in documents is $(a $), this would be passed by TeX, but not MathML since the closing parenthesis is outside math. Prof William Hammond has been campaigning for profiled LaTeX for several years now, but many users are hardly bothered since they expect other systems to adopt to their non-standard tagging methods. This can only result in a frustrating experience with tex4ht unfortunately. > Regarding bug reports, I've tried to compile LWARP documentation. It needed > some fixes, there was problem with \label commands used inside \caption, it > totally explodes tex4ht. I am not sure whether \caption{caption > text\label{some label}} is legal construct in LaTeX, but it should compile > without errors. > This is legal tagging and works OK for me. > There is also missing cleveref support, which I thought I reported last > summer, but apparently didn't. I have some basic cleveref.4ht file, which > works except for links. > > \fbox contents often overflow the border, it is probably just some CSS > issue > > SVG's produced by Tikz are often invalid, especially when font formatting > commands are used in diagrams. I personally use Tikz externalization > instead > of built-in tex4ht support, it doesn't work correctly in most cases. > Entirely agree with you. Maybe we can provide an extra package for tikz (owing to its extensive usage) to write out tikz picture sources to an external TeX file to make it easier enough to process separately, generate pdf and convert to desired graphic format. The package will also flag the figures automatically in HTML output. All can be done in one go if the user dares to invoke shell-escape. > [...] > > PS: we really need some more collaborators