Hi,
I plan to apply to this year's Summer of Code and I wanted to share my
idea with you for discussion. I've been using DocBook both for personal
documents (CV, Bachelor and Master Thesis, papers, etc.) and in the
FreeBSD Project. I'm usually satisfied with the output that Apache FOP
creates but I believe the PDF generation should be better supported. The
only usable open source XSL FO renderer is Apache FOP (xmlroff is very
immature and is not actively developed) so we have no alternatives.
Besides, I had problems with it with large documents (it crashes). As a
result, in FreeBSD, we are still using DocBook 4.2/4.5 with DSSSL
stylesheets to render PDF since we are not satisfied with this
situation. The lack of alternatives could mean technology/vendor
lock-in, it does not work with some of our big documents and Java is a
heavy dependency. I've worked with XSL FO and quite like how it works
but when it comes to these factors I feel afraid of basing a serious
project on XSL FO and FOP. Because of the complexity of XSL FO, it is
not easy to write a renderer so this situation will probably not change
in the near future. In turn, there's been LaTeX for a long time and
people use it, they are familiar with and it is very well supported.
Probably noone would think it is a risk basing a documentation project
on LaTeX.But DocBook is more semantic and I like it much more. Also, the
XHTML and EPUB generation of DocBook is of a really high quality. So I'd
like to combine the advantages of DocBook and LaTeX and create
stylesheets that produce LaTeX output that can be used for printable
formats. I think that probably more people thinks like me and having
this functionality would improve DocBook's acceptance in the industry.
There are two projects, db2latex and dblatex, which provide such
functionality but they do not integrate well with the existing
stylesheets and dblatex also introduces a new dependency, Python. I'd
like to create a new solution, that is purely XSLT-based and integrates
with the existing DocBook XSL facilities (titlepage, I18N, etc.). My
idea is to first create an XML serialization of TeX (TeXML or a slightly
revised version of that), which actually has the same abstract syntax as
TeX but is an XML document. Then I'd create actual stylesheets that
transform DocBook documents into this XML TeX and then it would be
transformed into real TeX in a second pass. So the first (more complex)
pass would produce XML and only the second pass would output plain text.
XSLT is not the best tool to produce plain text but this approach would
mitigate this problem and at the same time avoid having to introduce new
dependencies.
This is a big project but I believe that the summer is enough to create
a stable and useful stuff even if does not have the same
feature-completeness as XHTML and XSL FO output formats. During the last
summer I ported DocBook Slides to DocBook 5.0 and created stylesheets
for XHTML and XSL FO output. I also have DocBook and XML/XSLT
experiences from the documentation of FreeBSD.
Do you also think it is useful? Any potential mentors? Please share your
thoughts.
Thanks,
Gabor
---------------------------------------------------------------------
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org