Hello List,
I would like to propose a project "canonical DocBook" to the DocBook TC,
and I am interested in the opinion of this mailing list. I hope this is
the right list.
DocBook is a great system for creating technical documents. We use it
successfully for various purposes, which include transformation to
formats other than HTML and PDF. For example, we work with stylesheets
for transformation into ODF and into NISO-STS.
Here, the flexibility of DocBook schemas is problematic, because it
increases complexity. To give a very simple example, the title of a
section is valid both with and without an enclosing info element. A
template for transforming the title element must account for both
possibilities.
Our own stylesheets are therefore divided into at least phases. First,
the input document is transformed into a uniform structure. This would
ensure, for example, that each title element is always contained in an
info element. In a second step, the document is converted into the
target format. The advantage of this method is that the transformation
of the second phase can be made much easier.
As far as I can see, the XSL 3 stylesheets for XslTNG are also similar
in structure. These are certainly much more professional, comprehensive
and systematic in design. So there is a point in these stylesheets where
the input document is in a sort of "canonical DocBook". However, this
canonical format is not documented.
My suggestion is that the DocBook TC standardize and document the
canonical DocBook format. Subsequently, stylesheets for transforming
valid DocBook 5 documents into the canonical format would be published -
possibly these already exist, as part of the XslTNG stylesheets. The
advantage would be that other projects could more easily transform
canonical docbook to other formats. They would be able to build on a
standard, documented DocBook format of lower complexity.
Besides the simple example of the title elements, canonical DocBook
would have to consider the following aspects, among others:
para/simpara: canonical DocBook should only support simpara. para with
block-content (tables, lists) must be transformed into a sequence of
simpara and other block-content.
Tables: In canonical DocBook, each table must have table column
specifications. Default values are replaced by explicit values. Spanspec
elements are converted to corresponding column start and end positions.
Each cell of a table must have information about its position within the
table, so that it is possible to determine at which column it starts and
where it ends without complex calculations. Content of table cell must
be element only.
Images: Each image must have at least the attributes for image size and
scaling.
emphasis: explicit values instead of default values (e. g. role='bold').
A list of values for role which must be supported (bold, italic,
underline).
Lists: explicit values instead of default values (e. g. numeration for
orderedlist).
Of course, this task could be exceedingly difficult if we were on a
greenfield site. I hope that in reality it will be less difficult if we
take XslTNG stylesheets as a basis. And accept the format generated in
them for the intermediate result after simplifying the structure as a
basis for canonical DocBook standardization.
I would be very interested in the opinion of the members of this list on
this proposal.
Sincerely
Frank Steimke
P. S. This text was translated with www.DeepL.com/Translator (free
version) from german language.
---------------------------------------------------------------------
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org