27 Nov 2014
RE: HTML output for @itemize and @enumerate commands
VERSION: makeinfo 5.2 (built from source on Fedora 20 x86_64)
BUG: It is not certain whether these are bugs or an enhancement request,
so you decide.
Following up on the previous discussion.
The HTML output for both @itemize and @enumerate are rudimentary. Not
only is some of the TexInfo source formatting lost, the generated markup
does not take advantage of the available HTML constructs.
These problems are deep, and not easy to address:
@itemize (<ul>)
1.
@itemize correctly takes advantage of the HTML defaults for:
@itemize (no argument)
@itemize @bullet
2.
For arguments other than @bullet, the generated HTML looks like:
<ul class=”no-bullet”><li> ... </li> ... <ul>
o
First, this means that so long as the no-bullet class is
defined, the browser will render the list without bullets.
o
Second, the bullet character (whatever was specified in the
source) is embedded inside the line item. _I fully understand
this design decision_, and I probably would have done it the
same way if I were in a hurry.
o
Third, because the bullet character is embedded in the line
item, second and subsequent lines of the item appear to be
vertically mis-aligned.
3.
For '@itemize @minus' lists, the info output uses the Unicode minus
(U+2212) for the bullet, while the HTML inserts a plain minus sign
'-' (U+002D). I think it would be better to be consistent and
output: −
4.
For '@itemize @w{}' lists, the 'info' output generates a space
character where the bullet would have been and does not generate an
extra embedded space. This seems to be the correct implementation.
5.
*Enhancement Possibility*
I feel that full support for the HTML bullet types: [disc | circle |
square | none] is both necessary and convenient. We could either
hard-code a style in the converter, OR reference a class definition
for each type. In order to maintain flexibility and to be consistent
with the current converter design, I recommend the class callout.
The actual names for the new classes are up to you, but the
following are the names used in the current version of the CSS
definition file. Parsing logic (most likely first):
*
if ( argument == none || argument == @bullet )
generate: <ul class="disc-bullet"> ... </ul> OR <ul> ... </ul>
(note: the default bullet type is 'disc')
*
else if ( argument == @w{} || (TABLE OF CONTENTS) )
generate: <ul class="no-bullet"> ... </ul>
*
else if ( argument == @textdegree || argument == @BCIRCLE(U+26AC) )
generate: <ul class="circle-bullet"> ... </ul>
*
else
generate: <ul class="square-bullet"> ... </ul>
(note: for bullet characters not supported by HTML,
default to the third type of HTML bullet)
6.
Note that if you decide to hard-code the bullet style, you should use:
<ul style=“list-style-type:xxx;”> because the <ul type=”xxx”>
construct is deprecated in HTML4 and not supported by HTML5.
@enumerate (<ol>)
1.
'@enumerate' correctly takes advantage of the HTML defaults for
decimal (1, 2, 3, ...)
@enumerate (no argument)
@enumerate 1
2.
For non-decimal enumerators, the enumerator specified in the source
is lost.
3.
HTML supports several enumeration types, but not all of them have
TexInfo equivalents.
4.
I think it's important to directly support at least the following in
the converter:
*
@enumerate (default <ol> is ok)
*
@enumerate 1 (default <ol> is ok)
*
@enumerate A
class callout: <ol class= “enum-upper-alpha”>
hard-coded: <ol style=“list-style-type:upper-alpha;”>
*
@enumerate a
class callout: <ol class= “enum-lower-alpha”>
hard-coded: <ol style=“list-style-type:lower-alpha;”>
5.
Additional enumeration types that would be desirable:
*
lower-case Roman numerals,
class callout: <ol class= “enum-lower-roman”>
hard-coded: <ol style=“list-style-type:lower-roman;”>
*
upper-case Roman numerals,
class callout: <ol class= “enum-upper-roman”>
hard-coded: <ol style=“list-style-type:upper-roman;”>
*
lower-case Greek letters
class callout: <ol class= “enum-lower-greek”>
hard-coded: <ol style=“list-style-type:lower-greek;”>
6.
My idea for supporting additional enumeration types in info and HTML
output would look like this:
@enumerate @xxx{n} where 'xxx' is the name of the enumeration type,
and the _optional_'n' would specify the starting value.
HTML expects a decimal start for all types i.e. <ol
style=”list-style-type:lower-roman” start=“4” yields: iv.
Here are the types I would recommend:
*
@enumerate @loweralpha{n}
*
@enumerate @upperalpha{n}
*
@enumerate @lowerroman{n}
*
@enumerate @upperroman{n}
*
@enumerate @lowergreek{n}
*
@enumerate @enum_decimal{n} (for completeness)
*
@enumerate @enum_none (this could be handled by @itemize instead)
*
HTML supports additional types: decimal-leading-zero,
lower-latin, upper-latin, armenian, georgian. These may be too
much, but I have no data on how often these are used in the real
world.
7.
The currently-available TexInfo @enumerate syntax would remain
unchanged, but the '@enumerate a' and '@enumerate A' would generate
HTML as above. HTML (without styling) would therefore be unchanged
because the class names would be undefined.
8.
Enumeration that begins at an arbitrary point in the sequence would
be difficult to encode in the HTML unless you hard-code the style OR
pass in a variable (which I'm not sure is possible). For instance
'@enumerate 7' is allowed in the info output, but how would you pass
the start value through the HTML converter?
9.
Note that if you decide to hard-code the enumeration type, you
should use:
<ol style=“list-style-type:xxx;”> OR
<ol style=“list-style-type:xxx;” start=“n”> (for starting mid-sequence)
*
Even though HTML5 (according to w3.org) brings back the <ol
type=”xxx”> syntax, it's better to consistently use the style's
actual name.
10.
Parsing logic (most likely first)
*
if ( argument == (DECIMAL NUMBER) || argument == (NONE) )
info: 1, 2, 3, 4, 5, ... (or start at specified point)
HTML: <ol> ... </ol>
*
else if ( argument >= 'a' && argument <= 'z' || argument ==
@loweralpha )
info: 'a'-'z' as currently implemented, @loweralpha as if it
were 'a',
or @loweralpha{n} where 'n' is the start point
HTML: <ol class="enum-lower-alpha"> ... </ol>
*
else if ( argument >= 'A' && argument <= 'Z' || @upperalpha )
info: 'A'-'Z' as currently implemented, @upperalpha as if it
were 'A'
or @upperalpha{n} where 'n' is the start point
HTML: <ol class="enum-upper-alpha"> ... </ol>
*
if ( argument == @lowerroman )
info: i, ii, iii, iv, v, ...
or @lowerroman{n} where 'n' is the start point
HTML: <ol class="enum-lower-roman"> ... </ol>
*
else if ( argument == @upperroman )
info: I, II, III, IV, V, ...
or @upperroman{n} where 'n' is the start point
HTML: <ol class="enum-upper-roman"> ... </ol>
*
else if ( argument == @lowergreek )
info: α, β, γ, δ, ε, ...
or @lowergreek{n} where 'n' is the start point
HTML: <ol class="enum-lower-greek"> ... </ol>
*
else // (default to decimal)
info: 1, 2, 3, 4, 5, ...
HTML: <ol> ... </ol>
All of the above is of course just suggestion, but some of it seems
necessary and/or highly desirable for the future of @itemize and
@enumerate lists.
PS: I will post an updated version of the CSS definition file to my
website this weekend.
Cheers,
Mahlon
--
Software Sam - software and tools for GNU/Linux
Mahlon Smith,
/The Software Samurai/
On the Web: /http://www.SoftwareSam.us/
<http://www.SoftwareSam.us/home.html>/