Re: [Groff] ubuntu, groff and utf-8

Werner LEMBERG Tue, 08 Mar 2005 10:40:28 -0800

> It seems there is a problem with this script.
> If there is an `Amacron' in the data, the script produces `u0100'.
> But glyphs in groff are named in decomposed form,
> glyph name for `Amacron' is `u0041_0304'.
> You can see this from unicode_decomposed hash in afmtodit
> and uniglyph.cpp & glyphuni.cpp .
> Thus your script has to be made a bit longer by inclusion of 
> unicode_decomposed hash ;)


This is not correct.  You actually can name a glyph `\[u0100]' within
a groff input file, and groff itself decomposes it to glyph name
`u0041_0304'.  With other words, a glyph name `u0100' *within a font
file* isn't allowed.

> And, may be, it is a good idea to replace (optionally, where
> possible) unicode glyph names with the (approx. two character) groff
> glyph names, the way it is done in input.cpp, using
> unicode_to_glyph_list and following precedents from latin?.tmac.
> The reason is to make the output more portable and human-readable.

I think this is not worth the trouble.  Consider a UTF-8 document
written in Russian.  *All* Russian glyphs would be \[uXXXX], and this
can't be made more human readable.

> PS. Are you sure that mapping in devutf8 fonts (and other places)
> `la' and `ra' to 0x27E8(MATHEMATICAL LEFT ANGLE BRACKET) and 0x27E9
> is a good idea?

For UTF-8, this is the right solution IMHO.  The other choice, U+2329,
is problematic due to its canonical equivalence to the CJK left angle
bracket, U+3008.  It's easy to override this locally.

> It do not think many fonts have that Math Symbols, while `la' and
> `ra' are often used in roff files in non-math context

Really?  Can you give an example?


    Werner


_______________________________________________
Groff mailing list
Groff@gnu.org
http://lists.gnu.org/mailman/listinfo/groff

Re: [Groff] ubuntu, groff and utf-8

Reply via email to