On 1/21/24, Oliver Corff via <groff@gnu.org> wrote: > Now the question which is not language-specific: In how far can groff > access these font-internal lookup tables? It appears that the "naive" > approach does not trigger the ligature mechanism in the font, as > demonstrated by Tom's and Deri's examples. > > Is it possible that every \[u0Fxx] is (perhaps invisibly) isolated, akin > to putting every character in {f}{f}{l} if you want to make sure in TeX > that no ligature will spring into action?
It's much simpler than that: groff supports only five specific ligatures: fi, fl, ff, ffi, and ffl. See section 5.19.8 (Ligatures and Kerning) of the 1.23 version of the info manual. (Curiously, a more recent revision of this section downplays the significance of this limitation by citing two specific ligatures that aren't supported and calling them "archaic.") There's a feature request open (http://savannah.gnu.org/bugs/?64344) to remove this limitation, but no one is currently working on it. The mildly good news is that groff can access any glyph in a font, whether or not groff recognizes it as a ligature. For instance, the Linux Libertine font defines a ligature for "Qu". Groff won't invoke it automatically, but looking in the font description file reveals that this character is named u0051_0075, so groff can access it with the escape \[u0051_0075]. Some glyphs in the font description file may not have names, however (indicated by the first column of its entry being "---"), but groff can produce even unnamed glyphs in a font with its \N escape. Groff's .char request can make the syntax less clunky (e.g., for the Qu ligature cited above, you could say ".char Qu \[u0051_0075]"), but until its native ligature handling is expanded beyond its current five, you'll still want a custom preprocessor (e.g., to change every "Qu" in your input text to "\[Qu]" for that .char definition to work). > Yet instead of producing the letter "f", \[u0066] generates an error > message: "warning: special character '\f' not defined" > > Where is my mistake? This seems to be a groff bug: I reported it in http://savannah.gnu.org/bugs/?63334 but it's not a high priority. The reason it's not a high priority is that groff does not claim to support representing ASCII characters in \[u00xx] format. Even so, groff isn't correctly parsing here, because there should be no way for the sequence "\[u0066]" to translate to "\f": the entire string "\[u0066]" should either translate to "f", or be undefined.