Follow-up Comment #34, bug #64155 (group groff):
commit 604361951c9ddad621b3e0786ca7a2946e212842 Author: G. Branden Robinson <g.branden.robin...@gmail.com> Date: Fri May 3 00:00:59 2024 -0500 Revert "[troff]: Validate a font family before using it." This reverts commit 39ffa368dc6a1de4c11cf3f4f5b8594d3c974173. There were a few problems with this approach; possibly the biggest is that styles can be dynamically constructed, for which this validation process was no help, and could be a misleading hindrance. Thanks to Dave Kemper, Deri James, and Peter Schaffter for the discussion. <https://savannah.gnu.org/bugs/?64155> is already reopened. [comment #29 comment #29:] > > Well, when you're prepared to discuss it, it would be good to know if/how Dave's original report in comment #0 was invalid > > I think I can answer this - it is certainly not invalid, it just has nothing to do with ZD not being a "proper" family (in your eyes). This isn't a matter of my subjective opinion; the "family" and "style" features of _groff_ date back to 1990. https://git.savannah.gnu.org/cgit/groff.git/tree/ChangeLog.115?h=1.23.0#n5535 There is a long AT&T _troff_ hangover of assuming that `R`, `I`, and `B` fonts will exist (or, in _groff_, that abstract styles that work analogously to fonts of those names will). This assumption can extend to `S` as well. $ git blame font/devhtml/DESC.proto fe93746e79 (Werner LEMBERG 2001-01-17 14:53:37 +0000 1) res 240 fe93746e79 (Werner LEMBERG 2001-01-17 14:53:37 +0000 2) hor 24 fe93746e79 (Werner LEMBERG 2001-01-17 14:53:37 +0000 3) vert 40 fe93746e79 (Werner LEMBERG 2001-01-17 14:53:37 +0000 4) unitwidth 10 6d5bbb4fbb (Werner LEMBERG 2003-03-31 14:31:21 +0000 5) sizes 1-1000 0 2738e8ceb6 (Werner LEMBERG 2002-08-07 15:01:32 +0000 6) fonts 9 R I B BI CR CI CB CBI S fe93746e79 (Werner LEMBERG 2001-01-17 14:53:37 +0000 7) tcommand 8fe7c747b7 (Werner LEMBERG 2004-10-08 07:08:08 +0000 8) unscaled_charwidths fe93746e79 (Werner LEMBERG 2001-01-17 14:53:37 +0000 9) postpro post-grohtml fe93746e79 (Werner LEMBERG 2001-01-17 14:53:37 +0000 10) prepro pre-grohtml fe93746e79 (Werner LEMBERG 2001-01-17 14:53:37 +0000 11) use_charnames_in_special fe93746e79 (Werner LEMBERG 2001-01-17 14:53:37 +0000 12) pass_filenames 4d917850b2 (Werner LEMBERG 2006-02-26 22:21:38 +0000 13) unicode Eventually, [https://git.savannah.gnu.org/cgit/groff.git/commit/src/devices/grohtml/grohtml.1.man?h=1.23.0&id=b128913074d43c11d0f0eec035fcdb316ad7e397 I documented the foregoing.] Your critique implies that we should not further entrench assumptions regarding the availability of font or style names. Fair enough. > If you do -fZD, which has no alphabetic glyphs, any start up macros which contain conditional statements of the form:- > > .if '\*[.T]'html' \" etc.. > > Will produce character not found errors, since, as we know, both sides are formatted in separate environments and compared. Yes. I find this troublesome. It seems to be an idiom to use the output comparison operator (a term I had to coin, for it had no name) for string comparisons even when there is no intention of formatting either of the comparands as text. But that idiom would seem to port poorly to document environments where the basic Latin alphabet doesn't need to be formatted at all. > The give away in Dave's initial report is that the "character undefined" errors spell out the words "ps/pdf/html" at least can be seen. You paid better attention than I did. I managed to overlook it, staring more closely at the 1.23.0 output, seeing only the character sequence "psaciltnufhm", and deducing nothing much in particular from that. Good sleuthing! But also there's no way in hell we should be spewing diagnostics on, or failing to correctly perform comparisons within, control structures that are initializing the state of the formatter. These macro files are profoundly uninterested in formatting any output, and they'd be buggy if they did so, since an empty input document should produce no output no matter what (stock) macro packages or startup macro files you load. Maybe it is time for a proper string equality operator in our conditional expressions. > I'm sure this "feature" has come up before. Even with Branden's code which stopped -fZD from working did not address the real problem because if you have four fonts with R, I, B, and BI extension but don't contain alphabetic characters, -f works for the "family" but you will see exactly the same errors as Dave reported. Fair point. > One way for a proper fix, is to create a copy of TR as TRSKEL, add the "special" parameter, and change the name parameter to TRSKEL, remove all kernpairs, delete all glyph definitions above 127, and finally alter the DESC file by incrementing the number and adding TRSKEL on the end. This will solve the error occurring if you use -f on fonts which don't contain ascii glyphs, such as some CJK fonts. This can all be done in the devpdf directory (I have done it for testing), but a very similar change can be made to devps as well. Here, I must disagree. The foregoing strikes me as a workaround rather than a proper fix. > An alternative fix would be to consider including the GNU UnifontMedium font in groff and using it as a special instead of TRSKEL, this has the advantage of solving any undefined glyphs as well, but there are other issues. I'm not thrilled with this one either. A document of Chinese poetry or a setting of the Rig Veda need never format any Basic Latin glyphs. At the same time, it might want to issue conditional requests predicated on the identity of the output device. That's a reasonable thing to want. Why drag in a font to get that functionality? When we look at the macro files that produced the diagnostic messages when the `-fZD` option was given, we should ask ourselves (as you did): what is the macro file trying to do that generates all this noise? A string equality test. Not a comparison of node lists, which are laden with all sorts of other data.[1] Just a string equality test. Well, let's give the people what they are asking for. Especially since a lot of the time, "the people" is us. [1] Now that I've started resurrecting and expanding the formatter's facilities for node dumping, I'm getting a better notion of what all that extra data is. I begin to revise upward my expectations of the performance improvement I can expect from a straight-up simple string equality operator, particularly since the operation can be handed off to the C++ standard library and from there, likely to platform-specific, optimized assembly code. It hasn't escaped me that this might also help, say, string comparisons of PDF bookmark tags, since they too are not output to be formatted. _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?64155> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/