[bug #64155] [troff] specifying -fZD on command line generates warnings

G. Branden Robinson Thu, 02 May 2024 23:02:05 -0700

Follow-up Comment #34, bug #64155 (group groff):


commit 604361951c9ddad621b3e0786ca7a2946e212842
Author: G. Branden Robinson <g.branden.robin...@gmail.com>
Date:   Fri May 3 00:00:59 2024 -0500

    Revert "[troff]: Validate a font family before using it."
    
    This reverts commit 39ffa368dc6a1de4c11cf3f4f5b8594d3c974173.
    
    There were a few problems with this approach; possibly the biggest is
    that styles can be dynamically constructed, for which this validation
    process was no help, and could be a misleading hindrance.
    
    Thanks to Dave Kemper, Deri James, and Peter Schaffter for the
    discussion.
    
    <https://savannah.gnu.org/bugs/?64155> is already reopened.


[comment #29 comment #29:]
> > Well, when you're prepared to discuss it, it would be good to know if/how
Dave's original report in comment #0 was invalid
> 
> I think I can answer this - it is certainly not invalid, it just has nothing
to do with ZD not being a "proper" family (in your eyes).

This isn't a matter of my subjective opinion; the "family" and "style"
features of _groff_ date back to 1990.

https://git.savannah.gnu.org/cgit/groff.git/tree/ChangeLog.115?h=1.23.0#n5535

There is a long AT&T _troff_ hangover of assuming that `R`, `I`, and `B` fonts
will exist (or, in _groff_, that abstract styles that work analogously to
fonts of those names will).  This assumption can extend to `S` as well. 


$ git blame font/devhtml/DESC.proto
fe93746e79 (Werner LEMBERG 2001-01-17 14:53:37 +0000  1) res 240
fe93746e79 (Werner LEMBERG 2001-01-17 14:53:37 +0000  2) hor 24
fe93746e79 (Werner LEMBERG 2001-01-17 14:53:37 +0000  3) vert 40
fe93746e79 (Werner LEMBERG 2001-01-17 14:53:37 +0000  4) unitwidth 10
6d5bbb4fbb (Werner LEMBERG 2003-03-31 14:31:21 +0000  5) sizes 1-1000 0
2738e8ceb6 (Werner LEMBERG 2002-08-07 15:01:32 +0000  6) fonts 9 R I B BI CR
CI CB CBI S
fe93746e79 (Werner LEMBERG 2001-01-17 14:53:37 +0000  7) tcommand
8fe7c747b7 (Werner LEMBERG 2004-10-08 07:08:08 +0000  8) unscaled_charwidths
fe93746e79 (Werner LEMBERG 2001-01-17 14:53:37 +0000  9) postpro post-grohtml
fe93746e79 (Werner LEMBERG 2001-01-17 14:53:37 +0000 10) prepro  pre-grohtml
fe93746e79 (Werner LEMBERG 2001-01-17 14:53:37 +0000 11)
use_charnames_in_special
fe93746e79 (Werner LEMBERG 2001-01-17 14:53:37 +0000 12) pass_filenames
4d917850b2 (Werner LEMBERG 2006-02-26 22:21:38 +0000 13) unicode


Eventually,
[https://git.savannah.gnu.org/cgit/groff.git/commit/src/devices/grohtml/grohtml.1.man?h=1.23.0&id=b128913074d43c11d0f0eec035fcdb316ad7e397
I documented the foregoing.]

Your critique implies that we should not further entrench assumptions
regarding the availability of font or style names.  Fair enough.

> If you do -fZD, which has no alphabetic glyphs, any start up macros which
contain conditional statements of the form:-
> 
> .if '\*[.T]'html' \" etc..
> 
> Will produce character not found errors, since, as we know, both sides are
formatted in separate environments and compared.

Yes.  I find this troublesome.

It seems to be an idiom to use the output comparison operator (a term I had to
coin, for it had no name) for string comparisons even when there is no
intention of formatting either of the comparands as text.

But that idiom would seem to port poorly to document environments where the
basic Latin alphabet doesn't need to be formatted at all.

> The give away in Dave's initial report is that the "character undefined"
errors spell out the words "ps/pdf/html" at least can be seen.

You paid better attention than I did.  I managed to overlook it, staring more
closely at the 1.23.0 output, seeing only the character sequence
"psaciltnufhm", and deducing nothing much in particular from that.

Good sleuthing!

But also there's no way in hell we should be spewing diagnostics on, or
failing to correctly perform comparisons within, control structures that are
initializing the state of the formatter.  These macro files are profoundly
uninterested in formatting any output, and they'd be buggy if they did so,
since an empty input document should produce no output no matter what (stock)
macro packages or startup macro files you load.

Maybe it is time for a proper string equality operator in our conditional
expressions.

> I'm sure this "feature" has come up before. Even with Branden's code which
stopped -fZD from working did not address the real problem because if you have
four fonts with R, I, B, and BI extension but don't contain alphabetic
characters, -f works for the "family" but you will see exactly the same errors
as Dave reported.

Fair point.
 
> One way for a proper fix, is to create a copy of TR as TRSKEL, add the
"special" parameter, and change the name parameter to TRSKEL, remove all
kernpairs, delete all glyph definitions above 127, and finally alter the DESC
file by incrementing the number and adding TRSKEL on the end. This will solve
the error occurring if you use -f on fonts which don't contain ascii glyphs,
such as some CJK fonts. This can all be done in the devpdf directory (I have
done it for testing), but a very similar change can be made to devps as well.

Here, I must disagree.  The foregoing strikes me as a workaround rather than a
proper fix.

> An alternative fix would be to consider including the GNU UnifontMedium font
in groff and using it as a special instead of TRSKEL, this has the advantage
of solving any undefined glyphs as well, but there are other issues.

I'm not thrilled with this one either.  A document of Chinese poetry or a
setting of the Rig Veda need never format any Basic Latin glyphs.  At the same
time, it might want to issue conditional requests predicated on the identity
of the output device.  That's a reasonable thing to want.  Why drag in a font
to get that functionality?

When we look at the macro files that produced the diagnostic messages when the
`-fZD` option was given, we should ask ourselves (as you did): what is the
macro file trying to do that generates all this noise?

A string equality test.  Not a comparison of node lists, which are laden with
all sorts of other data.[1]  Just a string equality test.

Well, let's give the people what they are asking for.  Especially since a lot
of the time, "the people" is us.

[1] Now that I've started resurrecting and expanding the formatter's
facilities for node dumping, I'm getting a better notion of what all that
extra data is.  I begin to revise upward my expectations of the performance
improvement I can expect from a straight-up simple string equality operator,
particularly since the operation can be handed off to the C++ standard library
and from there, likely to platform-specific, optimized assembly code.  It
hasn't escaped me that this might also help, say, string comparisons of PDF
bookmark tags, since they too are not output to be formatted.


    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?64155>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

[bug #64155] [troff] specifying -fZD on command line generates warnings

Reply via email to