Follow-up Comment #8, bug #68133 (group groff): [comment #7 comment #7:] > [comment #6 comment #6:] >> Have you seen bug #68132? >> >> That's an example of a translation that "historically worked"--except in >> groff for the last 37 years. > > Yes, that would be another category. > >> Semantic clarity, consistency, and corresponding reductions in the >> number of corner cases that have to be documented, tested, and >> maintained...while we trade most of the latter advantages away >> again for things we support only in compatibility mode, such >> features are at least firewalled off. > > You know the code and test suite better than I do, of course. But on the > surface, "we have to maintain/test additional corner cases A, B, and C" > sounds like less work than "we have to maintain/test that corner cases A, B, > and C have result D under condition E and result F under condition G."
But how do I know, today, how many corner cases I'm going to need? And how are people even going to know that corner cases are being exercised when they're formatting a few hundred pages of legacy Unix documents, nothing _dramatically_ bad happens (like a formatter crash or major document truncation), without carefully going over the formatted output in one hand, the *roff source in the other, and substantial *roff expertise in the brain between? If I make GNU _troff_ not accept as a `tr` operand anything that doesn't have "charinfo" (see below), then I can have it spew a diagnostic when an oddball is encountered. The corner case can then be studied, and if it's a case of a historical substance, we can _then_ write support for it in compatibility mode and unit-test it. That way _thought_ is given to the corner cases, instead of the absence of thought. "Are we translating a character to a font selection escape sequence? Does the formatter understand what we're trying to do? Does it comply? Does the result work? Who knows?" >> Not just under the hood. Our Texinfo manual is at pains to >> distinguish characters from glyphs > > Perhaps my terminology was sloppy. Characters are input and glyphs are > output, which is not what I was trying to get at. No, we inescapably hit a trichotomy. Yes, GNU _troff_ reads 8-bit bytes, and every possible one of the 256 values is accounted for and handled in one way or another. And yes, GNU _troff_ spits out "glyphs" that an output device looks up in its font descriptions, or accesses by index. [https://man7.org/linux/man-pages/man5/groff_out.5.html The "c", "C", and "N" trout commands handle these.] But in between--internally to the formatter--ahh, that's a different story. Every unique _character_, whether ordinary, special, or indexed, an associated [https://cgit.git.savannah.gnu.org/cgit/groff.git/tree/src/roff/troff/charinfo.h?h=1.24.1 charinfo object]. > My intended point was it's easier for a user to conceptualize "I can > translate a character to another character represented by an escape sequence" > than "I can translate a character to some but not to other characters > represented by an escape sequence." I'm sure it is, but no _troff_ has ever been **that** simple. $ printf '.tr o\\fB\nHello, world!\n' | dwb nroff | cat -s Hell , w rld! $ printf '.tr o\\fB\nHello, world!\n' | solaris10 nroff | cat -s Hell , w rld! $ printf '.tr o\\s+2\nHello, world!\n' | solaris10 nroff | cat -s Hell , w rld! $ printf '.tr o\\x@2p@\nHello, world!\n' | solaris10 nroff | cat -s Hell, wrld! $ printf '.tr o\\x@2p@\nHello, world!\n' | dwb nroff | cat -s Hell, wrld! $ printf '.tr o\\x@2p@\nHello, world!\n' | 9 nroff | cat -s Hell, wrld! $ printf '.tr o\\x@2p@\nHello, world!\n' | heirloom nroff | cat -s Hell\, w\rld! When you translate a character to an escape sequence that does not represent a character, you might get an unbreakable space, you might get nothing, or you might get a backslash. And things "that are ignored in nroff mode" might not be so ignored after all. >> Why do you suppose that a user of a typesetting system is likely to be >> able to achieve a high level of skill in that system without a clear >> idea of what its definition of a "character" is? > > I expect Dennis Ritchie was a pretty skilled troffer and still used a syntax > you're advocating to disallow. I expect he was. But I'll bet that Kernighan was even more skilled, especially by the time he'd finished refactoring Ossanna troff into device-independent troff. (See CSTR #97.) I think some _troff_ features were not by deliberate design, but deliberate hacks that arose as users explored the input space. And those hacks that Ritchie employed in *roff documents, I expect to support, if it's not more than a slight headache to do (modulated by the impact on rendered output). But only in compatibility mode. There's no need for the coming generations of *roff users (insert laughter or tears here) to become conversant with old hacks where GNU _troff_, or even AT&T _troff_ itself, offers a workable alternative. >> If I get this stuff hammered out the way I envision, that use case will >> be served as follows. > ... >> Why is the foregoing insufficient, especially for something that has >> never worked before? > > (for groff values of "never") Fully conceded. > The foregoing is sufficient, but seems a little extra hoop-jumpy than > necessary. Granted, jumping through hoops is a price of admission to many > aspects of roff, but maybe that price can be reduced sometimes. I hope I've thrown some light on the conceptual clarity and consistency we can enjoy if we stick to the principle that _features_ are more important for composability than _syntax_. [1] See footnote 4 of <https://lists.gnu.org/archive/html/groff/2026-03/msg00039.html>. _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?68133> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/
signature.asc
Description: PGP signature
