Follow-up Comment #5, bug #67372 (group groff): At 2025-07-28T14:04:24-0400, G. Branden Robinson wrote: > commit ad4fa80a3f2d66ed7e9d4342fbc58d2be07984ea > Author: G. Branden Robinson <[email protected]> > Date: Sat Oct 1 04:18:47 2022 -0500 > > [troff]: Refactor to parallelize logic. > > * src/roff/troff/input.cpp: Refactor to parallelize logic in similar > routines; namely, those handling escape sequences that accept newlines > as argument delimiters.
There were six escape sequences at issue: \A, \B, \b, \o, \w, and \X.
Of these, only 3 are portable to all of the _troff_s available to me:
\b, \o, and \w. (`\X` was a Kernighan _troff_ innovation, not available
in Seventh Edition Unix _troff_.)
So I thought I'd see what happens with an input using this interpolated
delimiter trick with V7, DWB 3.3, and Heirloom Doctools troffs, and
_groff_ 1.22.{3,4}, 1.23.0, and Git HEAD.
Exhibit:
$ cat ATTIC/escape-delimiter-fun.roff
.sp 1i \" space to accommodate bracket-building
.ds D abc
\b\*D+|+\*D
.br
\o\*D+|+\*D
.br
\w\*D+|+\*D
V7 Unix (using a shorter filename because it was 1979 and 14 characters
should be enough for anyone--you can be sure Ken Thompson wasn't going
to ever type anything that long):
$ pdp11 ./v7.simh
PDP-11 simulator V3.8-1
Disabling XQ
@boot
New Boot, known devices are hp ht rk rl rp tm vt
: rl(0,0)rl2unix
mem = 177856
# Restricted rights: Use, duplication, or disclosure
is subject to restrictions stated in your contract with
Western Electric Company, Inc.
Thu Sep 22 23:35:05 EDT 1988
login: dmr
$ cat > escfun.roff
.sp 1i \" space to accommodate bracket-building
.ds D abc
\b\*D+|+\*D
.br
\o\*D+|+\*D
.br
\w\*D+|+\*D
$ nroff escfun.roff | sed '/^$/d'
b
c
+
|
+bc
+bc
120bc
$ sync
$ sync
$ sync
$
login:
Simulation stopped, PC: 002306 (MOV (SP)+,177776)
sim> quit
Goodbye
Next up, DWB 3.3 _troff_, a cousin of Kernighan _troff_:
$ DWBHOME=. ./bin/nroff escape-delimiter-fun.roff | cat -s
b
c
+
|
+bc
+bc
120bc
Next, Heirloom Doctools _troff_, a descendant of DWB 2.0 (whence also
came Solaris _troff_, to the best of my knowledge):
$ ./bin/nroff escape-delimiter-fun.roff | cat -s
b
c
+
|
+bc
+bc
120bc
_groff_ 1.22.3:
$ ~/groff-1.22.3/bin/nroff -ww ATTIC/escape-delimiter-fun.roff | cat -s
ATTIC/escape-delimiter-fun.roff:5: warning: missing closing delimiter
b
c
+bc
c
120bc
_groff_ 1.22.4:
$ ~/groff-1.22.4/bin/nroff -ww ATTIC/escape-delimiter-fun.roff | cat -s
troff: ATTIC/escape-delimiter-fun.roff:5: warning: missing closing delimiter
b
c
+bc
c
120bc
_groff_ 1.23.0:
$ ~/groff-1.23.0/bin/nroff -ww ATTIC/escape-delimiter-fun.roff | cat -s
troff:ATTIC/escape-delimiter-fun.roff:7: warning: missing closing delimiter in
width computation escape sequence (got a newline)
b
c
+
|
+
c
192
c
_groff_ Git HEAD:
$ ~/groff-HEAD/bin/nroff -ww ATTIC/escape-delimiter-fun.roff | cat -s
troff:ATTIC/escape-delimiter-fun.roff:3: warning: missing closing delimiter in
bracket-building escape sequence; expected character 'a', got a newline
troff:ATTIC/escape-delimiter-fun.roff:5: warning: missing closing delimiter in
overstrike escape sequence; expected character 'a', got a newline
troff:ATTIC/escape-delimiter-fun.roff:7: warning: missing closing delimiter in
width computation escape sequence; expected character 'a', got a newline
b
c
+
|
+.br c.br 192
a
b
c
I concede that I've destabilized _groff_ for this input.
I also observe that _groff_ has apparently never been consistent with
AT&T _troff_ in this respect: contrast _groff_ 1.22.{3,4} output with
V7, DWB 3.3, and Heirloom.
But _groff_ has another trick up its sleeve. Recall comment #2:
Normally, GNU 'troff' keeps track of delimited arguments'
interpolation depth. In compatibility mode, it does not.
.ds xx '
\w'abc\*(xxdef'
=> 168 (normal mode on a terminal device)
=> 72def' (compatibility mode on a terminal device)
What happens if we turn on _groff_'s AT&T compatibility mode?
$ ~/groff-HEAD/bin/nroff -Cww ATTIC/escape-delimiter-fun.roff | cat -s
b
c
+bc
+bc
120bc
We get _almost_ AT&T-compatible behavior. The pipe in the
bracket-building escape sequence has gone missing. _groff_ appears to
have been mislaying it for many years, compatibility mode or no.
But let's check that. Let's travel back in time and see if/how I've
perturbed the treatment of this input in compatibility mode.
$ ~/groff-1.23.0/bin/nroff -Cww ATTIC/escape-delimiter-fun.roff | cat -s
/home/branden/groff-1.23.0/bin/nroff: usage error: invalid option '-Cww'
usage: /home/branden/groff-1.23.0/bin/nroff [-bcCEhikpRStUVz] [-d ctext] [-d
string=text] [-K fallback-encoding] [-m macro-package] [-M macro-directory]
[-n page-number] [-o page-list] [-P postprocessor-argument] [-r
cnumeric-expression] [-r register=numeric-expression] [-T output-device] [-w
warning-category] [-W warning-category] [file ...]
usage: /home/branden/groff-1.23.0/bin/nroff {-v | --version}
usage: /home/branden/groff-1.23.0/bin/nroff --help
Oh, bother.
$ ~/groff-1.23.0/bin/groff -Tascii -Cww ATTIC/escape-delimiter-fun.roff | cat
-s
b
c
+bc
+bc
120bc
$ ~/groff-1.22.4/bin/groff -Tascii -Cww ATTIC/escape-delimiter-fun.roff | cat
-s
b
c
+bc
+bc
120bc
$ ~/groff-1.22.3/bin/groff -Tascii -Cww ATTIC/escape-delimiter-fun.roff | cat
-s
b
c
+bc
+bc
120bc
So compatibility mode has been stable, and close to AT&T behavior, but
not an exact match.
I think three tasks arise from this exploration.
1. I should document that if one wants AT&T-compatible treatment of
interpolated delimiters, one should use compatibility mode. The
language of our documentation should be broadened: _groff_'s concept
of "input level" (or "interpolation depth" as I prefer to term it)
applies not just to the _arguments_ of a delimited escape sequence,
_but to the delimiters themselves_.
2. I should see if I can make _groff_ perfectly AT&T-compatible with
this input, in compatibility mode.
3. I should add test cases that force me to nail down the formatter's
behavior when using string interpolations to construct escape
sequences.
How does that sound?
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?67372>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
signature.asc
Description: PGP signature
