Re: gropdf-ng merge status (was: PDF outline not capturing Cyrillic text)

G. Branden Robinson Tue, 06 Feb 2024 17:31:50 -0800

Hi Deri,

At 2024-02-06T21:35:05+0000, Deri wrote:
> Many thanks for your thoughts on my code. I shall reply in general
> terms since your grasp of some of the issues is rather hazy, as you
> admit.

I generally don't feel I grasp code of nontrivial complexity until I've
documented it and written tests for it, and often not even then.  I'm a
bear of very little brain!

> Huge AGL lookup table
> 
> My least favourite solution, but you made me do it! The most elegant
> and efficient solution was to make a one line amendment to afmtodit
> which added an extra column to the groff font files which would have
> the UTF-16 code for that glyph. This would only affect devpdf and
> devps and I checked the library code groff uses to read its font files
> was not affected by an extra column. I also checked the buffer used
> would not cause an overflow. Despite this, you didn't like this
> solution, without giving a cogent reason,  but suggesting a lookup
> table!

I remember expressing unease with the "new column" approach, not
rejection.  The main reason is the documented format of the lines in
question.

groff_font(5):
     The directive charset starts the character set subsection.  (On
     typesetters, this directive is misnamed since it starts a list of
     glyphs, not characters.)  It precedes a series of glyph
     descriptions, one per line.  Each such glyph description comprises
     a set of fields separated by spaces or tabs and organized as
     follows.

            name metrics type code [entity‐name] [-- comment]

[...]

     The entity‐name field defines an identifier for the glyph that the
     postprocessor uses to print the troff glyph name.  This field is
     optional; it was introduced so that the grohtml output driver could
     encode its character set.  For example, the glyph \[Po] is
     represented by “&pound;” in HTML 4.0.  For efficiency, these data
     are now compiled directly into grohtml.  grops uses the field to
     build sub‐encoding arrays for PostScript fonts containing more than
     256 glyphs.  Anything on the line after the entity‐name field or
     “--” is ignored.

The presence of 2 adjacent optional fields seems to me fairly close to
making the glyph descriptions formally undecidable.  In practice,
they're not, until and unless someone decides to name their "entity"
"--"...  (We don't actually tell anyone they're not allowed to do that.)

As I understand it, this feature is largely a consequence of the
implementation of grohtml 20-25 years ago, where an "entity" in HTML 4
and XHTML 1 was a well-defined thing.  We might do well to tighten the
semantics and format of this optional fifth field a bit more.

More esteemed *roffers than I have stumbled over our documentation's
unfortunate tendency to sometimes toss the term "entity" around,
unmoored from any formal definition in the *roff language.

https://lists.gnu.org/archive/html/groff/2023-04/msg00002.html

While I'm complaining about hazy terminology that exacerbates my hazy
understanding of things, I'll observe that I don't understand what the
verb "to sub-encode" means.  I suspect there are better words to express
what this is trying to get across.  If I understood what grops was
actually doing here, I'd try to find those words.

> As to whether I should embed the table, or read it in, I deferred to
> the more efficient method use by afmtodit, embed it as part of make. I
> still would prefer the extra column solution, then there is no lookup
> at all.

I don't object to the idea, but I think our design decisions should be
documented, and it frequently seems to fall to me to undertake the
documentation.  That means I have to ask a lot of questions, which
programmers sometimes interpret as critique.  (And, to be fair,
sometimes I actually _do_ have critiques of an implementation.)

> use_charnames_in_special
> 
> Probably unnecessary once you complete the work to return .device to
> its 1.23.0 condition, as you have stated.

That seems like a fair prediction.  Almost all of the logic _on the
formatter side_ that employs this parameter seems to be in one function,
`encode_char()`.

https://git.savannah.gnu.org/cgit/groff.git/tree/src/roff/troff/input.cpp?h=1.23.0#n5427

(Last month, I renamed that to `encode_char_for_troff_output()` and I'm
thinking it can be further improved, more like
`encode_char_for_device_control()`...

...there's just one more thing.

There's one other occurrence, in a constructor.

https://git.savannah.gnu.org/cgit/groff.git/tree/src/roff/troff/input.cpp?h=1.23.0#n293

I look forward to someday understanding what that's doing there.)

> pdfmomclean
> 
> Not quite sure how your work on #64484 will affect this, we will have
> to wait and see.

Fair enough.

> Stringhex
> 
> Clearly you are still misunderstanding the issue, because there are
> some incorrect statements.

Okay.

> In any lookup there is a key/value pair.

I'm with ya so far.

> If dealing with a document written in Japanese, both the key and the
> value will arrive as unicode. No problem for the value, but the key
> will be invalid if used as part of a register name.

Yes.  I was trying to say the same thing in the mail to which you're
replying.

> There are two obvious solutions. One is to encode the key into
> something, easily decoded, which is acceptable to be used as part of a
> register name, or do a loop traversal over two arrays, one holding the
> keys and one the values.

Yes.

> I'm pretty sure my 9yr old grandson would come up with a looping
> solution.

In my opinion, your grandson has good instincts if he avoids
implementing things like "an:cln", "sanitize.tmac", or my own
"an*abbreviate-page-topic".

> I really don't understand your opposition to the encoding solution,

If I'm debugging using troff and dump the string/macro list, then I
envision it being disheartening to see something like this.

.pm
PDFLB   9
pdfswitchtopage 32
pdfnote 380
pdf:note-T      57
pdfpause        29
PDFBOOKMARK.VIEW        21
pdf:look(0073007500700065007200630061006c006900660072006100670069006c0069007300740069006300650078007000690061006c00690064006f00630069006f007500732602)
 41
pdfmark 31
pdftransition   58
pdfbackground   40
pdfpagenumbering        37
pdfbookmark     1677

Whereas with the 9 year old youth's solution, I get something more like
this.

.pnr
pdf:bm.nl       0
PDFOUTLINE.FOLDLEVEL    10000
PDFNOTE.WIDTH   252000
PDFNOTE.HEIGHT  144000
PDFHREF.VIEW.LEADING    5000
PDFHREF.LEADING 2000
pdf:look.id!1   1
.pm
PDFLB   9
pdfswitchtopage 32
pdfnote 380
pdf:note-T      57
pdfpause        29
PDFBOOKMARK.VIEW        21
pdf:look.content!1      41
pdfmark 31
pdftransition   58
pdfbackground   40
pdfpagenumbering        37
pdfbookmark     1677
.tm \*[pdf:look.content!1]
supercalifragilisticexpialidocious\[u2602]

I think there are advantages to not having to read, copy and paste,
or--God forbid, type--something like 
pdf:look(0073007500700065007200630061006c006900660072006100670069006c0069007300740069006300650078007000690061006c00690064006f00630069006f007500732602).

Not to mention having to come up with an answer when someone asks me how
to decode that at a shell prompt.

> Ok, I accept you would have done it the childs way with the
> performance hit, but I prefer the more elegant encoding solution.

I agree that O(1) is generally better than O(n).  But there are
important things to measure besides the asymptotic runtime behavior of
this one aspect of the system.  In my experience, systems that are
easier to troubleshoot are more pleasant to use than ones that aren't.

Further, by not measuring the performance impact of encoding and
decoding all those character-string to hexadecimal-character-string
conversions, we're not telling the full story.

> Uniqueness of keys is an issue for either strategy. In mom, a user
> supplied key name is only possible by using the NAMED parameter, and
> if a user uses the same name twice in the document nothing nasty will
> happen, the overview panel will be correct, since each of those is
> tagged with a safe generated name, and if they have used the same name
> for two different places in the document, when they are checking all
> the intra-document links they will find one of them will go to the
> wrong place. Of course this could be improved by warning when the same
> name is provided for a different destination. The man/mdoc macros
> currently have no named destinations, all generated, but this will
> change if the mdoc section referencing is implemented.

Yes, though that's down the road a little way.  First I just want to get
hyperlinks working for `Mt`, `Lk`, and `Xr`.

But it's an obvious concern for man pages.  I'll _have_ to solve the
problem because man page section heading titles are so rigidly
prescribed.  Format multiple man(7) (and mdoc(7)) documents and you'll
have guaranteed collisions.  Prefixing the anchors/bookmarks with the
page identifier seems seems tractable--
  <a name="groff(1)/Options">
--for instance.  Even that's not perfect--with a large enough
agglomeration of pages you could have more than one groff(1) page--but
it might be close enough.  For the pathological cases we can support a
user-specified prefix string, maybe.

> You mention a possible issue if a diversion is passed to stringhex,
> since this is 95% your own code for stringup/down, I'm pretty sure
> that whatever you do to solve the issue in your own code can be
> equally applied to stringhex, so this not an argument you can use to
> prevent its inclusion.

I wasn't making an argument to "prevent its inclusion".  I was and am
studying and making observations, and if those point out oversights in
my own work as well as others', that's a net benefit because it _gets
the information out into the world_.

The point of collaborative software development is to get synergetic
benefits from multiple brains working on the same problem.  I hope you
have found my little brain to have contributed something from time to
time.

> As regards your point 2, this is a non-issue, in 1.23.0 it works fine
> with .device. You ask what does:-
> 
>     \X'pdf: bizzarecmd \[u1234]'
> 
> Mean? Well, assuming you are writing in the ethiopic language and
> wrote:-
> 
>     \X'pdf: bizzarecmd ሴ'
> 
> And gropdf would do a "bizzarecmd" using the CHARACTER given (ETHIOPIC
> SYLLABLE SEE). Which could be setting a window title in the pdf
> viewer, I'm not sure, I have not written a handler for bizzarecmd.

That is the point I am trying to make.  It is totally up to the
postprocessor (or output driver) how to interpret "\[u1234]".

> As you can see not "misleading to a novice" at all, the fact that
> preconv changed it to be a different form and gropdf changed it back
> to a character to use in pdf meta- data is completely transparent to
> the user.

I guess this is a matter of perspective, but the fact that preconv,
troff, and gropdf are all separate programs is significant to me.  In my
opinion, that fact means that we need to document the interfaces between
these components.  The "\[uXXXX]" that a person puts in an input
document can undergo transformation.  (1) preconv might apply a Unicode
normalization form to it; (2) the formatter might honor user
instructions to translate or replace the special character, and, in the
future, it might participate in ligature replacement; (3) the output
device will of course do whatever it does to make a visible grapheme.

> Your work on \X and .device is to put .device back to how it was in
> 1.23.0 and alter \X to be the same, this is what you said would
> happen.

Yup.  That's still my plan.  It's a bit of a slog to get through all of
the corner cases.

In case it is of interest, I'm attaching two pieces of absolutely not
fully baked code from my Git stash and a working copy, respectively, to
illustrate my progress.  Perhaps you would like the opportunity to repay
me in code review coin.  I present my underbelly at its softest!  :-O

Having gotten through all of that, I find myself unable to locate where
you explicitly identified any of my statements about stringhex as
incorrect, which is an expectation you set up.  We agree that one
approach is O(1) and another is O(n).  We agree that the key uniqueness
problem is unsolved by either one.  You think the solution that I lean
toward is childish, and I think the solution you lean toward is
less helpful for people using GNU troff(1) to debug documents or macro
files.  But those are matters of opinion, not correctness of claims.

Please more explicitly flag my mistaken/false statements so that I can
either rebut your characterization, or learn something from you.

> The purpose of my patch was intended to give Robin a robust solution
> to what he wanted to do.

Yes.  That's fine--I simply wanted to take the opportunity to try and
keep things moving forward with the gropdf-ng merge, a long and
difficult process.  And if he's happy, I'm happy--see below regarding my
trust in you to get the problem solved.

I hope you haven't overlooked that I was thrilled with your recent
commits that make the PDF-hyperlinkification of the collected groff man
pages possible.  That was so compelling that I put other things on hold
to get Savannah #61434 done.  <https://savannah.gnu.org/bugs/?61434>

> You wrote in another email:-
> 
> "But tparm(const char *str, long, long, long, long, long, long, long,
> long, long) is one of the worst things I've ever seen in C code.
> 
> As I just got done saying (more or less) to Deri, when you have to
> obfuscate your inputs to cram them into the data structure you're
> using, that's a sign that you're using the wrong data structure."
> 
> I don't appreciate being conflated with "the worst things".

I wasn't attempting to liken anything you've written to the tparm thing.
But even if I were, that would be an assessment of _code_ rather than of
your quality as a person or as a programmer.

I probably sound like a stereotypical political liberal when I claim
that "bad" code comes from bad environments, not bad programmers.  In my
opinion that really is the first order factor.  People are trainable;
programmers improve as they practice.  "Bad" code is often the result of
a poor fit for the tool with the problem, and it's sadly typical that a
programmer does not have the breadth of choice in tools that they would
prefer.  They might not have learned of a well-suited alternative; no
one has mastered everything.  They may be working under a management
mandate to employ certain technology, or operating under constraints
that render "better" tooling unsupportable in the environment in
question.  The pressures of cost minimization can knock things out of
reach in the product phase that were easily sustained during research
and development.

Similarly, the *roff language imposes constraints on us.  I suspect you
and I both would be reaching for Perl-like associative arrays to handle
the bookmark storage and lookup issue if *roff had them to offer us.

We should always be on the lookout for better solutions.  Languages,
tools, and expertise evolve.  This is what refactoring is all about.
Furthermore, we (or at least I) learn more about code by taking a
specimen that works and re-expressing it than by virtually any other
method.  When we have a good automated test suite to establish that we
haven't regressed it, this can be "refactoring" at its best.  If the end
result is smaller, clearer, and/or more performant code, it's a win.

> The structure we were discussing was a simple key/value pair.

Yes, but not so simple in *roff.  *roff identifiers are so "liberal in
what they expect" that people get seduced into thinking you can put
anything into them, which leads to episodes like Savannah #64202.

> I have noticed that you tend to call things obfuscated when you have
> difficulty understanding them.

Once again, I confess that I am a bear of very little brain.

But I also figure that even if I'm roughly as sharp as the average
person on this list--if I may make such a boast even hypothetically--
then plenty of other people will have at least as much difficulty as I.

> Encoding a key into a well known address space (the hex numbers) is
> not obfuscation.

V pbaprqr gung lbh znl or noyr gb ernq n urknqrpvzny rapbqvat bs NFPVV
jvgu sne terngre rnfr guna V pna.  Ohg vs abg, gura V fhttrfg gung lbh
erpbafvqre lbhe pynvz urer.

If you're old enough to have grandchildren I trust that you recognize
that form of non-obfuscation immediately.  And USENET still lives...

> Consider BASE64 in mail systems, is that obfuscation or a valid method
> of protecting an essentially ascii environment,

It was an expedient for transmitting 8-bit data payloads across
non-8-bit clean data communications channels.  Would Base64 or UU
encoding ever have been developed if 8-bit clean channels had been
ubiquitous in the first place?

I think you're inferring value judgments from my observations that
you're not warranted in making.  The bookmark-encoding-in-an-identifier
problem sucks, that's for sure.  I don't blame anyone for doing what
they can to escape its constraints.  But I think it's worth exploring
the space of solutions available to us.

Recall that Base64 has costs in time and space.  And of course, most
people can't just read a Base64-encoded email as-is, that is by, say,
catting or grepping their inbox.  These are downsides we should be
forthright about.

> to my mind its a sensible compromise in systems designed before
> unicode (rings a bell?).

I'm not passionate about Base64 one way or the other.  If the day
arrives when we no longer need it for anything because the problems that
motivated its development disappear, then I wouldn't mourn its demise.

It's an instrument, as stringhex, or some other solution to the same
problem, would be.

> I'm afraid you give the impression that your ideas on how I should do
> my voluntary contribution to groff have more weight than my own, is
> that how you see it?

Not at all.  We've had conflict over this point before, so I will ask
again whether you would prefer that gropdf be in the contrib/ directory.

Peter Schaffter, for instance, has made a simple statement of where he
regards the boundaries of mom to be even within the contrib/mom
directory; he's not terribly concerned with, as I recall, asserting
control over the Automake script or the groff_mom(7) man page, both of
which originated with others in the first place.  So, I pretty much keep
my fingers out of om.tmac.  (I think I've run some one-liner changes, at
about the level of typo fixes, by him on one or two occasions, in
response to Savannah tickets.  I simply don't grasp the package well
enough to do more.)

...but, Peter is also wholly responsible for mom's documentation.

If you leave to others the task of writing the documentation for your
contributions, they are (I am) going to study them as closely as they
(I) need to to try to communicate their function to third parties.
Specifically, when I study them closely enough, I start thinking like an
engineer again instead of just a technical writer.  And I don't think
that's unwarranted since I serve both roles in this project.  Further,
if a code module lives in the non-contrib part of the groff source
distribution, I think the strong implication is that responsibility for
its maintenance falls to the groff team as a whole.

(I'll grant that the distinction is less clear than it could be, because
groff has relatively few active contributors and a lot of the stuff in
the contrib/ directory is maintained either by me or by no one at all,
because the original authors have wandered away over the past 20 years.)

> I welcome people who can find issues with my code, and by issues I
> mean if it produces output other than intended, fails on edge cases I
> have not considered, falls over given valid input.

Certainly.  If you didn't welcome that sort of feedback I think you'd be
neglecting your responsibilities as a software developer.  Fortunately
that scenario is counterfactual.

> I am quite sure there will be "bugs" in my code, it is fairly complex,
> but subjecting it to a "code review" without even running it to see if
> it does what it says on the box, is not helpful.

I think you've pretty badly mistaken my perspective.  One of the reasons
I stick my long nose into your code in this way is because I don't worry
that you won't produce correct results.  You have an established record
of delivering solutions that work as advertised.  (None of us is
perfect, but you do as well as anyone, as far as I can tell.  It
wouldn't surprise me in the least if your defect rate were lower than
mine.  I depend heavily on iterative development processes and automated
testing to protect myself from faceplants, and I still suffer those
occasionally.  Here's a recent example
<https://savannah.gnu.org/bugs/?65225>.  We/I didn't have a regression
test for tbl's '\R' feature, and I got bitten.  I overlooked the forest
for the wood lice of invalid inputs.  We have that regression test now.)

Regards,
Branden

diff --git a/src/roff/groff/tests/device-control-special-character-handling.sh b/src/roff/groff/tests/device-control-special-character-handling.sh
index d5f37c1d9..f5b10a393 100755
--- a/src/roff/groff/tests/device-control-special-character-handling.sh
+++ b/src/roff/groff/tests/device-control-special-character-handling.sh
@@ -19,6 +19,7 @@
 #
 
 groff="${abs_top_builddir:-.}/test-groff"
+troff="${abs_top_builddir:-.}/troff"
 
 fail=
 
@@ -28,69 +29,84 @@ wail () {
 }
 
 input='.nf
-\X#bogus1: esc \%man-beast\[u1F63C]\[u1F00] -\[aq]\[dq]\[ga]\[ha]\[rs]\[ti]#
-.device bogus1: req \%man-beast\[u1F63C]\[u1F00] -'"'"'"`^\\~
-.device "wack1: req \%man-beast\[u1F63C]\[u1F00] -'"'"'"`^\\~
-.device " altwack
-.\" We must break or flush here before changing the escape character.
-.fl
-.ec @
-@X#bogus2: esc @%man-beast@[u1F63C]@[u1F00] -@[aq]@[dq]@[ga]@[ha]@[rs]@[ti]#
-.device bogus2: req @%man-beast@[u1F63C]@[u1F00] -'"'"'"`^\~
-.device "wack2: req @%man-beast@[u1F63C]@[u1F00] -'"'"'"`^\~'
-
-output=$(printf '%s\n' "$input" | "$groff" -T ps -Z 2> /dev/null \
+.device request 1:  !"#$%&'"'"'()*+,-./0123456789:;<=>?
+     \X@ escape 1:  !"#$%&'"'"'()*+,-./0123456789:;<=>?@
+.device request 2: @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_
+     \X! escape 2: @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_!
+.device request 3: `abcdefghijklmnopqrstuvwxyZ{|}~
+     \X! escape 3: `abcdefghijklmnopqrstuvwxyZ{|}~!'
+
+output=$(printf '%s\n' "$input" | "$troff" -R -T ps 2> /dev/null \
   | grep '^x X')
-error=$(printf '%s\n' "$input" | "$groff" -T ps -Z 2>&1 > /dev/null)
+error=$(printf '%s\n' "$input" | "$troff" -R -T ps 2>&1 > /dev/null)
+
+echo "checking printable ASCII repertoire" >&2
 
 echo "$output"
 echo "$error"
 
-echo "checking X escape sequence, default escape character" >&2
-# x X bogus1: esc man-beast -'"`^\~
 echo "$output" \
-  | grep -Fqx 'x X bogus1: esc man-beast -'"'"'"`^\~' \
+  | grep -Fqx 'x X request 1:  !"#$%&'"'"'()*+,-./0123456789:;<=>?' \
   || wail
-
-echo "checking device request, default escape character" >&2
-# x X bogus1: req man-beast\[u1F63C]\[u1F00] -'"`^\~
 echo "$output" \
-  | grep -Fqx 'x X bogus1: req man-beast\[u1F63C]\[u1F00] -'"'"'"`^\~' \
+  | grep -Fqx 'x X  escape 1:  !"#$%&'"'"'()*+,-./0123456789:;<=>?' \
   || wail
-
-echo "checking same, but with leading quote" >&2
-# x X bogus1: req man-beast\[u1F63C]\[u1F00] -'"`^\~
 echo "$output" \
-  | grep -Fqx 'x X wack1: req man-beast\[u1F63C]\[u1F00] -'"'"'"`^\~' \
+  | grep -Fqx 'x X request 2: @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_' \
   || wail
-
-echo "checking leading quote and space behavior in device request" >&2
-# x X altwack
-# 2 spaces between X and altwack
-echo "$output" | grep -Fqx 'x X  altwack' || wail
-
-echo "checking X escape sequence, alternate escape character" >&2
-# x X bogus2: esc man-beast -'"`^\~
 echo "$output" \
-  | grep -Fqx 'x X bogus2: esc man-beast -'"'"'"`^\~' \
+  | grep -Fqx 'x X  escape 2: @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_' \
   || wail
-
-echo "checking device request, alternate escape character" >&2
-# x X bogus2: req man-beast\[u1F63C]\[u1F00] -'"`^\~
 echo "$output" \
-  | grep -Fqx 'x X bogus2: req man-beast\[u1F63C]\[u1F00] -'"'"'"`^\~' \
+  | grep -Fqx 'x X request 3: `abcdefghijklmnopqrstuvwxyZ{|}~' \
   || wail
-
-echo "checking same, but with leading quote" >&2
-# x X bogus1: req man-beast\[u1F63C]\[u1F00] -'"`^\~
 echo "$output" \
-  | grep -Fqx 'x X wack2: req man-beast\[u1F63C]\[u1F00] -'"'"'"`^\~' \
+  | grep -Fqx 'x X  escape 3: `abcdefghijklmnopqrstuvwxyZ{|}~' \
   || wail
 
-# TODO: Uncomment this when \X learns to read its argument in copy mode.
-#echo "checking that no error diagnostics were thrown" >&2
-#test -z "$error" || wail
+# Confirm translation of a groff special character escape sequence to a
+# basic Latin character when used in a device control command.
+#
+# $1 is the special character escape _without_ the leading backslash.
+# $2 is the expected output character _shell-quoted as necessary_.
+# $3 is a human-readable glyph description for the test log.
+check_char () {
+  sc=$1
+  output=$2
+  format='checking conversion of "\\%s" to "%s" in device control'
+  format="$format escape sequence\n"
+  printf "$format" "$sc" "$output" "$description" >&2
+  if ! printf '\\X#\\%s %s#\n' "$sc" "$desc" | "$troff" -R -T ps \
+    | grep -Fqx 'x X '$output
+  then
+    printf '...FAILED' >&2
+    fail=yes
+  fi
+
+  format='checking conversion of "\\%s" to "%s" in device control'
+  format="$format request\n"
+  printf "$format" "$sc" "$output" "$description" >&2
+  if ! printf '.device %s %s\n' "$sc" "$desc" | "$troff" -R -T ps \
+    | grep -Fqx 'x X '$output
+  then
+    printf '...FAILED' >&2
+    fail=yes
+  fi
+}
+
+check_char ' ' ' '
+#check_char - -
+#check_char '[aq]' "'" "neutral apostrophe"
+#check_char '[dq]' '"' "double quote"
+#check_char '[ga]' '`' "grave accent"
+#check_char '[ha]' ^ "caret/hat"
+#check_char '[rs]' '\' "reverse solidus/backslash"
+#check_char '[ti]' '~' "tilde"
 
 test -z "$fail"
 
+## TODO: Uncomment this when \X learns to read its argument in copy mode.
+##echo "checking that no error diagnostics were thrown" >&2
+##test -z "$error" || wail
+
 # vim:set autoindent expandtab shiftwidth=2 tabstop=2 textwidth=72:
diff --git a/src/roff/groff/tests/some_escapes_accept_newline_delimiters.sh b/src/roff/groff/tests/some_escapes_accept_newline_delimiters.sh
index 96be05532..40ce37da3 100755
--- a/src/roff/groff/tests/some_escapes_accept_newline_delimiters.sh
+++ b/src/roff/groff/tests/some_escapes_accept_newline_delimiters.sh
@@ -1,6 +1,6 @@
 #!/bin/sh
 #
-# Copyright (C) 2022 Free Software Foundation, Inc.
+# Copyright (C) 2022-2024 Free Software Foundation, Inc.
 #
 # This file is part of groff.
 #
@@ -88,19 +88,6 @@ echo "checking correct handling of newline delimiter to 'w' escape" >&2
 output=$(printf "%s\n" "$input" | "$groff" -Tascii -ww)
 test "$output" = "72 D" || wail
 
-input="\X
-tty: link http://example.com
-D
-.pl \n(nlu"
-
-echo "checking that newline is accepted as delimiter to 'X' escape" >&2
-error=$(printf "%s\n" "$input" | "$groff" -Tascii -ww -z 2>&1)
-test -z "$error" || wail
-
-echo "checking correct handling of newline delimiter to 'X' escape" >&2
-output=$(printf "%s\n" "$input" | "$groff" -Tascii -ww -P -c)
-test "$output" = ' D' || wail
-
 input="\Z
 ABC
 D
diff --git a/src/roff/troff/input.cpp b/src/roff/troff/input.cpp
index bef0dcabf..0e8e50750 100644
--- a/src/roff/troff/input.cpp
+++ b/src/roff/troff/input.cpp
@@ -917,10 +917,13 @@ void shift()
   skip_line();
 }
 
-static char get_char_for_escape_parameter(bool allow_space = false)
+static char get_char_for_escape_parameter(bool allow_space = false,
+					  bool gbr_flag = false)
 {
-  int c = get_copy(0 /* nullptr */, false /* is defining */,
+  int c = get_copy(0 /* nullptr */,
+		   false /* is defining */,
 		   true /* handle \E */);
+  debug("GBR: processing '%1' (c=%2)", char(c), c);
   switch (c) {
   case EOF:
     copy_mode_error("end of input in escape sequence");
@@ -929,19 +932,27 @@ static char get_char_for_escape_parameter(bool allow_space = false)
     if (!is_invalid_input_char(c))
       break;
     // fall through
-  case '\n':
-    if (c == '\n')
-      input_stack::push(make_temp_iterator("\n"));
+//case '\n':
+//  input_stack::push(make_temp_iterator("\n"));
+//  // fall through
+  case ESCAPE_BANG:
+    if (gbr_flag) {
+      //debug("GBR: ignoring an escaped '!'");
+      c = ' ';
+    }
     // fall through
   case ' ':
-    if (c == ' ' && allow_space)
+    if (allow_space) {
+      //debug("GBR: we saw a space (or something) but we're allowing it");
       break;
+    }
     // fall through
   case '\t':
   case '\001':
   case '\b':
-    copy_mode_error("%1 is not allowed in an escape sequence parameter",
-		    input_char_description(c));
+      //copy_mode_error("GBR: %1 is not allowed in an escape sequence"
+      //		" parameter", input_char_description(c));
+      /* silently discard */;
     return '\0';
   }
   return c;
@@ -2475,43 +2486,55 @@ int token::operator!=(const token &t)
   return !(*this == t);
 }
 
-// is token a suitable delimiter (like ')?
+// Is the character usable as a delimiter?
+//
+// This is used directly only by `do_device_control()`, because it is
+// the only escape sequence that reads its argument in copy mode (so it
+// doesn't tokenize it) and accepts a user-specified delimiter.
+static bool is_char_usable_as_delimiter(int c)
+{
+  switch(c) {
+  case '0':
+  case '1':
+  case '2':
+  case '3':
+  case '4':
+  case '5':
+  case '6':
+  case '7':
+  case '8':
+  case '9':
+  case '+':
+  case '-':
+  case '/':
+  case '*':
+  case '%':
+  case '<':
+  case '>':
+  case '=':
+  case '&':
+  case ':':
+  case '(':
+  case ')':
+  case '.':
+    return false;
+  default:
+    return true;
+  }
+}
+
+// Is the current token a suitable delimiter (like `'`)?
 
 bool token::is_usable_as_delimiter(bool report_error)
 {
+  bool is_valid = false;
   switch(type) {
   case TOKEN_CHAR:
-    switch(c) {
-    case '0':
-    case '1':
-    case '2':
-    case '3':
-    case '4':
-    case '5':
-    case '6':
-    case '7':
-    case '8':
-    case '9':
-    case '+':
-    case '-':
-    case '/':
-    case '*':
-    case '%':
-    case '<':
-    case '>':
-    case '=':
-    case '&':
-    case ':':
-    case '(':
-    case ')':
-    case '.':
-      if (report_error)
-        error("character '%1' is not allowed as a starting delimiter",
-	      char(c));
-      return false;
-    default:
-      return true;
-    }
+    is_valid = is_char_usable_as_delimiter(c);
+    if (!is_valid && report_error)
+      error("character '%1' is not allowed as a starting delimiter",
+	    char(c));
+    return is_valid;
   case TOKEN_NODE:
     // the user doesn't know what a node is
     if (report_error)
@@ -5629,65 +5652,34 @@ static void encode_char_for_troff_output(macro *mac, const char c)
       mac->append('\\');
   }
   else {
-    if (tok.is_stretchable_space()
-	|| tok.is_unstretchable_space())
-      mac->append(' ');
-    else if (tok.is_special()) {
-      const char *sc;
-      if (font::use_charnames_in_special) {
-	charinfo *ci = tok.get_char(true /* required */);
-	sc = ci->get_symbol()->contents();
-      }
-      else
-	sc = tok.get_char()->get_symbol()->contents();
-      if (strcmp("-", sc) == 0)
-	mac->append('-');
-      else if (strcmp("aq", sc) == 0)
-	mac->append('\'');
-      else if (strcmp("dq", sc) == 0)
-	mac->append('"');
-      else if (strcmp("ga", sc) == 0)
-	mac->append('`');
-      else if (strcmp("ha", sc) == 0)
-	mac->append('^');
-      else if (strcmp("rs", sc) == 0)
-	mac->append('\\');
-      else if (strcmp("ti", sc) == 0)
-	mac->append('~');
-      else {
-	if (font::use_charnames_in_special) {
-	  if (sc[0] != (char)0) {
-	    mac->append('\\');
-	    mac->append('[');
-	    int i = 0;
-	    while (sc[i] != (char)0) {
-	      mac->append(sc[i]);
-	      i++;
-	    }
-	    mac->append(']');
-	  }
-	  else
-	    error("special character '%1' cannot be used within a"
-	          " device control escape sequence", sc);
-	}
-	else
-	  error("special character '%1' cannot be used within a device"
-		" control escape sequence", sc);
-      }
-    }
-    else if ((tok.is_hyphen_indicator()
-	      || tok.is_space()
-	      || tok.is_dummy()
-	      || tok.is_transparent_dummy()
-	      || tok.is_zero_width_break())) {
-      ; // silently discard this escape sequence
-    }
-    else
-      error("%1 is invalid within a device control command",
-	    tok.description());
+    error("char %1 (c=%2) is invalid within a device control command",
+	  char(c), c);
+  }
+}
+
+static node *do_device_control() // \X
+{
+  int c;
+  macro mac;
+  int delim = get_copy(0 /* nullptr */);
+  debug("GBR: delimiter is %1", delim);
+  while (((c = get_copy(0 /* nullptr */)) != delim) && c != EOF && c != '\n') {
+    debug("GBR: processing '%1' (c=%2)", char(c), c);
+    encode_char_for_troff_output(&mac, c);
+    if (0 == c)
+      error("GBR: DAMN, need a node handler");
+  }
+  if (c == EOF || c == '\n')
+    error("unterminated device control escape sequence");
+  if (mac.empty()) {
+    error("device control escape sequence requires an argument");
+    return 0 /* nullptr */;
   }
+  else
+    return new special_node(mac);
 }
 
+#if 0
 static node *do_device_control()
 {
   int start_level = input_stack::get_level();
@@ -5725,6 +5717,7 @@ static node *do_device_control()
   }
   return new special_node(mac);
 }
+#endif
 
 static void device_request()
 {
diff --git a/tmac/an.tmac b/tmac/an.tmac
index a60eb888f..09ad61484 100644
--- a/tmac/an.tmac
+++ b/tmac/an.tmac
@@ -1215,7 +1215,9 @@ .de1 UE
 .de1 MR
 .  if ((\\n[.$] < 2) : (\\n[.$] > 3)) \
 .    an-style-warn .\\$0 expects 2 or 3 arguments, got \\n[.$]
+.  tm GBR1: MR \\$@
 .  ds an*url man:\\$1(\\$2)\" used everywhere but macOS
+.  tm GBR2: an*url \\*[an*url]
 .  if (\\n[an*MR-URL-format] = 2) \
 .    ds an*url x-man-page://\\$2/\\$1\" macOS/Mac OS X since 10.3
 .  if (\\n[an*MR-URL-format] = 3) \
diff --git a/tmac/pspic.tmac b/tmac/pspic.tmac
index 98f9ba4f1..456e32c88 100644
--- a/tmac/pspic.tmac
+++ b/tmac/pspic.tmac
@@ -129,8 +129,7 @@ .de PSPIC
 .        \" between those two escapes
 .        ds ps-invis \X'ps: invis'
 .        ds ps-endinvis \X'ps: endinvis'
-.        ds ps-import \X'ps: import \E$1 \En[llx] \En[lly] \En[urx] \En[ury] \
-                                    \En[ps-deswid] \E*[ps-desht]'
+.        ds ps-import \X'ps: import \\$1 \\n[llx] \\n[lly] \\n[urx] \\n[ury] \\n[ps-deswid] \\*[ps-desht]'
 .      \}
 .      el \{\
 .        ds ps-invis

--- src/roff/troff/input.cpp	2024-02-04 02:14:00.583956798 -0600
+++ ./ATTIC/input.cpp.half-baked	2024-01-23 12:05:55.242368237 -0600
@@ -399,7 +399,7 @@
 : fp(f), lineno(1), filename(fn), popened(po),
   newline_flag(0), seen_escape(0)
 {
-  if ((font::use_charnames_in_special) && (fn != 0 /* nullptr */)) {
+  if ((font::use_charnames_in_special) && (fn != 0)) {
     if (!the_output)
       init_output();
     the_output->put_filename(fn, po);
@@ -917,9 +917,11 @@
   skip_line();
 }
 
-static char get_char_for_escape_parameter(bool allow_space = false)
+static char get_char_for_escape_parameter(bool allow_space = false,
+					  bool want_discard = false)
 {
-  int c = get_copy(0 /* nullptr */, false /* is defining */,
+  int c = get_copy(0 /* nullptr */,
+		   false /* is defining */,
 		   true /* handle \E */);
   switch (c) {
   case EOF:
@@ -940,8 +942,9 @@
   case '\t':
   case '\001':
   case '\b':
-    copy_mode_error("%1 is not allowed in an escape sequence parameter",
-		    input_char_description(c));
+    if (!want_discard)
+      copy_mode_error("%1 is not allowed in an escape sequence"
+		      " parameter", input_char_description(c));
     return '\0';
   }
   return c;
@@ -1279,9 +1282,9 @@
   return false;
 }
 
-non_interpreted_char_node::non_interpreted_char_node(unsigned char cc) : c(cc)
+non_interpreted_char_node::non_interpreted_char_node(unsigned char n) : c(n)
 {
-  assert(cc != 0);
+  assert(n != 0);
 }
 
 node *non_interpreted_char_node::copy()
@@ -1297,7 +1300,7 @@
 
 static void do_width();
 static node *do_non_interpreted();
-static node *do_special();
+static node *do_device_control();
 static node *do_suppress(symbol nm);
 static void do_register();
 
@@ -1879,7 +1882,7 @@
   }
   units x;
   for (;;) {
-    node *n = 0 /* nullptr */;
+    node *n = 0;
     int cc = input_stack::get(&n);
     if (cc != escape_char || escape_char == 0) {
     handle_ordinary_char:
@@ -2039,7 +2042,7 @@
 	return;
       case 0:
 	{
-	  assert(n != 0 /* nullptr */);
+	  assert(n != 0);
 	  token_node *tn = n->get_token_node();
 	  if (tn) {
 	    *this = tn->tk;
@@ -2360,7 +2363,7 @@
 	nd = new extra_size_node(x);
 	return;
       case 'X':
-	nd = do_special();
+	nd = do_device_control();
 	if (!nd)
 	  break;
 	type = TOKEN_NODE;
@@ -2533,7 +2536,7 @@
 
 const char *token::description()
 {
-  const size_t bufsz = sizeof "character 'x'" + 1;
+  const size_t bufsz = strlen("character 'x'") + 1;
   static char buf[bufsz];
   (void) memset(buf, 0, bufsz);
   switch (type) {
@@ -3209,7 +3212,7 @@
     }
     if (!suppress_next)
       tok.next();
-    was_trap_sprung = false;
+    trap_sprung_flag = 0;
   }
 }
 
@@ -3362,33 +3365,33 @@
 
 void node_list::append(node *n)
 {
-  if (head == 0 /* nullptr */) {
-    n->next = 0 /* nullptr */;
+  if (head == 0) {
+    n->next = 0;
     head = tail = n;
   }
   else {
-    n->next = 0 /* nullptr */;
+    n->next = 0;
     tail = tail->next = n;
   }
 }
 
 int node_list::length()
 {
-  int total = 0 /* nullptr */;
-  for (node *n = head; n != 0 /* nullptr */; n = n->next)
+  int total = 0;
+  for (node *n = head; n != 0; n = n->next)
     ++total;
   return total;
 }
 
 node_list::node_list()
 {
-  head = tail = 0 /* nullptr */;
+  head = tail = 0;
 }
 
 node *node_list::extract()
 {
   node *temp = head;
-  head = tail = 0 /* nullptr */;
+  head = tail = 0;
   return temp;
 }
 
@@ -3408,7 +3411,7 @@
 
 macro::~macro()
 {
-  if (p != 0 /* nullptr */ && --(p->count) <= 0)
+  if (p != 0 && --(p->count) <= 0)
     delete p;
 }
 
@@ -3416,12 +3419,12 @@
 : is_a_diversion(0), is_a_string(1)
 {
   if (!input_stack::get_location(1, &filename, &lineno)) {
-    filename = 0 /* nullptr */;
-    lineno = 0 /* nullptr */;
+    filename = 0;
+    lineno = 0;
   }
   len = 0;
   empty_macro = 1;
-  p = 0; /* nullptr */
+  p = 0;
 }
 
 macro::macro(const macro &m)
@@ -3429,7 +3432,7 @@
   empty_macro(m.empty_macro), is_a_diversion(m.is_a_diversion),
   is_a_string(m.is_a_string), p(m.p)
 {
-  if (p != 0 /* nullptr */)
+  if (p != 0)
     p->count++;
 }
 
@@ -3437,13 +3440,13 @@
 : is_a_diversion(is_div)
 {
   if (!input_stack::get_location(1, &filename, &lineno)) {
-    filename = 0 /* nullptr */;
-    lineno = 0 /* nullptr */;
+    filename = 0;
+    lineno = 0;
   }
   len = 0;
   empty_macro = 1;
   is_a_string = 1;
-  p = 0 /* nullptr */;
+  p = 0;
 }
 
 int macro::is_diversion()
@@ -3464,9 +3467,9 @@
 macro &macro::operator=(const macro &m)
 {
   // don't assign object
-  if (m.p != 0 /* nullptr */)
+  if (m.p != 0)
     m.p->count++;
-  if (p != 0 /* nullptr */ && --(p->count) <= 0)
+  if (p != 0 && --(p->count) <= 0)
     delete p;
   p = m.p;
   filename = m.filename;
@@ -3481,7 +3484,7 @@
 void macro::append(unsigned char c)
 {
   assert(c != 0);
-  if (p == 0 /* nullptr */)
+  if (p == 0)
     p = new macro_header;
   if (p->cl.length() != len) {
     macro_header *tem = p->copy(len);
@@ -3497,14 +3500,14 @@
 
 void macro::set(unsigned char c, int offset)
 {
-  assert(p != 0 /* nullptr */);
+  assert(p != 0);
   assert(c != 0);
   p->cl.set(c, offset);
 }
 
 unsigned char macro::get(int offset)
 {
-  assert(p != 0 /* nullptr */);
+  assert(p != 0);
   return p->cl.get(offset);
 }
 
@@ -3527,8 +3530,8 @@
 
 void macro::append(node *n)
 {
-  assert(n != 0 /* nullptr */);
-  if (p == 0 /* nullptr */)
+  assert(n != 0);
+  if (p == 0)
     p = new macro_header;
   if (p->cl.length() != len) {
     macro_header *tem = p->copy(len);
@@ -4229,7 +4232,7 @@
 // point with a composite mapping?  Either the key or value component
 // of an entry in the composite dictionary qualifies.
 //
-// This is an O(n) search, but by default groff defines only 22
+// This is an O(n) search, but by default groff only defines 22
 // composite character mappings ("tmac/composite.tmac").  If this
 // becomes a performance problem, we will need another dictionary
 // mapping the unique values of `composite_dictionary` (which is not
@@ -4272,15 +4275,15 @@
   skip_line();
 }
 
-bool was_trap_sprung = false;
-static bool are_traps_postponed = false;
+int trap_sprung_flag = 0;
+int postpone_traps_flag = 0;
 symbol postponed_trap;
 
 void spring_trap(symbol nm)
 {
   assert(!nm.is_null());
-  was_trap_sprung = true;
-  if (are_traps_postponed) {
+  trap_sprung_flag = 1;
+  if (postpone_traps_flag) {
     postponed_trap = nm;
     return;
   }
@@ -4301,19 +4304,19 @@
 
 void postpone_traps()
 {
-  are_traps_postponed = true;
+  postpone_traps_flag = 1;
 }
 
-bool unpostpone_traps()
+int unpostpone_traps()
 {
-  are_traps_postponed = false;
+  postpone_traps_flag = 0;
   if (!postponed_trap.is_null()) {
     spring_trap(postponed_trap);
     postponed_trap = NULL_SYMBOL;
-    return true;
+    return 1;
   }
   else
-    return false;
+    return 0;
 }
 
 void read_request()
@@ -4398,7 +4401,7 @@
     c = get_copy(&n);
   macro mac;
   request_or_macro *rm = (request_or_macro *)request_dictionary.lookup(nm);
-  macro *mm = rm ? rm->to_macro() : 0 /* nullptr */;
+  macro *mm = rm ? rm->to_macro() : 0;
   if (mode == DEFINE_APPEND && mm)
     mac = *mm;
   if (comp == COMP_DISABLE)
@@ -5616,19 +5619,24 @@
   return new non_interpreted_node(mac);
 }
 
-// In troff output, we translate the escape character to '\', but it is
-// up to the postprocessor to interpret it as such.  (This mostly
-// matters for device control commands.)
 static void encode_char_for_troff_output(macro *mac, const char c)
 {
-  bool is_char_valid = true;
-  const char *sc = 0 /* nullptr */;
-  if ('\0' == c) {
-    if (tok.is_space()
-	|| tok.is_stretchable_space()
+  // All printable characters get encoded as themselves except the
+  // escape character.  The formatter language's choice of escape
+  // character is irrelevant to the device-independent output language.
+  // Encode that as a backslash.
+  if (csprint(c)) {
+    if (c != escape_char)
+      mac->append(c);
+    else
+      mac->append('\\');
+  }
+  else {
+    if (tok.is_stretchable_space()
 	|| tok.is_unstretchable_space())
       mac->append(' ');
     else if (tok.is_special()) {
+      const char *sc;
       if (font::use_charnames_in_special) {
 	charinfo *ci = tok.get_char(true /* required */);
 	sc = ci->get_symbol()->contents();
@@ -5662,38 +5670,151 @@
 	    mac->append(']');
 	  }
 	  else
-	    is_char_valid = false;
+	    error("special character '%1' cannot be used within a"
+	          " device control escape sequence", sc);
 	}
 	else
-	  is_char_valid = false;
+	  error("special character '%1' cannot be used within a device"
+		" control escape sequence", sc);
       }
     }
-    else if (tok.is_hyphen_indicator()
-	       || tok.is_dummy()
-	       || tok.is_transparent_dummy()
-	       || tok.is_zero_width_break())
-      /* silently ignore */;
-    else
-      is_char_valid = false;
-    if (!is_char_valid) {
-      if (sc != 0 /* nullptr */)
-	error("special character '%1' is invalid within a device"
-	      " control command", sc);
-      else
-	error("%1 is invalid within a device control command",
-	      tok.description());
+    else if ((tok.is_hyphen_indicator()
+	      || tok.is_space()
+	      || tok.is_dummy()
+	      || tok.is_transparent_dummy()
+	      || tok.is_zero_width_break())) {
+      ; // silently discard this escape sequence
     }
+    else
+      error("%1 is invalid within a device control command",
+	    tok.description());
+  }
+}
+
+static node *do_device_control()
+{
+  node *n;
+  int c;
+  macro mac;
+  token start_token;
+  start_token.next();
+  if (!start_token.is_usable_as_delimiter(true /* report error */)){
+    do {
+      tok.next();
+    } while (tok != start_token && !tok.is_newline() && !tok.is_eof());
   }
+  if (tok == start_token)
+    error("device control escape sequence requires an argument");
   else {
-    if (c == escape_char) {
-      mac->append('\\');
+    // Get a character, accepting spaces and discarding embedded escape
+    // sequences with no representation outside of the formatter.
+    while ((c = get_char_for_escape_parameter(true, true))
+	   != start_token.ch() && c != EOF && c != '\n') {
+      debug("GBR: encoding '%1' (%2) in device control command", char(c), c);
+//    while ((c = get_copy(&n)) != start_token.ch() && c != EOF && c != '\n') {
+      if ('\0' == c) {
+	debug("GBR: ignoring it");
+	continue;
+      }
+      else
+	encode_char_for_troff_output(&mac, c);
+      // XXX: we can kill everything from here forward?
+      if ((c > 31) && (c < 127)) {
+	;//
+      }
+      // Convert certain tokens back into characters.
+      else if (c == 1) { // same in ISO 646 and CCSID 1047
+	encode_char_for_troff_output(&mac, '\\');
+	encode_char_for_troff_output(&mac, 'a');
+      }
+      else if (c == ESCAPE_BANG) {
+	encode_char_for_troff_output(&mac, '\\');
+	encode_char_for_troff_output(&mac, '!');
+      }
+      else if (c == ESCAPE_SPACE) {
+	encode_char_for_troff_output(&mac, '\\');
+	encode_char_for_troff_output(&mac, ' ');
+      }
+      else if (c == ESCAPE_AMPERSAND) {
+	encode_char_for_troff_output(&mac, '\\');
+	encode_char_for_troff_output(&mac, '&');
+      }
+      else if (c == ESCAPE_COLON) {
+	encode_char_for_troff_output(&mac, '\\');
+	encode_char_for_troff_output(&mac, ':');
+      }
+      else if (c == ESCAPE_CIRCUMFLEX) {
+	encode_char_for_troff_output(&mac, '\\');
+	encode_char_for_troff_output(&mac, '^');
+      }
+      else if (c == ESCAPE_UNDERSCORE) {
+	encode_char_for_troff_output(&mac, '\\');
+	encode_char_for_troff_output(&mac, '_');
+      }
+      else if (c == ESCAPE_LEFT_BRACE) {
+	encode_char_for_troff_output(&mac, '\\');
+	encode_char_for_troff_output(&mac, '{');
+      }
+      else if (c == ESCAPE_RIGHT_BRACE) {
+	encode_char_for_troff_output(&mac, '\\');
+	encode_char_for_troff_output(&mac, '}');
+      }
+      else if (c == ESCAPE_BAR) {
+	encode_char_for_troff_output(&mac, '\\');
+	encode_char_for_troff_output(&mac, '|');
+      }
+      else if (c == ESCAPE_RIGHT_QUOTE)
+	encode_char_for_troff_output(&mac, '\'');
+      else if (c == ESCAPE_HYPHEN)
+	encode_char_for_troff_output(&mac, '-');
+      else if (c == ESCAPE_PERCENT) {
+	encode_char_for_troff_output(&mac, '\\');
+	encode_char_for_troff_output(&mac, '%');
+      }
+      else if (c == ESCAPE_TILDE) {
+	encode_char_for_troff_output(&mac, '\\');
+	encode_char_for_troff_output(&mac, '~');
+      }
+      else if (c == ESCAPE_E) {
+	encode_char_for_troff_output(&mac, '\\');
+	encode_char_for_troff_output(&mac, 'E');
+      }
+      else if (c == ESCAPE_c) {
+	encode_char_for_troff_output(&mac, '\\');
+	encode_char_for_troff_output(&mac, 'c');
+      }
+      else if (c == ESCAPE_e) {
+	encode_char_for_troff_output(&mac, '\\');
+	encode_char_for_troff_output(&mac, 'e');
+      }
+      else if (c == '\t') {
+	encode_char_for_troff_output(&mac, '\\');
+	encode_char_for_troff_output(&mac, 't');
+      }
+      else if (c == INPUT_DELETE)
+	error("a delete character is unsupported in a device control"
+	      " escape sequence");
+      else if (c == 0) {
+	assert(0 == "wuh-oh");
+	//debug("GBR: appending node to device control command");
+	mac.append(n);
+      }
+      else {
+	debug("GBR: unhandled c=%1 (%2)", char(c), c);
+	assert(0 == "escape sequence unsupported");
+      }
     }
-    else
-      mac->append(c);
+    if (c == EOF || c == '\n')
+      error("unterminated device control escape sequence");
   }
+  if (mac.empty())
+    return 0 /* nullptr */;
+  else
+    return new special_node(mac);
 }
 
-static node *do_special()
+#if 0
+static node *do_device_control()
 {
   int start_level = input_stack::get_level();
   token start_token;
@@ -5730,31 +5851,44 @@
   }
   return new special_node(mac);
 }
+#endif
 
 static void device_request()
 {
-  if (!has_arg()) {
-    warning(WARN_MISSING, "device request expects arguments");
-    skip_line();
-    return;
+  // We can't use `has_arg()` here because we want to read in copy mode.
+  int c;
+  node *n = 0 /* nullptr */;
+  for (;;) {
+    c = input_stack::peek();
+    if (' ' == c)
+      (void) get_copy(0 /* nullptr */);
+    else
+      break;
   }
-  if (tok.is_newline() || tok.is_eof()) {
-    warning(WARN_MISSING, "device request expects arguments");
+  if (('\n' == c) || (EOF == c)) {
+    warning(WARN_MISSING, "device control request expects arguments");
     skip_line();
     return;
   }
-  if ('"' == tok.ch()) {
-    tok.next();
-  }
   macro mac;
   for (;;) {
-    if (tok.is_newline() || tok.is_eof())
+    c = get_copy(&n);
+    if ('"' == c) {
+      c = get_copy(0 /* nullptr */);
+      break;
+    }
+    if (c != ' ' && c != '\t')
       break;
-    encode_char_for_troff_output(&mac, tok.ch());
-    tok.next();
+  }
+  while (c != '\n' && c != EOF) {
+    if (c != 0)
+      encode_char_for_troff_output(&mac, c);
+    else
+      mac.append(n);
+    c = get_copy(&n);
   }
   curenv->add_node(new special_node(mac));
-  skip_line();
+  tok.next();
 }
 
 static void device_macro_request()

signature.asc
Description: PGP signature

Re: gropdf-ng merge status (was: PDF outline not capturing Cyrillic text)

Reply via email to