URL:
  <https://savannah.gnu.org/bugs/?68257>

                 Summary: [troff] a closing brace escape sequence `\}` should
terminate any longer token being collected
                   Group: GNU roff
               Submitter: gbranden
               Submitted: Sat 18 Apr 2026 04:55:56 PM UTC
                Category: Core
                Severity: 3 - Normal
              Item Group: Incorrect behaviour
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Unlocked
         Planned Release: None


    _______________________________________________________

Follow-up Comments:


-------------------------------------------------------
Date: Sat 18 Apr 2026 04:55:56 PM UTC By: G. Branden Robinson <gbranden>
In _groff_ 1.24.0 I managed to introduce an arguably new spurious diagnostic
in some circumstances.


$ printf '.nr a 1 \\}\n' | ~/groff-1.24.0/bin/groff 
troff:<standard input>:1: error: ignoring invalid numeric expression starting
with an escaped '}'


The numeric expression expected here is the autoincrement amount.

The assignment does go ahead and take place, and no autoincrement applies.


$ printf '.nr a 1 \\}\n.pnr a\n' | ~/groff-1.24.0/bin/groff 
troff:<standard input>:1: error: ignoring invalid numeric expression starting
with an escaped '}'
a       1 +0 0


Here's another point to observe:


$ printf '.nr a 1 12\\}34\n.pnr a\n' | ~/groff-1.24.0/bin/groff 
a       1 +12 0


...and this points the way to the remedy.

The functions in GNU _troff_ that read "longer" tokens (that is, what parser
theory calls "nonterminal symbols", meaning stuff you have to collect more of
before you can decide what it is) should, **and sometimes do**, terminate that
longer token upon encountering a closing brace escape sequence, they same as
they would when encountering a newline or EOF.

But they don't, consistently, which means our grammar is inconsistent with
itself and other _troff_ programs in some places.


$ printf '.nr a 12\\}34\n.tm \\na\n' | nroff
12
$ printf '.nr a 12\\}34\n.tm \\na\n' | 9 nroff
12
$ printf '.nr a 12\\}34\n.tm \\na\n' | dwb nroff
12
$ printf '.nr a 12\\}34\n.tm \\na\n' | heirloom nroff
12


Okay, so that's good, but...


$ printf '.ds a AB\\}CD\n.tm \\*a\n' | nroff
AB\}CD
$ printf '.ds a AB\\}CD\n.tm \\*a\n' | 9 nroff
AB
$ printf '.ds a AB\\}CD\n.tm \\*a\n' | dwb nroff
AB
$ printf '.ds a AB\\}CD\n.tm \\*a\n' | heirloom nroff
AB\}CD


...which is not so good.

Note well that "string contents" are a "longer symbol" of the type I'm
describing, at the time they're populating a string, even if they might
decompose into many different tokens when _interpolated_ (or otherwise
interpreted, or surgically altered with `chop` or `substring`).

I'll include here a portion of a private email I sent to Clem Cole in response
to a report he submitted to me which became bug #68252.


(3) Four classes of distinct diagnostic remain; two of them repeat many
    times because they arise due to macro programming.  I added GNU
    troff's '-b' (backtrace) option (which didn't work well prior to
    groff 1.23) to track them down.

    a.  troff: backtrace: 'errmacs':111: macro 'NM'
        troff: backtrace: file 'errmess1':2
        troff:errmess1:2: error: ignoring invalid numeric expression
containing an escaped '}'

        Hmmm.  Here's the relevant input.

   109  .de NM
   110  .if '\\$2'' \{
   111  .nr Hs 1 \}

        The foregoing is not erroneous.  I suspect the macro programmer
        forgot to escape the newline on line 110, but even if they
        deliberately wanted to put a word space or line break on the
        output when the `NM` macro was called with a second, non-empty
        argument, that doesn't silence the error on line 111.

        This is a bug I introduced in groff 1.24; sorry.  groff 1.23.0
        doesn't exhibit it.  I'll need to bisect, but I suspect the work
        on this bug:

        https://savannah.gnu.org/bugs/?64240

        ...introduced the problem.  I'll have to go back to the drawing
        board with numeric expression evaluation.  I further suspect the
        involvement of another, older issue:

        https://savannah.gnu.org/bugs/?42675
        ...which has been open for almost twelve years.

        Digression which I'll likely be adapting into a comment on that
        Savannah ticket:

        There are a couple of entangled issues.  Probably we should more
        aggressively handle brace escape sequences (meaning, they
        terminate the syntax item being scanned as a newline or EOF
        would) in the particularized expression parsers,[3] leaving only
        the control flow request handlers to properly interpret them,
        but the complaint raised in that ticket would not be completely
        resolved even so, because if you have _spaces_ before a `\}`
        escape sequence in your macro call, and because spaces _separate
        arguments in macro calls_ (if not quoted), the argument count
        will still seem too large.  Thus:

        .NM foo \} \} \}

        ...(assuming no trailing spaces) has an argument count of 4.
        The last three arguments are empty, or null strings if you will.
        By contrast:

        .NM foo \}\}\}

        ...has two arguments only, and of course...

        .NM foo\}\}\}

        ...only one.

        Bottom line: sorry, please ignore those error diagnostics.


One _this_ issue is resolved, I expect to revisit bug #42675.







    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?68257>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Attachment: signature.asc
Description: PGP signature

Reply via email to