gbranden pushed a commit to branch master
in repository groff.

commit a91cd457d95e4e6b67adf3d00185598afccfc87d
Author: G. Branden Robinson <[email protected]>
AuthorDate: Thu Sep 26 16:50:58 2024 -0500

    [tbl]: Warn on unreliable escape seqs in entries.
    
    * src/preproc/tbl/table.cpp (table::add_entry): Throw warnings if
      comment (`\"`, `\#`) or transparent throughput (`\!`) escape sequences
      encountered in table entry.  Because these escape sequences cause the
      formatter to consume the rest of the input line as their argument,
      they don't play well with tbl, which tries to measure a table entry's
      width by interpolating it inside the delimited `\w` escape sequence.
      You can sometimes get away with this (especially in simple table
      layouts), hence the mere warning, but it can't be relied upon.
    
    Prompted by an observation of the NetHack Guidebook; see
    <https://github.com/NetHack/NetHack/pull/1280/commits/\
    f886f71d491dd7205b9b064368001a34bf8fa9f0>.
    
    Exhibit:
    
    $ cat EXPERIMENTS/bad-escapes-in-tables.roff
    This is my table.
    .sp
    .TS
    L.
    foo \" comment
    bar\# another comment
    baz\!qux
    .\" This should produce no diagnostic.
    T{
    This is fine. \" fingers crossed\! [sic]
    T}
    .TE
    $ ./build/test-groff -T ascii -t EXPERIMENTS/bad-escapes-in-tables.roff|cat 
-s
    tbl:EXPERIMENTS/bad-escapes-in-tables.roff:5: warning: table entry contains 
comment escape sequence '\"'
    tbl:EXPERIMENTS/bad-escapes-in-tables.roff:5: warning: table entry contains 
comment escape sequence '\#'
    tbl:EXPERIMENTS/bad-escapes-in-tables.roff:7: warning: table entry contains 
transparent throughput escape sequence '\!'
    EXPERIMENTS/bad-escapes-in-tables.roff:5: warning: table wider than line 
length minus indentation
    This is my table.
    
    foo
    bazqux 3bot 0>?0
    This is fine.
    
    The evident garbage in the table as formatted (plus warnings from the
    formatter if you turn them on) are the consequence of the invalid use of
    the escape sequences in the table data.
    
    In case I haven't mentioned it lately, I generally don't write
    automated tests of invalid inputs because that is an unbounded space.
---
 ChangeLog                 | 12 ++++++++++++
 src/preproc/tbl/table.cpp | 35 ++++++++++++++++++++++++++++++++++-
 2 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 51c2ec77e..a192af200 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,15 @@
+2024-09-26  G. Branden Robinson <[email protected]>
+
+       * src/preproc/tbl/table.cpp (table::add_entry): Throw warnings
+       if comment (`\"`, `\#`) or transparent throughput (`\!`) escape
+       sequences encountered in table entry.  Because these escape
+       sequences cause the formatter to consume the rest of the input
+       line as their argument, they don't play well with tbl, which
+       tries to measure a table entry's width by interpolating it
+       inside the delimited `\w` escape sequence.  You can sometimes
+       get away with this (especially in simple table layouts), hence
+       the mere warning, but it can't be relied upon.
+
 2024-09-26  G. Branden Robinson <[email protected]>
 
        * src/preproc/tbl/table.cpp (table::add_entry): When erroring
diff --git a/src/preproc/tbl/table.cpp b/src/preproc/tbl/table.cpp
index ab6eea7ee..596a854b0 100644
--- a/src/preproc/tbl/table.cpp
+++ b/src/preproc/tbl/table.cpp
@@ -1,4 +1,4 @@
-/* Copyright (C) 1989-2023 Free Software Foundation, Inc.
+/* Copyright (C) 1989-2024 Free Software Foundation, Inc.
      Written by James Clark ([email protected])
 
 This file is part of groff.
@@ -1525,7 +1525,40 @@ void table::add_entry(int r, int c, const string &str,
   allocate(r);
   table_entry *e = 0 /* nullptr */;
   int len = str.length();
+  // Diagnose escape sequences that can wreak havoc in generated output.
   if (len > 1) {
+    const char *entryptr = str.contents();
+    // A comment on a control line or in a text block is okay.
+    const char *commentptr = strstr(entryptr, "\\\"");
+    if (commentptr != 0 /* nullptr */) {
+      const char *controlptr = strchr(entryptr, '.');
+      if ((controlptr == 0 /* nullptr */)
+         || (controlptr == entryptr)
+         || (strstr(entryptr, "\n") == 0 /* nullptr */))
+       warning_with_file_and_line(fn, ln, "table entry contains"
+                                  " comment escape sequence '\\\"'");
+    }
+    const char *gcommentptr = strstr(entryptr, "\\#");
+    // If both types of comment are present, the first is what matters.
+    if ((gcommentptr != 0 /* nullptr */)
+       && (gcommentptr < commentptr))
+      commentptr = gcommentptr;
+    if (commentptr != 0 /* nullptr */) {
+      const char *controlptr = strchr(entryptr, '.');
+      if ((controlptr == 0 /* nullptr */)
+         || (controlptr == entryptr)
+         || (strstr(entryptr, "\n") == 0 /* nullptr */))
+       warning_with_file_and_line(fn, ln, "table entry contains"
+                                  " comment escape sequence '\\#'");
+    }
+    // A \! escape sequence after a comment has started is okay.
+    const char *exclptr = strstr(str.contents(), "\\!");
+    if ((exclptr != 0 /* nullptr */)
+       && ((0 /* nullptr */ == commentptr)
+           || (exclptr < commentptr)))
+      warning_with_file_and_line(fn, ln, "table entry contains"
+                                " transparent throughput escape"
+                                " sequence '\\!'");
     string last_two_chars = str.substring((len - 2), 2);
     if ("\\z" == last_two_chars)
       error_with_file_and_line(fn, ln, "table entry ends with"

_______________________________________________
Groff-commit mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/groff-commit

Reply via email to