gbranden pushed a commit to branch master
in repository groff.
commit a91cd457d95e4e6b67adf3d00185598afccfc87d
Author: G. Branden Robinson <[email protected]>
AuthorDate: Thu Sep 26 16:50:58 2024 -0500
[tbl]: Warn on unreliable escape seqs in entries.
* src/preproc/tbl/table.cpp (table::add_entry): Throw warnings if
comment (`\"`, `\#`) or transparent throughput (`\!`) escape sequences
encountered in table entry. Because these escape sequences cause the
formatter to consume the rest of the input line as their argument,
they don't play well with tbl, which tries to measure a table entry's
width by interpolating it inside the delimited `\w` escape sequence.
You can sometimes get away with this (especially in simple table
layouts), hence the mere warning, but it can't be relied upon.
Prompted by an observation of the NetHack Guidebook; see
<https://github.com/NetHack/NetHack/pull/1280/commits/\
f886f71d491dd7205b9b064368001a34bf8fa9f0>.
Exhibit:
$ cat EXPERIMENTS/bad-escapes-in-tables.roff
This is my table.
.sp
.TS
L.
foo \" comment
bar\# another comment
baz\!qux
.\" This should produce no diagnostic.
T{
This is fine. \" fingers crossed\! [sic]
T}
.TE
$ ./build/test-groff -T ascii -t EXPERIMENTS/bad-escapes-in-tables.roff|cat
-s
tbl:EXPERIMENTS/bad-escapes-in-tables.roff:5: warning: table entry contains
comment escape sequence '\"'
tbl:EXPERIMENTS/bad-escapes-in-tables.roff:5: warning: table entry contains
comment escape sequence '\#'
tbl:EXPERIMENTS/bad-escapes-in-tables.roff:7: warning: table entry contains
transparent throughput escape sequence '\!'
EXPERIMENTS/bad-escapes-in-tables.roff:5: warning: table wider than line
length minus indentation
This is my table.
foo
bazqux 3bot 0>?0
This is fine.
The evident garbage in the table as formatted (plus warnings from the
formatter if you turn them on) are the consequence of the invalid use of
the escape sequences in the table data.
In case I haven't mentioned it lately, I generally don't write
automated tests of invalid inputs because that is an unbounded space.
---
ChangeLog | 12 ++++++++++++
src/preproc/tbl/table.cpp | 35 ++++++++++++++++++++++++++++++++++-
2 files changed, 46 insertions(+), 1 deletion(-)
diff --git a/ChangeLog b/ChangeLog
index 51c2ec77e..a192af200 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,15 @@
+2024-09-26 G. Branden Robinson <[email protected]>
+
+ * src/preproc/tbl/table.cpp (table::add_entry): Throw warnings
+ if comment (`\"`, `\#`) or transparent throughput (`\!`) escape
+ sequences encountered in table entry. Because these escape
+ sequences cause the formatter to consume the rest of the input
+ line as their argument, they don't play well with tbl, which
+ tries to measure a table entry's width by interpolating it
+ inside the delimited `\w` escape sequence. You can sometimes
+ get away with this (especially in simple table layouts), hence
+ the mere warning, but it can't be relied upon.
+
2024-09-26 G. Branden Robinson <[email protected]>
* src/preproc/tbl/table.cpp (table::add_entry): When erroring
diff --git a/src/preproc/tbl/table.cpp b/src/preproc/tbl/table.cpp
index ab6eea7ee..596a854b0 100644
--- a/src/preproc/tbl/table.cpp
+++ b/src/preproc/tbl/table.cpp
@@ -1,4 +1,4 @@
-/* Copyright (C) 1989-2023 Free Software Foundation, Inc.
+/* Copyright (C) 1989-2024 Free Software Foundation, Inc.
Written by James Clark ([email protected])
This file is part of groff.
@@ -1525,7 +1525,40 @@ void table::add_entry(int r, int c, const string &str,
allocate(r);
table_entry *e = 0 /* nullptr */;
int len = str.length();
+ // Diagnose escape sequences that can wreak havoc in generated output.
if (len > 1) {
+ const char *entryptr = str.contents();
+ // A comment on a control line or in a text block is okay.
+ const char *commentptr = strstr(entryptr, "\\\"");
+ if (commentptr != 0 /* nullptr */) {
+ const char *controlptr = strchr(entryptr, '.');
+ if ((controlptr == 0 /* nullptr */)
+ || (controlptr == entryptr)
+ || (strstr(entryptr, "\n") == 0 /* nullptr */))
+ warning_with_file_and_line(fn, ln, "table entry contains"
+ " comment escape sequence '\\\"'");
+ }
+ const char *gcommentptr = strstr(entryptr, "\\#");
+ // If both types of comment are present, the first is what matters.
+ if ((gcommentptr != 0 /* nullptr */)
+ && (gcommentptr < commentptr))
+ commentptr = gcommentptr;
+ if (commentptr != 0 /* nullptr */) {
+ const char *controlptr = strchr(entryptr, '.');
+ if ((controlptr == 0 /* nullptr */)
+ || (controlptr == entryptr)
+ || (strstr(entryptr, "\n") == 0 /* nullptr */))
+ warning_with_file_and_line(fn, ln, "table entry contains"
+ " comment escape sequence '\\#'");
+ }
+ // A \! escape sequence after a comment has started is okay.
+ const char *exclptr = strstr(str.contents(), "\\!");
+ if ((exclptr != 0 /* nullptr */)
+ && ((0 /* nullptr */ == commentptr)
+ || (exclptr < commentptr)))
+ warning_with_file_and_line(fn, ln, "table entry contains"
+ " transparent throughput escape"
+ " sequence '\\!'");
string last_two_chars = str.substring((len - 2), 2);
if ("\\z" == last_two_chars)
error_with_file_and_line(fn, ln, "table entry ends with"
_______________________________________________
Groff-commit mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/groff-commit