URL: <https://savannah.gnu.org/bugs/?67133>
Summary: [groff] pipeline handling conceals fate of commands
that die by SIGPIPE
Group: GNU roff
Submitter: gbranden
Submitted: Sat 17 May 2025 02:00:53 AM GMT
Category: Core
Severity: 4 - Important
Item Group: Incorrect behaviour
Status: In Progress
Privacy: Public
Assigned to: gbranden
Open/Closed: Open
Discussion Lock: Any
Planned Release: None
_______________________________________________________
Follow-up Comments:
-------------------------------------------------------
Date: Sat 17 May 2025 02:00:53 AM GMT By: G. Branden Robinson <gbranden>
This has been a super-annoying problem for a long time, and I believe it has
played a role in frustrating _grohtml_ development.
Now that I've got a series of commits ready it's easier to just quote them.
commit 21d5356b77b48a3e4ce2231698fec856603c2489
Author: G. Branden Robinson <[email protected]>
Date: Fri May 16 19:24:38 2025 -0500
[troff]: Add assertions (INSTALL FAILS).
* src/roff/troff/node.cpp (real_output_file::on, real_output_file::off):
Add `assert()`ions of complementary Boolean state before
unconditionally assigning the other. One of them fails, in a quiet
and underhanded way, when using grohtml to format "doc/pic.ms", which
scandalously doesn't provoke a build failure. No automated tests trip
it, either. A document in the wild could conceivably trip either. If
one does, we want to hear about it, preferably with a core file
generated by an unstripped troff executable.
I have a fairly hard rule about not pushing commits that break the
build. Technically this doesn't. groff _builds_ just fine. But if you
try to install it, it will fail:
.../share/doc/groff-1.23.0/html/img
/usr/bin/install: cannot stat '.../GIT/groff/build/doc/img/pic*': No such
file or directory
...and sure enough, the pic-1.png and other image files to be inlined
into the generated HTML files aren't there.
Why aren't they there? Why didn't the build fail earlier? Why didn't
groff throw a fit when grohtml couldn't generate the desired output?
("Why aren't you over there stomping Private Pyle's guts out?!")
Two reasons.
1. src/roff/groff/pipeline.c appears to have a 33-year-old bug in it.
2. Our doc/doc.am script doesn't try hard enough to detect failure.
Over the past 8 years, I've on rare occasions seen mysterious failures
from grohtml, usually when formatting pic.ms, but they've always
neglected to leave behind evidence of the crime (like a core dump), and
worse, like in this case, usually weren't detectable if all you did was
build everything, as opposed to trying to install it.
Now at last I think I have my fingers around the throat of the problem.
In the next commits I'll address the two issues above.
ChangeLog | 12 ++++++++++++
src/roff/troff/node.cpp | 6 ++++--
2 files changed, 16 insertions(+), 2 deletions(-)
commit 74b8bf5a1a9d60c38cfa4160f328ac20497cf731
Author: G. Branden Robinson <[email protected]>
Date: Fri May 16 08:04:22 2025 -0500
Detect grohtml failure better (BUILD FAILS).
* doc/doc.am (doc/pic.html, doc/webpage.html): Create output as a
".tmp"-suffixed file at first, then check for the first of multiple
associated image files we know should exist, moving the target into
place only if it does.
Per the previous commit, now the build truly fails when "groff -Thtml"
does. But only because we're looking for files that should exist, but
don't. Why is groff exiting with status 0?
For that, we need another change.
ChangeLog | 9 +++++++++
doc/doc.am | 10 +++++++---
2 files changed, 16 insertions(+), 3 deletions(-)
commit 9a3dd212cfbaa6ab5238c4de10848e32991c6c13
Author: G. Branden Robinson <[email protected]>
Date: Fri May 16 19:49:33 2025 -0500
[groff]: Fail if piped cmd signaled (BUILD FAILS).
* src/roff/groff/pipeline.c (run_pipeline) [!_WIN32 && !__MSDOS__ &&
!_UWIN && !__CYGWIN__ && !__EMX__]: _Unconditionally_ set bit 2 of the
return value in the event of _any_ signal hitting a pipelined
process. A workaround that James Clark put in for a SunOS 4.1.1
X11-related bug in groff 1.06 (1992) appears to have led us to grief;
if any of the child processes in the pipeline being wait(2)ed on was
signaled with SIGPIPE, this alteration of return the value was not
being done. If that child was the last in the pipeline (gxditview,
the code presumes), he walks the list of commands in the pipeline and
kill(2)s them all with SIGPIPE. If a process has multiple fatal
signals pending, which one wins? Apparently, on Linux 5.10, in a
fight between SIGPIPE and SIGABRT (raised by abort(3), called by
assert(3)), SIGPIPE always wins. So bit 2 of this function's return
value, which (after a left shift) ultimately becomes groff(1)'s exit
status, never got set, and so groff happily reported success when it
should have screamed hideously of failure. Likely the SunOS 4
workaround should be ripped out entirely, but this fix adequately
detects grohtml failures.
I experimentally instrumented "pipeline.c" a bit, which helped me track
down the issue. The following was as good as a smoking gun to me.
GBR: Unix run_pipeline()
GBR: PID 54503 caught signal 13
GBR: run_pipeline() returning 0
ChangeLog | 24 ++++++++++++++++++++++++
src/roff/groff/pipeline.c | 8 +++-----
2 files changed, 27 insertions(+), 5 deletions(-)
commit 718cb68badf5fecadec1bcea7328c02767651a0b
Author: G. Branden Robinson <[email protected]>
Date: Fri May 16 20:44:59 2025 -0500
[troff]: Comment out assertion.
* src/roff/troff/node.cpp (real_output_file::on): Comment out assertion
that fails when formatting "pic.ms" as HTML.
ChangeLog | 5 +++++
src/roff/troff/node.cpp | 2 +-
2 files changed, 6 insertions(+), 1 deletion(-)
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?67133>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
signature.asc
Description: PGP signature
