URL: <https://savannah.gnu.org/bugs/?66242>
Summary: [troff] generalize `-a` option (as `-A fmt`?) Group: GNU roff Submitter: gbranden Submitted: Sat 21 Sep 2024 10:57:57 PM UTC Category: Core Severity: 1 - Wish Item Group: None Status: Postponed Privacy: Public Assigned to: None Open/Closed: Open Discussion Lock: Any Planned Release: None _______________________________________________________ Follow-up Comments: ------------------------------------------------------- Date: Sat 21 Sep 2024 10:57:57 PM UTC By: G. Branden Robinson <gbranden> Origin: https://github.com/mquinson/po4a/issues/527 It's not a very well known fact that _groff_ supports output in more than one format. And I don't mean PostScript, PDF, or HTML--_groff_ handles all of these the same, writing out a document in a page description language that doesn't have a well accepted name but which I call "grout". It's a descendant of the _troff_ output format described by Kernighan in the Bell Labs CSTR documents #97 and #76 (1992 revision), which I similarly call "trout". Programs called _output drivers_, like DWB _troff_'s _dpost_, or _groff_'s _grodvi_, _grolbp_, _grops_, _gropdf_, and _grotty_, translate that page description language into another file format or byte stream that a (possibly emulated) hardware device is prepared to consume. But that's _not_ what I'm talking about. As a language compiler, _groff_ builds lists of "nodes", very much like the abstract syntax tree that is taught in computer science classes. Since day one it has supported not one but *two* output formats: _grout_, and "approximate output", which is what you see when you run `groff -a` on a document. For about 25 years it has also supported "suppressed output", which is sort of a hack that was put in place to surmount certain problems with HTML generation that need not concern us here. The important fact is that there is not a tight coupling between nodes and their rendering. Here's a description of `groff -a` output. It closely follows a Unix _troff_ feature. -a Generate a plain text approximation of the typeset output. The read‐only register .A is set to 1. This option produces a sort of abstract preview of the formatted output. • Page breaks are marked by a phrase in angle brackets; for example, “<beginning of page>”. • Lines are broken where they would be in formatted output. • Vertical motion, apart from that implied by a break, is not represented. • A horizontal motion of any size is represented as one space. Adjacent horizontal motions are not combined. Supplemental inter‐sentence space (configured by the second argument to the .ss request) is not represented. • A special character is rendered as its identifier between angle brackets; for example, a hyphen appears as “<hy>”. The above description should not be considered a specification; the details of -a output are subject to change. And here's an example of what that looks like: $ nroff -a -man ~/ncurses-HEAD/share/man/man3/beep.3ncurses <beginning of page> beep(3NCURSES) Library calls beep(3NCURSES) NAME beep, flash <-> ring the (visual) bell of the terminal with curses SYNOPSIS #include <ncursesw/curses.h> int beep(void); int flash(void); DESCRIPTION beep and flash alert the terminal user: the former by sounding the termi<hy> nal's audible alarm, and the latter by visibly attracting attention. Com<hy> monly, a terminal implements a visual bell by momentarily reversing the character foreground and background colors on the entire display; even a monochrome device can do this. These functions each attempt the other alert type if the one requested is unavailable. If neither is available, curses performs no action. Nearly all terminals have an audible alert mechanism such as a bell or piezoelectric buzzer, but only some can flash the screen. RETURN VALUE These functions return OK on success and ERR on failure. In ncurses, beep and flash return OK if the terminal type supports the cor<hy> responding capability: bell (bel) for beep and flash_screen (flash) for flash. Otherwise they return ERR. EXTENSIONS In ncurses, these functions can return ERR. PORTABILITY X/Open Curses, Issue 4 describes these functions. It specifies no error conditions for them. On SVr4 curses, they always return OK, and X/Open Curses specifies them as doing so. HISTORY beep and flash appeared in SVr2 (1984). SEE ALSO ncurses(3NCURSES), terminfo(5) ncurses 6.5 2024-07-20 beep(3NCURSES) You may anticipate where I'm going with this. One could write a "pod emitter" output class. Like "approximate" (or "ascii" [_sic_]) output, its `tprint` member functions for node types that it didn't support (couldn't represent) would be empty. But one thing a node _does_ know is which font is selected to write the current glyph. It also knows where the sentence boundaries are (assuming the input was not written to conceal this information), so it could start a new output line when encountering one. So why parse _man_ or try to guess where the font face changes are when you could have _groff_ *tell* you, with perfect knowledge? Just wanted to put that idea out there. And it would be another motivator for superseding `-a` with an argument-taking option, `-A` I would think, that would give us flexibility to support several such output formats in the future. _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?66242> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/
signature.asc
Description: PGP signature