URL:
  <https://savannah.gnu.org/bugs/?66242>

                 Summary: [troff] generalize `-a` option (as `-A fmt`?)
                   Group: GNU roff
               Submitter: gbranden
               Submitted: Sat 21 Sep 2024 10:57:57 PM UTC
                Category: Core
                Severity: 1 - Wish
              Item Group: None
                  Status: Postponed
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
         Planned Release: None


    _______________________________________________________

Follow-up Comments:


-------------------------------------------------------
Date: Sat 21 Sep 2024 10:57:57 PM UTC By: G. Branden Robinson <gbranden>
Origin: https://github.com/mquinson/po4a/issues/527

It's not a very well known fact that _groff_ supports output in more than one
format.  And I don't mean PostScript, PDF, or HTML--_groff_ handles all of
these the same, writing out a document in a page description language that
doesn't have a well accepted name but which I call "grout".  It's a descendant
of the _troff_ output format described by Kernighan in the Bell Labs CSTR
documents #97 and #76 (1992 revision), which I similarly call "trout". 
Programs called _output drivers_, like DWB _troff_'s _dpost_, or _groff_'s
_grodvi_, _grolbp_, _grops_, _gropdf_, and _grotty_, translate that page
description language into another file format or byte stream that a (possibly
emulated) hardware device is prepared to consume.

But that's _not_ what I'm talking about.  As a language compiler, _groff_
builds lists of "nodes", very much like the abstract syntax tree that is
taught in computer science classes.  Since day one it has supported not one
but *two* output formats: _grout_, and "approximate output", which is what you
see when you run `groff -a` on a document.  For about 25 years it has also
supported "suppressed output", which is sort of a hack that was put in place
to surmount certain problems with HTML generation that need not concern us
here.  The important fact is that there is not a tight coupling between nodes
and their rendering.

Here's a description of `groff -a` output.  It closely follows a Unix _troff_
feature.


     -a       Generate a plain text approximation of the typeset output.
              The read‐only register .A is set to 1.  This option
              produces a sort of abstract preview of the formatted
              output.

              •  Page breaks are marked by a phrase in angle brackets;
                 for example, “<beginning of page>”.

              •  Lines are broken where they would be in formatted
                 output.

              •  Vertical motion, apart from that implied by a break, is
                 not represented.

              •  A horizontal motion of any size is represented as one
                 space.  Adjacent horizontal motions are not combined.
                 Supplemental inter‐sentence space (configured by the
                 second argument to the .ss request) is not represented.

              •  A special character is rendered as its identifier
                 between angle brackets; for example, a hyphen appears
                 as “<hy>”.

              The above description should not be considered a
              specification; the details of -a output are subject to
              change.


And here's an example of what that looks like:


$ nroff -a -man ~/ncurses-HEAD/share/man/man3/beep.3ncurses 
<beginning of page>
beep(3NCURSES) Library calls beep(3NCURSES)
NAME 
 beep, flash <-> ring the (visual) bell of the terminal with curses
SYNOPSIS 
 #include <ncursesw/curses.h>
 int beep(void);
 int flash(void);
DESCRIPTION 
 beep and flash alert the terminal user: the former by sounding the termi<hy>
 nal's audible alarm, and the latter by visibly attracting attention. Com<hy>
 monly, a terminal implements a visual bell by momentarily reversing the
 character foreground and background colors on the entire display; even a
 monochrome device can do this. These functions each attempt the other
 alert type if the one requested is unavailable. If neither is available,
 curses performs no action. Nearly all terminals have an audible alert
 mechanism such as a bell or piezoelectric buzzer, but only some can flash
 the screen.
RETURN VALUE 
 These functions return OK on success and ERR on failure.
 In ncurses, beep and flash return OK if the terminal type supports the
cor<hy>
 responding capability: bell (bel) for beep and flash_screen (flash) for
 flash. Otherwise they return ERR.
EXTENSIONS 
 In ncurses, these functions can return ERR.
PORTABILITY 
 X/Open Curses, Issue 4 describes these functions. It specifies no error
 conditions for them.
 On SVr4 curses, they always return OK, and X/Open Curses specifies them as
 doing so.
HISTORY 
 beep and flash appeared in SVr2 (1984).
SEE ALSO 
 ncurses(3NCURSES), terminfo(5)
ncurses 6.5 2024-07-20 beep(3NCURSES)


You may anticipate where I'm going with this.

One could write a "pod emitter" output class.  Like "approximate" (or "ascii"
[_sic_]) output, its `tprint` member functions for node types that it didn't
support (couldn't represent) would be empty.  But one thing a node _does_ know
is which font is selected to write the current glyph.

It also knows where the sentence boundaries are (assuming the input was not
written to conceal this information), so it could start a new output line when
encountering one.

So why parse _man_ or try to guess where the font face changes are when you
could have _groff_ *tell* you, with perfect knowledge?

Just wanted to put that idea out there.  And it would be another motivator for
superseding `-a` with an argument-taking option, `-A` I would think, that
would give us flexibility to support several such output formats in the
future.







    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?66242>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Attachment: signature.asc
Description: PGP signature

Reply via email to