Re: Groff examples repository

G. Branden Robinson Sun, 29 Aug 2021 02:37:34 -0700

At 2021-08-28T12:14:17-0400, Douglas McIlroy wrote:
> -----------------------------------------------------------------
> 
> A small anomaly. Consider
> 
>     .de .
>     .tm Hi
>     ,..
>     ..
> 
> The second .. emits "Hi". This fragment also emits "Hi":
> 
>     .de end end
>     .tm Hi
>     .end
> 
> But this (with macro . not previously defined)
> does not:
> 
>     .de . .
>     .tm Hi
>     ..


I used the same example last week to update the "Writing Macros" section
of our Texinfo manual.  The concept of "copy mode" confused me deeply
when I first arrived in the groff community and finally I realized that
the statement, used repeatedly about escapes in our documentation, that
certain ones are "interpreted in copy mode" makes a lot more sense to
people who already know what copy mode is than those who don't.
Recasting ensued.

I have added some of this material to groff(7) as well, albeit in
aggressively condensed form and with no examples.

It may go without saying, but I have some minor revisions to the below
queued already.  Nevertheless I think in its present state it renders
the subject more accessible.

*** SNIP ***
5.21 Writing Macros
===================

A "macro" is a stored collection of text and control lines that can be
used multiple times.  Use macros to define common operations.  *Note
Strings::, for a (limited) alternative syntax to call macros.  While
requests exist for the purpose of creating macros, simply calling an
undefined macro, or interpolating it as a string, will cause it to be
defined as empty.  *Note Identifiers::.

 -- Request: .de name [end]
     Define a macro NAME, replacing the definition of any existing
     request, macro, string, or diversion called NAME.  GNU 'troff'
     enters "copy mode", storing subsequent input lines in an internal
     buffer.  If the optional second argument is not specified, the
     macro definition ends with the control line '..' (two dots).
     Alternatively, END identifies a macro whose call syntax ends the
     definition of NAME; END is then called normally.  Spaces or tabs
     are permitted after the control character in the line containing
     this ending token (either '.' or 'END'), but a tab immediately
     after the token prevents its recognition as the end of a macro
     definition.(1)  (*note Writing Macros-Footnote-1::)

     Here is a small example macro called 'P' that causes a break and
     inserts some vertical space.  It could be used to separate
     paragraphs.

          .de P
          .  br
          .  sp .8v
          ..

     We can define one macro within another.  Attempting to nest '..'
     naïvely will end the outer definition because the inner definition
     isn't interpreted as such until the outer macro is later
     interpolated.  We can use an end macro instead.  Each level of
     nesting should use a unique end macro.

     An end macro need not be defined until it is called.  This fact
     enables a nested macro definition to begin inside one macro and end
     inside another.  Consider the following example.(2)  (*note Writing
     Macros-Footnote-2::)

          .de m1
          .  de m2 m3
          you
          ..
          .de m3
          Hello,
          Joe.
          ..
          .de m4
          do
          ..
          .m1
          know?
          .  m3
          What
          .m4
          .m2
              => Hello, Joe.  What do you know?

     A nested macro definition _can_ be terminated with '..', and nested
     macros _can_ reuse end macros, but these control lines must be
     escaped multiple times for each level of nesting.  The necessity of
     this escaping and the utility of nested macro definitions will
     become clearer when we employ macro parameters and consider the
     behavior of copy mode in detail.

 -- Request: .de1 name [end]
     The 'de1' request defines a macro that executes with compatibility
     mode disabled (*note Implementation Differences::).  On entry, the
     state of compatibility mode enablement is saved, and on exit it is
     restored.  Observe the extra backlash before the interpolation of
     register 'xxx'; we'll explore this subject in *note Copy Mode::.

          .nr xxx 12345
          .de aa
          The value of xxx is \\n[xxx].
          .  br
          ..
          .de1 bb
          The value of xxx is \\n[xxx].
          ..
          .cp 1
          .aa
              error-> warning: register '[' not defined
              => The value of xxx is 0xxx].
          .bb
              => The value of xxx is 12345.

 -- Request: .dei name [end]
 -- Request: .dei1 name [end]
     The 'dei' request defines a macro indirectly.  That is, it
     interpolates strings named NAME and END before performing the
     definition.

     The following examples are equivalent.

          .ds xx aa
          .ds yy bb
          .dei xx yy

          .de aa bb

     The 'dei1' request is similar to 'dei', but with compatibility mode
     switched off during execution of the defined macro.

     If compatibility mode is on, 'de' and 'dei' behave similarly to
     'de1' and 'dei1', respectively: a "compatibility save" token is
     inserted at the beginning, and a "compatibility restore" token at
     the end, with compatibility mode switched on during execution.

 -- Request: .am name [end]
 -- Request: .am1 name [end]
 -- Request: .ami name [end]
 -- Request: .ami1 name [end]
     'am' appends subsequent input lines to macro NAME, extending its
     definition, and otherwise working as 'de' does.

     To make the previously defined 'P' macro set indented instead of
     block paragraphs, add the necessary code to the existing macro.

          .am P
          .ti +5n
          ..

     The other requests are analogous to their 'de' counterparts.  The
     'am1' request turns off compatibility mode while executing the
     appendment: a "compatibility save" input token is inserted before
     the appended input lines, and a "compatibility restore" input token
     after the end.  The 'ami' request appends indirectly, meaning that
     strings NAME and END are interpolated with the resulting names used
     before appending.  The 'ami1' request is similar to 'ami',
     disabling compatibility mode during interpretation of the appended
     lines.

   Using 'trace.tmac', you can trace calls to 'de', 'de1', 'am', and
'am1'.  You can also use the 'backtrace' request at any point desired to
troubleshoot tricky spots (*note Debugging::).

   *Note Strings::, for the 'als', 'rm', and 'rn' requests to create an
alias of, remove, and rename a macro, respectively.

   Macro identifiers share their name space with requests, strings, and
diversions; *note Identifiers::.  The 'am', 'as', 'da', 'de', 'di', and
'ds' requests (together with their variants) only create a new object if
the name of the macro, diversion, or string is currently undefined or if
it is defined as a request; normally, they modify the value of an
existing object.  *Note the description of the 'als' request: als, for
pitfalls when redefining a macro that is aliased.

 -- Request: .return [anything]
     Exit a macro, immediately returning to the caller.  If called with
     an argument ANYTHING, exit twice--the current macro and the macro
     one level higher.  This is used to define a wrapper macro for
     'return' in 'trace.tmac'.

   (1) While it is possible to define and call a macro '.', you can't
use it as an end macro: during a macro definition, '..' is never handled
as calling '.', even if '.de NAME .' explicitly precedes it.

   (2) The structure of this example is adapted from, and structurally
isomorphic to, part of a solution by Tadziu Hoffman to the problem of
reflowing text multiple times to find an optimal configuration for it.
<https://lists.gnu.org/archive/html/groff/2008-12/msg00006.html>

5.21.1 Parameters
-----------------

Macro calls and string interpolations optionally accept a list of
arguments; recall *note Request and Macro Arguments::.  At the time such
an interpolation takes place, these "parameters" can be examined using a
register and a variety of escape sequences starting with '\$'.

 -- Register: \n[.$]
     The count of parameters available to a macro or string is kept in
     this read-only register.  The 'shift' request can change its value.

   Any individual parameter can be accessed by its position in the list
of arguments to the macro call, numbered from left to right starting at
1, with one of the following escapes.

 -- Escape: \$n
 -- Escape: \$(nn
 -- Escape: \$[nnn]
     Interpolate the Nth, NNth, or NNNth parameter.  The first form only
     accepts a single digit (1<=N<=9)), the second two digits
     (01<=NN<=99)), and the third any positive integer NNN.  Macros and
     strings can accept an unlimited number of parameters.

 -- Request: .shift [n]
     Shift the parameters N places (1 by default).  This is a "left
     shift": what was parameter I becomes parameter I-N and so on.  The
     parameters formerly in positions 1 to N are no longer available.
     Shifting by a nonpositive amount performs no operation.  The
     register '.$' is adjusted accordingly.

   In practice, parameter interpolations are usually seen prefixed with
an extra escape character.  This is because the '\$' family of escape
sequences is interpreted even in copy mode.(1)  (*note
Parameters-Footnote-1::)

 -- Escape: \$*
 -- Escape: \$@
 -- Escape: \$^
     In some cases it is convenient to interpolate all of the parameters
     at once (to pass them to a request, for instance).  The '\$*'
     escape concatenates the parameters, separating them with spaces.
     '\$@' is similar, concatenating the parameters, surrounding each by
     double quotes and separating them with spaces.  If not in
     compatibility mode, the interpolation depth of double quotes is
     preserved (*note Request and Macro Arguments::).  '\$^'
     interpolates all parameters as if they were arguments to the 'ds'
     request.

          .de foo
          . tm $1='\\$1'
          . tm $2='\\$2'
          . tm $*='\\$*'
          . tm $@='\\$@'
          . tm $^='\\$^'
          ..
          .foo " This is a "test"
              error-> $1=' This is a '
              error-> $2='test"'
              error-> $*=' This is a  test"'
              error-> $@='" This is a " "test""'
              error-> $^='" This is a "test"'

     '\$*' is useful when writing a macro that doesn't need to
     distinguish its arguments, or even to not interpret them; examples
     include macros that produce diagnostic messages by wrapping the
     'tm' or 'ab' requests.  Use '\$@' when writing a macro that may
     need to shift its parameters and/or wrap a macro or request that
     finds the count significant.  If in doubt, prefer '\$@' to '\$*'.
     An application of '\$^' is seen in 'trace.tmac', which redefines
     some requests and macros for debugging purposes.

 -- Escape: \$0
     Interpolate the name by which the executing macro was called.  The
     'als' request can cause a macro to have more than one name.
     Applying string interpolation to a macro does not change this name.

          .de foo
          .  tm \\$0
          ..
          .als bar foo
          .
          .de aaa
          .  foo
          ..
          .de bbb
          .  bar
          ..
          .de ccc
          \\*[foo]\\
          ..
          .de ddd
          \\*[bar]\\
          ..
          .
          .aaa
              error-> foo
          .bbb
              error-> bar
          .ccc
              error-> ccc
          .ddd
              error-> ddd

   (1) If they were not, parameter interpolations would be similar to
command-line parameters--fixed for the entire duration of a 'roff'
program's run.  The advantage of interpolating '\$' escape sequences
even in copy mode is that they can change from one interpolation to the
next, like function parameters.  The additional escape character is the
price of this power.

5.21.2 Copy Mode
----------------

When GNU 'troff' processes certain requests, most importantly those
which define or append to a macro or string, it does so in "copy mode":
it copies the characters of the definition into a dedicated storage
region, interpolating the escape sequences '\n', '\g', '\$', '\*', and
'\V' normally; interpreting '\<RET>' immediately; discarding comments
'\"' and '\#'; interpolating the current leader, escape, or tab
character with '\a', '\e', and '\t', respectively; and storing all other
escape sequences in an encoded form.

   The complement of copy mode--a 'roff' formatter's behavior when not
defining or appending to a macro, string, or diversion--where all macros
are interpolated, requests invoked, and valid escape sequences processed
immediately upon recognition, can be termed "interpretation mode".

 -- Escape: \\
     The escape character, '\' by default, escapes itself.  Thus you can
     control whether a given '\n', '\g', '\$', '\*', or '\V' escape
     sequence is interpreted at the time the macro containing it is
     defined, or later when the macro is called.(1)  (*note Copy
     Mode-Footnote-1::)

          .nr x 20
          .de y
          .nr x 10
          \&\nx
          \&\\nx
          ..
          .y
              => 20 10

     You can think of '\\' as a "delayed" backslash; it is the escape
     character followed by a backslash from which the escape character
     has removed its special meaning.  Consequently, '\\' is not an
     escape sequence in the usual sense.  In any escape sequence '\X'
     that GNU 'troff' does not recognize, the escape character is
     ignored and X is output, with two exceptions--'\\' is the first.

 -- Escape: \.
     '\.' escapes the control character.  It is similar to '\\' in that
     it isn't a true escape sequence.  It is used to permit nested macro
     definitions to end without a named macro call to conclude them.
     Without a syntax for escaping the control character, this would not
     be possible.

          .de m1
          foo
          .
          .  de m2
          bar
          \\..
          .
          ..
          .m1
          .m2
              => foo bar

     The first backslash is consumed while the macro is read, and the
     second is interpreted while executing macro 'm1'.

   'roff' documents should not use the '\\' or '\.' tokens outside of
copy mode; they serve only to obfuscate the input.  Use '\e' to obtain
the escape character, '\[rs]' to obtain a backslash glyph, and '\&'
before '.' and ''' where GNU 'troff' expects them as control characters
if you mean to use them literally (recall *note Requests and Macros::).

   Macro definitions can be nested to arbitrary depth.  The mechanics of
parsing the escape character have significant consequences for this
practice.

     .de M1
     \\$1
     .  de M2
     \\\\$1
     .    de M3
     \\\\\\\\$1
     \\\\..
     .    M3 hand.
     \\..
     .  M2 of
     ..
     This understeer is getting
     .M1 out
         => This understeer is getting out of hand.

   Each escape character is interpreted twice--once in copy mode, when
the macro is defined, and once in interpretation mode, when it is
executed.  As seen above, this fact leads to exponential growth in the
number of escape characters required to delay interpolation of '\n',
'\g', '\$', '\*', and '\V' at each nesting level, which can be daunting.
GNU 'troff' offers a solution.

 -- Escape: \E
     '\E' represents an escape character that is not interpreted in copy
     mode.  You can use it to ease the writing of nested macro
     definitions.

          .de M1
          .  nop \E$1
          .  de M2
          .    nop \E$1
          .    de M3
          .      nop \E$1
          \\\\..
          .    M3 better.
          \\..
          .  M2 bit
          ..
          This vehicle handles
          .M1 a
              => This vehicle handles a bit better.

     Observe that because '\.' is not a true escape sequence, we can't
     use '\E' to keep '..' from ending a macro definition prematurely.
     If the multiplicity of backslashes complicates maintenance, use end
     macros.

     '\E' is also convenient to define strings that contain escape
     sequences that need to work when used in copy mode (for example, as
     macro arguments) We might define strings to begin and end
     superscripting as follows.(2)  (*note Copy Mode-Footnote-2::)

          .ds { \v'-.9m\s'\En[.s]*7u/10u'+.7m'
          .ds } \v'-.7m\s0+.9m'

     When the 'ec' request is used to redefine the escape character,
     '\E' also makes it easier to distinguish the semantics of an escape
     character from the other meaning(s) its character might have.
     Consider the use of an unusual escape character, '-'.

          .nr a 1
          .ec -
          .de xx
          --na
          ..
          .xx
              => -na

     This result may surprise you; some people expect '1' to be output
     since register 'a' has clearly been defined with that value.  What
     has happened?  The robotic replacement of '\' with '-' has led us
     astray.  As mentioned above, the leading escape character makes the
     following character ordinary.  Written with the default escape
     character, the sequence '--' becomes '\-', which you may recognize
     as the special character escape sequence for the minus sign glyph.
     Since the escape character followed by itself is a valid escape
     sequence, only '\E' yields the expected result.

          .nr a 1
          .ec -
          .de xx
          -Ena
          ..
          .xx
              => 1

   (1) Compare this to the '\def' and '\edef' commands in TeX.

   (2) These are lightly adapted from the 'groff' implmentation of the
'ms' macros.
*** SNIP ***

Regards,
Branden

signature.asc
Description: PGP signature

Re: Groff examples repository

Reply via email to