At 2021-08-28T12:14:17-0400, Douglas McIlroy wrote: > ----------------------------------------------------------------- > > A small anomaly. Consider > > .de . > .tm Hi > ,.. > .. > > The second .. emits "Hi". This fragment also emits "Hi": > > .de end end > .tm Hi > .end > > But this (with macro . not previously defined) > does not: > > .de . . > .tm Hi > ..
I used the same example last week to update the "Writing Macros" section of our Texinfo manual. The concept of "copy mode" confused me deeply when I first arrived in the groff community and finally I realized that the statement, used repeatedly about escapes in our documentation, that certain ones are "interpreted in copy mode" makes a lot more sense to people who already know what copy mode is than those who don't. Recasting ensued. I have added some of this material to groff(7) as well, albeit in aggressively condensed form and with no examples. It may go without saying, but I have some minor revisions to the below queued already. Nevertheless I think in its present state it renders the subject more accessible. *** SNIP *** 5.21 Writing Macros =================== A "macro" is a stored collection of text and control lines that can be used multiple times. Use macros to define common operations. *Note Strings::, for a (limited) alternative syntax to call macros. While requests exist for the purpose of creating macros, simply calling an undefined macro, or interpolating it as a string, will cause it to be defined as empty. *Note Identifiers::. -- Request: .de name [end] Define a macro NAME, replacing the definition of any existing request, macro, string, or diversion called NAME. GNU 'troff' enters "copy mode", storing subsequent input lines in an internal buffer. If the optional second argument is not specified, the macro definition ends with the control line '..' (two dots). Alternatively, END identifies a macro whose call syntax ends the definition of NAME; END is then called normally. Spaces or tabs are permitted after the control character in the line containing this ending token (either '.' or 'END'), but a tab immediately after the token prevents its recognition as the end of a macro definition.(1) (*note Writing Macros-Footnote-1::) Here is a small example macro called 'P' that causes a break and inserts some vertical space. It could be used to separate paragraphs. .de P . br . sp .8v .. We can define one macro within another. Attempting to nest '..' naïvely will end the outer definition because the inner definition isn't interpreted as such until the outer macro is later interpolated. We can use an end macro instead. Each level of nesting should use a unique end macro. An end macro need not be defined until it is called. This fact enables a nested macro definition to begin inside one macro and end inside another. Consider the following example.(2) (*note Writing Macros-Footnote-2::) .de m1 . de m2 m3 you .. .de m3 Hello, Joe. .. .de m4 do .. .m1 know? . m3 What .m4 .m2 => Hello, Joe. What do you know? A nested macro definition _can_ be terminated with '..', and nested macros _can_ reuse end macros, but these control lines must be escaped multiple times for each level of nesting. The necessity of this escaping and the utility of nested macro definitions will become clearer when we employ macro parameters and consider the behavior of copy mode in detail. -- Request: .de1 name [end] The 'de1' request defines a macro that executes with compatibility mode disabled (*note Implementation Differences::). On entry, the state of compatibility mode enablement is saved, and on exit it is restored. Observe the extra backlash before the interpolation of register 'xxx'; we'll explore this subject in *note Copy Mode::. .nr xxx 12345 .de aa The value of xxx is \\n[xxx]. . br .. .de1 bb The value of xxx is \\n[xxx]. .. .cp 1 .aa error-> warning: register '[' not defined => The value of xxx is 0xxx]. .bb => The value of xxx is 12345. -- Request: .dei name [end] -- Request: .dei1 name [end] The 'dei' request defines a macro indirectly. That is, it interpolates strings named NAME and END before performing the definition. The following examples are equivalent. .ds xx aa .ds yy bb .dei xx yy .de aa bb The 'dei1' request is similar to 'dei', but with compatibility mode switched off during execution of the defined macro. If compatibility mode is on, 'de' and 'dei' behave similarly to 'de1' and 'dei1', respectively: a "compatibility save" token is inserted at the beginning, and a "compatibility restore" token at the end, with compatibility mode switched on during execution. -- Request: .am name [end] -- Request: .am1 name [end] -- Request: .ami name [end] -- Request: .ami1 name [end] 'am' appends subsequent input lines to macro NAME, extending its definition, and otherwise working as 'de' does. To make the previously defined 'P' macro set indented instead of block paragraphs, add the necessary code to the existing macro. .am P .ti +5n .. The other requests are analogous to their 'de' counterparts. The 'am1' request turns off compatibility mode while executing the appendment: a "compatibility save" input token is inserted before the appended input lines, and a "compatibility restore" input token after the end. The 'ami' request appends indirectly, meaning that strings NAME and END are interpolated with the resulting names used before appending. The 'ami1' request is similar to 'ami', disabling compatibility mode during interpretation of the appended lines. Using 'trace.tmac', you can trace calls to 'de', 'de1', 'am', and 'am1'. You can also use the 'backtrace' request at any point desired to troubleshoot tricky spots (*note Debugging::). *Note Strings::, for the 'als', 'rm', and 'rn' requests to create an alias of, remove, and rename a macro, respectively. Macro identifiers share their name space with requests, strings, and diversions; *note Identifiers::. The 'am', 'as', 'da', 'de', 'di', and 'ds' requests (together with their variants) only create a new object if the name of the macro, diversion, or string is currently undefined or if it is defined as a request; normally, they modify the value of an existing object. *Note the description of the 'als' request: als, for pitfalls when redefining a macro that is aliased. -- Request: .return [anything] Exit a macro, immediately returning to the caller. If called with an argument ANYTHING, exit twice--the current macro and the macro one level higher. This is used to define a wrapper macro for 'return' in 'trace.tmac'. (1) While it is possible to define and call a macro '.', you can't use it as an end macro: during a macro definition, '..' is never handled as calling '.', even if '.de NAME .' explicitly precedes it. (2) The structure of this example is adapted from, and structurally isomorphic to, part of a solution by Tadziu Hoffman to the problem of reflowing text multiple times to find an optimal configuration for it. <https://lists.gnu.org/archive/html/groff/2008-12/msg00006.html> 5.21.1 Parameters ----------------- Macro calls and string interpolations optionally accept a list of arguments; recall *note Request and Macro Arguments::. At the time such an interpolation takes place, these "parameters" can be examined using a register and a variety of escape sequences starting with '\$'. -- Register: \n[.$] The count of parameters available to a macro or string is kept in this read-only register. The 'shift' request can change its value. Any individual parameter can be accessed by its position in the list of arguments to the macro call, numbered from left to right starting at 1, with one of the following escapes. -- Escape: \$n -- Escape: \$(nn -- Escape: \$[nnn] Interpolate the Nth, NNth, or NNNth parameter. The first form only accepts a single digit (1<=N<=9)), the second two digits (01<=NN<=99)), and the third any positive integer NNN. Macros and strings can accept an unlimited number of parameters. -- Request: .shift [n] Shift the parameters N places (1 by default). This is a "left shift": what was parameter I becomes parameter I-N and so on. The parameters formerly in positions 1 to N are no longer available. Shifting by a nonpositive amount performs no operation. The register '.$' is adjusted accordingly. In practice, parameter interpolations are usually seen prefixed with an extra escape character. This is because the '\$' family of escape sequences is interpreted even in copy mode.(1) (*note Parameters-Footnote-1::) -- Escape: \$* -- Escape: \$@ -- Escape: \$^ In some cases it is convenient to interpolate all of the parameters at once (to pass them to a request, for instance). The '\$*' escape concatenates the parameters, separating them with spaces. '\$@' is similar, concatenating the parameters, surrounding each by double quotes and separating them with spaces. If not in compatibility mode, the interpolation depth of double quotes is preserved (*note Request and Macro Arguments::). '\$^' interpolates all parameters as if they were arguments to the 'ds' request. .de foo . tm $1='\\$1' . tm $2='\\$2' . tm $*='\\$*' . tm $@='\\$@' . tm $^='\\$^' .. .foo " This is a "test" error-> $1=' This is a ' error-> $2='test"' error-> $*=' This is a test"' error-> $@='" This is a " "test""' error-> $^='" This is a "test"' '\$*' is useful when writing a macro that doesn't need to distinguish its arguments, or even to not interpret them; examples include macros that produce diagnostic messages by wrapping the 'tm' or 'ab' requests. Use '\$@' when writing a macro that may need to shift its parameters and/or wrap a macro or request that finds the count significant. If in doubt, prefer '\$@' to '\$*'. An application of '\$^' is seen in 'trace.tmac', which redefines some requests and macros for debugging purposes. -- Escape: \$0 Interpolate the name by which the executing macro was called. The 'als' request can cause a macro to have more than one name. Applying string interpolation to a macro does not change this name. .de foo . tm \\$0 .. .als bar foo . .de aaa . foo .. .de bbb . bar .. .de ccc \\*[foo]\\ .. .de ddd \\*[bar]\\ .. . .aaa error-> foo .bbb error-> bar .ccc error-> ccc .ddd error-> ddd (1) If they were not, parameter interpolations would be similar to command-line parameters--fixed for the entire duration of a 'roff' program's run. The advantage of interpolating '\$' escape sequences even in copy mode is that they can change from one interpolation to the next, like function parameters. The additional escape character is the price of this power. 5.21.2 Copy Mode ---------------- When GNU 'troff' processes certain requests, most importantly those which define or append to a macro or string, it does so in "copy mode": it copies the characters of the definition into a dedicated storage region, interpolating the escape sequences '\n', '\g', '\$', '\*', and '\V' normally; interpreting '\<RET>' immediately; discarding comments '\"' and '\#'; interpolating the current leader, escape, or tab character with '\a', '\e', and '\t', respectively; and storing all other escape sequences in an encoded form. The complement of copy mode--a 'roff' formatter's behavior when not defining or appending to a macro, string, or diversion--where all macros are interpolated, requests invoked, and valid escape sequences processed immediately upon recognition, can be termed "interpretation mode". -- Escape: \\ The escape character, '\' by default, escapes itself. Thus you can control whether a given '\n', '\g', '\$', '\*', or '\V' escape sequence is interpreted at the time the macro containing it is defined, or later when the macro is called.(1) (*note Copy Mode-Footnote-1::) .nr x 20 .de y .nr x 10 \&\nx \&\\nx .. .y => 20 10 You can think of '\\' as a "delayed" backslash; it is the escape character followed by a backslash from which the escape character has removed its special meaning. Consequently, '\\' is not an escape sequence in the usual sense. In any escape sequence '\X' that GNU 'troff' does not recognize, the escape character is ignored and X is output, with two exceptions--'\\' is the first. -- Escape: \. '\.' escapes the control character. It is similar to '\\' in that it isn't a true escape sequence. It is used to permit nested macro definitions to end without a named macro call to conclude them. Without a syntax for escaping the control character, this would not be possible. .de m1 foo . . de m2 bar \\.. . .. .m1 .m2 => foo bar The first backslash is consumed while the macro is read, and the second is interpreted while executing macro 'm1'. 'roff' documents should not use the '\\' or '\.' tokens outside of copy mode; they serve only to obfuscate the input. Use '\e' to obtain the escape character, '\[rs]' to obtain a backslash glyph, and '\&' before '.' and ''' where GNU 'troff' expects them as control characters if you mean to use them literally (recall *note Requests and Macros::). Macro definitions can be nested to arbitrary depth. The mechanics of parsing the escape character have significant consequences for this practice. .de M1 \\$1 . de M2 \\\\$1 . de M3 \\\\\\\\$1 \\\\.. . M3 hand. \\.. . M2 of .. This understeer is getting .M1 out => This understeer is getting out of hand. Each escape character is interpreted twice--once in copy mode, when the macro is defined, and once in interpretation mode, when it is executed. As seen above, this fact leads to exponential growth in the number of escape characters required to delay interpolation of '\n', '\g', '\$', '\*', and '\V' at each nesting level, which can be daunting. GNU 'troff' offers a solution. -- Escape: \E '\E' represents an escape character that is not interpreted in copy mode. You can use it to ease the writing of nested macro definitions. .de M1 . nop \E$1 . de M2 . nop \E$1 . de M3 . nop \E$1 \\\\.. . M3 better. \\.. . M2 bit .. This vehicle handles .M1 a => This vehicle handles a bit better. Observe that because '\.' is not a true escape sequence, we can't use '\E' to keep '..' from ending a macro definition prematurely. If the multiplicity of backslashes complicates maintenance, use end macros. '\E' is also convenient to define strings that contain escape sequences that need to work when used in copy mode (for example, as macro arguments) We might define strings to begin and end superscripting as follows.(2) (*note Copy Mode-Footnote-2::) .ds { \v'-.9m\s'\En[.s]*7u/10u'+.7m' .ds } \v'-.7m\s0+.9m' When the 'ec' request is used to redefine the escape character, '\E' also makes it easier to distinguish the semantics of an escape character from the other meaning(s) its character might have. Consider the use of an unusual escape character, '-'. .nr a 1 .ec - .de xx --na .. .xx => -na This result may surprise you; some people expect '1' to be output since register 'a' has clearly been defined with that value. What has happened? The robotic replacement of '\' with '-' has led us astray. As mentioned above, the leading escape character makes the following character ordinary. Written with the default escape character, the sequence '--' becomes '\-', which you may recognize as the special character escape sequence for the minus sign glyph. Since the escape character followed by itself is a valid escape sequence, only '\E' yields the expected result. .nr a 1 .ec - .de xx -Ena .. .xx => 1 (1) Compare this to the '\def' and '\edef' commands in TeX. (2) These are lightly adapted from the 'groff' implmentation of the 'ms' macros. *** SNIP *** Regards, Branden
signature.asc
Description: PGP signature