keithmarshall pushed a commit to branch master in repository groff. commit 058b63ce3d614479a64d65d9272cbaa3e2f4b4d1 Author: Keith Marshall <keith.d.marsh...@ntlworld.com> AuthorDate: Sat Sep 4 12:35:26 2021 +0100
Sanitize text for use in PDF document outlines. --- contrib/pdfmark/ChangeLog | 30 ++++++++ contrib/pdfmark/pdfmark.am | 3 +- contrib/pdfmark/pdfmark.ms | 12 +-- contrib/pdfmark/sanitize.tmac | 170 ++++++++++++++++++++++++++++++++++++++++++ contrib/pdfmark/spdf.tmac | 33 +++++--- 5 files changed, 230 insertions(+), 18 deletions(-) diff --git a/contrib/pdfmark/ChangeLog b/contrib/pdfmark/ChangeLog index 65ab4a0..ab034fe 100644 --- a/contrib/pdfmark/ChangeLog +++ b/contrib/pdfmark/ChangeLog @@ -1,3 +1,33 @@ +2021-09-03 Keith Marshall <keith.d.marsh...@ntlworld.com> + + Sanitize text for use in PDF document outlines. + + * sanitize.tmac: New file; it implements... + (sanitize): ...this new macro; interprets its first argument as a + string name, and copies its remaining arguments to the named string, + discarding specific embedded troff escape sequences; currently... + (\F): ...only this is identified as "specifically discardable". + + * pdfmark.am (TMACFILES): Add sanitize.tmac + + * spdf.tmac (mso): Include sanitize.tmac + (xn*ref, xn*argc): Rename all occurrences... + (spdf:refname, spdf:argc): ...to these, respectively. + (XN): Stop inserting $* directly into PDF outlines; instead, use... + (spdf:bm.text): ...this new string; this is locally defined by... + (spdf:bm.define): ...this new macro; passed the original $* from + XN, this itself, is locally defined as a redirectable alias for... + (spdf:bm.basic): ...this new local macro; it simply copies $*, + passed from XN, to the string named by its first argument, (which is + always spdf:bm.text), so reproducing previous behaviour. + (opt*XN-S): New macro; defined for internal use only, it adds a "-S" + option to XN, such that, when specified, it temporarily redirects... + (spdf:bm.define): ...this macro mapping alias to... + (sanitize): ...this. + + * pdfmark.ms (XN): Add "-S" option for all headings which include... + (\F[C]...\F[]): ...this escape sequence. + 2021-08-21 Keith Marshall <keith.d.marsh...@ntlworld.com> Define, and use registered trade mark strings. diff --git a/contrib/pdfmark/pdfmark.am b/contrib/pdfmark/pdfmark.am index d56dd9b..9a2d030 100644 --- a/contrib/pdfmark/pdfmark.am +++ b/contrib/pdfmark/pdfmark.am @@ -1,4 +1,4 @@ -# Copyright (C) 2005-2020 Free Software Foundation, Inc. +# Copyright (C) 2005-2021 Free Software Foundation, Inc. # Written by Keith Marshall (keith.d.marsh...@ntlworld.com) # Automake migration by Bertrand Garrigues # @@ -27,6 +27,7 @@ bin_SCRIPTS += pdfroff # Files installed in $(tmacdir) TMACFILES = \ contrib/pdfmark/pdfmark.tmac \ + contrib/pdfmark/sanitize.tmac \ contrib/pdfmark/spdf.tmac pdfmarktmacdir = $(tmacdir) dist_pdfmarktmac_DATA = $(TMACFILES) diff --git a/contrib/pdfmark/pdfmark.ms b/contrib/pdfmark/pdfmark.ms index fdd3e44..2abe022 100644 --- a/contrib/pdfmark/pdfmark.ms +++ b/contrib/pdfmark/pdfmark.ms @@ -349,7 +349,7 @@ of their choice, to format their documents, while also using the macros to add PDF features. . .NH 2 -.XN -N pdfmark-operator -- The \F[C]pdfmark\F[] Operator +.XN -S -N pdfmark-operator -- The \F[C]pdfmark\F[] Operator .LP All PDF features are implemented by embedding instances of the .B \F[C]pdfmark\F[] @@ -1178,7 +1178,7 @@ which extend through a page transition; .QE . .NH 3 -.XN Optional Features of the \F[C]pdfhref\F[] Macro +.XN -S -- Optional Features of the \F[C]pdfhref\F[] Macro .LP The behaviour of a number of the .CW pdfhref @@ -2340,7 +2340,7 @@ illustrates how this may be accomplished:\(en .XN -N add-note -- Annotating a PDF Document using Pop-Up Notes . .NH 2 -.XN -N pdfsync -- Synchronizing Output and \F[C]pdfmark\F[] Contexts +.XN -S -N pdfsync -- Synchronizing Output and \F[C]pdfmark\F[] Contexts .LP It has been noted previously, that the .CW pdfview @@ -2493,7 +2493,7 @@ as to how the macros may be employed with their chosen primary macro package. . .NH 2 -.XN -N using-spdf -- Using \F[C]pdfmark\F[] Macros with the \F[C]ms\F[] Macro Package +.XN -S -N using-spdf -- Using \F[C]pdfmark\F[] Macros with the \F[C]ms\F[] Macro Package .LP The use of the binding macro package, .CW spdf.tmac , @@ -2544,7 +2544,7 @@ and the issues they are intended to address, are described below. . .NH 3 -.XN \F[C]ms\F[] Section Headings in PDF Documents +.XN -S -- \F[C]ms\F[] Section Headings in PDF Documents .LP Traditionally, .CW ms @@ -2572,7 +2572,7 @@ to be used in conjunction with the macro. . .NH 4 -.XN -N xn-macro -- The \F[C]XN\F[] Macro +.XN -S -N xn-macro -- The \F[C]XN\F[] Macro . .NH 1 .XN The PDF Publishing Process diff --git a/contrib/pdfmark/sanitize.tmac b/contrib/pdfmark/sanitize.tmac new file mode 100644 index 0000000..4efa785 --- /dev/null +++ b/contrib/pdfmark/sanitize.tmac @@ -0,0 +1,170 @@ +.ig + +sanitize.tmac + +Copyright (C) 2021 Free Software Foundation, Inc. + Written by Keith Marshall (keith.d.marsh...@ntlworld.com) + +This file is part of groff. + +groff is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation, either version 3 of the License, or +(at your option) any later version. + +groff is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with this program. If not, see <http://www.gnu.org/licenses/>. + +.. +.eo +.de sanitize +.\" Usage: .sanitize name text ... +.\" +.\" Remove designated formatting escape sequences from "text ..."; return +.\" the sanitized text in a string register, identified by "name". +.\" +.\" Begin by initializing the named result as an empty string, bind it to +.\" an internal reference name, and discard the "name" argument, to leave +.\" only the text which is to be sanitized, as residual arguments. +.\" +. ds \$1 +. als sanitize:result \$1 +. shift +. +.\" Initialize a working string register, which we will cyclically reduce +.\" until it becomes empty, after starting with all of the text passed as +.\" the residual arguments, and establish its initial length. +.\" +. ds sanitize:residual "\$*\" +. length sanitize:residual.length "\$*\" +. +.\" Begin the cyclic reduction loop... +.\" +. while \n[sanitize:residual.length] \{\ +. \" +. \" ...assuming, at the start of each cycle, that the next character +. \" will not be skipped, and that it will be moved from the residual, +. \" to the result, as the character-by-character scan proceeds. +. \" +. nr sanitize:skip.count 0 +. sanitize:scan.execute +. +. \" For each character scanned, we need to check if it matches the +. \" normal escape character; the check is most readily performed, if +. \" an alternative escape character is introduced, and when a match +. \" is found, we prepare to skip an escape sequence. +. \" +. ec ! +. if '!*[sanitize:scan.char]'\' .nr sanitize:skip.count 1 +. ec +. ie \n[sanitize:skip.count] \{\ +. \" +. \" When a possible escape sequence has been detected, we back it +. \" up, (in case it isn't recognized, and we need to reinstate its +. \" content into the result string), then scan ahead to check for +. \" an identifiable escape sequence... +. \" +. rn sanitize:scan.char sanitize:hold +. sanitize:scan.execute +. ie d sanitize:esc-\*[sanitize:scan.char] \ +. \" +. \" ...which we delegate to its appropriate handler, to skip... +. \" +. sanitize:esc-\*[sanitize:scan.char] +. +. \" ...but, in the case of an unrecognized escape sequence, we copy +. \" its backed-up content, followed by the character retrieved from +. \" the current scan cycle, to the result string. +. \" +. el .as sanitize:result "\*[sanitize:hold]\*[sanitize:scan.char]\" +. \} +. +. \" When the current scan cycle has retrieved a character, which isn't +. \" part of any possible escape sequence, we simply copy that character +. \" to the result string. +. \" +. el .as sanitize:result "\*[sanitize:scan.char]\" +. \} +. +.\" Clean up the register space, by deleting all of the string registers, +.\" and numeric registers, which are designated as temporary, for private +.\" use within this macro only. +.\" +. rm sanitize:hold sanitize:scan.char sanitize:residual sanitize:result +. rr sanitize:residual.length sanitize:skip.count +.. +.de sanitize:scan.execute +.\" Usage (internal): .sanitize:scan.execute +.\" +.\" Perform a single-character reduction of sanitize:residual, by copying +.\" its initial character to sanitize:scan.char, and then deleting it from +.\" sanitize:residual itself. (Note that we use arithmetic decrementation +.\" of sanitize:residual.length, rather than repeating the length request +.\" on sanitize:residual, because reduction WILL fail when there is only +.\" one character remaining). +.\" +. nr sanitize:residual.length -1 +. ds sanitize:scan.char "\*[sanitize:residual]\" +. substring sanitize:scan.char 0 0 +. substring sanitize:residual 1 +.. +.de sanitize:skip-( +.\" Usage (internal): .sanitize:skip-( +.\" +.\" For any identified escape sequence, with a two-character property name, +.\" simply skip over the next two characters in the residual string. +.\" +. nr sanitize:residual.length -2 +. substring sanitize:residual 2 +.. +.de sanitize:skip-[ +.\" Usage (internal): .sanitize:skip-[ +.\" +.\" For any identified escape sequence, with an arbitrary-length property +.\" name, skip following characters in the residual string, until we find +.\" a terminal "]" character, or we exhaust the residual. +.\" +. while \n[sanitize:skip.count] \{\ +. sanitize:scan.execute +. ie \n[sanitize:residual.length] \{\ +. \" We haven't yet exhausted the residual; if we find a nested "[" +. \" character, increment the nesting level, otherwise decrement it +. \" for each "]"; it will become zero at the terminal "]". +. \" +. ie '\*[sanitize:scan.char]'[' .nr sanitize:skip.count +1 +. el .if '\*[sanitize:scan.char]']' .nr sanitize:skip.count -1 +. \} +. \" Stop unconditionally, if we do exhaust the residual. +. \" +. el .nr sanitize:skip.count 0 +. \} +.. +.de sanitize:esc-generic +.\" Usage: .sanitize:esc-X +.\" +.\" (X represents any legitimate single-character escape sequence id). +.\" +.\" Handler for skipping "\X" sequences, in text which is to be sanitized; +.\" this will automatically detect sequences conforming to any of the forms +.\" "\Xc", "\X(cc", or "\X[...]", and will handle each appropriately. The +.\" implementation is generic, and may be aliased to handle any specific +.\" escape sequences, which exhibit similar semantics. +.\" +. sanitize:scan.execute +. if d sanitize:skip-\*[sanitize:scan.char] \ +. sanitize:skip-\*[sanitize:scan.char] +.. +.\" Map the generic handler to specific escape sequences, as required. +.\" +.als sanitize:esc-F sanitize:esc-generic +.ec +.\" Local Variables: +.\" mode: nroff +.\" End: +.\" vim: filetype=groff: +.\" sanitize.tmac: end of file diff --git a/contrib/pdfmark/spdf.tmac b/contrib/pdfmark/spdf.tmac index 767f5ee..33591d0 100644 --- a/contrib/pdfmark/spdf.tmac +++ b/contrib/pdfmark/spdf.tmac @@ -2,7 +2,7 @@ spdf.tmac -Copyright (C) 2004-2020 Free Software Foundation, Inc. +Copyright (C) 2004-2021 Free Software Foundation, Inc. Written by Keith Marshall (keith.d.marsh...@ntlworld.com) This file is part of groff. @@ -25,6 +25,7 @@ along with this program. If not, see <http://www.gnu.org/licenses/>. .if !rOPMODE .nr OPMODE 1 .\" .mso s.tmac +.mso sanitize.tmac .mso pdfmark.tmac .\" .\" Omitted Sections @@ -82,16 +83,18 @@ along with this program. If not, see <http://www.gnu.org/licenses/>. .\" additional spacing parameters may be set relative to the current .\" document line spacing, as set by \n[VS]). .\" -.rm xn*ref +.rm spdf:refname +.als spdf:bm.define spdf:bm.basic .while dopt*XN\\$1 \{\ . opt*XN\\$1 \\$* -. shift \\n[xn*argc] +. shift \\n[spdf:argc] . \} -.rr xn*argc +.rr spdf:argc .if '\\$1'--' .shift -.if dxn*ref .XM -N \\*[xn*ref] -- \\$@ -.rm xn*ref -.pdfhref O \\n[nh*hl] "\\*(SN \\$*" +.if dspdf:refname .XM -N \\*[spdf:refname] -- \\$@ +.rm spdf:refname +.spdf:bm.define spdf:bm.text "\\$*" +.pdfhref O \\n[nh*hl] "\\*(SN \\*[spdf:bm.text]" .XS .if rtc*hl \{\ . if !dXNVS1 .ds XNVS1 1.0v \" default leading for top level @@ -119,12 +122,20 @@ along with this program. If not, see <http://www.gnu.org/licenses/>. \&\\$* .. .de opt*XN-N -.nr xn*argc 2 -.ds xn*ref \\$2 +.ds spdf:refname \\$2 +.nr spdf:argc 2 +.. +.de opt*XN-S +.als spdf:bm.define sanitize +.nr spdf:argc 1 .. .de opt*XN-X -.nr xn*argc 1 -.if !dxn*ref .ds xn*ref \\\\$1 +.if !dspdf:refname .ds spdf:refname \\\\$1 +.nr spdf:argc 1 +.. +.de spdf:bm.basic +.shift +.ds spdf:bm.text "\\$*\" .. .de LU .LP _______________________________________________ Groff-commit mailing list Groff-commit@gnu.org https://lists.gnu.org/mailman/listinfo/groff-commit