Re: Why does groff require psutils?
Hi Alexis, At 2024-04-08T08:17:21-0500, G. Branden Robinson wrote: > At 2024-04-08T15:10:55+0200, Alexis (surryhill) wrote: > > On Thu, Apr 04, 2024 at 08:26:44PM -0500, G. Branden Robinson wrote: > > > Alexis, would you like to look into this more deeply, and maybe > > > find a solution that will enable us to use ps2ps after all? > > > > Of course, Branden. Just to be sure I use the same code-path as you > > did which command(s) did you use to build the HTML versions groff of > > man pages? > > Sure thing. This is the script I use to inspect groff's man pages > (and a couple of other documents) for regressions before each push I > do. [...] Any luck with this? Regards, Branden signature.asc Description: PGP signature
Re: Why does groff require psutils?
Hi Alexis, At 2024-04-08T15:10:55+0200, Alexis (surryhill) wrote: > On Thu, Apr 04, 2024 at 08:26:44PM -0500, G. Branden Robinson wrote: > > Alexis, would you like to look into this more deeply, and maybe find a > > solution that will enable us to use ps2ps after all? > > Of course, Branden. Just to be sure I use the same code-path as you > did which command(s) did you use to build the HTML versions groff of > man pages? Sure thing. This is the script I use to inspect groff's man pages (and a couple of other documents) for regressions before each push I do. It's saved me from embarrassment countless times. (So naturally I find other ways to embarrass myself.) To make it fully useful for regression testing, I keep a cached copy of the files it creates as of my last push, and then diff the cached and new directories. But you shouldn't need that additional infrastructure to reproduce the problem--if in fact you can, and it's not somehow an artifact of my environment. Regards, Branden #!/bin/bash set -e if [ $# -ne 1 ] then echo "need a directory argument (e.g., \"old\", \"new\")" >&2 exit 1 fi if ! [ -x ./build/test-groff ] then echo "./build/test-groff does not exist or is not executable" >&2 exit 2 fi groff () { ../build/test-groff "$@" } BFLAG= #BFLAG=-b DIR=$1 MANS=( ./src/utils/lkbib/lkbib.1.man ./src/utils/tfmtodit/tfmtodit.1.man ./src/utils/hpftodit/hpftodit.1.man ./src/utils/pfbtops/pfbtops.1.man ./src/utils/afmtodit/afmtodit.1.man ./src/utils/lookbib/lookbib.1.man ./src/utils/addftinfo/addftinfo.1.man ./src/utils/xtotroff/xtotroff.1.man ./src/utils/indxbib/indxbib.1.man ./src/roff/nroff/nroff.1.man ./src/roff/troff/troff.1.man ./src/roff/groff/groff.1.man ./src/utils/grog/grog.1.man ./src/devices/grodvi/grodvi.1.man ./src/devices/grolbp/grolbp.1.man ./src/devices/grops/grops.1.man ./src/devices/grohtml/grohtml.1.man ./src/devices/grolj4/grolj4.1.man ./src/devices/grotty/grotty.1.man ./src/devices/gropdf/gropdf.1.man ./src/devices/gropdf/pdfmom.1.man ./src/devices/xditview/gxditview.1.man ./src/preproc/preconv/preconv.1.man ./src/preproc/tbl/tbl.1.man ./src/preproc/soelim/soelim.1.man ./src/preproc/eqn/eqn.1.man ./src/preproc/eqn/neqn.1.man ./src/preproc/pic/pic.1.man ./src/preproc/refer/refer.1.man ./src/preproc/grn/grn.1.man ./contrib/pic2graph/pic2graph.1.man ./contrib/hdtbl/groff_hdtbl.7.man ./contrib/mm/groff_mm.7.man ./contrib/mm/mmroff.1.man ./contrib/grap2graph/grap2graph.1.man ./contrib/pdfmark/pdfroff.1.man ./contrib/rfc1345/groff_rfc1345.7.man ./contrib/eqn2graph/eqn2graph.1.man ./contrib/gpinyin/gpinyin.1.man ./contrib/mom/groff_mom.7.man ./contrib/gdiffmk/gdiffmk.1.man ./contrib/glilypond/glilypond.1.man ./contrib/chem/chem.1.man ./contrib/gperl/gperl.1.man ./man/groff_tmac.5.man ./man/groff_out.5.man ./man/groff_diff.7.man ./man/groff_char.7.man ./man/groff.7.man ./man/roff.7.man ./man/groff_font.5.man ./tmac/groff_trace.7.man ./tmac/groff_me.7.man ./tmac/groff_ms.7.man ./tmac/groff_man.7.man ./tmac/groff_man_style.7.man ./tmac/groff_mdoc.7.man ./tmac/groff_www.7.man ) MANS_SV=( ./contrib/mm/groff_mmse.7.man ) mkdir "$DIR" pushd "$DIR" >/dev/null # the change logs, so we know approximately where we are cp ../ChangeLog . for d in chem gdiffmk glilypond gperl gpinyin hdtbl mm mom pdfmark rfc1345 \ sboxes do cp ../contrib/$d/ChangeLog ./ChangeLog.$d done # our Texinfo manual cp ../build/doc/groff.txt . # our Texinfo manual via HTML cp ../build/doc/groff.html . lynx -dump groff.html > groff.html.txt # our ms manuals groff $BFLAG -ww -Tutf8 -ept -ms ../doc/ms.ms > ms.txt # our me manuals #groff $BFLAG -ww -Tutf8 -me ../doc/meintro.me > meintro.txt #groff $BFLAG -ww -Tutf8 -kt -me -mfr ../doc/meintro_fr.me > meintro_fr.txt #groff $BFLAG -ww -Tutf8 -me ../doc/meref.me > meref.txt me_pre=../ATTIC/my.me groff $BFLAG -ww -Tutf8 -me $me_pre ../build/doc/meintro.me > meintro.txt groff $BFLAG -ww -Tutf8 -kt -me -mfr $me_pre ../build/doc/meintro_fr.me \ > meintro_fr.txt groff $BFLAG -ww -Tutf8 -me $me_pre ../build/doc/meref.me > meref.txt for F in ${MANS[*]} ${MANS_SV[*]} do G=../build/${F%.man} if [ -f "$G" ] then cp "$G" . else echo "warning: \"$G\" missing" >&2 fi done : ${AD:=l} ARGS="$BFLAG -ww -dAD=$AD -rCHECKSTYLE=3 -rU1 -Tutf8 -e -t -mandoc" NOCR=-rcR=0 LOCALE= ARGS_HTML="$BFLAG -ww -rCHECKSTYLE=3 -Thtml -e -t -mandoc -P-C -P-G" for P in *.[157] do if [ "$P" = groff_mmse.7 ] then LOCALE=-msv else LOCALE= fi echo $0: $P >&2 echo "groff $ARGS $LOCALE $P" > "$P.cR.txt" groff $ARGS $LOCALE "$P" >> "$P.cR.txt" echo "groff $ARGS $LOCALE $NOCR $P" > "$P.no-cR.txt" groff $ARGS $LOCALE $NOCR "$P" >> "$P.no-cR.txt" echo "" > "$P.html" groff $ARGS_HTML $LOCALE -P-I$P $P >> "$P.html" rm "$P" done popd >/dev/null # vim:set ai et sw=4 ts=4 tw=80: signature.asc Description: PGP signature
Re: Re: Why does groff require psutils?
On Thu, Apr 04, 2024 at 08:26:44PM -0500, G. Branden Robinson wrote: > Alexis, would you like to look into this more deeply, and maybe find a > solution that will enable us to use ps2ps after all? Of course, Branden. Just to be sure I use the same code-path as you did which command(s) did you use to build the HTML versions groff of man pages? Alexis
Re: Why does groff require psutils?
At 2024-03-24T22:58:57-0500, G. Branden Robinson wrote: > At 2024-03-17T12:59:51+0100, Alexis wrote: > > > I'm sorry I let this fall onto the floor. Picking it up... > > > > That's alright; thanks for picking it up again, much appreciated! > [...] > > Do let me know if you desire or require further testing or changes, or > > if I can be helpful in other ways to drive this change forward. > > I've pushed this. Thank you! Unfortunately I've had to revert it. It actually does break inclusions of tables (and maybe equations, too--I didn't check). To verify, try building HTML versions of groff man pages. I get output like this: pnmcrop: The image is entirely background; there is nothing to crop. pnmtopng: EOF / read error reading magic number Calling 'pnmcut 3454 78 271 23 < /tmp/branden/groff-page-tADF85 | pnmcrop -quiet | pnmtopng -quiet -background rgb:f/f/f -transparent rgb:f/f/f> tbl.1-1.png ' returned status 256 ...and when viewing the generated HTML document with firefox(1), blank areas replace the tables. When I revert the change, the tables spring back to life (albeit as raster images, as before). Alexis, would you like to look into this more deeply, and maybe find a solution that will enable us to use ps2ps after all? Regards, Branden signature.asc Description: PGP signature
Re: Why does groff require psutils?
At 2024-03-25T19:28:56+0100, Alexis wrote: > On Sun, Mar 24, 2024 at 10:58:54PM -0500, G. Branden Robinson wrote: > > I've pushed this. Thank you! > > That's great news! Thank you Branden, much appreciated. > > Minor nit: Looking at the commit¹ details it seems the commit is > credited to another person bearing the same, somewhat uncommon name, > but a different email address. Whps! I've apparently been mazed by pseudonyms. I'm terribly sorry. I can fix your credit in the ChangeLog but unfortunately commit data is forever. > The important bit is: the change is coming to Groff :) A few have complained bitterly that entirely too much has already arrived... ;-) Regards, Branden signature.asc Description: PGP signature
Re: Re: Why does groff require psutils?
On Sun, Mar 24, 2024 at 10:58:54PM -0500, G. Branden Robinson wrote: > I've pushed this. Thank you! That's great news! Thank you Branden, much appreciated. Minor nit: Looking at the commit¹ details it seems the commit is credited to another person bearing the same, somewhat uncommon name, but a different email address. The important bit is: the change is coming to Groff :) Best Alexis ¹ https://git.savannah.gnu.org/cgit/groff.git/commit/?id=3bde75a958f5f3eea84f1e0098c7b457358792b3
Re: Why does groff require psutils?
At 2024-03-17T12:59:51+0100, Alexis wrote: > > I'm sorry I let this fall onto the floor. Picking it up... > > That's alright; thanks for picking it up again, much appreciated! [...] > Do let me know if you desire or require further testing or changes, or > if I can be helpful in other ways to drive this change forward. I've pushed this. Thank you! Regards, Branden signature.asc Description: PGP signature
Re: Why does groff require psutils?
At 2024-03-17T12:53:14-0400, Peter Schaffter wrote: > On Sun, Mar 17, 2024, Alexis wrote: > > Note that the documented use of psselect in > > contrib/momdoc—describing how to rearrange a table of contents > > generated at the end of a document when using the ps output > > device—is unchanged as I did not find a suitable replacement. > > Switching to ps2ps imposes no penalty on the mom documentation, > which explicitly instructs users to acquire and install the psutils > package if it is not already on their system. I had similar thoughts. The NEWS item documenting the vanishing build dependency can remind the reader that it implies nothing no reduced utility of psutils for people _using_ groff, contra _compiling_ it. Regards, Branden signature.asc Description: PGP signature
Re: Why does groff require psutils?
On Sun, Mar 17, 2024, Alexis wrote: > Note that the documented use of psselect in contrib/momdoc—describing > how to rearrange a table of contents generated at the end of a document > when using the ps output device—is unchanged as I did not find a suitable > replacement. Switching to ps2ps imposes no penalty on the mom documentation, which explicitly instructs users to acquire and install the psutils package if it is not already on their system. -- Peter Schaffter https://www.schaffter.ca
Re: Re: Why does groff require psutils?
Hello Branden, > I'm sorry I let this fall onto the floor. Picking it up... That's alright; thanks for picking it up again, much appreciated! > groff runs it unconditionally if the output device is "html" (or "xhtml"), > because the "html" device's "DESC" file tells it to. Ah, that's the piece I was missing, thanks! > Another way to make sure your code is being exercised is to stick a > `debug()` call into it. That's helpful to know. I've done this locally and verified that things work and seem to produce visually equal results¹ when looking at the generated html for pic and webpage pages in the doc directory. Please find attached a patch that, to the best of my knowledge, replaces psselect with ps2ps in code and documentation. Note that the documented use of psselect in contrib/momdoc—describing how to rearrange a table of contents generated at the end of a document when using the ps output device—is unchanged as I did not find a suitable replacement. According to the Ghostscript documentation for the -sPageList option: "[f]or PostScript or PCL input files, the list of pages must be given in increasing order, you cannot process pages out of order or repeat pages and this will generate an error."² Do let me know if you desire or require further testing or changes, or if I can be helpful in other ways to drive this change forward. Best Alexis ¹ Checking the differences between the generated images using https://github.com/x1ddos/imgdiff the reported difference was below 0.01% ² https://ghostscript.readthedocs.io/en/latest/Use.html diff --git b/m4/groff.m4 a/m4/groff.m4 index ebbe60d52..d5f014316 100644 --- b/m4/groff.m4 +++ a/m4/groff.m4 @@ -182,7 +182,7 @@ AC_DEFUN([GROFF_CHECK_GROHTML_PROGRAMS], [ missing= m4_foreach([groff_prog], dnl Keep this list of programs in sync with grohtml test scripts. -[[pnmcrop], [pnmcut], [pnmtopng], [pnmtops], [psselect]], [ +[[pnmcrop], [pnmcut], [pnmtopng], [pnmtops], [ps2ps]], [ AC_CHECK_PROG(groff_prog, groff_prog, [found], [missing]) if test $[]groff_prog = missing then diff --git b/src/devices/grohtml/grohtml.1.man a/src/devices/grohtml/grohtml.1.man index ef617703f..e10091bf2 100644 --- b/src/devices/grohtml/grohtml.1.man +++ a/src/devices/grohtml/grohtml.1.man @@ -303,11 +303,12 @@ These include the \%Netpbm tools .IR \%pnmcrop , .IR \%pnmcut , and -.IR \%pnmtopng ; -\%Ghostscript -.RI ( gs ); -and the \%PSUtils tool -.IR \%psselect . +.IR \%pnmtopng +as well as +\%Ghostscript's +.IR \%gs +and +.IR \%ps2ps . . . .\" diff --git b/src/preproc/html/pre-html.cpp a/src/preproc/html/pre-html.cpp index cbcc2ccda..401c275cb 100644 --- b/src/preproc/html/pre-html.cpp +++ a/src/preproc/html/pre-html.cpp @@ -918,7 +918,7 @@ int imageList::createPage(int pageno) fprintf(stderr, "creating page %d\n", pageno); #endif - s = make_string("psselect -q -p%d %s %s\n", + s = make_string("ps2ps -sPageList=%d %s %s\n", pageno, psFileName, psPageName); html_system(s, 1); assert(strlen(image_gen) > 0); diff --git b/src/roff/groff/tests/html_works_with_grn_and_eqn.sh a/src/roff/groff/tests/html_works_with_grn_and_eqn.sh index e0709440f..b3b2dc5cd 100755 --- b/src/roff/groff/tests/html_works_with_grn_and_eqn.sh +++ a/src/roff/groff/tests/html_works_with_grn_and_eqn.sh @@ -22,7 +22,7 @@ groff="${abs_top_builddir:-.}/test-groff" # Keep this list of programs in sync with GROFF_CHECK_GROHTML_PROGRAMS # in m4/groff.m4. -for cmd in pnmcrop pnmcut pnmtopng pnmtops psselect +for cmd in pnmcrop pnmcut pnmtopng pnmtops ps2ps do if ! command -v $cmd >/dev/null then diff --git b/src/roff/groff/tests/smoke-test_html_device.sh a/src/roff/groff/tests/smoke-test_html_device.sh index 36dc50e47..9dbb8f298 100755 --- b/src/roff/groff/tests/smoke-test_html_device.sh +++ a/src/roff/groff/tests/smoke-test_html_device.sh @@ -22,7 +22,7 @@ groff="${abs_top_builddir:-.}/test-groff" # Keep this list of programs in sync with GROFF_CHECK_GROHTML_PROGRAMS # in m4/groff.m4. -for cmd in pnmcrop pnmcut pnmtopng pnmtops psselect +for cmd in pnmcrop pnmcut pnmtopng pnmtops ps2ps do if ! command -v $cmd >/dev/null then
Re: Why does groff require psutils?
Hi Alexis, At 2023-12-01T17:35:09+0100, Alexis wrote: > following up on my previous email I'd like to test the attached patch > (replace_psselect.patch) with groff, but am uncertain how to trigger > the patched code path. I'm sorry I let this fall onto the floor. Picking it up... > If I read src/preproc/html/html.am correctly then pre-html.cpp is > compiled into pre-grohtml. Yes. > I'm uncertain how pre-grohtml is invoked in a common groff pipeline or > can be invoked manually for testing purposes. groff runs it unconditionally if the output device is "html" (or "xhtml"), because the "html" device's "DESC" file tells it to. https://git.savannah.gnu.org/cgit/groff.git/tree/font/devhtml/DESC.proto?h=1.23.0 > Anyone willing to share some insights or ideas? Another way to make sure your code is being exercised is to stick a `debug()` call into it. diff --git a/src/preproc/html/pre-html.cpp b/src/preproc/html/pre-html.cpp index cbcc2ccda..831d0a269 100644 --- a/src/preproc/html/pre-html.cpp +++ b/src/preproc/html/pre-html.cpp @@ -918,6 +918,7 @@ int imageList::createPage(int pageno) fprintf(stderr, "creating page %d\n", pageno); #endif + debug("GBR: hello"); s = make_string("psselect -q -p%d %s %s\n", pageno, psFileName, psPageName); html_system(s, 1); After doing this, running "make -C build -j" said hello to me many times (once per document page) thanks to the rebuilds of the "webpage.html" and "pic.html" documents prompted by the newness of the "pre-grohtml" executable. Regards, Branden signature.asc Description: PGP signature
Re: Why does groff require psutils?
Hi all, following up on my previous email I'd like to test the attached patch (replace_psselect.patch) with groff, but am uncertain how to trigger the patched code path. If I read src/preproc/html/html.am correctly then pre-html.cpp is compiled into pre-grohtml. I'm uncertain how pre-grohtml is invoked in a common groff pipeline or can be invoked manually for testing purposes. Anyone willing to share some insights or ideas? Best Alexis diff --git a/src/preproc/html/pre-html.cpp b/src/preproc/html/pre-html.cpp index cbcc2ccda..1668c9f67 100644 --- a/src/preproc/html/pre-html.cpp +++ b/src/preproc/html/pre-html.cpp @@ -918,7 +918,7 @@ int imageList::createPage(int pageno) fprintf(stderr, "creating page %d\n", pageno); #endif - s = make_string("psselect -q -p%d %s %s\n", + s = make_string("ps2ps -dFirstPage=%1$d -dLastPage=%1$d %s %s\n", pageno, psFileName, psPageName); html_system(s, 1); assert(strlen(image_gen) > 0);
Re: Why does groff require psutils?
At 2023-11-26T15:34:10+0100, Ingo Schwarze wrote: > not related to the "psutils" questions, but this almost made my > eyes fall out. Evidently... > Alexis wrote on Sun, Nov 26, 2023 at 12:28:25PM +0100: > > > Would replacing the following in src/preproc/html/pre-html.cpp > > s = make_string("psselect -q -p%d %s %s\n", > >pageno, psFileName, psPageName); > > WHOA. > > What kind of crappy code is that? The kind that does _not_ appear to put these character pointers under user control. > It's really "C Programming 101" that you must *never* do anything > like that. Obviously, execve(2) or a similar library function > that does not suffer from shell argument splitting and shell > metacharacter issues must be used here. If we want to continue > shipping preproc/html, i think this definitely needs to be fixed. Hang on now. Before you can declare the above unsafe...and admittedly we would do well to restore the very next line of context-- html_system(s, 1); where `html_system` wraps exactly the standard C library function you think it does-- we need to consider whether any mayhem-causing characters can get into psFileName or psPageName in the first place. 132 # if defined(DEBUGGING) && !defined(DEBUG_FILE_DIR) 133 /* For a DEBUGGING version, on the Unix host, we can also usually rely 134on being able to use '/tmp' for temporary file storage. (Note that, 135as in the __MSDOS__ or _WIN32 case above, the user may override this 136by defining 137 138 -DDEBUG_FILE_DIR=/path/to/debug/files 139 140in the CPPFLAGS.) */ 141 142 # define DEBUG_FILE_DIR /tmp 143 # endif 144 145 #endif /* not __MSDOS__ or _WIN32 */ 146 147 #ifdef DEBUGGING 148 // For a DEBUGGING version, we need some additional macros, 149 // to direct the captured debugging mode output to appropriately named 150 // files in the specified DEBUG_FILE_DIR. 151 152 # define DEBUG_TEXT(text) #text 153 # define DEBUG_NAME(text) DEBUG_TEXT(text) 154 # define DEBUG_FILE(name) DEBUG_NAME(DEBUG_FILE_DIR) "/" name 155 #endif 1703 static void makeTempFiles(void) 1704 { 1705 #if defined(DEBUGGING) 1706 psFileName = DEBUG_FILE("prehtml-ps"); 1709 psPageName = DEBUG_FILE("prehtml-psn"); 1712 #else /* not DEBUGGING */ 1715 // psPageName contains a single page of PostScript. 1716 f = xtmpfile(&psPageName, PS_TEMPLATE_LONG, PS_TEMPLATE_SHORT, true); 1717 if (0 /* nullptr */ == f) 1718 sys_fatal("xtmpfile"); 1719 fclose(f); 1728 // psFileName contains a PostScript file of the complete document. 1729 f = xtmpfile(&psFileName, PS_TEMPLATE_LONG, PS_TEMPLATE_SHORT, true); 1730 if (0 /* nullptr */ == f) 1731 sys_fatal("xtmpfile"); 1732 fclose(f); ...and that's it. These character pointers point either to a string literal embedded in the text section of the executable,[1] or are returned by `xtmpfile`, which wraps the standard C library's `mkstemp()` as you might expect. > I mean, for all i know, there are people running "groff -T html" > on public web servers to serve manual pages to the general public > via public CGI interfaces... That's indeed possible, but you may need to tell this sleepy kid in the back row of the C Programming 101 lecture hall how there's an obvious hole for injection of spaces or shell operators in the foregoing. I've expressed elsewhere my idea for what's necessary to eliminate the `pre-grohtml` preprocessor altogether, and that is the avenue I would like to pursue, rather than refactoring to foreclose the possibility of a security hole that may arise if someone else hacks on the program to _make_ it vulnerable. As it stands, this looks to me like a code smell rather than even a theoretical vulnerability. I'm open to correction on this point. Regards, Branden [1] Well, probably. That's traditionally what would happen. As this is debugging code that is disabled by default, I didn't bother to check. Caveat lector. [2] src/libs/libgroff/tmpfile.cpp: // Open a temporary file and with fatal error on failure. FILE *xtmpfile(char **namep, const char *postfix_long, const char *postfix_short, int do_unlink) { char *templ = xtmptemplate(postfix_long, postfix_short); errno = 0; int fd = mkstemp(templ); if (fd < 0) fatal("cannot create temporary file: %1", strerror(errno)); errno = 0; FILE *fp = fdopen(fd, FOPEN_RWB); // many callers of xtmpfile use binary I/O if (!fp) fatal("fdopen: %1", strerror(errno)); if (do_unlink) add_tmp_file(templ); if (namep) *namep = templ; else delete[] templ; return fp; } signature.asc Description: PGP signature
Re: Why does groff require psutils?
Hi, not related to the "psutils" questions, but this almost made my eyes fall out. Alexis wrote on Sun, Nov 26, 2023 at 12:28:25PM +0100: > Would replacing the following in src/preproc/html/pre-html.cpp > s = make_string("psselect -q -p%d %s %s\n", >pageno, psFileName, psPageName); WHOA. What kind of crappy code is that? It's really "C Programming 101" that you must *never* do anything like that. Obviously, execve(2) or a similar library function that does not suffer from shell argument splitting and shell metacharacter issues must be used here. If we want to continue shipping preproc/html, i think this definitely needs to be fixed. I mean, for all i know, there are people running "groff -T html" on public web servers to serve manual pages to the general public via public CGI interfaces... Yours, Ingo
Re: Why does groff require psutils?
Hi all, from what I understand the "psselect" command is used by the groff html preprocessor to extract *a single page* from a multi-page Postscript document. I think the same could be achieved using ghostscript, which groff already depends on and uses. Note that I know little about psutils, ghostscript, and Postscript, so please take the following with a block rather than a grain of salt. > If someone knows of a replacement for psselect, possibly in a more > widely deployed and used package, we could conceivably transition to it. > This would require some testing. First tests (see minimal working example (MWE) below) suggest that psselect can be replaced with ghostscript's ps2ps or gs command. Would replacing the following in src/preproc/html/pre-html.cpp s = make_string("psselect -q -p%d %s %s\n", pageno, psFileName, psPageName); with s = make_string("ps2ps -dFirstPage=%1$d -dLastPage=%1$d %s %s\n", pageno, psFileName, psPageName); or s = make_string("echo showpage | " "%s%s -q -dBATCH -dSAFER " "-dFirstPage=%3$d -dLastPage=%3$d " "-sDEVICE=ps2write " "-sOutputFile=%s %s\n", image_gen, EXE_EXT, pageno, psPageName, psFileName); seem feasible for a medium or even short term transition if the suggested commands are a worthy equivalent for how psselect is used in groff? Please find attached a Makefile as a MWE that: - generates a multi-page Postscript document using groff - extracts a page using psselect writing it to psselect-psPageName.ps - extracts a page using ps2ps writing it to ps2ps-psPageName.ps - extracts a page using gs writing it to psPageName.ps The MWE requires make, seq, groff, ghostscript, and psselect. The support for pdf conversion is added for those who have the (mis)fortune of working on macOS, which sadly has removed support for PostScript. Best Alexis PAGENO := 3 TOTAL_PAGES := 9 all: psPageName.pdf # Extract page PAGENO from the multi-page example document psFileName.ps psPageName.ps: psFileName.ps @# This command is currently used in groff's html preprocessor (see pre-html.cpp). psselect -q -p${PAGENO} $< psselect-$@ @# This command is a psselect replacement using ghostscript's ps2ps. ps2ps -dFirstPage=${PAGENO} -dLastPage=${PAGENO} $< ps2ps-$@ @# This command is a psselect replacement using a ghostscript based @# pipeline similar to what is already being used in groff's html @# preprocessor. echo showpage \ | gs -q -dBATCH -dSAFER -dFirstPage=${PAGENO} -dLastPage=${PAGENO} \ -sDEVICE=ps2write -sOutputFile=$@ $< # Generate an multi-page example document # to extract a single page from. psFileName.ps: seq ${TOTAL_PAGES} \ | awk '\ BEGIN {print ".ce 99\n.ps 24\n.vs 32"} \ {print "Page", NR "\n.bp"} \ END {print "Page", NR+1 "\n.ex"}' \ | groff -Tps - > $@ # Convert Postscript to PDF %.pdf: %.ps ps2pdf $< $@
Re: Why does groff require psutils?
Hi Lukas, At 2023-11-19T17:31:40+0100, Lukas Javorsky wrote: > I've been approached by a maintainer of the `psutils` package that the > groff is the only package that still requires it. He wants to get rid > of the package as many of the dependencies have shifted from it, but > the groff is still remaining > > I wanted to ask what is the exact reason groff needs to require > `psutils`. I've found only a small mention in the README, however not > much more: > > > Ghostscript is required for creation of PDF and (X)HTML output. > > Production of (X)HTML furthermore demands tools from the 'netpbm' > > and 'psutils' packages. > > > > Thank you for your answers. This is not documented as well as it could be. I expected to find the answer in the GNU Autoconf tests that groff uses, but unless one already knows what binaries psutils ships, one can't find it. Long story short: we need the "psselect" command. Longer story. This appears to be the only psutils command we need. $ cat -n m4/groff.m4|sed -n '173,191p' 173 # grohtml needs the following programs to produce images from tbl(1) 174 # tables and eqn(1) equations. 175 176 dnl Any macro that tests $use_grohtml should AC_REQUIRE this. 177 178 AC_DEFUN([GROFF_CHECK_GROHTML_PROGRAMS], [ 179AC_REQUIRE([GROFF_GHOSTSCRIPT_PATH]) 180 181use_grohtml=no 182missing= 183m4_foreach([groff_prog], 184 dnl Keep this list of programs in sync with grohtml test scripts. 185 [[pnmcrop], [pnmcut], [pnmtopng], [pnmtops], [psselect]], [ 186AC_CHECK_PROG(groff_prog, groff_prog, [found], [missing]) 187if test $[]groff_prog = missing 188then 189 missing="$missing 'groff_prog'" 190fi 191 ]) I have a long-term notion to eliminate the pre-grohtml program (the preprocessor for grohtml) altogether; that would eliminate psselect and at least some of our dependency on netpbm. (I think we have another dependency on netpbm.) But "long-term" may mean "years away". What does it do? psselect(1): Psselect selects pages from a PostScript document, creating a new PostScript file. The input PostScript file should follow the Adobe Document Structuring Conventions. If someone knows of a replacement for psselect, possibly in a more widely deployed and used package, we could conceivably transition to it. This would require some testing. Regards, Branden signature.asc Description: PGP signature