Arto Jonsson <ajons...@kapsi.fi> writes: > On Fri, May 10, 2013 at 08:04:57AM +0100, Stuart Henderson wrote: >> On 2013/05/10 13:18, Damien Miller wrote: >> > On Wed, 8 May 2013, Ted Unangst wrote: >> > >> > > On Tue, Apr 30, 2013 at 18:57, Arto Jonsson wrote: >> > > > Taken from netbsd with minor modifications. Comments? >> > > >> > > I don't think you've received much feedback. I don't know how other >> > > developers feel, but the question I have is can't this be done with a >> > > rather simple awk script? or perl? One of the reasons we have perl in >> > > base is precisely so it can be used for things like this. >> > >> > This implementation has the benefits of being small, having existing >> > maintainers (NetBSD) and already having been written and debugged. It >> > seems like make-work to do it over in Perl. >> >> If we do use this implementation, then pascal@'s version from 2011 added >> some fixes from FreeBSD, >> http://comments.gmane.org/gmane.os.openbsd.tech/25740 > > Here's an updated diff. Compared to the previous diff '-' is now handled > as stdin. From the freebsd version I noticed that the previous diff also > had useless exit() call which I removed. Comments?
Nitpicking inline... > Index: Makefile > =================================================================== > RCS file: /cvs/src/usr.bin/Makefile,v > retrieving revision 1.129 > diff -u -p -r1.129 Makefile > --- Makefile 15 Mar 2013 06:01:41 -0000 1.129 > +++ Makefile 10 May 2013 14:09:23 -0000 > @@ -16,7 +16,7 @@ SUBDIR= apply apropos ar arch asa asn1_c > m4 mail make man mandoc mesg mg \ > midiplay mixerctl mkdep mklocale mkstr mktemp modstat nc netstat \ > newsyslog \ > - nfsstat nice nm nohup oldrdist pagesize passwd paste patch pctr \ > + nfsstat nice nm nl nohup oldrdist pagesize passwd paste patch pctr \ > pkg-config pkill \ > pr printenv printf quota radioctl ranlib rcs rdist rdistd \ > readlink renice rev rpcgen rpcinfo rs rsh rup ruptime rusers rwall \ > Index: nl/Makefile > =================================================================== > RCS file: nl/Makefile > diff -N nl/Makefile > --- /dev/null 1 Jan 1970 00:00:00 -0000 > +++ nl/Makefile 10 May 2013 14:09:24 -0000 > @@ -0,0 +1,6 @@ > +# $OpenBSD$ > +# $NetBSD: Makefile,v 1.4 2011/08/16 12:00:46 christos Exp $ > + > +PROG= nl > + > +.include <bsd.prog.mk> > Index: nl/nl.1 > =================================================================== > RCS file: nl/nl.1 > diff -N nl/nl.1 > --- /dev/null 1 Jan 1970 00:00:00 -0000 > +++ nl/nl.1 10 May 2013 14:09:24 -0000 > @@ -0,0 +1,212 @@ > +.\" $OpenBSD$ > +.\" $NetBSD: nl.1,v 1.12 2012/04/08 22:00:39 wiz Exp $ > +.\" > +.\" Copyright (c) 1999 The NetBSD Foundation, Inc. > +.\" All rights reserved. > +.\" > +.\" This code is derived from software contributed to The NetBSD Foundation > +.\" by Klaus Klein. > +.\" > +.\" Redistribution and use in source and binary forms, with or without > +.\" modification, are permitted provided that the following conditions > +.\" are met: > +.\" 1. Redistributions of source code must retain the above copyright > +.\" notice, this list of conditions and the following disclaimer. > +.\" 2. Redistributions in binary form must reproduce the above copyright > +.\" notice, this list of conditions and the following disclaimer in the > +.\" documentation and/or other materials provided with the distribution. > +.\" > +.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS > +.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > LIMITED > +.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A > PARTICULAR > +.\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS > +.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR > +.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF > +.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS > +.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN > +.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) > +.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF > THE > +.\" POSSIBILITY OF SUCH DAMAGE. > +.\" > +.Dd $Mdocdate$ > +.Dt NL 1 > +.Os > +.Sh NAME > +.Nm nl > +.Nd line numbering filter > +.Sh SYNOPSIS > +.Nm > +.Op Fl p > +.Op Fl b Ar type > +.Op Fl d Ar delim > +.Op Fl f Ar type > +.Op Fl h Ar type > +.Op Fl i Ar incr > +.Op Fl l Ar num > +.Op Fl n Ar format > +.Op Fl s Ar sep > +.Op Fl v Ar startnum > +.Op Fl w Ar width > +.Op Ar file > +.Sh DESCRIPTION > +The > +.Nm > +utility reads lines from the named > +.Ar file > +or the standard input if the > +.Ar file > +argument is omitted, Since the upstream version doesn't treat `-' specially, I'd add: argument is a single dash .Pq Sq \&- or absent, - as in cat(1) - to make that clear. > +applies a configurable line numbering filter operation and writes the result > +to the standard output. > +.Pp > +The > +.Nm > +utility treats the text it reads in terms of logical pages. > +Unless specified otherwise, line numbering is reset at the start of each > +logical page. > +A logical page consists of a header, a body and a footer section; empty > +sections are valid. > +Different line numbering options are independently available for header, > +body and footer sections. > +.Pp > +The starts of logical page sections are signaled by input lines containing > +nothing but one of the following sequences of delimiter characters: > +.Bd -unfilled -offset indent > +.Bl -column "\e:\e:\e: " "header " > +.It Em "Line" "Start of" > +.It \e:\e:\e: header > +.It \e:\e: body > +.It \e: footer > +.El > +.Ed > +.Pp > +If the input does not contain any logical page section signaling directives, > +the text being read is assumed to consist of a single logical page body. > +.Pp > +The following options are available: > +.Bl -tag -width indent > +.It Fl b Ar type > +Specify the logical page body lines to be numbered. > +Recognized > +.Ar type > +arguments are: > +.Bl -tag -width pstringXX > +.It a > +Number all lines. > +.It t > +Number only non-empty lines. > +.It n > +No line numbering. > +.It p Ns Ar expr > +Number only those lines that contain the basic regular expression specified > +by > +.Ar expr . > +.El > +.Pp > +The default > +.Ar type > +for logical page body lines is t. > +.It Fl d Ar delim > +Specify the delimiter characters used to indicate the start of a logical > +page section in the input file. > +At most two characters may be specified; if only one character is specified, > +the first character is replaced and the second character remains unchanged. > +The default > +.Ar delim > +characters are ``\e:''. > +.It Fl f Ar type > +Specify the same as > +.Fl b Ar type > +except for logical page footer lines. > +The default > +.Ar type > +for logical page footer lines is n. > +.It Fl h Ar type > +Specify the same as > +.Fl b Ar type > +except for logical page header lines. > +The default > +.Ar type > +for logical page header lines is n. > +.It Fl i Ar incr > +Specify the increment value used to number logical page lines. > +The default > +.Ar incr > +value is 1. > +.It Fl l Ar num > +If numbering of all lines is specified for the current logical section > +using the corresponding > +.Fl b > +a, > +.Fl f > +a > +or > +.Fl h > +a > +option, > +specify the number of adjacent blank lines to be considered as one. > +For example, > +.Fl l > +2 results in only the second adjacent blank line being numbered. > +The default > +.Ar num > +value is 1. > +.It Fl n Ar format > +Specify the line numbering output format. > +Recognized > +.Ar format > +arguments are: > +.Bl -tag -width lnXX -compact > +.It ln > +Left justified. > +.It rn > +Right justified, leading zeros suppressed. > +.It rz > +Right justified, leading zeros kept. > +.El > +.Pp > +The default > +.Ar format > +is rn. > +.It Fl p > +Specify that line numbering should not be restarted at logical page > delimiters. > +.It Fl s Ar sep > +Specify the characters used in separating the line number and the > corresponding > +text line. > +The default > +.Ar sep > +setting is a single tab character. > +.It Fl v Ar startnum > +Specify the initial value used to number logical page lines; see also the > +description of the > +.Fl p > +option. > +The default > +.Ar startnum > +value is 1. > +.It Fl w Ar width > +Specify the number of characters to be occupied by the line number; > +in case the > +.Ar width > +is insufficient to hold the line number, it will be truncated to its > +.Ar width > +least significant digits. > +The default > +.Ar width > +is 6. > +.El > +.Sh EXIT STATUS > +.Ex -std > +.Sh SEE ALSO > +.Xr pr 1 > +.Sh STANDARDS > +The > +.Nm > +utility is compliant with the > +.St -p1003.1-2008 > +specification. > +.Sh HISTORY > +The > +.Nm > +utility first appeared in > +.At V.2 . > Index: nl/nl.c > =================================================================== > RCS file: nl/nl.c > diff -N nl/nl.c > --- /dev/null 1 Jan 1970 00:00:00 -0000 > +++ nl/nl.c 10 May 2013 14:09:24 -0000 > @@ -0,0 +1,384 @@ > +/* $OpenBSD$ */ > +/* $NetBSD: nl.c,v 1.11 2011/08/16 12:00:46 christos Exp $ */ > + > +/*- > + * Copyright (c) 1999 The NetBSD Foundation, Inc. > + * All rights reserved. > + * > + * This code is derived from software contributed to The NetBSD Foundation > + * by Klaus Klein. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions > + * are met: > + * 1. Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * 2. Redistributions in binary form must reproduce the above copyright > + * notice, this list of conditions and the following disclaimer in the > + * documentation and/or other materials provided with the distribution. > + * > + * THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS > + * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > LIMITED > + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR > + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS > + * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR > + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF > + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS > + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN > + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) > + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE > + * POSSIBILITY OF SUCH DAMAGE. > + */ > + > +#include <sys/cdefs.h> > + > +#include <err.h> > +#include <limits.h> > +#include <locale.h> > +#include <regex.h> > +#include <stdio.h> > +#include <stdlib.h> > +#include <string.h> > +#include <unistd.h> > + > +typedef enum { > + number_all, /* number all lines */ > + number_nonempty, /* number non-empty lines */ > + number_none, /* no line numbering */ > + number_regex /* number lines matching regular expression */ > +} numbering_type; > + > +struct numbering_property { > + const char * const name; /* for diagnostics */ > + numbering_type type; /* numbering type */ > + regex_t expr; /* for type == number_regex */ > +}; > + > +/* line numbering formats */ > +#define FORMAT_LN "%-*d" /* left justified, leading zeros suppressed */ > +#define FORMAT_RN "%*d" /* right justified, leading zeros suppressed */ > +#define FORMAT_RZ "%0*d" /* right justified, leading zeros kept */ > + > +#define FOOTER 0 > +#define BODY 1 > +#define HEADER 2 > +#define NP_LAST HEADER > + > +static struct numbering_property numbering_properties[NP_LAST + 1] = { > + { "footer", number_none, { 0, 0, 0, 0 } }, > + { "body", number_nonempty, { 0, 0, 0, 0 } }, > + { "header", number_none, { 0, 0, 0, 0 } }, > +}; > + > +#define max(a, b) ((a) > (b) ? (a) : (b)) > + > +/* > + * Maximum number of characters required for a decimal representation of a > + * (signed) int; courtesy of tzcode. > + */ > +#define INT_STRLEN_MAXIMUM \ > + ((sizeof (int) * CHAR_BIT - 1) * 302 / 1000 + 2) > + > +static void filter(void); > +static void parse_numbering(const char *, int); > +static __dead void usage(void); "static" here (and at definition time) makes cc -O2 inline the functions, this makes debugging harder should it be needed and brings no speed benefit. > + > +/* > + * Pointer to dynamically allocated input line buffer, and its size. > + */ > +static char *buffer; > +static size_t buffersize; > + > +/* > + * Dynamically allocated buffer suitable for string representation of ints. > + */ > +static char *intbuffer; > +static size_t intbuffersize; > + > +/* > + * Configurable parameters. > + */ > +/* delimiter characters that indicate the start of a logical page section */ > +static char delim[2] = { '\\', ':' }; > + > +/* line numbering format */ > +static const char *format = FORMAT_RN; > + > +/* increment value used to number logical page lines */ > +static int incr = 1; > + > +/* number of adjacent blank lines to be considered (and numbered) as one */ > +static unsigned int nblank = 1; > + > +/* whether to restart numbering at logical page delimiters */ > +static int restart = 1; > + > +/* characters used in separating the line number and the corrsp. text line */ > +static const char *sep = "\t"; > + > +/* initial value used to number logical page lines */ > +static int startnum = 1; > + > +/* number of characters to be used for the line number */ > +/* should be unsigned but required signed by `*' precision conversion */ > +static int width = 6; > + > + > +int > +main(int argc, char *argv[]) > +{ > + int c; > + long val; > + const char *errstr; > + > + (void)setlocale(LC_ALL, ""); > + > + /* > + * Note: this implementation strictly conforms to the XBD Utility > + * Syntax Guidelines and does not permit the optional `file' operand > + * to be intermingled with the options, which is defined in the > + * XCU specification (Issue 5) but declared an obsolescent feature that > + * will be removed from a future issue. It shouldn't matter, though. > + */ I'm not sure that comment is useful. > + while ((c = getopt(argc, argv, "pb:d:f:h:i:l:n:s:v:w:")) != -1) { > + switch (c) { > + case 'p': > + restart = 0; > + break; > + case 'b': > + parse_numbering(optarg, BODY); > + break; > + case 'd': > + if (optarg[0] != '\0') > + delim[0] = optarg[0]; > + if (optarg[1] != '\0') > + delim[1] = optarg[1]; > + /* at most two delimiter characters */ > + if (optarg[2] != '\0') { > + errx(EXIT_FAILURE, > + "invalid delim argument -- %s", > + optarg); > + /* NOTREACHED */ > + } > + break; > + case 'f': > + parse_numbering(optarg, FOOTER); > + break; > + case 'h': > + parse_numbering(optarg, HEADER); > + break; > + case 'i': > + incr = strtonum(optarg, INT_MIN, INT_MAX, &errstr); > + if (errstr) > + errx(EXIT_FAILURE, "increment value is %s: %s", > + errstr, optarg); > + break; > + case 'l': > + nblank = strtonum(optarg, 0, UINT_MAX, &errstr); > + if (errstr) > + errx(EXIT_FAILURE, > + "blank line value is %s: %s", > + errstr, optarg); > + break; > + case 'n': > + if (strcmp(optarg, "ln") == 0) { > + format = FORMAT_LN; > + } else if (strcmp(optarg, "rn") == 0) { > + format = FORMAT_RN; > + } else if (strcmp(optarg, "rz") == 0) { > + format = FORMAT_RZ; > + } else > + errx(EXIT_FAILURE, > + "illegal format -- %s", optarg); > + break; > + case 's': > + sep = optarg; > + break; > + case 'v': > + startnum = strtonum(optarg, INT_MIN, INT_MAX, &errstr); > + if (errstr) > + errx(EXIT_FAILURE, > + "initial logical page value is %s: %s", > + errstr, optarg); > + break; > + case 'w': > + width = strtonum(optarg, 1, INT_MAX, &errstr); > + if (errstr) > + errx(EXIT_FAILURE, "width is %s: %s", errstr, > + optarg); > + break; > + case '?': > + default: > + usage(); > + /* NOTREACHED */ > + } > + } > + argc -= optind; > + argv += optind; > + > + switch (argc) { > + case 0: > + break; > + case 1: > + if (strcmp(argv[0], "-") != 0 && > + freopen(argv[0], "r", stdin) == NULL) > + err(EXIT_FAILURE, "Cannot open `%s'", argv[0]); > + break; > + default: > + usage(); > + /* NOTREACHED */ > + } > + > + /* Determine the maximum input line length to operate on. */ > + if ((val = sysconf(_SC_LINE_MAX)) == -1) /* ignore errno */ > + val = LINE_MAX; > + /* Allocate sufficient buffer space (including the terminating NUL). */ > + buffersize = (size_t)val + 1; > + if ((buffer = malloc(buffersize)) == NULL) > + err(EXIT_FAILURE, "Cannot allocate input line buffer"); > + > + /* Allocate a buffer suitable for preformatting line number. */ > + intbuffersize = max((int)INT_STRLEN_MAXIMUM, width) + 1; /* NUL */ > + if ((intbuffer = malloc(intbuffersize)) == NULL) > + err(EXIT_FAILURE, "cannot allocate preformatting buffer"); > + > + /* Do the work. */ > + filter(); > + > + return EXIT_SUCCESS; > + /* NOTREACHED */ > +} > + > +static void > +filter(void) > +{ > + int line; /* logical line number */ > + int section; /* logical page section */ > + unsigned int adjblank; /* adjacent blank lines */ > + int consumed; /* intbuffer measurement */ > + int donumber, idx; > + > + adjblank = 0; > + line = startnum; > + section = BODY; > + > + while (fgets(buffer, (int)buffersize, stdin) != NULL) { > + for (idx = FOOTER; idx <= NP_LAST; idx++) { > + /* Does it look like a delimiter? */ > + if (buffer[2 * idx + 0] == delim[0] && > + buffer[2 * idx + 1] == delim[1]) { > + /* Was this the whole line? */ > + if (buffer[2 * idx + 2] == '\n') { > + section = idx; > + adjblank = 0; > + if (restart) > + line = startnum; > + goto nextline; > + } > + } else { > + break; > + } > + } > + > + switch (numbering_properties[section].type) { > + case number_all: > + /* > + * Doing this for number_all only is disputable, but > + * the standard expresses an explicit dependency on > + * `-b a' etc. > + */ > + if (buffer[0] == '\n' && ++adjblank < nblank) > + donumber = 0; > + else > + donumber = 1, adjblank = 0; > + break; > + case number_nonempty: > + donumber = (buffer[0] != '\n'); > + break; > + case number_none: > + donumber = 0; > + break; > + case number_regex: > + donumber = > + (regexec(&numbering_properties[section].expr, > + buffer, 0, NULL, 0) == 0); > + break; What about a default case here, to make WARNINGS=Yes shut up? > + } > + > + if (donumber) { > + consumed = snprintf(intbuffer, intbuffersize, format, > + width, line); > + (void)printf("%s", > + intbuffer + max(0, consumed - width)); > + line += incr; > + } else { > + (void)printf("%*s", width, ""); > + } > + (void)printf("%s%s", sep, buffer); > + > + if (ferror(stdout)) > + err(EXIT_FAILURE, "output error"); > +nextline: > + ; > + } > + > + if (ferror(stdin)) > + err(EXIT_FAILURE, "input error"); > +} > + > +/* > + * Various support functions. > + */ > + > +static void > +parse_numbering(const char *argstr, int section) > +{ > + int error; > + char errorbuf[NL_TEXTMAX]; > + > + switch (argstr[0]) { > + case 'a': > + numbering_properties[section].type = number_all; > + break; > + case 'n': > + numbering_properties[section].type = number_none; > + break; > + case 't': > + numbering_properties[section].type = number_nonempty; > + break; > + case 'p': > + /* If there was a previous expression, throw it away. */ > + if (numbering_properties[section].type == number_regex) > + regfree(&numbering_properties[section].expr); > + else > + numbering_properties[section].type = number_regex; > + > + /* Compile/validate the supplied regular expression. */ > + if ((error = regcomp(&numbering_properties[section].expr, > + &argstr[1], REG_NEWLINE|REG_NOSUB)) != 0) { > + (void)regerror(error, > + &numbering_properties[section].expr, > + errorbuf, sizeof (errorbuf)); > + errx(EXIT_FAILURE, > + "%s expr: %s -- %s", > + numbering_properties[section].name, errorbuf, > + &argstr[1]); > + } > + break; > + default: > + errx(EXIT_FAILURE, > + "illegal %s line numbering type -- %s", > + numbering_properties[section].name, argstr); > + } > +} > + > +static __dead void > +usage(void) > +{ > + extern char *__progname; > + > + (void)fprintf(stderr, "usage: %s [-p] [-b type] [-d delim] [-f type] " > + "[-h type] [-i incr] [-l num]\n\t[-n format] [-s sep] " > + "[-v startnum] [-w width] [file]\n", __progname); > + exit(EXIT_FAILURE); > +} > -- Jérémie Courrèges-Anglas PGP Key fingerprint: 61DB D9A0 00A4 67CF 2A90 8961 6191 8FBF 06A1 1494