CVS commit: src/lib/libc/regex

2024-09-24 Thread Valery Ushakov
Module Name:src
Committed By:   uwe
Date:   Tue Sep 24 14:10:44 UTC 2024

Modified Files:
src/lib/libc/regex: regex.3

Log Message:
regex.3: brush up markup

Use \N for backreferences in both places for consistency and to make
it more obviously different from \n (besides the different fonts, that
might not be too obvious).

Use \&. instead of .\& to protect punctuation.


To generate a diff of this commit:
cvs rdiff -u -r1.34 -r1.35 src/lib/libc/regex/regex.3

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

Modified files:

Index: src/lib/libc/regex/regex.3
diff -u src/lib/libc/regex/regex.3:1.34 src/lib/libc/regex/regex.3:1.35
--- src/lib/libc/regex/regex.3:1.34	Sun Sep 22 00:22:08 2024
+++ src/lib/libc/regex/regex.3	Tue Sep 24 14:10:43 2024
@@ -1,4 +1,4 @@
-.\" $NetBSD: regex.3,v 1.34 2024/09/22 00:22:08 christos Exp $
+.\" $NetBSD: regex.3,v 1.35 2024/09/24 14:10:43 uwe Exp $
 .\"
 .\" Copyright (c) 1992, 1993, 1994 Henry Spencer.
 .\" Copyright (c) 1992, 1993, 1994
@@ -37,6 +37,7 @@
 .Dd September 21, 2024
 .Dt REGEX 3
 .Os
+.
 .Sh NAME
 .Nm regcomp ,
 .Nm regexec ,
@@ -44,36 +45,64 @@
 .Nm regfree ,
 .Nm regasub ,
 .Nm regnsub
+.
 .Nd regular-expression library
+.
 .Sh LIBRARY
 .Lb libc
+.
 .Sh SYNOPSIS
+.
 .In regex.h
+.
 .Ft int
 .Fo regcomp
-.Fa "regex_t * restrict preg" "const char * restrict pattern" "int cflags"
+.Fa "regex_t * restrict preg"
+.Fa "const char * restrict pattern"
+.Fa "int cflags"
 .Fc
+.
 .Ft int
 .Fo regexec
-.Fa "const regex_t * restrict preg" "const char * restrict string"
-.Fa "size_t nmatch" "regmatch_t pmatch[restrict]" "int eflags"
+.Fa "const regex_t * restrict preg"
+.Fa "const char * restrict string"
+.Fa "size_t nmatch"
+.Fa "regmatch_t pmatch[restrict]"
+.Fa "int eflags"
 .Fc
 .Ft size_t
 .Fo regerror
-.Fa "int errcode" "const regex_t * restrict preg"
-.Fa "char * restrict errbuf" "size_t errbuf_size"
+.Fa "int errcode"
+.Fa "const regex_t * restrict preg"
+.Fa "char * restrict errbuf"
+.Fa "size_t errbuf_size"
 .Fc
+.
 .Ft void
 .Fn regfree "regex_t *preg"
+.
 .Ft ssize_t
-.Fn regnsub "char *buf" "size_t bufsiz" "const char *sub" "const regmatch_t *rm" "const char *str"
+.Fo regnsub
+.Fa "char *buf"
+.Fa "size_t bufsiz"
+.Fa "const char *sub"
+.Fa "const regmatch_t *rm"
+.Fa "const char *str"
+.Fc
+.
 .Ft ssize_t
-.Fn regasub "char **buf" "const char *sub" "const regmatch_t *rm" "const char *sstr"
+.Fo regasub
+.Fa "char **buf"
+.Fa "const char *sub"
+.Fa "const regmatch_t *rm"
+.Fa "const char *sstr"
+.Fc
+.
 .Sh DESCRIPTION
 These routines implement
 .St -p1003.2
 regular expressions
-.Pq Do RE Dc Ns s ;
+.Pq Do Tn RE Dc Ns s ;
 see
 .Xr re_format 7 .
 The
@@ -100,7 +129,7 @@ It also declares the four functions,
 a type
 .Ft regoff_t ,
 and a number of constants with names starting with
-.Dq Dv REG_ .
+.Ql REG_ .
 .Pp
 The
 .Fn regcomp
@@ -117,8 +146,11 @@ structure pointed to by
 The
 .Fa cflags
 argument
-is the bitwise OR of zero or more of the following flags:
-.Bl -tag -width REG_EXTENDED
+is the bitwise
+.Em or
+of zero or more of the following flags:
+.Bl -tag -width Dv
+.
 .It Dv REG_EXTENDED
 Compile modern
 .Pq Dq extended
@@ -127,11 +159,13 @@ rather than the obsolete
 .Pq Dq basic
 REs that
 are the default.
+.
 .It Dv REG_BASIC
 This is a synonym for 0,
 provided as a counterpart to
 .Dv REG_EXTENDED
 to improve readability.
+.
 .It Dv REG_NOSPEC
 Compile with recognition of all special characters turned off.
 All characters are thus considered ordinary,
@@ -149,42 +183,49 @@ and
 may not be used
 in the same call to
 .Fn regcomp .
+.
 .It Dv REG_ICASE
-Compile for matching that ignores upper/lower case distinctions.
+Compile for matching that ignores upper\|/\^lower case distinctions.
 See
 .Xr re_format 7 .
+.
 .It Dv REG_NOSUB
 Compile for matching that need only report success or failure,
 not what was matched.
+.
 .It Dv REG_NEWLINE
 Compile for newline-sensitive matching.
 By default, newline is a completely ordinary character with no special
 meaning in either REs or strings.
 With this flag,
-.Ql [^
+.Ql \&[^
 bracket expressions and
-.Ql .\&
+.Ql \&.
 never match newline,
 a
-.Ql ^\&
+.Ql \&^
 anchor matches the null string after any newline in the string
 in addition to its normal function,
 and the
-.Ql $\&
+.Ql \&$
 anchor matches the null string before any newline in the
 string in addition to its normal function.
+.
 .It Dv REG_PEND
 The regular expression ends,
-not at the first NUL,
+not at the first
+.Tn NUL ,
 but just before the character pointed to by the
-.Va re_endp
+.Fa re_endp
 member of the structure pointed to by
 .Fa preg .
 The
-.Va re_endp
+.Fa re_endp
 member is of type
 .Ft "const char *" .
-This flag permits inclusion of NULs in the RE;
+This flag permits inclusion of
+.Tn NUL Ns s
+in the RE;
 they are considered ordinary characters.
 This is an extension,
 compatible with but not specified by
@@ -194,43 +235,44 @@ caution in software i

CVS commit: src/lib/libc/regex

2024-09-24 Thread Valery Ushakov
Module Name:src
Committed By:   uwe
Date:   Tue Sep 24 14:10:44 UTC 2024

Modified Files:
src/lib/libc/regex: regex.3

Log Message:
regex.3: brush up markup

Use \N for backreferences in both places for consistency and to make
it more obviously different from \n (besides the different fonts, that
might not be too obvious).

Use \&. instead of .\& to protect punctuation.


To generate a diff of this commit:
cvs rdiff -u -r1.34 -r1.35 src/lib/libc/regex/regex.3

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.



CVS commit: src/lib/libc/regex

2024-09-21 Thread Christos Zoulas
Module Name:src
Committed By:   christos
Date:   Sun Sep 22 00:22:09 UTC 2024

Modified Files:
src/lib/libc/regex: regex.3

Log Message:
Fix section header (Anonymous)


To generate a diff of this commit:
cvs rdiff -u -r1.33 -r1.34 src/lib/libc/regex/regex.3

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

Modified files:

Index: src/lib/libc/regex/regex.3
diff -u src/lib/libc/regex/regex.3:1.33 src/lib/libc/regex/regex.3:1.34
--- src/lib/libc/regex/regex.3:1.33	Sat Dec  3 20:29:32 2022
+++ src/lib/libc/regex/regex.3	Sat Sep 21 20:22:08 2024
@@ -1,4 +1,4 @@
-.\" $NetBSD: regex.3,v 1.33 2022/12/04 01:29:32 uwe Exp $
+.\" $NetBSD: regex.3,v 1.34 2024/09/22 00:22:08 christos Exp $
 .\"
 .\" Copyright (c) 1992, 1993, 1994 Henry Spencer.
 .\" Copyright (c) 1992, 1993, 1994
@@ -34,7 +34,7 @@
 .\"	@(#)regex.3	8.4 (Berkeley) 3/20/94
 .\" $FreeBSD: head/lib/libc/regex/regex.3 363817 2020-08-04 02:06:49Z kevans $
 .\"
-.Dd March 11, 2021
+.Dd September 21, 2024
 .Dt REGEX 3
 .Os
 .Sh NAME
@@ -260,7 +260,7 @@ If
 .Fn regcomp
 fails, it returns a non-zero error code;
 see
-.Sx DIAGNOSTICS .
+.Sx RETURN VALUES .
 .Pp
 The
 .Fn regexec
@@ -375,7 +375,7 @@ returns 0 for success and the non-zero c
 for failure.
 Other non-zero error codes may be returned in exceptional situations;
 see
-.Sx DIAGNOSTICS .
+.Sx RETURN VALUES .
 .Pp
 If
 .Dv REG_NOSUB



CVS commit: src/lib/libc/regex

2024-09-21 Thread Christos Zoulas
Module Name:src
Committed By:   christos
Date:   Sun Sep 22 00:22:09 UTC 2024

Modified Files:
src/lib/libc/regex: regex.3

Log Message:
Fix section header (Anonymous)


To generate a diff of this commit:
cvs rdiff -u -r1.33 -r1.34 src/lib/libc/regex/regex.3

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.



CVS commit: src/lib/libc/regex

2023-08-30 Thread Christos Zoulas
Module Name:src
Committed By:   christos
Date:   Wed Aug 30 20:37:24 UTC 2023

Modified Files:
src/lib/libc/regex: regcomp.c

Log Message:
- cast GETNEXT to unsigned where it is being promoted to int to prevent
  sign-extension (really it would have been better for PEEK*() and GETNEXT()
  to return unsigned char; this would have removed a ton of (uch) casts, but
  it is too intrusive for now).
- fix an isalpha that should have been iswalpha


To generate a diff of this commit:
cvs rdiff -u -r1.47 -r1.48 src/lib/libc/regex/regcomp.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

Modified files:

Index: src/lib/libc/regex/regcomp.c
diff -u src/lib/libc/regex/regcomp.c:1.47 src/lib/libc/regex/regcomp.c:1.48
--- src/lib/libc/regex/regcomp.c:1.47	Wed Dec 21 12:44:15 2022
+++ src/lib/libc/regex/regcomp.c	Wed Aug 30 16:37:24 2023
@@ -1,4 +1,4 @@
-/*	$NetBSD: regcomp.c,v 1.47 2022/12/21 17:44:15 wiz Exp $	*/
+/*	$NetBSD: regcomp.c,v 1.48 2023/08/30 20:37:24 christos Exp $	*/
 
 /*-
  * SPDX-License-Identifier: BSD-3-Clause
@@ -51,7 +51,7 @@
 static char sccsid[] = "@(#)regcomp.c	8.5 (Berkeley) 3/20/94";
 __FBSDID("$FreeBSD: head/lib/libc/regex/regcomp.c 368359 2020-12-05 03:18:48Z kevans $");
 #endif
-__RCSID("$NetBSD: regcomp.c,v 1.47 2022/12/21 17:44:15 wiz Exp $");
+__RCSID("$NetBSD: regcomp.c,v 1.48 2023/08/30 20:37:24 christos Exp $");
 
 #ifndef LIBHACK
 #define REGEX_GNU_EXTENSIONS
@@ -898,10 +898,10 @@ p_simp_re(struct parse *p, struct branch
 	handled = false;
 
 	assert(MORE());		/* caller should have ensured this */
-	c = GETNEXT();
+	c = (uch)GETNEXT();
 	if (c == '\\') {
 		(void)REQUIRE(MORE(), REG_EESCAPE);
-		cc = GETNEXT();
+		cc = (uch)GETNEXT();
 		c = BACKSL | cc;
 #ifdef REGEX_GNU_EXTENSIONS
 		if (p->gnuext) {
@@ -1083,7 +1083,7 @@ p_count(struct parse *p)
 	int ndigits = 0;
 
 	while (MORE() && isdigit((uch)PEEK()) && count <= DUPMAX) {
-		count = count*10 + (GETNEXT() - '0');
+		count = count*10 + ((uch)GETNEXT() - '0');
 		ndigits++;
 	}
 
@@ -1422,7 +1422,7 @@ may_escape(struct parse *p, const wint_t
 
 	if ((p->pflags & PFLAG_LEGACY_ESC) != 0)
 		return (true);
-	if (isalpha(ch) || ch == '\'' || ch == '`')
+	if (iswalpha(ch) || ch == '\'' || ch == '`')
 		return (false);
 	return (true);
 #ifdef NOTYET



CVS commit: src/lib/libc/regex

2023-08-30 Thread Christos Zoulas
Module Name:src
Committed By:   christos
Date:   Wed Aug 30 20:37:24 UTC 2023

Modified Files:
src/lib/libc/regex: regcomp.c

Log Message:
- cast GETNEXT to unsigned where it is being promoted to int to prevent
  sign-extension (really it would have been better for PEEK*() and GETNEXT()
  to return unsigned char; this would have removed a ton of (uch) casts, but
  it is too intrusive for now).
- fix an isalpha that should have been iswalpha


To generate a diff of this commit:
cvs rdiff -u -r1.47 -r1.48 src/lib/libc/regex/regcomp.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.



CVS commit: src/lib/libc/regex

2022-12-21 Thread Thomas Klausner
Module Name:src
Committed By:   wiz
Date:   Wed Dec 21 17:44:15 UTC 2022

Modified Files:
src/lib/libc/regex: regcomp.c

Log Message:
Remove unneeded -D_OPENBSD_SOURCE


To generate a diff of this commit:
cvs rdiff -u -r1.46 -r1.47 src/lib/libc/regex/regcomp.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

Modified files:

Index: src/lib/libc/regex/regcomp.c
diff -u src/lib/libc/regex/regcomp.c:1.46 src/lib/libc/regex/regcomp.c:1.47
--- src/lib/libc/regex/regcomp.c:1.46	Thu Mar 11 15:00:29 2021
+++ src/lib/libc/regex/regcomp.c	Wed Dec 21 17:44:15 2022
@@ -1,4 +1,4 @@
-/*	$NetBSD: regcomp.c,v 1.46 2021/03/11 15:00:29 christos Exp $	*/
+/*	$NetBSD: regcomp.c,v 1.47 2022/12/21 17:44:15 wiz Exp $	*/
 
 /*-
  * SPDX-License-Identifier: BSD-3-Clause
@@ -51,9 +51,7 @@
 static char sccsid[] = "@(#)regcomp.c	8.5 (Berkeley) 3/20/94";
 __FBSDID("$FreeBSD: head/lib/libc/regex/regcomp.c 368359 2020-12-05 03:18:48Z kevans $");
 #endif
-__RCSID("$NetBSD: regcomp.c,v 1.46 2021/03/11 15:00:29 christos Exp $");
-
-#define _OPENBSD_SOURCE
+__RCSID("$NetBSD: regcomp.c,v 1.47 2022/12/21 17:44:15 wiz Exp $");
 
 #ifndef LIBHACK
 #define REGEX_GNU_EXTENSIONS



CVS commit: src/lib/libc/regex

2022-12-21 Thread Thomas Klausner
Module Name:src
Committed By:   wiz
Date:   Wed Dec 21 17:44:15 UTC 2022

Modified Files:
src/lib/libc/regex: regcomp.c

Log Message:
Remove unneeded -D_OPENBSD_SOURCE


To generate a diff of this commit:
cvs rdiff -u -r1.46 -r1.47 src/lib/libc/regex/regcomp.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.



CVS commit: src/lib/libc/regex

2022-12-04 Thread Valeriy E. Ushakov
Module Name:src
Committed By:   uwe
Date:   Sun Dec  4 16:52:49 UTC 2022

Modified Files:
src/lib/libc/regex: re_format.7

Log Message:
re_format(7): Add subsection headings for ERE and BRE

The first paragraph could use some rewording.  While BRE may be
obsolete, it's still the default for regcomp(3) and the default for
grep(1), sed(1), etc.


To generate a diff of this commit:
cvs rdiff -u -r1.15 -r1.16 src/lib/libc/regex/re_format.7

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

Modified files:

Index: src/lib/libc/regex/re_format.7
diff -u src/lib/libc/regex/re_format.7:1.15 src/lib/libc/regex/re_format.7:1.16
--- src/lib/libc/regex/re_format.7:1.15	Sun Aug 28 12:59:50 2022
+++ src/lib/libc/regex/re_format.7	Sun Dec  4 16:52:48 2022
@@ -1,4 +1,4 @@
-.\" $NetBSD: re_format.7,v 1.15 2022/08/28 12:59:50 uwe Exp $
+.\" $NetBSD: re_format.7,v 1.16 2022/12/04 16:52:48 uwe Exp $
 .\"
 .\" Copyright (c) 1992, 1993, 1994 Henry Spencer.
 .\" Copyright (c) 1992, 1993, 1994
@@ -69,7 +69,7 @@ leaves some aspects of RE syntax and sem
 may not be fully portable to other
 .St -p1003.2
 implementations.
-.Pp
+.Ss Extended regular expressions
 A (modern) RE is one\*(DG or more non-empty\*(DG
 .Em branches ,
 separated by
@@ -383,7 +383,7 @@ Programs intended to be portable should 
 than 256 bytes,
 as an implementation can refuse to accept such REs and remain
 POSIX-compliant.
-.Pp
+.Ss Basic regular expressions
 Obsolete
 .Pq Dq basic
 regular expressions differ in several respects.



CVS commit: src/lib/libc/regex

2022-12-04 Thread Valeriy E. Ushakov
Module Name:src
Committed By:   uwe
Date:   Sun Dec  4 16:52:49 UTC 2022

Modified Files:
src/lib/libc/regex: re_format.7

Log Message:
re_format(7): Add subsection headings for ERE and BRE

The first paragraph could use some rewording.  While BRE may be
obsolete, it's still the default for regcomp(3) and the default for
grep(1), sed(1), etc.


To generate a diff of this commit:
cvs rdiff -u -r1.15 -r1.16 src/lib/libc/regex/re_format.7

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.



CVS commit: src/lib/libc/regex

2022-11-05 Thread Taylor R Campbell
Module Name:src
Committed By:   riastradh
Date:   Sat Nov  5 11:33:55 UTC 2022

Modified Files:
src/lib/libc/regex: regerror.c

Log Message:
regerror(3): Allow null errbuf if errbuf_size is zero.

The man page says:

   If errbuf_size is 0, errbuf is ignored but the return value is still
   correct.

POSIX says:

   If errbuf_size is 0, regerror() shall ignore the errbuf argument,
   and return the size of the buffer needed to hold the generated
   string.

   https://pubs.opengroup.org/onlinepubs/9699919799/functions/regerror.html

from e...@google.com


To generate a diff of this commit:
cvs rdiff -u -r1.25 -r1.26 src/lib/libc/regex/regerror.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

Modified files:

Index: src/lib/libc/regex/regerror.c
diff -u src/lib/libc/regex/regerror.c:1.25 src/lib/libc/regex/regerror.c:1.26
--- src/lib/libc/regex/regerror.c:1.25	Fri Feb 26 19:24:47 2021
+++ src/lib/libc/regex/regerror.c	Sat Nov  5 11:33:55 2022
@@ -1,4 +1,4 @@
-/*	$NetBSD: regerror.c,v 1.25 2021/02/26 19:24:47 christos Exp $	*/
+/*	$NetBSD: regerror.c,v 1.26 2022/11/05 11:33:55 riastradh Exp $	*/
 
 /*-
  * SPDX-License-Identifier: BSD-3-Clause
@@ -46,7 +46,7 @@
 static char sccsid[] = "@(#)regerror.c	8.4 (Berkeley) 3/20/94";
 __FBSDID("$FreeBSD: head/lib/libc/regex/regerror.c 326025 2017-11-20 19:49:47Z pfg $");
 #endif
-__RCSID("$NetBSD: regerror.c,v 1.25 2021/02/26 19:24:47 christos Exp $");
+__RCSID("$NetBSD: regerror.c,v 1.26 2022/11/05 11:33:55 riastradh Exp $");
 
 #include "namespace.h"
 #include 
@@ -139,7 +139,7 @@ regerror(int errcode,
 	char convbuf[50];
 
 	_DIAGASSERT(errcode != REG_ATOI || preg != NULL);
-	_DIAGASSERT(errbuf != NULL);
+	_DIAGASSERT(errbuf_size == 0 || errbuf != NULL);
 
 	if (errcode == REG_ATOI) {
 		s = regatoi(preg, convbuf, sizeof convbuf);



CVS commit: src/lib/libc/regex

2022-11-05 Thread Taylor R Campbell
Module Name:src
Committed By:   riastradh
Date:   Sat Nov  5 11:33:55 UTC 2022

Modified Files:
src/lib/libc/regex: regerror.c

Log Message:
regerror(3): Allow null errbuf if errbuf_size is zero.

The man page says:

   If errbuf_size is 0, errbuf is ignored but the return value is still
   correct.

POSIX says:

   If errbuf_size is 0, regerror() shall ignore the errbuf argument,
   and return the size of the buffer needed to hold the generated
   string.

   https://pubs.opengroup.org/onlinepubs/9699919799/functions/regerror.html

from e...@google.com


To generate a diff of this commit:
cvs rdiff -u -r1.25 -r1.26 src/lib/libc/regex/regerror.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.



CVS commit: src/lib/libc/regex

2022-08-28 Thread Valeriy E. Ushakov
Module Name:src
Committed By:   uwe
Date:   Sun Aug 28 12:59:50 UTC 2022

Modified Files:
src/lib/libc/regex: re_format.7

Log Message:
re_format(7): Use dagger, not double dagger.  Make it superscript.


To generate a diff of this commit:
cvs rdiff -u -r1.14 -r1.15 src/lib/libc/regex/re_format.7

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

Modified files:

Index: src/lib/libc/regex/re_format.7
diff -u src/lib/libc/regex/re_format.7:1.14 src/lib/libc/regex/re_format.7:1.15
--- src/lib/libc/regex/re_format.7:1.14	Wed Feb 24 09:10:12 2021
+++ src/lib/libc/regex/re_format.7	Sun Aug 28 12:59:50 2022
@@ -1,4 +1,4 @@
-.\" $NetBSD: re_format.7,v 1.14 2021/02/24 09:10:12 wiz Exp $
+.\" $NetBSD: re_format.7,v 1.15 2022/08/28 12:59:50 uwe Exp $
 .\"
 .\" Copyright (c) 1992, 1993, 1994 Henry Spencer.
 .\" Copyright (c) 1992, 1993, 1994
@@ -64,18 +64,19 @@ Obsolete REs mostly exist for backward c
 they will be discussed at the end.
 .St -p1003.2
 leaves some aspects of RE syntax and semantics open;
-`\(dd' marks decisions on these aspects that
+.ds DG \\s-2\\v'-0.4m'\\(dg\\v'0.4m'\\s+2
+`\(dg' marks decisions on these aspects that
 may not be fully portable to other
 .St -p1003.2
 implementations.
 .Pp
-A (modern) RE is one\(dd or more non-empty\(dd
+A (modern) RE is one\*(DG or more non-empty\*(DG
 .Em branches ,
 separated by
 .Ql \&| .
 It matches anything that matches one of the branches.
 .Pp
-A branch is one\(dd or more
+A branch is one\*(DG or more
 .Em pieces ,
 concatenated.
 It matches a match for the first, followed by a match for the second, etc.
@@ -83,7 +84,7 @@ It matches a match for the first, follow
 A piece is an
 .Em atom
 possibly followed
-by a single\(dd
+by a single\*(DG
 .Ql \&* ,
 .Ql \&+ ,
 .Ql \&? ,
@@ -111,7 +112,7 @@ always followed by
 .Ql \&} .
 The integers must lie between 0 and
 .Dv RE_DUP_MAX
-(255\(dd) inclusive,
+(255\*(DG) inclusive,
 and if there are two of them, the first may not exceed the second.
 An atom followed by a bound containing one integer
 .Em i
@@ -144,7 +145,7 @@ An atom is a regular expression enclosed
 regular expression),
 an empty set of
 .Ql ()
-(matching the null string)\(dd,
+(matching the null string)\*(DG,
 a
 .Em bracket expression
 (see below),
@@ -160,16 +161,16 @@ followed by one of the characters
 (matching that character taken as an ordinary character),
 a
 .Ql \e
-followed by any other character\(dd
+followed by any other character\*(DG
 (matching that character taken as an ordinary character,
 as if the
 .Ql \e
-had not been present\(dd),
+had not been present\*(DG),
 or a single character with no other significance (matching that character).
 A
 .Ql \&{
 followed by a character other than a digit is an ordinary
-character, not the beginning of a bound\(dd.
+character, not the beginning of a bound\*(DG.
 It is illegal to end an RE with
 .Ql \e .
 .Pp
@@ -193,7 +194,7 @@ of characters between those two (inclusi
 collating sequence,
 .No e.g. Ql [0-9]
 in ASCII matches any decimal digit.
-It is illegal\(dd for two ranges to share an
+It is illegal\*(DG for two ranges to share an
 endpoint,
 .No e.g. Ql a-c-e .
 Ranges are very collating-sequence-dependent,
@@ -265,7 +266,7 @@ then
 and
 .Ql [xy]
 are all synonymous.
-An equivalence class may not\(dd be an endpoint
+An equivalence class may not\*(DG be an endpoint
 of a range.
 .Pp
 Within a bracket expression, the name of a
@@ -297,7 +298,7 @@ The reverse, matching any character that
 class, the negation operator of bracket expressions may be used:
 .Ql [^[:class:]] .
 .Pp
-There are two special cases\(dd of bracket expressions:
+There are two special cases\*(DG of bracket expressions:
 the bracket expressions
 .Ql [[:<:]]
 and
@@ -377,7 +378,7 @@ and
 becomes
 .Ql [^xX] .
 .Pp
-No particular limit is imposed on the length of REs\(dd.
+No particular limit is imposed on the length of REs\*(DG.
 Programs intended to be portable should not employ REs longer
 than 256 bytes,
 as an implementation can refuse to accept such REs and remain
@@ -424,10 +425,10 @@ and
 by themselves ordinary characters.
 .Ql \&^
 is an ordinary character except at the beginning of the
-RE or\(dd the beginning of a parenthesized subexpression,
+RE or\*(DG the beginning of a parenthesized subexpression,
 .Ql \&$
 is an ordinary character except at the end of the
-RE or\(dd the end of a parenthesized subexpression,
+RE or\*(DG the end of a parenthesized subexpression,
 and
 .Ql \&*
 is an ordinary character if it appears at the beginning of the



CVS commit: src/lib/libc/regex

2022-08-28 Thread Valeriy E. Ushakov
Module Name:src
Committed By:   uwe
Date:   Sun Aug 28 12:59:50 UTC 2022

Modified Files:
src/lib/libc/regex: re_format.7

Log Message:
re_format(7): Use dagger, not double dagger.  Make it superscript.


To generate a diff of this commit:
cvs rdiff -u -r1.14 -r1.15 src/lib/libc/regex/re_format.7

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.



Re: CVS commit: src/lib/libc/regex

2021-02-25 Thread Christos Zoulas
In article <5c9e716-7ec1-9c7d-a7cb-21f08946...@invisible.ca>,
Jared McNeill   wrote:
>Building tools on macOS:
>
>/Users/jmcneill/netbsd/git-src/tools/compat/../../lib/libc/regex/regcomp.c:1585:8:
> 
>error: implicit declaration of function 'reallocarray' is invalid
>   in C99 [-Werror,-Wimplicit-function-declaration]
> ncs = reallocarray(p->g->sets, p->g->ncsets + 1, sizeof(*ncs));
>   ^
>/Users/jmcneill/netbsd/git-src/tools/compat/../../lib/libc/regex/regcomp.c:1585:8:
> 
>note: did you mean 'reallocarr'?
>/Users/jmcneill/netbsd/git-src/tools/compat/compat_defs.h:556:5: note: 
>'reallocarr' declared here
>int reallocarr(void *, size_t, size_t);
> ^

Fixed, thanks!

christos



Re: CVS commit: src/lib/libc/regex

2021-02-25 Thread Jared McNeill

Building tools on macOS:

/Users/jmcneill/netbsd/git-src/tools/compat/../../lib/libc/regex/regcomp.c:1585:8: 
error: implicit declaration of function 'reallocarray' is invalid

  in C99 [-Werror,-Wimplicit-function-declaration]
ncs = reallocarray(p->g->sets, p->g->ncsets + 1, sizeof(*ncs));
  ^
/Users/jmcneill/netbsd/git-src/tools/compat/../../lib/libc/regex/regcomp.c:1585:8: 
note: did you mean 'reallocarr'?
/Users/jmcneill/netbsd/git-src/tools/compat/compat_defs.h:556:5: note: 
'reallocarr' declared here

int reallocarr(void *, size_t, size_t);
^


On Tue, 23 Feb 2021, Christos Zoulas wrote:


Module Name:src
Committed By:   christos
Date:   Tue Feb 23 22:14:59 UTC 2021

Modified Files:
src/lib/libc/regex: cname.h engine.c re_format.7 regcomp.c regerror.c
regex.3 regex2.h regexec.c regfree.c utils.h
Removed Files:
src/lib/libc/regex: cclass.h

Log Message:
sync with FreeBSD:
   - NLS support
   - GNU extensions
   - bug fixes


To generate a diff of this commit:
cvs rdiff -u -r1.7 -r0 src/lib/libc/regex/cclass.h
cvs rdiff -u -r1.7 -r1.8 src/lib/libc/regex/cname.h
cvs rdiff -u -r1.24 -r1.25 src/lib/libc/regex/engine.c
cvs rdiff -u -r1.12 -r1.13 src/lib/libc/regex/re_format.7
cvs rdiff -u -r1.38 -r1.39 src/lib/libc/regex/regcomp.c
cvs rdiff -u -r1.23 -r1.24 src/lib/libc/regex/regerror.c
cvs rdiff -u -r1.26 -r1.27 src/lib/libc/regex/regex.3
cvs rdiff -u -r1.13 -r1.14 src/lib/libc/regex/regex2.h
cvs rdiff -u -r1.22 -r1.23 src/lib/libc/regex/regexec.c
cvs rdiff -u -r1.15 -r1.16 src/lib/libc/regex/regfree.c
cvs rdiff -u -r1.6 -r1.7 src/lib/libc/regex/utils.h

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.




Re: CVS commit: src/lib/libc/regex

2018-02-26 Thread Kamil Rytarowski
On 26.02.2018 15:41, Christos Zoulas wrote:
> On Feb 26,  2:33pm, n...@gmx.com (Kamil Rytarowski) wrote:
> -- Subject: Re: CVS commit: src/lib/libc/regex
> 
> | Looking at the internals of regasub(3) and regnsub(3), 10 is not just a
> | lower limit, but also the upper limit. I will try to explain it better
> | in a documentation and leave the code as it is.
> 
> Yes, the way to fix this is to come up with syntax to support more than
> one digit, perhaps \{10}...
> 
> christos
> 

I have no opinion and no request on changing this. Also 10 (9 without
\0) matches are already a lot and what matters to me, this is a well
defined behavior.



signature.asc
Description: OpenPGP digital signature


Re: CVS commit: src/lib/libc/regex

2018-02-26 Thread Christos Zoulas
On Feb 26,  2:33pm, n...@gmx.com (Kamil Rytarowski) wrote:
-- Subject: Re: CVS commit: src/lib/libc/regex

| Looking at the internals of regasub(3) and regnsub(3), 10 is not just a
| lower limit, but also the upper limit. I will try to explain it better
| in a documentation and leave the code as it is.

Yes, the way to fix this is to come up with syntax to support more than
one digit, perhaps \{10}...

christos


Re: CVS commit: src/lib/libc/regex

2018-02-26 Thread Kamil Rytarowski
On 25.02.2018 00:45, Christos Zoulas wrote:
> On Feb 25, 12:39am, n...@gmx.com (Kamil Rytarowski) wrote:
> -- Subject: Re: CVS commit: src/lib/libc/regex
> 
> | 
> | --MIMEStream=_0+26969_51985210222325_05798576868
> | Content-Type: multipart/signed; micalg=pgp-sha256;
> |  protocol="application/pgp-signature";
> |  boundary="PkRH582jLcQBCd2EFFVVmMFMcjasCXI54"
> | 
> | This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
> | --PkRH582jLcQBCd2EFFVVmMFMcjasCXI54
> | Content-Type: multipart/mixed; boundary="9pUp6Nh9t4hLApFrwb0ID8pUSOtt8VKmJ";
> |  protected-headers="v1"
> | From: Kamil Rytarowski 
> | To: source-changes-d@NetBSD.org, Christos Zoulas 
> | Message-ID: <23a830fa-8063-e100-c63b-eb86ef7a2...@gmx.com>
> | Subject: Re: CVS commit: src/lib/libc/regex
> | References: <20160114204147.ba169f...@cvs.netbsd.org>
> | In-Reply-To: <20160114204147.ba169f...@cvs.netbsd.org>
> | 
> | 
> | --9pUp6Nh9t4hLApFrwb0ID8pUSOtt8VKmJ
> | Content-Type: text/plain; charset=windows-1252
> | Content-Language: en-US
> | Content-Transfer-Encoding: quoted-printable
> | 
> | On 14.01.2016 21:41, Christos Zoulas wrote:
> | > +The
> | > +.Fa rm
> | > +array must be at least 10 elements long, and should contain the result
> | > +of the matches from a previous
> | > +.Fn regexec
> | > +call.
> | 
> | Could we have an argument to regasub(3)/regnsub(3) "size_t nmatch" like
> | in regexec(3), instead of assuming >=3D 10 elements long?
> | 
> | It might not be too late to alter this function. There is only 1 user in
> | GCC and no stable releases with this API.
> | 
> | My rationale is to sanitize these interfaces without caching the number
> | of elements for a regexec(3) call in a sanitizer. Additionally we could
> | have an internal sanity check to prevent out of bound operations on the
> | "regmatch_t *" type.
> 
> Sure, fix it and pullup-8.
> 
> christos
> 

Looking at the internals of regasub(3) and regnsub(3), 10 is not just a
lower limit, but also the upper limit. I will try to explain it better
in a documentation and leave the code as it is.



signature.asc
Description: OpenPGP digital signature


Re: CVS commit: src/lib/libc/regex

2018-02-24 Thread Christos Zoulas
On Feb 25, 12:39am, n...@gmx.com (Kamil Rytarowski) wrote:
-- Subject: Re: CVS commit: src/lib/libc/regex

| 
| --MIMEStream=_0+26969_51985210222325_05798576868
| Content-Type: multipart/signed; micalg=pgp-sha256;
|  protocol="application/pgp-signature";
|  boundary="PkRH582jLcQBCd2EFFVVmMFMcjasCXI54"
| 
| This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
| --PkRH582jLcQBCd2EFFVVmMFMcjasCXI54
| Content-Type: multipart/mixed; boundary="9pUp6Nh9t4hLApFrwb0ID8pUSOtt8VKmJ";
|  protected-headers="v1"
| From: Kamil Rytarowski 
| To: source-changes-d@NetBSD.org, Christos Zoulas 
| Message-ID: <23a830fa-8063-e100-c63b-eb86ef7a2...@gmx.com>
| Subject: Re: CVS commit: src/lib/libc/regex
| References: <20160114204147.ba169f...@cvs.netbsd.org>
| In-Reply-To: <20160114204147.ba169f...@cvs.netbsd.org>
| 
| 
| --9pUp6Nh9t4hLApFrwb0ID8pUSOtt8VKmJ
| Content-Type: text/plain; charset=windows-1252
| Content-Language: en-US
| Content-Transfer-Encoding: quoted-printable
| 
| On 14.01.2016 21:41, Christos Zoulas wrote:
| > +The
| > +.Fa rm
| > +array must be at least 10 elements long, and should contain the result
| > +of the matches from a previous
| > +.Fn regexec
| > +call.
| 
| Could we have an argument to regasub(3)/regnsub(3) "size_t nmatch" like
| in regexec(3), instead of assuming >=3D 10 elements long?
| 
| It might not be too late to alter this function. There is only 1 user in
| GCC and no stable releases with this API.
| 
| My rationale is to sanitize these interfaces without caching the number
| of elements for a regexec(3) call in a sanitizer. Additionally we could
| have an internal sanity check to prevent out of bound operations on the
| "regmatch_t *" type.

Sure, fix it and pullup-8.

christos


Re: CVS commit: src/lib/libc/regex

2018-02-24 Thread Kamil Rytarowski
On 14.01.2016 21:41, Christos Zoulas wrote:
> +The
> +.Fa rm
> +array must be at least 10 elements long, and should contain the result
> +of the matches from a previous
> +.Fn regexec
> +call.

Could we have an argument to regasub(3)/regnsub(3) "size_t nmatch" like
in regexec(3), instead of assuming >= 10 elements long?

It might not be too late to alter this function. There is only 1 user in
GCC and no stable releases with this API.

My rationale is to sanitize these interfaces without caching the number
of elements for a regexec(3) call in a sanitizer. Additionally we could
have an internal sanity check to prevent out of bound operations on the
"regmatch_t *" type.



signature.asc
Description: OpenPGP digital signature


Re: CVS commit: src/lib/libc/regex

2011-11-16 Thread Takehiko NOZAKI
do we have to fix src/dist/nvi/regex too?
it is same spencer regex code as src/lib/libc/regex (but modified for
wide character).

ftp://ftp.netbsd.org/pub/NetBSD/misc/tnozaki/patch-dist_nvi_regex

very truly yours
-- 
Takehiko NOZAKI


2011/10/10 Christos Zoulas :
> Module Name:    src
> Committed By:   christos
> Date:           Sun Oct  9 18:23:00 UTC 2011
>
> Modified Files:
>        src/lib/libc/regex: engine.c regcomp.c regex2.h
>
> Log Message:
> Prevent regcomp/regexec DoS attacks by limiting the amount of memory used
> and the level of recursion. Thanks to Maksymilian Arciemowicz for discovery
> and help with the implementation.
>
>
> To generate a diff of this commit:
> cvs rdiff -u -r1.22 -r1.23 src/lib/libc/regex/engine.c
> cvs rdiff -u -r1.29 -r1.30 src/lib/libc/regex/regcomp.c
> cvs rdiff -u -r1.12 -r1.13 src/lib/libc/regex/regex2.h
>
> Please note that diffs are not public domain; they are subject to the
> copyright notices on the relevant files.
>
>