Re: [PATCH 1/1] regex: fix broken clang build

2022-06-15 Thread Paul Eggert

On 6/14/22 23:51, Jeffrey Walton wrote:

I think you should use __apple_build_version__ to differentiate
between Apple Clang and LLVM Clang.


As near as I can make out, support for -Wvla was added in Clang 2.8, 
which I hope is old enough that we need not worry about Apple messing 
with Clang version numbers. If not, someone with access to old Apple 
releases would have to tell us.


What a mess this stuff is. Maybe it would be simpler to disable all -W 
options by default under Clang.





Re: [PATCH 1/1] regex: fix broken clang build

2022-06-14 Thread Paul Eggert

Why isn't a similar patch needed for lib/regex.c?

Can you determine the earliest version of Clang that supports both "# 
pragma GCC diagnostic push" and "# pragma GCC diagnostic ignored 
"-Wvla""? A quick glance at the archived manuals suggests it goes back 
to at least Clang 4 but I suspect it's earlier. Knowing is better than 
guessing 9.


Thanks.




Re: Parallelization of shell scripts for 'configure' etc.

2022-06-14 Thread Paul Eggert

On 6/14/22 18:55, Ángel wrote:

Do you have any handy example of configure that takes too long to run?


Sure. Coreutils, emacs. Pretty much any nontrivial configure script 
takes longer than it should.


I understand that parallelization of shell scripts is nontrivial.




Re: Parallelization of shell scripts for 'configure' etc.

2022-06-14 Thread Paul Eggert

On 6/14/22 10:11, Nick Bowler wrote:


The resulting config.h is correct but pa.sh took almost 1 minute to run
the configure script, about ten times longer than dash takes to run the
same script.  More than half of that time appears to be spent just
loading the program into pa.sh, before a single shell command is
actually executed.


Ouch. Thanks for looking into this, and sorry it took so much of your 
time. It appears that PaSh itself isn't suitable for prime-time use. 
This isn't too surprising, though, as it's a research project and I 
wouldn't expect it to compete with production shells. The main question 
here is whether the ideas of PaSh are good ones for improving Bash.





Re: Parallelization of shell scripts for 'configure' etc.

2022-06-13 Thread Paul Eggert

On 6/13/22 18:25, Dale R. Worley wrote:

It seems to me that bash provides the needed tools -- "( ... ) &" lets
you run things in parallel.  Similarly, if you've got a lot of small
tasks with a complex web of dependencies, you can encode that in a
"makefile".

It seems to me that the heavy work is rebuilding how "configure" scripts
are constructed based on which items can be run in parallel.


Yes, all that could be done in theory, but it'd take a lot of hacking 
and it's been decades and it hasn't happened.


I'd rather have shell scripts "just work" in parallel with a minimum of 
fuss.





Parallelization of shell scripts for 'configure' etc.

2022-06-13 Thread Paul Eggert
In many Gnu projects, the 'configure' script is the biggest barrier to 
building because it takes s long to run. Is there some way that we 
could improve its performance without completely reengineering it, by 
improving Bash so that it can parallelize 'configure' scripts?


For ideas about this, please see PaSh-JIT:

Kallas K, Mustafa T, Bielak J, Karnikis D, Dang THY, Greenberg M, 
Vasilakis N. Practically correct, just-in-time shell script 
parallelization. Proc OSDI 22. July 2022. 
https://nikos.vasilak.is/p/pash:osdi:2022.pdf


I've wanted something like this for *years* (I assigned a simpler 
version to my undergraduates but of course it was too much to expect 
them to implement it) and I hope some sort of parallelization like this 
can get into production with Bash at some point (or some other shell if 
Bash can't use this idea).




Re: Drop Gnulib support for ecvt, fcvt, gcvt, getw, putw?

2022-06-13 Thread Paul Eggert

On 6/12/22 21:54, Bruno Haible wrote:

Paul Eggert wrote:

The functions ecvt, fcvt, gcvt, getw, putw are no longer in POSIX and
the glibc manual pretty much recommends against them. How about if we
drop Gnulib support for them? This would speed up 'configure' a bit,
among other things. As far as I know nobody uses them.

I think this is a wrong axis of optimization.


I should have made it clearer that it was more to avoid confusion than 
to optimize. ("Why does Gnulib mess with ecvt?" I wondered.)


Your other optimization suggestions look reasonable, if someone has the 
time.


For more-drastic performance improvements we could use improvements to 
the shell. I'll send a followup email about this.




Re: fchmodat.c & lchmod.c - O_PATH & AT_EMPTY_PATH on older kernels

2022-06-12 Thread Paul Eggert

On 6/12/22 08:03, Bruno Haible wrote:


Two tests now fail, that succeeded before yesterday's patch.


Thanks for reporting that. Although I don't have MS-Windows I stared at 
the code a bit and I think I see what might be the problem. I found some 
other potential issues too (basically, I didn't switch from 
fstatat/lstat to readlinkat/readlink often enough so the code is still 
prone to EOVERFLOW problems). I installed the attached further patch, 
which I hope fixes things on MS-Windows and fixes the other issues too.
From d682f8de7f9d384f4cfc482a3ba2960329a8db21 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sun, 12 Jun 2022 13:46:52 -0700
Subject: [PATCH] fchmodat: port better to MS-Windows etc.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

MS-Windows problem reported by Bruno Haible in:
https://lists.gnu.org/r/bug-gnulib/2022-06/msg00041.html
Although I don’t use MS-Windows I see some related fstatat etc.
problems and am trying to fix them with this further patch.
* lib/fchmodat.c (fchmodat):
* lib/lchmod.c (lchmod):
* lib/lchown.c (lchown)
[!HAVE_LCHOWN && HAVE_CHOWN && !CHOWN_MODIFIES_SYMLINK]:
* lib/renameatu.c (renameatu)
[HAVE_RENAME && RENAME_TRAILING_SLASH_SOURCE_BUG]:
Use readlinkat/readlink instead of fstatat/lstat to test merely
whether a string names a symlink, as this avoids problems
with EOVERFLOW.  Also, I hope it works around the MS-Windows
issues that Bruno noted.
* m4/fchmodat.m4 (gl_PREREQ_FCHMODAT):
Check for readlinkat, not lchmod.
* m4/lchmod.m4 (gl_FUNC_LCHMOD): Do not require AC_CANONICAL_HOST
or check for lstat.
(gl_PREREQ_LCHMOD): Check for readlink.
* modules/lchown (Depends-on): Add readlink.  Do not depend on
lstat merely because !HAVE_LCHOWN.
* modules/renameatu (Depends-on): Add fstatat, readlinkat.
---
 ChangeLog | 26 ++
 lib/fchmodat.c| 16 
 lib/lchmod.c  | 17 +++--
 lib/lchown.c  |  4 ++--
 lib/renameatu.c   |  7 ---
 m4/fchmodat.m4|  4 ++--
 m4/lchmod.m4  |  7 +++
 modules/lchown|  3 ++-
 modules/renameatu |  2 ++
 9 files changed, 56 insertions(+), 30 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 2d0340b933..2daa6d8c81 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,29 @@
+2022-06-12  Paul Eggert  
+
+	fchmodat: port better to MS-Windows etc.
+	MS-Windows problem reported by Bruno Haible in:
+	https://lists.gnu.org/r/bug-gnulib/2022-06/msg00041.html
+	Although I don’t use MS-Windows I see some related fstatat etc.
+	problems and am trying to fix them with this further patch.
+	* lib/fchmodat.c (fchmodat):
+	* lib/lchmod.c (lchmod):
+	* lib/lchown.c (lchown)
+	[!HAVE_LCHOWN && HAVE_CHOWN && !CHOWN_MODIFIES_SYMLINK]:
+	* lib/renameatu.c (renameatu)
+	[HAVE_RENAME && RENAME_TRAILING_SLASH_SOURCE_BUG]:
+	Use readlinkat/readlink instead of fstatat/lstat to test merely
+	whether a string names a symlink, as this avoids problems
+	with EOVERFLOW.  Also, I hope it works around the MS-Windows
+	issues that Bruno noted.
+	* m4/fchmodat.m4 (gl_PREREQ_FCHMODAT):
+	Check for readlinkat, not lchmod.
+	* m4/lchmod.m4 (gl_FUNC_LCHMOD): Do not require AC_CANONICAL_HOST
+	or check for lstat.
+	(gl_PREREQ_LCHMOD): Check for readlink.
+	* modules/lchown (Depends-on): Add readlink.  Do not depend on
+	lstat merely because !HAVE_LCHOWN.
+	* modules/renameatu (Depends-on): Add fstatat, readlinkat.
+
 2022-06-12  Bruno Haible  
 
 	doc: Update O_PATH platforms list.
diff --git a/lib/fchmodat.c b/lib/fchmodat.c
index b233c366de..8ed4cb7398 100644
--- a/lib/fchmodat.c
+++ b/lib/fchmodat.c
@@ -83,9 +83,10 @@ fchmodat (int dir, char const *file, mode_t mode, int flags)
 # if NEED_FCHMODAT_NONSYMLINK_FIX
   if (flags == AT_SYMLINK_NOFOLLOW)
 {
-  struct stat st;
+#  if HAVE_READLINKAT
+  char readlink_buf[1];
 
-#  ifdef O_PATH
+#   ifdef O_PATH
   /* Open a file descriptor with O_NOFOLLOW, to make sure we don't
  follow symbolic links, if /proc is mounted.  O_PATH is used to
  avoid a failure if the file is not readable.
@@ -96,7 +97,7 @@ fchmodat (int dir, char const *file, mode_t mode, int flags)
 
   int err;
   char buf[1];
-  if (0 <= readlinkat (fd, "", buf, sizeof buf))
+  if (0 <= readlinkat (fd, "", readlink_buf, sizeof readlink_buf))
 err = EOPNOTSUPP;
   else if (errno == EINVAL)
 {
@@ -113,17 +114,16 @@ fchmodat (int dir, char const *file, mode_t mode, int flags)
   errno = err;
   if (0 <= err)
 return err == 0 ? 0 : -1;
-#  endif
+#   endif
 
   /* O_PATH + /proc is not supported.  */
-  int fstatat_result = fstatat (dir, file, &st, AT_SYMLINK_NOFOLLOW);
-  if (fstatat_result != 0)
-return fstatat_result;
-  if (S_ISLNK (st.st_mode))
+
+  if (0 <= readlinkat (dir, file, readlink_buf, sizeof readlink_buf))
 

Drop Gnulib support for ecvt, fcvt, gcvt, getw, putw?

2022-06-12 Thread Paul Eggert
The functions ecvt, fcvt, gcvt, getw, putw are no longer in POSIX and 
the glibc manual pretty much recommends against them. How about if we 
drop Gnulib support for them? This would speed up 'configure' a bit, 
among other things. As far as I know nobody uses them.





Re: fchmodat.c & lchmod.c - O_PATH & AT_EMPTY_PATH on older kernels

2022-06-11 Thread Paul Eggert

On 6/8/22 13:27, Lance Fredrickson wrote:
Would be nice to see a fix upstream before more projects update gnulib 
and the issue becomes broader.


It sounds like a source-code configuration issue, as your platform's 
headers in /usr/include correspond to a kernel newer than what you're 
running (which is not supported in general).


That being said, it might not hurt to simplify this messy old code so 
that it's more portable to messy old platforms. Also, now that I'm 
thinking about it, the Gnulib code didn't work if fstatat fails with 
EOVERFLOW and should be using readlinkat instead. Please try the first 
attached patch, which I've installed in Gnulib. I also installed the 2nd 
attached patch which is just a doc update.From 14379aa60449d0b48b9c247391ba863d271f4cb4 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 11 Jun 2022 16:58:25 -0700
Subject: [PATCH 1/2] fchmodat: port to old Linux kernel + newer headers
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Problem reported by Lance Fredrickson in:
https://lists.gnu.org/r/bug-gnulib/2022-06/msg00038.html
* lib/fchmodat.c (fchmodat):
* lib/lchmod.c (lchmod): Do not rely on AT_EMPTY_PATH as to
whether syscalls work on ""; instead, if a call fails with
ENOENT assume that those syscalls do not work.
Do not use fstatat to determine whether a file is a symlink,
as this has problems with EOVERFLOW.  Use readlinkat instead,
and if it fails with EINVAL then the file is not a symlink.
Remove #if tests on __linux__ || __ANDROID__ || __CYGWIN__
as this has been a maintenance hassle and it’s unlikely
these days that a new platform would #define O_PATH without also
either supporting /proc or keeping it absent.
* modules/fchmodat (Depends-on): Remove fstatat.
There should be no need for either fchmodat or lchmod to depend on
readlinkat, since they use readlinkat only in contexts where it
should work without Gnulib intervention.
---
 ChangeLog| 21 ++
 lib/fchmodat.c   | 54 --
 lib/lchmod.c | 56 +---
 modules/fchmodat |  1 -
 4 files changed, 59 insertions(+), 73 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 6dbc0b089f..54ac81901d 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,24 @@
+2022-06-11  Paul Eggert  
+
+	fchmodat: port to old Linux kernel + newer headers
+	Problem reported by Lance Fredrickson in:
+	https://lists.gnu.org/r/bug-gnulib/2022-06/msg00038.html
+	* lib/fchmodat.c (fchmodat):
+	* lib/lchmod.c (lchmod): Do not rely on AT_EMPTY_PATH as to
+	whether syscalls work on ""; instead, if a call fails with
+	ENOENT assume that those syscalls do not work.
+	Do not use fstatat to determine whether a file is a symlink,
+	as this has problems with EOVERFLOW.  Use readlinkat instead,
+	and if it fails with EINVAL then the file is not a symlink.
+	Remove #if tests on __linux__ || __ANDROID__ || __CYGWIN__
+	as this has been a maintenance hassle and it’s unlikely
+	these days that a new platform would #define O_PATH without also
+	either supporting /proc or keeping it absent.
+	* modules/fchmodat (Depends-on): Remove fstatat.
+	There should be no need for either fchmodat or lchmod to depend on
+	readlinkat, since they use readlinkat only in contexts where it
+	should work without Gnulib intervention.
+
 2022-06-06  Bruno Haible  
 
 	fopen-gnu: Make this module work again (regression 2022-01-03).
diff --git a/lib/fchmodat.c b/lib/fchmodat.c
index dc53583366..b233c366de 100644
--- a/lib/fchmodat.c
+++ b/lib/fchmodat.c
@@ -85,7 +85,7 @@ fchmodat (int dir, char const *file, mode_t mode, int flags)
 {
   struct stat st;
 
-#  if defined O_PATH && defined AT_EMPTY_PATH
+#  ifdef O_PATH
   /* Open a file descriptor with O_NOFOLLOW, to make sure we don't
  follow symbolic links, if /proc is mounted.  O_PATH is used to
  avoid a failure if the file is not readable.
@@ -94,45 +94,28 @@ fchmodat (int dir, char const *file, mode_t mode, int flags)
   if (fd < 0)
 return fd;
 
-  /* Up to Linux 5.3 at least, when FILE refers to a symbolic link, the
- chmod call below will change the permissions of the symbolic link
- - which is undesired - and on many file systems (ext4, btrfs, jfs,
- xfs, ..., but not reiserfs) fail with error EOPNOTSUPP - which is
- misleading.  Therefore test for a symbolic link explicitly.
- Use fstatat because fstat does not work on O_PATH descriptors
- before Linux 3.6.  */
-  if (fstatat (fd, "", &st, AT_EMPTY_PATH) != 0)
+  int err;
+  char buf[1];
+  if (0 <= readlinkat (fd, "", buf, sizeof buf))
+err = EOPNOTSUPP;
+  else if (errno == EINVAL)
 {
-  int stat_errno = errno;
-  close (fd);
-  errno = stat_errno;
-  return -1;
-

Re: [PATCH 2/3] dfa: new option DFA_STRAY_BACKSLASH_WARN

2022-06-06 Thread Paul Eggert

On 6/6/22 12:37, Jim Meyering wrote:

Once you push that (and assuming you have nothing else pending), I'll
prepare another pre-release snapshot.


Thanks, I pushed it into grep master, after fixing the commentary issue 
Bruno noted.




Re: [PATCH 2/3] dfa: new option DFA_STRAY_BACKSLASH_WARN

2022-06-05 Thread Paul Eggert

On 6/4/22 18:45, Bruno Haible wrote:

And maybe some people find it OK to get the "stray \ before -" warning for
   grep -e '\-x' FILE
and to inhibit it only when the option -e or the marker '--' is not used?


Yes, thanks, this narrower exception sounds better. Revised grep patch 
attached.


[cc'ing this to grep-devel as this is no longer a Gnulib patch (should 
have done that in my previous email...).]From 019745515bbc6c515dee600f24e80c4003f76b79 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sun, 5 Jun 2022 10:42:22 -0700
Subject: [PATCH] =?UTF-8?q?grep:=20don=E2=80=99t=20diagnose=20"grep=20'\-c?=
 =?UTF-8?q?'"?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* src/grep.c (main): Skip past leading backslash of a pattern that
begins with "\-".  Inspired by a remark by Bruno Haible in:
https://lists.gnu.org/r/bug-gnulib/2022-06/msg00022.html
---
 src/grep.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/grep.c b/src/grep.c
index 59d3431..6b976b4 100644
--- a/src/grep.c
+++ b/src/grep.c
@@ -2848,8 +2848,16 @@ main (int argc, char **argv)
 }
   else if (optind < argc)
 {
+  /* If a command-line regular expression operand starts with '\-',
+ skip the '\'.  This suppress a stray-backslash warning if a
+ script uses the non-POSIX "grep '\-x'" to avoid treating
+ "-x" as an option.  */
+  char const *pat = argv[optind++];
+  bool skip_bs = (matcher != F_MATCHER_INDEX
+  && pat[0] == '\\' && pat[1] == '-');
+
   /* Make a copy so that it can be reallocated or freed later.  */
-  pattern_array = keys = xstrdup (argv[optind++]);
+  pattern_array = keys = xstrdup (pat + skip_bs);
   idx_t patlen = strlen (keys);
   keys[patlen] = '\n';
   keycc = update_patterns (keys, 0, patlen + 1, "");
-- 
2.36.1



Re: [PATCH 2/3] dfa: new option DFA_STRAY_BACKSLASH_WARN

2022-06-04 Thread Paul Eggert

On 6/3/22 20:18, Bruno Haible wrote:

This warning punishes a traditional habit, namely to backslash-escape
a leading ASCII '-' character, so as to avoid it from being interpreted
as an option.


That's the first I've heard of that habit. Even 7th Edition Unix 
supported 'grep -e PAT FILE', but I guess some nonstandard grep 
implementations removed -e. These days, although I would think 'grep -- 
PAT FILE' would work everywhere, there may be scripts that still use 
that old habit.


If we want to support "grep '\-PAT' FILE", we can do so in grep rather 
than in dfa.c; it might not be wise to treat \- as - everywhere, as that 
would preclude future extensions such as A\-B meaning r.e. subtraction. 
We could add something like the attached to GNU grep, if you and Jim 
think it's a good idea (I hope Jim doesn't get tired of these "just one 
more thing" changes before a release...).From fe5ed6852a51d6b6ea9c407f10019ecd427d4eb0 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 4 Jun 2022 15:41:38 -0700
Subject: [PATCH] =?UTF-8?q?grep:=20don=E2=80=99t=20diagnose=20"grep=20'\-c?=
 =?UTF-8?q?'"?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* src/dfasearch.c (GEAcompile): Skip past leading backslash of a
pattern that begins with "\-".  Inspired by a remark by Bruno Haible:
https://lists.gnu.org/r/bug-gnulib/2022-06/msg00022.html
---
 src/dfasearch.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/dfasearch.c b/src/dfasearch.c
index 8d832f0..730d0e4 100644
--- a/src/dfasearch.c
+++ b/src/dfasearch.c
@@ -203,6 +203,13 @@ GEAcompile (char *pattern, idx_t size, reg_syntax_t syntax_bits,
   dfasyntax (dc->dfa, &localeinfo, syntax_bits, dfaopts);
   bool bs_safe = !localeinfo.multibyte | localeinfo.using_utf8;
 
+  /* If a pattern starts with "\-" omit the backslash, to suppress a
+ stray-backslash warning if a script uses the non-POSIX
+ "grep '\-x'" to avoid treating "-x" as an option.  */
+  bool skip_bs = pattern[0] == '\\' && pattern[1] == '-';
+  pattern += skip_bs;
+  size -= skip_bs;
+
   /* For GNU regex, pass the patterns separately to detect errors like
  "[\nallo\n]\n", where the patterns are "[", "allo" and "]", and
  this should be a syntax error.  The same for backref, where the
-- 
2.36.1



Re: "grep '\]'" warnings suggest a Gnulib DFA patch

2022-06-04 Thread Paul Eggert

On 6/3/22 20:08, Bruno Haible wrote:


But when I think about the thousands of people who use regular expressions
out there. How would they remember that in parentheses both should be
backslash-escaped in EREs
 \(   \)


It's even weirder, in that POSIX says unmatched ')' is treated like '\)' 
in an ERE, which is why gnulib/lib/dfa.c does not warn about it.




but brackets and braces are asymmetric
 \[   ]
 \{   }


Thanks, good catch about \}. We should treat it like \]. (And this means 
regex-quote.c is buggy in a different way, sigh)




Even if the warning message you install in grep has 3 or 5 lines and goes
into all details, we are not serving the community if we force them to adhere
to asymmetric rules, where up to now they could use symmetric rules.


Thanks, and I see Jim agrees too. I installed the first two attached 
patches into Gnulib to do that and to fix regex-quote, propagated this 
into Grep, and installed the last attached patch to Grep to document this.


At some point the behavior of \], \}, and all the other stuff the Grep 
manuals new "Problematic Expressions" node should be documented in 
gnulib/doc/regex.texi too. I'll cc this to Reuben to see whether he has 
the time.


It might be useful for GNU grep to have a --pedantic flag to check 
regular expression portability, to reject unportable REs like '\]'. But 
any such feature can wait until after the next GNU grep release.From 0153035f93d5e537efef9119676e120034ac912b Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 3 Jun 2022 18:46:37 -0700
Subject: [PATCH 1/2] dfa: do not warn about \] and \}
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* lib/dfa.c (lex): Do not warn about \] and \}, since they’re
surely universally supported even though POSIX says their
interpretation is undefined.
---
 ChangeLog | 7 +++
 lib/dfa.c | 2 ++
 lib/dfa.h | 6 +-
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 5fe5e9ee23..053fabde2a 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2022-06-04  Paul Eggert  
+
+	dfa: do not warn about \] and \}
+	* lib/dfa.c (lex): Do not warn about \] and \}, since they’re
+	surely universally supported even though POSIX says their
+	interpretation is undefined.
+
 2022-06-03  Paul Eggert  
 
 	regex-quote: \] -> ] in EREs and BREs
diff --git a/lib/dfa.c b/lib/dfa.c
index bd4c5f0582..4f8367af3f 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -1563,6 +1563,8 @@ lex (struct dfa *dfa)
 }
   dfawarn (msg);
 }
+  FALLTHROUGH;
+case ']': case '}':
 normal_char:
   dfa->lex.laststart = false;
   /* For multibyte character sets, folding is done in atom.  Always
diff --git a/lib/dfa.h b/lib/dfa.h
index 91ec1d809f..043f0e9717 100644
--- a/lib/dfa.h
+++ b/lib/dfa.h
@@ -79,7 +79,11 @@ enum
merely a warning.  */
 DFA_CONFUSING_BRACKETS_ERROR = 1 << 2,
 
-/* Warn about stray backslashes before ordinary characters.  */
+/* Warn about stray backslashes before ordinary characters other
+   than ] and } which are special because even though POSIX
+   says \] and \} have undefined interpretation, platforms
+   reliably ignore those stray backlashes and warning about them
+   would likely cause more trouble than it's worth.  */
 DFA_STRAY_BACKSLASH_WARN = 1 << 3,
 
 /* Warn about * appearing out of context at the start of an
-- 
2.34.1

From ac58aead465ab8bea4223060e61c33eb265e8e85 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 4 Jun 2022 09:55:28 -0700
Subject: [PATCH 2/2] regex-quote: \} -> } in EREs
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* lib/regex-quote.c (ere_special): Don’t use \} in EREs,
as POSIX says the interpretation is undefined.
* tests/test-regex-quote.c (test_bre, test_ere):
Add tests for }.
---
 ChangeLog| 6 ++
 lib/regex-quote.c| 2 +-
 tests/test-regex-quote.c | 2 ++
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 053fabde2a..ed21be142f 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,11 @@
 2022-06-04  Paul Eggert  
 
+	regex-quote: \} -> } in EREs
+	* lib/regex-quote.c (ere_special): Don’t use \} in EREs,
+	as POSIX says the interpretation is undefined.
+	* tests/test-regex-quote.c (test_bre, test_ere):
+	Add tests for }.
+
 	dfa: do not warn about \] and \}
 	* lib/dfa.c (lex): Do not warn about \] and \}, since they’re
 	surely universally supported even though POSIX says their
diff --git a/lib/regex-quote.c b/lib/regex-quote.c
index 9b92e98910..41639ea50e 100644
--- a/lib/regex-quote.c
+++ b/lib/regex-quote.c
@@ -29,7 +29,7 @@
 static const char bre_special[] = "$^.*[\\";
 
 /* Characters that are special in an ERE.  */
-stat

Re: Version sort behavior

2022-06-04 Thread Paul Eggert

On 6/3/22 20:48, Bruno Haible wrote:

The coreutils test 'tests/misc/ls-misc.pl' now fails for me:


It works for me, from a fresh checkout and bootstrap of the current 
coreutils commit 93e099e4c3b659b2e329f655fbdc73fdf594a66e on Fedora 36 
x86-64.


From the output, it looks like you might not have the correct version 
of Gnulib (it should be commit 762bd0aa660b0c1c02597e0d2e5c5fbf9bab1b91) 
or you haven't run ./bootstrap lately.




Re: Version sort behavior

2022-06-03 Thread Paul Eggert
Thanks for the bug report. I installed the attached patch into Gnulib 
master and propagated this into Coreutils, so it should be fixed in the 
next Coreutils release.From 1ba2b66ea45f9bc43cdc0f6f93efa59157d2b2ba Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 3 Jun 2022 17:27:44 -0700
Subject: [PATCH] =?UTF-8?q?filevercmp:=20don=E2=80=99t=20treat=20entire=20?=
 =?UTF-8?q?filename=20as=20suffix?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Problem reported by Artém S. Tashkinóv in:
https://lists.gnu.org/r/bug-gnulib/2022-06/msg00012.html
* lib/filevercmp.c (file_prefixlen): When stripping
(\.[A-Za-z~][A-Za-z0-9~]*)*$ suffixes, do not strip
the entire file name.
* tests/test-filevercmp.c (examples): Adjust to match new behavior.
---
 ChangeLog   | 10 ++
 lib/filevercmp.c| 18 +++---
 lib/filevercmp.h|  4 +++-
 tests/test-filevercmp.c |  4 ++--
 4 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 4449ff14f9..95d1314cdc 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,13 @@
+2022-06-03  Paul Eggert  
+
+	filevercmp: don’t treat entire filename as suffix
+	Problem reported by Artém S. Tashkinóv in:
+	https://lists.gnu.org/r/bug-gnulib/2022-06/msg00012.html
+	* lib/filevercmp.c (file_prefixlen): When stripping
+	(\.[A-Za-z~][A-Za-z0-9~]*)*$ suffixes, do not strip
+	the entire file name.
+	* tests/test-filevercmp.c (examples): Adjust to match new behavior.
+
 2022-06-03  Bruno Haible  
 
 	setlocale: Update after Turkey changed its name.
diff --git a/lib/filevercmp.c b/lib/filevercmp.c
index d546e79054..7e54793e61 100644
--- a/lib/filevercmp.c
+++ b/lib/filevercmp.c
@@ -29,6 +29,8 @@
 /* Return the length of a prefix of S that corresponds to the suffix
defined by this extended regular expression in the C locale:
  (\.[A-Za-z~][A-Za-z0-9~]*)*$
+   Use the longest suffix matching this regular expression,
+   except do not use all of S as a suffix if S is nonempty.
If *LEN is -1, S is a string; set *LEN to S's length.
Otherwise, *LEN should be nonnegative, S is a char array,
and *LEN does not change.  */
@@ -36,20 +38,22 @@ static idx_t
 file_prefixlen (char const *s, ptrdiff_t *len)
 {
   size_t n = *len;  /* SIZE_MAX if N == -1.  */
+  idx_t prefixlen = 0;
 
-  for (idx_t i = 0; ; i++)
+  for (idx_t i = 0; ; )
 {
-  idx_t prefixlen = i;
-  while (i + 1 < n && s[i] == '.' && (c_isalpha (s[i + 1])
-  || s[i + 1] == '~'))
-for (i += 2; i < n && (c_isalnum (s[i]) || s[i] == '~'); i++)
-  continue;
-
   if (*len < 0 ? !s[i] : i == n)
 {
   *len = i;
   return prefixlen;
 }
+
+  i++;
+  prefixlen = i;
+  while (i + 1 < n && s[i] == '.' && (c_isalpha (s[i + 1])
+  || s[i + 1] == '~'))
+for (i += 2; i < n && (c_isalnum (s[i]) || s[i] == '~'); i++)
+  continue;
 }
 }
 
diff --git a/lib/filevercmp.h b/lib/filevercmp.h
index 5a33677671..57949760b2 100644
--- a/lib/filevercmp.h
+++ b/lib/filevercmp.h
@@ -61,7 +61,9 @@
without them, using version sort without special priority;
if they do not compare equal, this comparison result is used and
the suffixes are effectively ignored.  Otherwise, the entire
-   strings are compared using version sort.
+   strings are compared using version sort.  When removing a suffix
+   from a nonempty string, remove the maximal-length suffix such that
+   the remaining string is nonempty.
 
This function is intended to be a replacement for strverscmp.  */
 int filevercmp (char const *a, char const *b) _GL_ATTRIBUTE_PURE;
diff --git a/tests/test-filevercmp.c b/tests/test-filevercmp.c
index b2a7e90f3f..998250990d 100644
--- a/tests/test-filevercmp.c
+++ b/tests/test-filevercmp.c
@@ -29,6 +29,8 @@ static const char *const examples[] =
   "",
   ".",
   "..",
+  ".0",
+  ".9",
   ".A",
   ".Z",
   ".a~",
@@ -39,8 +41,6 @@ static const char *const examples[] =
   ".zz~",
   ".zz",
   ".zz.~1~",
-  ".0",
-  ".9",
   ".zz.0",
   ".\1",
   ".\1.txt",
-- 
2.36.1



"grep '\]'" warnings suggest a Gnulib DFA patch

2022-06-03 Thread Paul Eggert
While testing, I discovered that master-branch grep's bootstrap script 
contained a regular expression with '\]' that master-branch grep now 
warns about. I fixed this portability bug in 'bootstrap' by installing 
the following patch into Gnulib and propagating this into grep master:


https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=762bd0aa660b0c1c02597e0d2e5c5fbf9bab1b91

Even though POSIX says the interpretation of \] is undefined (which 
means the Gnulib patch is helpful), it's unlikely that any 
POSIX-conforming regular expression matcher would do anything other than 
treat \] like plain ]. And this suggests that GNU grep's warning about 
\] is perhaps more trouble than it's worth.


So, what do you think of the idea of not warning for this particular 
stray backslash? Proposed Gnulib patch attached, with the idea of 
propagating this into GNU grep before its upcoming release. I haven't 
installed this.From 0da5279533567c7b3470e550861643c0060e2f0d Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 3 Jun 2022 18:46:37 -0700
Subject: [PATCH] dfa: do not warn about \]
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* lib/dfa.c (lex): Do not warn about \], since it’s surely
universally supported even though POSIX says its interpretation
is undefined.
---
 ChangeLog | 5 +
 lib/dfa.c | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index 5fe5e9ee23..73e7898f4e 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,10 @@
 2022-06-03  Paul Eggert  
 
+	dfa: do not warn about \]
+	* lib/dfa.c (lex): Do not warn about \], since it’s surely
+	universally supported even though POSIX says its interpretation
+	is undefined.
+
 	regex-quote: \] -> ] in EREs and BREs
 	* build-aux/bootstrap:
 	* build-aux/bootstrap.conf (gettext_external):
diff --git a/lib/dfa.c b/lib/dfa.c
index bd4c5f0582..d6652432a4 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -1563,6 +1563,9 @@ lex (struct dfa *dfa)
 }
   dfawarn (msg);
 }
+  FALLTHROUGH;
+case ']':
+  /* Do not warn about \] as that's more trouble than it's worth.  */
 normal_char:
   dfa->lex.laststart = false;
   /* For multibyte character sets, folding is done in atom.  Always
-- 
2.36.1



Re: Failure running gnulib-tool update on z/OS

2022-06-02 Thread Paul Eggert

On 5/30/22 20:55, Mike Fulton wrote:

/bin/sh gnulib/gnulib-tool –update

The error is on line 1571, where the z/OS shell (an older POSIX ‘sh’) issues:
FSUM7728 bad ${} modifier


I must be missing context, as current gnulib/gnulib-tool's line 1571 
says "autoconf_minversion=" which doesn't have anything to do with ${} 
modifiers. And although I do see the following code starting on line 
1966, I don't see how the failure occurs because the funny ${} modifier 
is inside a single-quoted string that should not be eval'ed.


Is the problem that the z/OS shell accepts ${f//o/e} but rejects 
${1//[!a-zA-Z0-9_]/_}? If so, the fix to gnulib-tool should be simple.


-
if (f=foo; eval echo '${f//o/e}') < /dev/null 2>/dev/null | grep 
fee >/dev/null; then

  # Bash 2.0 and newer, ksh, and zsh support the syntax
  #   ${param//pattern/replacement}
  # as a shorthand for
  #   `echo "$param" | sed -e "s/pattern/replacement/g"`.
  # Note: The 'eval' is necessary for dash and NetBSD /bin/sh.
  eval 'func_cache_var ()
  {
cachevar=c_${1//[!a-zA-Z0-9_]/_}
  }'
else
  func_cache_var ()
  {
case $1 in
  *[!a-zA-Z0-9_]*)
cachevar=c_`echo "$1" | LC_ALL=C sed -e 
's/[^a-zA-Z0-9_]/_/g'` ;;

  *)
cachevar=c_$1 ;;
esac
  }
fi



[PATCH] dfa: new options DFA_STAR_WARN, DFA_PLUS_WARN

2022-05-24 Thread Paul Eggert
This lets ‘grep -E '(*a|+b)'’ warn about the * and the +.
* lib/dfa.h (DFA_STAR_WARN, DFA_PLUS_WARN): New flags.
* lib/dfa.c (lex): Support them.
---
 ChangeLog |  7 +++
 lib/dfa.c | 51 ++-
 lib/dfa.h |  8 
 3 files changed, 49 insertions(+), 17 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 088e3b3134..5b20aa58e7 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2022-05-24  Paul Eggert  
+
+   dfa: new options DFA_STAR_WARN, DFA_PLUS_WARN
+   This lets ‘grep -E '(*a|+b)'’ warn about the * and the +.
+   * lib/dfa.h (DFA_STAR_WARN, DFA_PLUS_WARN): New flags.
+   * lib/dfa.c (lex): Support them.
+
 2022-05-23  Paul Eggert  
 
dfa: '\n' is not governed by RE_LIMITED_OPS
diff --git a/lib/dfa.c b/lib/dfa.c
index 5d92b38b4c..bd4c5f0582 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -1311,17 +1311,25 @@ lex (struct dfa *dfa)
 goto default_case;
   if (backslash != ((dfa->syntax.syntax_bits & RE_BK_PLUS_QM) != 0))
 goto normal_char;
-  if (!(dfa->syntax.syntax_bits & RE_CONTEXT_INDEP_OPS)
-  && dfa->lex.laststart)
-goto normal_char;
+  if (dfa->lex.laststart)
+{
+  if (!(dfa->syntax.syntax_bits & RE_CONTEXT_INDEP_OPS))
+goto default_case;
+  if (dfa->syntax.dfaopts & DFA_PLUS_WARN)
+dfawarn (_("? at start of expression"));
+}
   return dfa->lex.lasttok = QMARK;
 
 case '*':
   if (backslash)
 goto normal_char;
-  if (!(dfa->syntax.syntax_bits & RE_CONTEXT_INDEP_OPS)
-  && dfa->lex.laststart)
-goto normal_char;
+  if (dfa->lex.laststart)
+{
+  if (!(dfa->syntax.syntax_bits & RE_CONTEXT_INDEP_OPS))
+goto default_case;
+  if (dfa->syntax.dfaopts & DFA_STAR_WARN)
+dfawarn (_("* at start of expression"));
+}
   return dfa->lex.lasttok = STAR;
 
 case '+':
@@ -1329,9 +1337,13 @@ lex (struct dfa *dfa)
 goto default_case;
   if (backslash != ((dfa->syntax.syntax_bits & RE_BK_PLUS_QM) != 0))
 goto normal_char;
-  if (!(dfa->syntax.syntax_bits & RE_CONTEXT_INDEP_OPS)
-  && dfa->lex.laststart)
-goto normal_char;
+  if (dfa->lex.laststart)
+{
+  if (!(dfa->syntax.syntax_bits & RE_CONTEXT_INDEP_OPS))
+goto default_case;
+  if (dfa->syntax.dfaopts & DFA_PLUS_WARN)
+dfawarn (_("+ at start of expression"));
+}
   return dfa->lex.lasttok = PLUS;
 
 case '{':
@@ -1339,9 +1351,6 @@ lex (struct dfa *dfa)
 goto default_case;
   if (backslash != ((dfa->syntax.syntax_bits & RE_NO_BK_BRACES) == 0))
 goto normal_char;
-  if (!(dfa->syntax.syntax_bits & RE_CONTEXT_INDEP_OPS)
-  && dfa->lex.laststart)
-goto normal_char;
 
   /* Cases:
  {M} - exact count
@@ -1374,16 +1383,24 @@ lex (struct dfa *dfa)
   dfa->lex.maxrep * 10 + *p - '0'));
   }
   }
-if (! ((! backslash || (p != lim && *p++ == '\\'))
+bool invalid_content
+  = ! ((! backslash || (p != lim && *p++ == '\\'))
&& p != lim && *p++ == '}'
&& 0 <= dfa->lex.minrep
&& (dfa->lex.maxrep < 0
-   || dfa->lex.minrep <= dfa->lex.maxrep)))
+   || dfa->lex.minrep <= dfa->lex.maxrep));
+if (invalid_content
+&& (dfa->syntax.syntax_bits & RE_INVALID_INTERVAL_ORD))
+  goto normal_char;
+if (dfa->lex.laststart)
   {
-if (dfa->syntax.syntax_bits & RE_INVALID_INTERVAL_ORD)
-  goto normal_char;
-dfaerror (_("invalid content of \\{\\}"));
+if (!(dfa->syntax.syntax_bits & RE_CONTEXT_INDEP_OPS))
+  goto default_case;
+if (dfa->syntax.dfaopts & DFA_PLUS_WARN)
+  dfawarn (_("{...} at start of expression"));
   }
+if (invalid_content)
+  dfaerror (_("invalid content of \\{\\}"));
 if (RE_DUP_MAX < dfa->lex.maxrep)
   dfaerror (_("regular expression too big"));
 dfa->

[PATCH 2/3] dfa: new option DFA_STRAY_BACKSLASH_WARN

2022-05-23 Thread Paul Eggert
This is for grep, which wants to warn about stray backslashes that
lead to unspecified behavior.  For example, "grep -oi '\a'"
surprisingly is not equivalent to "grep -oi 'a'", so the stray
backslash should be warned about.
* lib/dfa.c: Include wctype.h, for iswprint and iswspace.
(lex): Add support for DFA_STRAY_BACKSLASH_WARN.
* lib/dfa.h (DFA_STRAY_BACKSLASH_WARN): New constant.
---
 ChangeLog |   9 
 lib/dfa.c | 120 --
 lib/dfa.h |   3 ++
 3 files changed, 93 insertions(+), 39 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 407baca335..0c5e799521 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,14 @@
 2022-05-23  Paul Eggert  
 
+   dfa: new option DFA_STRAY_BACKSLASH_WARN
+   This is for grep, which wants to warn about stray backslashes that
+   lead to unspecified behavior.  For example, "grep -oi '\a'"
+   surprisingly is not equivalent to "grep -oi 'a'", so the stray
+   backslash should be warned about.
+   * lib/dfa.c: Include wctype.h, for iswprint and iswspace.
+   (lex): Add support for DFA_STRAY_BACKSLASH_WARN.
+   * lib/dfa.h (DFA_STRAY_BACKSLASH_WARN): New constant.
+
dfa: new option DFA_CONFUSING_BRACKETS_ERROR
This is for grep, which wants [:alpha:] to be an error
at the top level.
diff --git a/lib/dfa.c b/lib/dfa.c
index ba21639521..4833a20d72 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -59,6 +59,7 @@ c_isdigit (char c)
 #define _(str) gettext (str)
 
 #include 
+#include 
 
 #include "xalloc.h"
 #include "localeinfo.h"
@@ -1192,8 +1193,7 @@ lex (struct dfa *dfa)
  we set the backslash flag and go through the loop again.
  On the plus side, this avoids having a duplicate of the
  main switch inside the backslash case.  On the minus side,
- it means that just about every case begins with
- "if (backslash) ...".  */
+ it means that just about every case tests the backslash flag.  */
   for (int i = 0; i < 2; ++i)
 {
   if (! dfa->lex.left)
@@ -1248,52 +1248,67 @@ lex (struct dfa *dfa)
 case '7':
 case '8':
 case '9':
-  if (backslash && !(dfa->syntax.syntax_bits & RE_NO_BK_REFS))
-{
-  dfa->lex.laststart = false;
-  return dfa->lex.lasttok = BACKREF;
-}
-  goto normal_char;
+  if (!backslash)
+goto normal_char;
+  if (dfa->syntax.syntax_bits & RE_NO_BK_REFS)
+goto stray_backslash;
+
+  dfa->lex.laststart = false;
+  return dfa->lex.lasttok = BACKREF;
 
 case '`':
-  if (backslash && !(dfa->syntax.syntax_bits & RE_NO_GNU_OPS))
-{
-  /* FIXME: should be beginning of string */
-  return dfa->lex.lasttok = BEGLINE;
-}
-  goto normal_char;
+  if (!backslash)
+goto normal_char;
+  if (dfa->syntax.syntax_bits & RE_NO_GNU_OPS)
+goto stray_backslash;
+
+  /* FIXME: should be beginning of string */
+  return dfa->lex.lasttok = BEGLINE;
 
 case '\'':
-  if (backslash && !(dfa->syntax.syntax_bits & RE_NO_GNU_OPS))
-{
-  /* FIXME: should be end of string */
-  return dfa->lex.lasttok = ENDLINE;
-}
-  goto normal_char;
+  if (!backslash)
+goto normal_char;
+  if (dfa->syntax.syntax_bits & RE_NO_GNU_OPS)
+goto stray_backslash;
+
+  /* FIXME: should be end of string */
+  return dfa->lex.lasttok = ENDLINE;
 
 case '<':
-  if (backslash && !(dfa->syntax.syntax_bits & RE_NO_GNU_OPS))
-return dfa->lex.lasttok = BEGWORD;
-  goto normal_char;
+  if (!backslash)
+goto normal_char;
+  if (dfa->syntax.syntax_bits & RE_NO_GNU_OPS)
+goto stray_backslash;
+
+  return dfa->lex.lasttok = BEGWORD;
 
 case '>':
-  if (backslash && !(dfa->syntax.syntax_bits & RE_NO_GNU_OPS))
-return dfa->lex.lasttok = ENDWORD;
-  goto normal_char;
+  if (!backslash)
+goto normal_char;
+  if (dfa->syntax.syntax_bits & RE_NO_GNU_OPS)
+goto stray_backslash;
+
+  return dfa->lex.lasttok = ENDWORD;
 
 case 'b':
-  if (backslash && !(dfa->syntax.syntax_bits & RE_NO_GNU_OPS))
-return dfa->lex.lasttok = LIMWORD;
-  goto normal_char;
+  if (!backslash)
+goto normal_char;
+  if (dfa->syntax.syntax_

[PATCH 3/3] dfa: '\n' is not governed by RE_LIMITED_OPS

2022-05-23 Thread Paul Eggert
* lib/dfa.c (lex): Pay no attention to RE_LIMITED_OPS when
deciding how to parse '\n', since regcomp.c doesn’t.
---
 ChangeLog | 4 
 lib/dfa.c | 3 +--
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 0c5e799521..088e3b3134 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,9 @@
 2022-05-23  Paul Eggert  
 
+   dfa: '\n' is not governed by RE_LIMITED_OPS
+   * lib/dfa.c (lex): Pay no attention to RE_LIMITED_OPS when
+   deciding how to parse '\n', since regcomp.c doesn’t.
+
dfa: new option DFA_STRAY_BACKSLASH_WARN
This is for grep, which wants to warn about stray backslashes that
lead to unspecified behavior.  For example, "grep -oi '\a'"
diff --git a/lib/dfa.c b/lib/dfa.c
index 4833a20d72..5d92b38b4c 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -1401,8 +1401,7 @@ lex (struct dfa *dfa)
   return dfa->lex.lasttok = OR;
 
 case '\n':
-  if (dfa->syntax.syntax_bits & RE_LIMITED_OPS
-  || !(dfa->syntax.syntax_bits & RE_NEWLINE_ALT))
+  if (!(dfa->syntax.syntax_bits & RE_NEWLINE_ALT))
 goto default_case;
   if (backslash)
 goto normal_char;
-- 
2.36.1




[PATCH 1/3] dfa: new option DFA_CONFUSING_BRACKETS_ERROR

2022-05-23 Thread Paul Eggert
This is for grep, which wants [:alpha:] to be an error
at the top level.
* lib/dfa.c (struct regex_syntax): New member dfaopts,
replacing anchor.  All uses changed.
(parse_bracket_exp): Error, not warn, if DFA_CONFUSING_BRACKETS_ERROR.
* lib/dfa.h (DFA_CONFUSING_BRACKETS_ERROR): New constant.
---
 ChangeLog | 10 ++
 lib/dfa.c | 13 ++---
 lib/dfa.h |  6 +-
 3 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 327100aaf1..407baca335 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,13 @@
+2022-05-23  Paul Eggert  
+
+   dfa: new option DFA_CONFUSING_BRACKETS_ERROR
+   This is for grep, which wants [:alpha:] to be an error
+   at the top level.
+   * lib/dfa.c (struct regex_syntax): New member dfaopts,
+   replacing anchor.  All uses changed.
+   (parse_bracket_exp): Error, not warn, if DFA_CONFUSING_BRACKETS_ERROR.
+   * lib/dfa.h (DFA_CONFUSING_BRACKETS_ERROR): New constant.
+
 2022-05-21  Paul Eggert  
 
strstr-simple: pacify GCC 12.1
diff --git a/lib/dfa.c b/lib/dfa.c
index 5f290ec58e..ba21639521 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -399,15 +399,12 @@ struct regex_syntax
 {
   /* Syntax bits controlling the behavior of the lexical analyzer.  */
   reg_syntax_t syntax_bits;
+  int dfaopts;
   bool syntax_bits_set;
 
   /* Flag for case-folding letters into sets.  */
   bool case_fold;
 
-  /* True if ^ and $ match only the start and end of data, and do not match
- end-of-line within data.  */
-  bool anchor;
-
   /* End-of-line byte in data.  */
   unsigned char eolbyte;
 
@@ -836,7 +833,7 @@ unibyte_word_constituent (struct dfa const *dfa, unsigned 
char c)
 static int
 char_context (struct dfa const *dfa, unsigned char c)
 {
-  if (c == dfa->syntax.eolbyte && !dfa->syntax.anchor)
+  if (c == dfa->syntax.eolbyte && !(dfa->syntax.dfaopts & DFA_ANCHOR))
 return CTX_NEWLINE;
   if (unibyte_word_constituent (dfa, c))
 return CTX_LETTER;
@@ -1140,7 +1137,9 @@ parse_bracket_exp (struct dfa *dfa)
   while ((wc = wc1, (c = c1) != ']'));
 
   if (colon_warning_state == 7)
-dfawarn (_("character class syntax is [[:space:]], not [:space:]"));
+((dfa->syntax.dfaopts & DFA_CONFUSING_BRACKETS_ERROR
+  ? dfaerror : dfawarn)
+ (_("character class syntax is [[:space:]], not [:space:]")));
 
   if (! known_bracket_exp)
 return BACKREF;
@@ -4327,9 +4326,9 @@ dfasyntax (struct dfa *dfa, struct localeinfo const 
*linfo,
   dfa->canychar = -1;
   dfa->syntax.syntax_bits_set = true;
   dfa->syntax.case_fold = (bits & RE_ICASE) != 0;
-  dfa->syntax.anchor = (dfaopts & DFA_ANCHOR) != 0;
   dfa->syntax.eolbyte = dfaopts & DFA_EOL_NUL ? '\0' : '\n';
   dfa->syntax.syntax_bits = bits;
+  dfa->syntax.dfaopts = dfaopts;
 
   for (int i = CHAR_MIN; i <= CHAR_MAX; ++i)
 {
diff --git a/lib/dfa.h b/lib/dfa.h
index e94e43546d..327b9c7cdf 100644
--- a/lib/dfa.h
+++ b/lib/dfa.h
@@ -73,7 +73,11 @@ enum
 DFA_ANCHOR = 1 << 0,
 
 /* '\0' in data is end-of-line, instead of the traditional '\n'.  */
-DFA_EOL_NUL = 1 << 1
+DFA_EOL_NUL = 1 << 1,
+
+/* Treat [:alpha:] etc. as an error at the top level, instead of
+   merely a warning.  */
+DFA_CONFUSING_BRACKETS_ERROR = 1 << 2,
   };
 
 /* Initialize or reinitialize a DFA.  The arguments are:
-- 
2.36.1




[PATCH] strstr-simple: pacify GCC 12.1

2022-05-21 Thread Paul Eggert
* lib/str-two-way.h (two_way_long_needle): Pacify GCC 12.1
-Wsuggest-attribute=pure (x86-64, -O2).
---
 ChangeLog | 6 ++
 lib/str-two-way.h | 4 ++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 30ffdcb7c3..327100aaf1 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2022-05-21  Paul Eggert  
+
+   strstr-simple: pacify GCC 12.1
+   * lib/str-two-way.h (two_way_long_needle): Pacify GCC 12.1
+   -Wsuggest-attribute=pure (x86-64, -O2).
+
 2022-05-20  Paul Eggert  
 
dfa: steer cleer of POSIX-reserved symbols
diff --git a/lib/str-two-way.h b/lib/str-two-way.h
index 7ee344aea1..b00017c0b4 100644
--- a/lib/str-two-way.h
+++ b/lib/str-two-way.h
@@ -231,7 +231,7 @@ critical_factorization (const unsigned char *needle, size_t 
needle_len,
most 2 * HAYSTACK_LEN - NEEDLE_LEN comparisons occur in searching.
If AVAILABLE modifies HAYSTACK_LEN (as in strstr), then at most 3 *
HAYSTACK_LEN - NEEDLE_LEN comparisons occur in searching.  */
-static RETURN_TYPE
+static RETURN_TYPE _GL_ATTRIBUTE_PURE
 two_way_short_needle (const unsigned char *haystack, size_t haystack_len,
   const unsigned char *needle, size_t needle_len)
 {
@@ -325,7 +325,7 @@ two_way_short_needle (const unsigned char *haystack, size_t 
haystack_len,
If AVAILABLE modifies HAYSTACK_LEN (as in strstr), then at most 3 *
HAYSTACK_LEN - NEEDLE_LEN comparisons occur in searching, and
sublinear performance is not possible.  */
-static RETURN_TYPE
+static RETURN_TYPE _GL_ATTRIBUTE_PURE
 two_way_long_needle (const unsigned char *haystack, size_t haystack_len,
  const unsigned char *needle, size_t needle_len)
 {
-- 
2.36.1




[PATCH] dfa: steer cleer of POSIX-reserved symbols

2022-05-20 Thread Paul Eggert
* lib/dfa.c (str_eq): Rename from streq.  All uses changed.
(c_isdigit): Rename from isasciidigit.  The function worked in
EBCDIC so it wasn’t ASCII-specific anyway.  All uses changed.
---
 ChangeLog |  7 +++
 lib/dfa.c | 20 ++--
 2 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index d86cf0048b..30ffdcb7c3 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2022-05-20  Paul Eggert  
+
+   dfa: steer cleer of POSIX-reserved symbols
+   * lib/dfa.c (str_eq): Rename from streq.  All uses changed.
+   (c_isdigit): Rename from isasciidigit.  The function worked in
+   EBCDIC so it wasn’t ASCII-specific anyway.  All uses changed.
+
 2022-05-17  Paul Eggert  
 
parse-datetime: support 'J' military time zone
diff --git a/lib/dfa.c b/lib/dfa.c
index e88fabb442..5f290ec58e 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -44,13 +44,13 @@
 #define assume_nonnull(x) assume ((x) != NULL)
 
 static bool
-streq (char const *a, char const *b)
+str_eq (char const *a, char const *b)
 {
   return strcmp (a, b) == 0;
 }
 
 static bool
-isasciidigit (char c)
+c_isdigit (char c)
 {
   return '0' <= c && c <= '9';
 }
@@ -930,7 +930,7 @@ static const struct dfa_ctype *_GL_ATTRIBUTE_PURE
 find_pred (const char *str)
 {
   for (int i = 0; prednames[i].name; i++)
-if (streq (str, prednames[i].name))
+if (str_eq (str, prednames[i].name))
   return &prednames[i];
   return NULL;
 }
@@ -1009,8 +1009,8 @@ parse_bracket_exp (struct dfa *dfa)
worry about that possibility.  */
 {
   char const *class
-= (dfa->syntax.case_fold && (streq (str, "upper")
- || streq (str, "lower"))
+= (dfa->syntax.case_fold && (str_eq (str, "upper")
+ || str_eq (str, "lower"))
? "alpha" : str);
   const struct dfa_ctype *pred = find_pred (class);
   if (!pred)
@@ -1090,7 +1090,7 @@ parse_bracket_exp (struct dfa *dfa)
   if (wc != wc2 || wc == WEOF)
 {
   if (dfa->localeinfo.simple
-  || (isasciidigit (c) & isasciidigit (c2)))
+  || (c_isdigit (c) & c_isdigit (c2)))
 {
   for (int ci = c; ci <= c2; ci++)
 if (dfa->syntax.case_fold && isalpha (ci))
@@ -1339,7 +1339,7 @@ lex (struct dfa *dfa)
 char const *p = dfa->lex.ptr;
 char const *lim = p + dfa->lex.left;
 dfa->lex.minrep = dfa->lex.maxrep = -1;
-for (; p != lim && isasciidigit (*p); p++)
+for (; p != lim && c_isdigit (*p); p++)
   dfa->lex.minrep = (dfa->lex.minrep < 0
  ? *p - '0'
  : MIN (RE_DUP_MAX + 1,
@@ -1352,7 +1352,7 @@ lex (struct dfa *dfa)
   {
 if (dfa->lex.minrep < 0)
   dfa->lex.minrep = 0;
-while (++p != lim && isasciidigit (*p))
+while (++p != lim && c_isdigit (*p))
   dfa->lex.maxrep
 = (dfa->lex.maxrep < 0
? *p - '0'
@@ -4116,7 +4116,7 @@ dfamust (struct dfa const *d)
 idx_t j, ln, rn, n;
 
 /* Guaranteed to be.  Unlikely, but ...  */
-if (streq (lmp->is, rmp->is))
+if (str_eq (lmp->is, rmp->is))
   {
 lmp->begline &= rmp->begline;
 lmp->endline &= rmp->endline;
@@ -4163,7 +4163,7 @@ dfamust (struct dfa const *d)
   for (idx_t i = 0; mp->in[i] != NULL; i++)
 if (strlen (mp->in[i]) > strlen (result))
   result = mp->in[i];
-  if (streq (result, mp->is))
+  if (str_eq (result, mp->is))
 {
   if ((!need_begline || mp->begline) && (!need_endline
  || mp->endline))
-- 
2.36.1




[PATCH] parse-datetime: support 'J' military time zone

2022-05-17 Thread Paul Eggert
Requested by Brian Inglis in:
https://savannah.gnu.org/support/?110644
* lib/parse-datetime.y (parser_control): New member J_zones_seen.
(item): New item 'J'.
(military_table): Add 'J'.
(parse_datetime_body): Set and use J_zones_seen.
* tests/test-parse-datetime.c (main): Test "J".
---
 ChangeLog   | 11 +++
 doc/parse-datetime.texi |  2 +-
 lib/parse-datetime.y| 15 ---
 tests/test-parse-datetime.c | 10 ++
 4 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 3b1f715527..d86cf0048b 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,14 @@
+2022-05-17  Paul Eggert  
+
+   parse-datetime: support 'J' military time zone
+   Requested by Brian Inglis in:
+   https://savannah.gnu.org/support/?110644
+   * lib/parse-datetime.y (parser_control): New member J_zones_seen.
+   (item): New item 'J'.
+   (military_table): Add 'J'.
+   (parse_datetime_body): Set and use J_zones_seen.
+   * tests/test-parse-datetime.c (main): Test "J".
+
 2022-05-15  Reuben Thomas  
 
doc: Update regex documentation to match implementation.
diff --git a/doc/parse-datetime.texi b/doc/parse-datetime.texi
index 575b4d5aea..44305d136c 100644
--- a/doc/parse-datetime.texi
+++ b/doc/parse-datetime.texi
@@ -304,7 +304,7 @@ Time zone items other than @samp{UTC} and @samp{Z}
 are obsolescent and are not recommended, because they
 are ambiguous; for example, @samp{EST} has a different meaning in
 Australia than in the United States, and @samp{A} has different
-meaning as a military time zone than as an obsolescent
+meaning as a military time zone than as an obsolete
 RFC 822 time zone.  Instead, it's better to use
 unambiguous numeric time zone corrections like @samp{-0500}, as
 described in the previous section.
diff --git a/lib/parse-datetime.y b/lib/parse-datetime.y
index 7220d05dd7..0903c2003e 100644
--- a/lib/parse-datetime.y
+++ b/lib/parse-datetime.y
@@ -205,6 +205,7 @@ typedef struct
   bool rels_seen;
   idx_t dates_seen;
   idx_t days_seen;
+  idx_t J_zones_seen;
   idx_t local_zones_seen;
   idx_t dsts_seen;
   idx_t times_seen;
@@ -624,6 +625,11 @@ item:
 pc->local_zones_seen++;
 debug_print_current_time (_("local_zone"), pc);
   }
+  | 'J'
+  {
+pc->J_zones_seen++;
+debug_print_current_time ("J", pc);
+  }
   | zone
   {
 pc->zones_seen++;
@@ -1153,7 +1159,8 @@ static table const time_zone_table[] =
RFC 822 got these backwards, but RFC 5322 makes the incorrect
treatment optional, so do them the right way here.
 
-   Note 'T' is a special case, as it is used as the separator in ISO
+   'J' is special, as it is local time.
+   'T' is also special, as it is the separator in ISO
8601 date and time of day representation.  */
 static table const military_table[] =
 {
@@ -1166,6 +1173,7 @@ static table const military_table[] =
   { "G", tZONE,  HOUR ( 7) },
   { "H", tZONE,  HOUR ( 8) },
   { "I", tZONE,  HOUR ( 9) },
+  { "J", 'J',0 },
   { "K", tZONE,  HOUR (10) },
   { "L", tZONE,  HOUR (11) },
   { "M", tZONE,  HOUR (12) },
@@ -1816,6 +1824,7 @@ parse_datetime_body (struct timespec *result, char const 
*p,
   pc.dates_seen = 0;
   pc.days_seen = 0;
   pc.times_seen = 0;
+  pc.J_zones_seen = 0;
   pc.local_zones_seen = 0;
   pc.dsts_seen = 0;
   pc.zones_seen = 0;
@@ -1941,7 +1950,7 @@ parse_datetime_body (struct timespec *result, char const 
*p,
   else
 {
   if (1 < (pc.times_seen | pc.dates_seen | pc.days_seen | pc.dsts_seen
-   | (pc.local_zones_seen + pc.zones_seen)))
+   | (pc.J_zones_seen + pc.local_zones_seen + pc.zones_seen)))
 {
   if (debugging (&pc))
 {
@@ -1953,7 +1962,7 @@ parse_datetime_body (struct timespec *result, char const 
*p,
 dbg_printf ("error: seen multiple days parts\n");
   if (pc.dsts_seen > 1)
 dbg_printf ("error: seen multiple daylight-saving parts\n");
-  if ((pc.local_zones_seen + pc.zones_seen) > 1)
+  if ((pc.J_zones_seen + pc.local_zones_seen + pc.zones_seen) > 1)
 dbg_printf ("error: seen multiple time-zone parts\n");
 }
   goto fail;
diff --git a/tests/test-parse-datetime.c b/tests/test-parse-datetime.c
index 4310ee8a3d..6dbcb3ac93 100644
--- a/tests/test-parse-datetime.c
+++ b/tests/test-parse-datetime.c
@@ -151,6 +151,16 @@ main (_GL_UNUSED int argc, char **argv)
   ASSERT (expected.tv_sec == result.tv_sec
   && expected.tv_nsec == result.tv_nsec);
 
+  /* ISO 8601 extended date and time of day representation,
+ ' ' separator, 'J' (

[PATCH] dfa: fix bug with ‘.’ and UTF-8 Hangul Syllables

2022-05-13 Thread Paul Eggert
This fixes a bug introduced in 2019-12-18T05:41:27Z!egg...@cs.ucla.edu,
an earlier patch that fixed dfa.c to not match invalid UTF-8.
Unfortunately that patch had a couple of typos when dfa.c is
matching against the regular expression ‘.’ (dot).  One typo
caused dfa.c to incorrectly reject the valid UTF-8 sequences
(ED)(90-9F)(80-BF) corresponding to U+D400 through U+D7FF, which
are some Hangul Syllables and Hangul Jamo Extended-B.  The other
typo caused dfa.c to incorrectly reject the valid sequences
(F4)(88-8F)(80-BF)(80-BF) which correspond to U+108000 through
U+10 (Supplemental Private Use Area plane B).
* lib/dfa.c (utf8_classes): Fix typos.
* tests/test-dfa-match.sh: Test the fix.
---
 ChangeLog   | 16 
 lib/dfa.c   |  4 ++--
 tests/test-dfa-match.sh | 11 +++
 3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 6ed8a50735..fe26d37618 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,19 @@
+2022-05-13  Paul Eggert  
+
+   dfa: fix bug with ‘.’ and UTF-8 Hangul Syllables
+   This fixes a bug introduced in 2019-12-18T05:41:27Z!egg...@cs.ucla.edu,
+   an earlier patch that fixed dfa.c to not match invalid UTF-8.
+   Unfortunately that patch had a couple of typos when dfa.c is
+   matching against the regular expression ‘.’ (dot).  One typo
+   caused dfa.c to incorrectly reject the valid UTF-8 sequences
+   (ED)(90-9F)(80-BF) corresponding to U+D400 through U+D7FF, which
+   are some Hangul Syllables and Hangul Jamo Extended-B.  The other
+   typo caused dfa.c to incorrectly reject the valid sequences
+   (F4)(88-8F)(80-BF)(80-BF) which correspond to U+108000 through
+   U+10 (Supplemental Private Use Area plane B).
+   * lib/dfa.c (utf8_classes): Fix typos.
+   * tests/test-dfa-match.sh: Test the fix.
+
 2022-05-12  Paul Eggert  
 
manywarnings: update C warnings for GCC 12
diff --git a/lib/dfa.c b/lib/dfa.c
index a27d096f73..e88fabb442 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -1704,7 +1704,7 @@ add_utf8_anychar (struct dfa *dfa)
 /* G. ed (just a token).  */
 
 /* H. 80-9f: 2nd byte of a "GHC" sequence.  */
-CHARCLASS_INIT (0, 0, 0, 0, 0x, 0, 0, 0),
+CHARCLASS_INIT (0, 0, 0, 0, 0x, 0, 0, 0),
 
 /* I. f0 (just a token).  */
 
@@ -1717,7 +1717,7 @@ add_utf8_anychar (struct dfa *dfa)
 /* L. f4 (just a token).  */
 
 /* M. 80-8f: 2nd byte of a "LMCC" sequence.  */
-CHARCLASS_INIT (0, 0, 0, 0, 0xff, 0, 0, 0),
+CHARCLASS_INIT (0, 0, 0, 0, 0x, 0, 0, 0),
   };
 
   /* Define the character classes that are needed below.  */
diff --git a/tests/test-dfa-match.sh b/tests/test-dfa-match.sh
index b23851b8c0..4561584c4c 100755
--- a/tests/test-dfa-match.sh
+++ b/tests/test-dfa-match.sh
@@ -42,4 +42,15 @@ in=$(printf "bb\nbb")
 $timeout_10 ${CHECKER} test-dfa-match-aux a "$in" 1 > out || fail=1
 compare /dev/null out || fail=1
 
+# If the platform supports U+00E9 LATIN SMALL LETTER E WITH ACUTE,
+# test U+D45C HANGUL SYLLABLE PYO.
+U_00E9=$(printf '\303\251\n')
+U_D45C=$(printf '\355\221\234\n')
+if testout=$(LC_ALL=en_US.UTF-8 $CHECKER test-dfa-match-aux '^.$' "$U_00E9") &&
+   test "$testout" = 2
+then
+  testout=$(LC_ALL=en_US.UTF-8 $CHECKER test-dfa-match-aux '^.$' "$U_D45C") &&
+  test "$testout" = 3 || fail=1
+fi
+
 Exit $fail
-- 
2.34.1




[PATCH] manywarnings: update C warnings for GCC 12

2022-05-12 Thread Paul Eggert
Adjust for C programs compiled by GCC 12.
(A C++ expert still needs to look at manywarnings-c++.m4.)
* build-aux/gcc-warning.spec: Add warnings introduced in GCC 12.
* m4/manywarnings.m4 (gl_MANYWARN_ALL_GCC): Add -Wbidi-chars=any,ucn
and -Wuse-after-free=3.  Although not enabled by -Wall or -Wextra
they seem suitable for Gnulib-using C code.
---
 ChangeLog  | 10 ++
 build-aux/gcc-warning.spec | 26 +-
 m4/manywarnings.m4 |  2 ++
 3 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 8cc17d8d14..6ed8a50735 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,13 @@
+2022-05-12  Paul Eggert  
+
+   manywarnings: update C warnings for GCC 12
+   Adjust for C programs compiled by GCC 12.
+   (A C++ expert still needs to look at manywarnings-c++.m4.)
+   * build-aux/gcc-warning.spec: Add warnings introduced in GCC 12.
+   * m4/manywarnings.m4 (gl_MANYWARN_ALL_GCC): Add -Wbidi-chars=any,ucn
+   and -Wuse-after-free=3.  Although not enabled by -Wall or -Wextra
+   they seem suitable for Gnulib-using C code.
+
 2022-05-11  Paul Eggert  
 
parse-datetime: remove Emacs cruft
diff --git a/build-aux/gcc-warning.spec b/build-aux/gcc-warning.spec
index c0d49f2a6e..cbcbc87f9a 100644
--- a/build-aux/gcc-warning.spec
+++ b/build-aux/gcc-warning.spec
@@ -32,15 +32,21 @@
 -Wanalyzer-shift-count-negativeenabled by -fanalyzer
 -Wanalyzer-shift-count-overflowenabled by -fanalyzer
 -Wanalyzer-stale-setjmp-buffer implied by -fanalyzer
--Wanalyzer-tainted-array-index FIXME maybe? too much noise
+-Wanalyzer-tainted-allocation-size FIXME requires -fanalyzer-checker=taint
+-Wanalyzer-tainted-array-index FIXME requires -fanalyzer-checker=taint
+-Wanalyzer-tainted-divisor FIXME requires -fanalyzer-checker=taint
+-Wanalyzer-tainted-offset  FIXME requires -fanalyzer-checker=taint
+-Wanalyzer-tainted-sizeFIXME requires 
-fanalyzer-checker=taint
 -Wanalyzer-too-complex enabled by -fanalyzer
 -Wanalyzer-unsafe-call-within-signal-handler   enabled by -fanalyzer
 -Wanalyzer-use-after-free  enabled by -fanalyzer
 -Wanalyzer-use-of-pointer-in-stale-stack-frame enabled by -fanalyzer
+-Wanalyzer-use-of-uninitialized-value  enabled by -fanalyzer
 -Wanalyzer-write-to-const  enabled by -fanalyzer
 -Wanalyzer-write-to-string-literal enabled by -fanalyzer
 -Warray-bounds covered by -Warray-bounds=
 -Warray-bounds=<0,2>   handled specially by gl_MANYWARN_ALL_GCC
+-Warray-compareenabled by -Wall
 -Warray-parameter  enabled by -Wall
 -Warray-parameter=<0,2>enabled by -Wall
 -Warray-temporariesfortran
@@ -49,6 +55,8 @@
 -Wattribute-alias=<0,2>handled specially by 
gl_MANYWARN_ALL_GCC
 -Wattribute-warningdefault
 -Wattributes   default
+-Wbidi-chars   handled specially by gl_MANYWARN_ALL_GCC
+-Wbidi-chars=  handled specially by gl_MANYWARN_ALL_GCC
 -Wbool-compare enabled by -Wall
 -Wbool-operation   enabled by -Wall
 -Wbuiltin-declaration-mismatch default
@@ -56,10 +64,15 @@
 -Wc++-compat   only useful for code meant to be 
compiled by a C++ compiler
 -Wc++0x-compat c++
 -Wc++11-compat c++
+-Wc++11-extensions c++
 -Wc++14-compat c++
+-Wc++14-extensions c++
 -Wc++17-compat c++
+-Wc++17-extensions c++
 -Wc++1z-compat c++
 -Wc++20-compat c++
+-Wc++20-extensions c++
+-Wc++23-extensions c++
 -Wc++2a-compat c++
 -Wc-binding-type   fortran
 -Wc11-c2x-compat   c compatibility
@@ -86,11 +99,14 @@
 -Wconversion   FIXME maybe? too much noise; encourages 
bad changes
 -Wconversion-extra fortran
 -Wconversion-null  c++ and objc++
+-Wcoverage-invalid-line-number default if --coverage
 -Wcoverage-mismatchdefault
 -Wcpp  default
 -Wctad-maybe-unsupported   c++ and objc++
 -Wctor-dtor-privacyc++
 -Wdangling-elseenabled by -Wparentheses
+-Wdangling-pointer enabled by -Wall
+-Wdangling-pointer=<0,2>   enabled by -Wall
 -Wdeclaration-after-statement  needed only for pre-C99, so obsolete
 -Wdelete-incompletec++ 

Re: regex documentation

2022-05-11 Thread Paul Eggert

On 5/11/22 11:09, Reuben Thomas wrote:

Sorry, I don't follow. The concrete example given is: \Sw matches any
character that is
not word-constituent. That seems to be [^[:alnum:]]?


In glibc regex, \Sw matches a nonspace followed by a 'w'. That is, it is 
equivalent to [^[:space:]]w and it has a different meaning from the 
decommissioned GNU regex meaning.


I'm assuming that the goal of gnulib/doc/regex.texi is to document the 
regular expression syntax implemented by glibc, grep, etc. Does that 
match your assumption.




Re: regex documentation

2022-05-11 Thread Paul Eggert

On 5/11/22 09:37, Reuben Thomas wrote:

Only thing I spotted offhand was that \s and \S mean something entirely
different in glibc as syntax classes are not programmable.


I think the documentation as I've edited it is correct.


Sorry, I should have been more specific. In glibc regex, \s is a synonym 
for [[:space:]] and \S is a synonym for [^[:space:]], so the discussion 
in regex.texi of @samp{\s@var{class}} etc. is wrong on a syntactic level 
not just a semantic level.




Re: #ifdef emacs

2022-05-11 Thread Paul Eggert

On 5/11/22 03:27, Reuben Thomas wrote:

Mostly in alloca.c, with one case in parse-datetime.y.
Bruno handled alloca.c, and I did parse-datetime.y with the attached. 
Thanks for reporting it.From 950f04bbf18dad544c61f448206e9dc96cbe3b7a Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 11 May 2022 09:35:45 -0700
Subject: [PATCH] parse-datetime: remove Emacs cruft
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* lib/parse-datetime.y: Remove an ‘ifdef emacs’.  Emacs has never
used this module.  The module is derived from code taken from
Emacs, but that code was removed from Emacs in the 1990s.
---
 ChangeLog|  7 +++
 lib/parse-datetime.y | 10 --
 2 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index f4e24d4157..8cc17d8d14 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2022-05-11  Paul Eggert  
+
+	parse-datetime: remove Emacs cruft
+	* lib/parse-datetime.y: Remove an ‘ifdef emacs’.  Emacs has never
+	used this module.  The module is derived from code taken from
+	Emacs, but that code was removed from Emacs in the 1990s.
+
 2022-05-11  Bruno Haible  
 
 	alloca: Remove old code for Emacs, unused since 2009.
diff --git a/lib/parse-datetime.y b/lib/parse-datetime.y
index 9fc14c9d46..7220d05dd7 100644
--- a/lib/parse-datetime.y
+++ b/lib/parse-datetime.y
@@ -52,16 +52,6 @@
 #define YYMAXDEPTH 20
 #define YYINITDEPTH YYMAXDEPTH
 
-/* Since the code of parse-datetime.y is not included in the Emacs executable
-   itself, there is no need to #define static in this file.  Even if
-   the code were included in the Emacs executable, it probably
-   wouldn't do any harm to #undef it here; this will only cause
-   problems if we try to write to a static variable, which I don't
-   think this code needs to do.  */
-#ifdef emacs
-# undef static
-#endif
-
 #include 
 #include 
 #include 
-- 
2.34.1



Re: regex documentation

2022-05-11 Thread Paul Eggert

On 5/11/22 04:18, Bruno Haible wrote:

Reuben Thomas wrote:

I'm happy to prepare a patch in this case. I would simply remove all
mention of syntax tables, as that functionality is no longer available.


Attached.


Thanks! Looks good to me, except that the comma in line 111 is superfluous.
Paul, OK with you as well?


Only thing I spotted offhand was that \s and \S mean something entirely 
different in glibc as syntax classes are not programmable.




Re: regex module has dropped support for syntax tables

2022-05-10 Thread Paul Eggert

On 5/9/22 14:03, Reuben Thomas wrote:

On Mon, 9 May 2022 at 20:29, Paul Eggert  wrote:


On 5/8/22 15:54, Reuben Thomas wrote:


I sympathise if the gnulib maintainers don't want to reintroduce them; in
that case, could their removal please be flagged up in the docs?


Sure, I installed the attached.



Thanks! I didn't think of this before, is regex.texi supposed to document
GNU regex, then? 


Oh my. For years I thought that gnulib/doc/regex.texi documents the 
regular expressions supported by glibc (and by Gnulib, which mimics 
glibc). Unfortunately it appears that I am wrong, and it's documentation 
for the old GNU regex package (however, with some edits by me that are 
appropriate only for glibc!).


Does anybody use gnulib/doc/regex.texi? If not, I suggest we remove it 
from Gnulib. It's not part of any package, and its presence is confusing 
both Rueben and me.





Failing that, you could also try GNU Emacs's regex implementation, which

is derived from GNU regex 0.12, and which may have fewer bugs than regex
0.12.



That's a good suggestion I hadn't thought of, thanks. I had a look at Emacs
git, and it seems to use glibc regex, though?


It has two copies of the regex code, one from Gnulib (which is what you 
probably saw) and one just for Emacs. I meant the latter. It's in 
emacs/src/regex-emacs.[ch].




Re: regex module has dropped support for syntax tables

2022-05-09 Thread Paul Eggert

On 5/8/22 15:54, Reuben Thomas wrote:


I sympathise if the gnulib maintainers don't want to reintroduce them; in
that case, could their removal please be flagged up in the docs?


Sure, I installed the attached.


Also, do the maintainers have any better suggestion for what I should do
than revert to GNU regex 0.12 for a2ps? It relies on syntax tables for its
style sheets, and I don't want to have to introduce an incompatibility to a
mature program.


Perhaps you can transliterate the regexps using syntax-table features 
into those without? (I'm not familiar with the issue here.)


Failing that, you could also try GNU Emacs's regex implementation, which 
is derived from GNU regex 0.12, and which may have fewer bugs than regex 
0.12.From 2f2f597641b4350915ea64c2457587d24d3fc9e2 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Mon, 9 May 2022 12:20:24 -0700
Subject: [PATCH] Say that it is not the old interface

---
 modules/regex | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/modules/regex b/modules/regex
index e8ad558642..b780427221 100644
--- a/modules/regex
+++ b/modules/regex
@@ -1,5 +1,8 @@
 Description:
 Regular expression matching.
+This matches the current GNU C Library, so its interface differs from
+the standalone GNU regex library which has long been decommissioned in
+favor of the GNU C Library interface.
 
 Files:
 lib/regex.h
-- 
2.34.1



[PATCH 2/2] libc-config: update to match cdefs

2022-05-05 Thread Paul Eggert
* lib/libc-config.h (__attribute_alloc_align__)
(__attribute_maybe_unused, __fortified_attr_access)
(__glibc_fortify, __glibc_fortify_n, __glibc_likely)
(__glibc_safe_len_cond, __glibc_safe_or_unknown_len)
(__glibc_unsafe_len, __glibc_unsigned_or_positive, __wur):
Undef these too, since lib/cdefs.h now defines them
unconditionally.
---
 ChangeLog |  9 +
 lib/libc-config.h | 11 +++
 2 files changed, 20 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index f1a154027a..ea2c24dee1 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,14 @@
 2022-05-05  Paul Eggert  
 
+   libc-config: update to match cdefs
+   * lib/libc-config.h (__attribute_alloc_align__)
+   (__attribute_maybe_unused, __fortified_attr_access)
+   (__glibc_fortify, __glibc_fortify_n, __glibc_likely)
+   (__glibc_safe_len_cond, __glibc_safe_or_unknown_len)
+   (__glibc_unsafe_len, __glibc_unsigned_or_positive, __wur):
+   Undef these too, since lib/cdefs.h now defines them
+   unconditionally.
+
cdefs: merge from glibc
* lib/cdefs.h (__glibc_safe_or_unknown_len):
Use glibc’s newer version.
diff --git a/lib/libc-config.h b/lib/libc-config.h
index 8fec489378..a56665b1ce 100644
--- a/lib/libc-config.h
+++ b/lib/libc-config.h
@@ -121,6 +121,7 @@
 # undef __attr_dealloc
 # undef __attr_dealloc_free
 # undef __attribute__
+# undef __attribute_alloc_align__
 # undef __attribute_alloc_size__
 # undef __attribute_artificial__
 # undef __attribute_const__
@@ -129,6 +130,7 @@
 # undef __attribute_format_arg__
 # undef __attribute_format_strfmon__
 # undef __attribute_malloc__
+# undef __attribute_maybe_unused__
 # undef __attribute_noinline__
 # undef __attribute_nonstring__
 # undef __attribute_pure__
@@ -142,16 +144,24 @@
 # undef __extern_always_inline
 # undef __extern_inline
 # undef __flexarr
+# undef __fortified_attr_access
 # undef __fortify_function
 # undef __glibc_c99_flexarr_available
+# undef __glibc_fortify
+# undef __glibc_fortify_n
 # undef __glibc_has_attribute
 # undef __glibc_has_builtin
 # undef __glibc_has_extension
+# undef __glibc_likely
 # undef __glibc_macro_warning
 # undef __glibc_macro_warning1
 # undef __glibc_objsize
 # undef __glibc_objsize0
+# undef __glibc_safe_len_cond
+# undef __glibc_safe_or_unknown_len
 # undef __glibc_unlikely
+# undef __glibc_unsafe_len
+# undef __glibc_unsigned_or_positive
 # undef __inline
 # undef __ptr_t
 # undef __restrict
@@ -159,6 +169,7 @@
 # undef __va_arg_pack
 # undef __va_arg_pack_len
 # undef __warnattr
+# undef __wur
 
 /* Include our copy of glibc .  */
 # include 
-- 
2.35.1




[PATCH 1/2] cdefs: merge from glibc

2022-05-05 Thread Paul Eggert
* lib/cdefs.h (__glibc_safe_or_unknown_len):
Use glibc’s newer version.
---
 ChangeLog   |  6 ++
 lib/cdefs.h | 12 ++--
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 02be5e2317..f1a154027a 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2022-05-05  Paul Eggert  
+
+   cdefs: merge from glibc
+   * lib/cdefs.h (__glibc_safe_or_unknown_len):
+   Use glibc’s newer version.
+
 2022-05-02  Paul Eggert  
 
gettime-res: help the compiler
diff --git a/lib/cdefs.h b/lib/cdefs.h
index cb2514504f..7b8ed5b344 100644
--- a/lib/cdefs.h
+++ b/lib/cdefs.h
@@ -164,13 +164,13 @@
|| (__builtin_constant_p (__l) && (__l) > 0))
 
 /* Length is known to be safe at compile time if the __L * __S <= __OBJSZ
-   condition can be folded to a constant and if it is true.  The -1 check is
-   redundant because since it implies that __glibc_safe_len_cond is true.  */
+   condition can be folded to a constant and if it is true, or unknown (-1) */
 #define __glibc_safe_or_unknown_len(__l, __s, __osz) \
-  (__glibc_unsigned_or_positive (__l)\
-   && __builtin_constant_p (__glibc_safe_len_cond ((__SIZE_TYPE__) (__l), \
-  __s, __osz))   \
-   && __glibc_safe_len_cond ((__SIZE_TYPE__) (__l), __s, __osz))
+  ((__osz) == (__SIZE_TYPE__) -1 \
+   || (__glibc_unsigned_or_positive (__l)\
+   && __builtin_constant_p (__glibc_safe_len_cond ((__SIZE_TYPE__) (__l), \
+  (__s), (__osz)))   \
+   && __glibc_safe_len_cond ((__SIZE_TYPE__) (__l), (__s), (__osz
 
 /* Conversely, we know at compile time that the length is unsafe if the
__L * __S <= __OBJSZ condition can be folded to a constant and if it is
-- 
2.35.1




Re: bug#54764: encode-time: make DST and TIMEZONE fields of the list argument optional ones

2022-05-02 Thread Paul Eggert

On 4/23/22 07:35, Bernhard Voelker wrote:

lib/gettime-res.c:77:46: error: 'earlier.tv_sec' may be used uninitialized in 
this function \
[-Werror=maybe-uninitialized]


Thanks for reporting that. Although the unnecessary initialization is 
annoying, this time I'm not annoyed enough to complicate the code  to 
pacify GCC, so I installed the attached which follows your suggestion.


This patch also lets GCC know that the numbers in question are all 
positive which I suppose might help code generation. It also replaces a 
U+00B5 MICRO SIGN with the recommended U+03BC GREEK SMALL LETTER MU.From 2ef6006ffc4080cf8c0c1f4d4deeb4c357d7a695 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Mon, 2 May 2022 09:52:48 -0700
Subject: [PATCH] gettime-res: help the compiler

* lib/gettime-res.c (gettime_res): Pacify GCC versions that
incorrectly complain about earlier.tv_sec not being initialized.
Let GCC know that gcd args are always positive.
---
 ChangeLog |  5 +
 lib/gettime-res.c | 10 +-
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index f0c9d331d4..02be5e2317 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,10 @@
 2022-05-02  Paul Eggert  
 
+	gettime-res: help the compiler
+	* lib/gettime-res.c (gettime_res): Pacify GCC versions that
+	incorrectly complain about earlier.tv_sec not being initialized.
+	Let GCC know that gcd args are always positive.
+
 	af_alg: port to Ubuntu 22.04
 	Without this patch, maintainer builds of coreutils fail on Ubuntu
 	22.04 with diagnostics like "./lib/gl_openssl.h:79:1: error:
diff --git a/lib/gettime-res.c b/lib/gettime-res.c
index bb4d0b191d..0a14cd360f 100644
--- a/lib/gettime-res.c
+++ b/lib/gettime-res.c
@@ -52,14 +52,13 @@ gettime_res (void)
   /* On all Gnulib platforms the following calculations do not overflow.  */
 
   long int hz = TIMESPEC_HZ;
-  long int r = hz * res.tv_sec + res.tv_nsec;
-  struct timespec earlier;
-  earlier.tv_nsec = -1;
+  long int r = res.tv_nsec <= 0 ? hz : res.tv_nsec;
+  struct timespec earlier = { .tv_nsec = -1 };
 
   /* On some platforms, clock_getres (CLOCK_REALTIME, ...) yields a
  too-large resolution, under the mistaken theory that it should
  return the timer interval.  For example, on AIX 7.2 POWER8
- clock_getres yields 10 ms even though clock_gettime yields 1 µs
+ clock_getres yields 10 ms even though clock_gettime yields 1 μs
  resolution.  Work around the problem with high probability by
  trying clock_gettime several times and observing the resulting
  bounds on resolution.  */
@@ -79,7 +78,8 @@ gettime_res (void)
 }
   earlier = now;
 
-  r = gcd (r, now.tv_nsec ? now.tv_nsec : hz);
+  if (0 < now.tv_nsec)
+r = gcd (r, now.tv_nsec);
 }
 
   return r;
-- 
2.34.1



[PATCH] af_alg: port to Ubuntu 22.04

2022-05-02 Thread Paul Eggert
Without this patch, maintainer builds of coreutils fail on Ubuntu
22.04 with diagnostics like "./lib/gl_openssl.h:79:1: error:
'MD5_Init' is deprecated: Since OpenSSL 3.0
[-Werror=deprecated-declarations]".  From
<https://wiki.openssl.org/index.php/OpenSSL_1.1.0_Changes>
it appears that Gnulib needs to either define OPENSSL_API_COMPAT
to a version less than 3.0, or use a compatibility layer, or
assume OpenSSL 1.1.0 or later.  The simplest workaround is to
define OPENSSL_API_COMPAT for 1.1.1, the oldest OpenSSL release
still supported.  A better fix would be to rewrite the code to
assume OpenSSL 1.1.1 or later, and stop using the older API.
* lib/md5.h, lib/sha1.h, lib/sha256.h, lib/sha512.h, lib/sm3.h:
Define OPENSSL_API_COMPAT to 0x10101000L to suppress
the deprecation warnings on Ubuntu 22.04.
---
 ChangeLog| 18 ++
 lib/md5.h|  3 +++
 lib/sha1.h   |  3 +++
 lib/sha256.h |  3 +++
 lib/sha512.h |  3 +++
 lib/sm3.h|  3 +++
 6 files changed, 33 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index 5749e2dc69..f0c9d331d4 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,21 @@
+2022-05-02  Paul Eggert  
+
+   af_alg: port to Ubuntu 22.04
+   Without this patch, maintainer builds of coreutils fail on Ubuntu
+   22.04 with diagnostics like "./lib/gl_openssl.h:79:1: error:
+   'MD5_Init' is deprecated: Since OpenSSL 3.0
+   [-Werror=deprecated-declarations]".  From
+   <https://wiki.openssl.org/index.php/OpenSSL_1.1.0_Changes>
+   it appears that Gnulib needs to either define OPENSSL_API_COMPAT
+   to a version less than 3.0, or use a compatibility layer, or
+   assume OpenSSL 1.1.0 or later.  The simplest workaround is to
+   define OPENSSL_API_COMPAT for 1.1.1, the oldest OpenSSL release
+   still supported.  A better fix would be to rewrite the code to
+   assume OpenSSL 1.1.1 or later, and stop using the older API.
+   * lib/md5.h, lib/sha1.h, lib/sha256.h, lib/sha512.h, lib/sm3.h:
+   Define OPENSSL_API_COMPAT to 0x10101000L to suppress
+   the deprecation warnings on Ubuntu 22.04.
+
 2022-05-01  Paul Eggert  
 
vasnprintf: Simplify. Reduce binary code size.
diff --git a/lib/md5.h b/lib/md5.h
index 5b92eac5ec..611c230b81 100644
--- a/lib/md5.h
+++ b/lib/md5.h
@@ -24,6 +24,9 @@
 #include 
 
 # if HAVE_OPENSSL_MD5
+#  ifndef OPENSSL_API_COMPAT
+#   define OPENSSL_API_COMPAT 0x10101000L /* FIXME: Use OpenSSL 1.1+ API.  */
+#  endif
 #  include 
 # endif
 
diff --git a/lib/sha1.h b/lib/sha1.h
index 098678d8da..bc3470a508 100644
--- a/lib/sha1.h
+++ b/lib/sha1.h
@@ -23,6 +23,9 @@
 # include 
 
 # if HAVE_OPENSSL_SHA1
+#  ifndef OPENSSL_API_COMPAT
+#   define OPENSSL_API_COMPAT 0x10101000L /* FIXME: Use OpenSSL 1.1+ API.  */
+#  endif
 #  include 
 # endif
 
diff --git a/lib/sha256.h b/lib/sha256.h
index dc9d87e615..533173a59e 100644
--- a/lib/sha256.h
+++ b/lib/sha256.h
@@ -22,6 +22,9 @@
 # include 
 
 # if HAVE_OPENSSL_SHA256
+#  ifndef OPENSSL_API_COMPAT
+#   define OPENSSL_API_COMPAT 0x10101000L /* FIXME: Use OpenSSL 1.1+ API.  */
+#  endif
 #  include 
 # endif
 
diff --git a/lib/sha512.h b/lib/sha512.h
index f38819faf0..1eb1870227 100644
--- a/lib/sha512.h
+++ b/lib/sha512.h
@@ -22,6 +22,9 @@
 # include "u64.h"
 
 # if HAVE_OPENSSL_SHA512
+#  ifndef OPENSSL_API_COMPAT
+#   define OPENSSL_API_COMPAT 0x10101000L /* FIXME: Use OpenSSL 1.1+ API.  */
+#  endif
 #  include 
 # endif
 
diff --git a/lib/sm3.h b/lib/sm3.h
index 5d606fe7d8..2efe800a12 100644
--- a/lib/sm3.h
+++ b/lib/sm3.h
@@ -31,6 +31,9 @@
 # include 
 
 # if HAVE_OPENSSL_SM3
+#  ifndef OPENSSL_API_COMPAT
+#   define OPENSSL_API_COMPAT 0x10101000L /* FIXME: Use OpenSSL 1.1+ API.  */
+#  endif
 #  include 
 # endif
 
-- 
2.34.1




Re: vasnprintf.c: "out_of_memory", -Wanalyzer-free-of-non-heap, -Wanalyzer-malloc-leak

2022-05-01 Thread Paul Eggert

On 5/1/22 14:28, Bruno Haible wrote:

I pushed these three patches in your name. (I hope that's fine with you.)


Thanks, looks good.



Re: malloc failing with EAGAIN

2022-05-01 Thread Paul Eggert

On 5/1/22 13:28, Bruno Haible wrote:

I would argue that glibc should use a different errno value in this case.


Either errno value makes sense to me. If you keep doing a non-blocking 
read on a pipe that only you can write to you'll keep getting EAGAIN, 
which has the same feel as a bug where you grab a lock and then keep 
doing a malloc that won't work until you release the lock.


No big deal either way of course.



Re: vasnprintf.c: "out_of_memory", -Wanalyzer-free-of-non-heap, -Wanalyzer-malloc-leak

2022-05-01 Thread Paul Eggert

On 5/1/22 10:16, Bjarni Ingi Gislason wrote:

   I checked what of the about 30 options, I use for compiling "groff",
could cause the only warning (leak) about "vasnprintf.c" and it was the
"-flto".


Ah, the "-flto" suggests that the problem isn't in vasnprintf per se; 
it's in groff, which isn't freeing storage allocated by vasnprintf.


Possibly it's a real memory leak, but possibly it's not really a problem 
at all. It's OK - indeed, a win - to not free storage if you're about to 
exit anyway.




  The compilation of "groff" then got finished with a lot of warnings
(149) about its code.


It's up to you as to whether to investigate these warnings more 
carefully. Please check to see whether they actually make sense before 
reporting, due to the high number of false positives in this area.




Re: vasnprintf.c: "out_of_memory", -Wanalyzer-free-of-non-heap, -Wanalyzer-malloc-leak

2022-04-30 Thread Paul Eggert

On 4/30/22 17:24, Bruno Haible wrote:

These dependencies save an 'errno = ENOMEM;' assignment in one or
two places, but can cause integration problems; I am especially
thinking at the use in GNU libintl and libasprintf.


Ah, I didn't know about the integration problems. I was worried about 
the case where malloc fails with some errno value other than ENOMEM and 
that errno value should be reflected to the user; but that's less 
important than getting integration right. (malloc can fail with EAGAIN 
on GNU/Linux and I assume other errno values are also possible.)



It would be worth to eliminate the false positive reports by GCC.


Yes, though the weird thing is I'm also using GCC 11.3.1 and am not 
getting the false positives.



We could
use
   assume (result==resultbuf)
for one part.


I am surprised GCC doesn't deduce that itself; it's part of that same 
weird thing.


If this happens only with unusual configuration settings I expect we 
don't need to worry about it. It's just a warning



buf_malloced is NULL 99% of the time; here I prefer the code
that saves a function call.


Good point; hadn't noticed that. I suppose this can also help make 
branch prediction more accurate.




Re: vasnprintf.c: "out_of_memory", -Wanalyzer-free-of-non-heap, -Wanalyzer-malloc-leak

2022-04-30 Thread Paul Eggert

On 4/30/22 14:11, Bruno Haible wrote:

This is a false positive as well:


Thanks for checking that. I did a similar check before seeing your 
email, and found some opportunities for simplifying the code so that 
these checks could be easier in the future. (With luck it'd also help 
avoid false positives from lower-quality static checkers, which would 
save us time in the future.) What do you think of the attached patch?


A bonus is that it shrinks the size of the vasnprintf text by about 7% 
on Fedora 35 x86-64.From 9fcda8e08c51791e35441cc19c4d9275211b02f0 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 30 Apr 2022 15:57:48 -0700
Subject: [PATCH] vasnprintf: simplify cleanup code
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This shrinks the text size by 7% on my platform,
and makes it a bit easier to understand (for me at least).
* lib/vasnprintf.c (divide, VASNPRINTF):
Just call free (x) instead of doing ‘if (x != NULL) free (x);’.
(VASNPRINTF): Simplify by coalescing cleanup code.
Preserve malloc, realloc errno instead of replacing
it with ENOMEM.
(CLEANUP): Remove; no longer needed.
* modules/c-vasnprintf, modules/vasnprintf (Depends-on):
Depend on malloc-posix, realloc-posix so that we can
rely on errno being set on failure.
---
 ChangeLog|  15 +++
 lib/vasnprintf.c | 298 ++-
 modules/c-vasnprintf |   2 +
 modules/vasnprintf   |   2 +
 4 files changed, 85 insertions(+), 232 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 7c7ed13141..5e7f0de809 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,18 @@
+2022-04-30  Paul Eggert  
+
+	vasnprintf: simplify cleanup code
+	This shrinks the text size by 7% on my platform,
+	and makes it a bit easier to understand (for me at least).
+	* lib/vasnprintf.c (divide, VASNPRINTF):
+	Just call free (x) instead of doing ‘if (x != NULL) free (x);’.
+	(VASNPRINTF): Simplify by coalescing cleanup code.
+	Preserve malloc, realloc errno instead of replacing
+	it with ENOMEM.
+	(CLEANUP): Remove; no longer needed.
+	* modules/c-vasnprintf, modules/vasnprintf (Depends-on):
+	Depend on malloc-posix, realloc-posix so that we can
+	rely on errno being set on failure.
+
 2022-04-30  Bruno Haible  
 
 	string: Avoid syntax error on glibc systems with GCC 11.
diff --git a/lib/vasnprintf.c b/lib/vasnprintf.c
index 485745243f..d20d30dc9f 100644
--- a/lib/vasnprintf.c
+++ b/lib/vasnprintf.c
@@ -915,8 +915,7 @@ divide (mpn_t a, mpn_t b, mpn_t *q)
   q_ptr[q_len++] = 1;
 }
   keep_q:
-  if (tmp_roomptr != NULL)
-free (tmp_roomptr);
+  free (tmp_roomptr);
   q->limbs = q_ptr;
   q->nlimbs = q_len;
   return roomptr;
@@ -1865,29 +1864,19 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp,
 /* errno is already set.  */
 return NULL;
 
-  /* Frees the memory allocated by this function.  Preserves errno.  */
-#define CLEANUP() \
-  if (d.dir != d.direct_alloc_dir)  \
-free (d.dir);   \
-  if (a.arg != a.direct_alloc_arg)  \
-free (a.arg);
+  TCHAR_T *buf_malloced = NULL;
+  /* Output string accumulator.  */
+  DCHAR_T *result = resultbuf;
 
   if (PRINTF_FETCHARGS (args, &a) < 0)
-{
-  CLEANUP ();
-  errno = EINVAL;
-  return NULL;
-}
-
-  {
+errno = EINVAL;
+  else
+   {
 size_t buf_neededlength;
 TCHAR_T *buf;
-TCHAR_T *buf_malloced;
 const FCHAR_T *cp;
 size_t i;
 DIRECTIVE *dp;
-/* Output string accumulator.  */
-DCHAR_T *result;
 size_t allocated;
 size_t length;
 
@@ -1897,32 +1886,20 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp,
   xsum4 (7, d.max_width_length, d.max_precision_length, 6);
 #if HAVE_ALLOCA
 if (buf_neededlength < 4000 / sizeof (TCHAR_T))
-  {
-buf = (TCHAR_T *) alloca (buf_neededlength * sizeof (TCHAR_T));
-buf_malloced = NULL;
-  }
+  buf = (TCHAR_T *) alloca (buf_neededlength * sizeof (TCHAR_T));
 else
 #endif
   {
 size_t buf_memsize = xtimes (buf_neededlength, sizeof (TCHAR_T));
 if (size_overflow_p (buf_memsize))
-  goto out_of_memory_1;
+  goto out_of_memory;
 buf = (TCHAR_T *) malloc (buf_memsize);
 if (buf == NULL)
-  goto out_of_memory_1;
+  goto fail;
 buf_malloced = buf;
   }
 
-if (resultbuf != NULL)
-  {
-result = resultbuf;
-allocated = *lengthp;
-  }
-else
-  {
-result = NULL;
-allocated = 0;
-  }
+allocated = result != NULL ? *lengthp : 0;
 length = 0;
 /* Invariants:
result is either == resultbuf or == NULL or malloc-allocated.
@@ -1942,7 +1919,7 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp,
 memory_size = xtimes (allocated, sizeof (DCHAR_T));  \
 if (size_ov

Re: vasnprintf.c: "out_of_memory", -Wanalyzer-free-of-non-heap, -Wanalyzer-malloc-leak

2022-04-30 Thread Paul Eggert

On 4/30/22 07:13, Bjarni Ingi Gislason wrote:

   With latest gnulib version:


I'm not seeing this problem with the current 
(84863a1c4dc8cca8fb0f6f670f67779cdd2d543b) gnulib version on Fedora 35 
x86-64, which has GCC 11.3.1 20220421 (Red Hat 11.3.1-2).


Here's how I tried to reproduce the issue:

./gnulib-tool -h --create-testdir --dir foo vasnprintf
cd foo
./configure
make CFLAGS='-fanalyzer -Wanalyzer-mismatching-deallocation -O2' check

Does the above work for you? If so, how does it differ from what groff does?

The idea is to make the problem reproducible without dealing with groff 
or with whatever changes you made to groff. If that's not possible, I 
guess we'll need a copy of your groff source since it sounds like you've 
modified groff.




Re: glob.m4 leaves the file 'conf-file' behind

2022-04-28 Thread Paul Eggert
Thanks for reporting that; I installed the attached. It also cleans up a 
few test files left behind.From b5c8b3e7603827ad5b8e0b3c21060cdc49a49339 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Thu, 28 Apr 2022 14:40:48 -0700
Subject: [PATCH] glob: improve config and test cleanup
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Config problem reported by Benno Schulenberg in:
https://lists.gnu.org/r/bug-gnulib/2022-04/msg00071.html
* m4/glob.m4 (gl_GLOB): Clean up temporary file.
Also, name it conf$$-file not conf-file, so it’s cleaned
up on interrupt.
* modules/glob-tests (MOSTLYCLEANFILES):
Append test-glob.tglobfile, test-glob.tgloblink[123].
---
 ChangeLog  | 11 +++
 m4/glob.m4 |  6 +++---
 modules/glob-tests |  5 +
 3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 3ce3d85884..f111e479bf 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,14 @@
+2022-04-28  Paul Eggert  
+
+	glob: improve config and test cleanup
+	Config problem reported by Benno Schulenberg in:
+	https://lists.gnu.org/r/bug-gnulib/2022-04/msg00071.html
+	* m4/glob.m4 (gl_GLOB): Clean up temporary file.
+	Also, name it conf$$-file not conf-file, so it’s cleaned
+	up on interrupt.
+	* modules/glob-tests (MOSTLYCLEANFILES):
+	Append test-glob.tglobfile, test-glob.tgloblink[123].
+
 2022-04-26  Paul Eggert  
 
 	glob: port to NetBSD 9.2
diff --git a/m4/glob.m4 b/m4/glob.m4
index cf5f93930c..f59b84ff05 100644
--- a/m4/glob.m4
+++ b/m4/glob.m4
@@ -1,4 +1,4 @@
-# glob.m4 serial 25
+# glob.m4 serial 26
 dnl Copyright (C) 2005-2007, 2009-2022 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
@@ -70,7 +70,7 @@ char a[_GNU_GLOB_INTERFACE_VERSION == 1 || _GNU_GLOB_INTERFACE_VERSION == 2 ? 1
   AC_CACHE_CHECK([whether glob NOTDIR*/ omits symlink to nondir],
  [gl_cv_glob_omit_nondir_symlinks],
 [if test $cross_compiling != yes; then
-   if ln -s conf-file conf$$-globtest 2>/dev/null && touch conf-file
+   if ln -s conf$$-file conf$$-globtest 2>/dev/null && touch conf$$-file
then
  gl_cv_glob_omit_nondir_symlinks=maybe
else
@@ -94,7 +94,7 @@ char a[_GNU_GLOB_INTERFACE_VERSION == 1 || _GNU_GLOB_INTERFACE_VERSION == 2 ? 1
 :
])
fi
-   rm -f conf$$-globtest
+   rm -f conf$$-file conf$$-globtest
  else
gl_cv_glob_omit_nondir_symlinks="$gl_cross_guess_normal"
  fi
diff --git a/modules/glob-tests b/modules/glob-tests
index f551f6c950..ec519cf38d 100644
--- a/modules/glob-tests
+++ b/modules/glob-tests
@@ -12,3 +12,8 @@ Makefile.am:
 TESTS += test-glob
 check_PROGRAMS += test-glob
 test_glob_LDADD = $(LDADD) $(LIB_MBRTOWC)
+MOSTLYCLEANFILES += \
+  test-glob.tglobfile \
+  test-glob.tgloblink1 \
+  test-glob.tgloblink2 \
+  test-glob.tgloblink3
-- 
2.35.1



Re: build failure of glob module on NetBSD: request for member ‘dd_fd’...

2022-04-26 Thread Paul Eggert

On 4/21/22 00:41, Benno Schulenberg wrote:


glob.c:1361:21: error: request for member ‘dd_fd’ in something not a structure 
or
union


Thanks for reporting that. That's due to a bug in NetBSD 9.2's 
implementation of dirfd. It's not implemented as a function (which is a 
bug in and of itself; POSIX says it must work even if you #undef it), 
and its macro doesn't work with a void * argument (where a function 
would work).


A good way to fix this would be to modify the dirfd module to work 
around the NetBSD bugs. I took the easy way out, though, and simply 
documented the bugs and modified glob to not run afoul of the bugs, by 
installing the attached patch into Gnulib.diff --git a/ChangeLog b/ChangeLog
index ddd4826bcf..3ce3d85884 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,14 @@
+2022-04-26  Paul Eggert  
+
+	glob: port to NetBSD 9.2
+	Problem reported by Benno Schulenberg in:
+	https://lists.gnu.org/r/bug-gnulib/2022-04/msg00052.html
+	* doc/posix-functions/dirfd.texi: Document NetBSD 9.2 portability
+	bugs.  Remove an old comment about errno that is no longer true
+	of POSIX 2018.
+	* lib/glob.c (glob_in_dir): Convert dirfd arg from void *
+	to DIR * before passing it to dirfd.
+
 2022-04-21  Paul Eggert  
 
 	regex: match [...---...] like V7 grep
diff --git a/doc/posix-functions/dirfd.texi b/doc/posix-functions/dirfd.texi
index 46ad5fe7d4..d0f6c8cdcd 100644
--- a/doc/posix-functions/dirfd.texi
+++ b/doc/posix-functions/dirfd.texi
@@ -18,8 +18,9 @@ Portability problems not fixed by Gnulib:
 @item
 This function always fails on some platforms:
 mingw.
-@end itemize
 
-With the @code{dirfd} module, this functions always sets @code{errno} when it
-fails. (POSIX does not require that @code{dirfd} sets @code{errno} when it
-fails.)
+@item
+There is a @code{dirfd} macro but no function, and the macro does not
+work with an argument of type @code{void *}, as a function would:
+NetBSD 9.2.
+@end itemize
diff --git a/lib/glob.c b/lib/glob.c
index f6993a3706..57cb3bd1d1 100644
--- a/lib/glob.c
+++ b/lib/glob.c
@@ -1357,7 +1357,8 @@ glob_in_dir (const char *pattern, const char *directory, int flags,
 }
   else
 {
-  int dfd = dirfd (stream);
+  DIR *dirp = stream;
+  int dfd = dirfd (dirp);
   int fnm_flags = ((!(flags & GLOB_PERIOD) ? FNM_PERIOD : 0)
| ((flags & GLOB_NOESCAPE) ? FNM_NOESCAPE : 0));
   flags |= GLOB_MAGCHAR;


Re: bug#20657: Accepting [xyz---abc] - three minus signs to mean one

2022-04-24 Thread Paul Eggert

On 4/24/22 06:21, arn...@skeeve.com wrote:

I plan to add a test to gawk; perhaps grep would benefit from one as well?


That'd need more than just a test, as we'd need to also modify regex.m4 
to arrange to replace any system regex that has this incompatibility 
with gnulib regex. And we'd need to document the extension since we 
shouldn't test undocumented features. Although such work could be done, 
I expect it'd be a more productive use of our limited time to get this 
extension into glibc first. I'll add that to my (long) list of things to do.




Re: [BUG REPORT] gnulib Android NDK and/or Cygwin build failure regression after 0c8a563f

2022-04-24 Thread Paul Eggert

On 4/3/22 19:22, osm0...@outlook.com wrote:

Examined a copy of the conftest.out file the script parses those from and 
noticed it had CR+LF due to being from Windows Android NDK and/or on Cygwin,


Sounds like your Cygwin shell is misconfigured. You might try setting 
SHELLOPTS='igncr' in your environment before running the shell.




Re: Accepting [xyz---abc] - three minus signs to mean one

2022-04-21 Thread Paul Eggert

On 4/21/22 00:57, Arnold Robbins wrote:


As far as my testing indicates, dfa.c doesn't need a patch, it seems
to accept "---" inside brackets for a single minus.


Yes, a brief perusal of the dfa.c source code suggests you're right. 
Thanks for looking into this. I tend to agree with you that POSIX is not 
likely to outlaw this extension.




If there are no objections, can we get this into Gnulib?


Although the basic idea looks good, I see a few places where the patch 
can be improved.


* The two calls to re_string_peek_byte might go past the end of the 
pattern (a subscript violation). This is possible because the pattern is 
not necessarily null-terminated.


* The two calls to re_string_fetch_byte can be simplified into a single 
call to re_string_skip_bytes.


* No need to assign to token->opr.c, as it already has the correct value.

* Can fall through to the default case to save a bit of duplicate code.

* glibc still uses comments /* like this */ for style reasons, and we 
should stick to that.


I wrote a patch with these improvements in mind and installed it into 
Gnulib (see attached); hope it works for Gawk too.From dd83dfb3f2d2e5139ea7d00240b5441daa0b3a56 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Thu, 21 Apr 2022 18:56:12 -0700
Subject: [PATCH] regex: match [...---...] like V7 grep

Problem reported by Arnold Robbins in:
https://bugs.gnu.org/20657
https://lists.gnu.org/r/bug-gnulib/2022-04/msg00053.html
* lib/regcomp.c (peek_token_bracket): Let [...---...] match '-'.
This is an extension to POSIX, and matches V7 Unix grep.
---
 ChangeLog |  9 +
 lib/regcomp.c | 16 +---
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index cd16bbe0cd..ddd4826bcf 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2022-04-21  Paul Eggert  
+
+	regex: match [...---...] like V7 grep
+	Problem reported by Arnold Robbins in:
+	https://bugs.gnu.org/20657
+	https://lists.gnu.org/r/bug-gnulib/2022-04/msg00053.html
+	* lib/regcomp.c (peek_token_bracket): Let [...---...] match '-'.
+	This is an extension to POSIX, and matches V7 Unix grep.
+
 2022-04-20  Paul Eggert  
 
 	backupfile: fix bug when renaming simple backups
diff --git a/lib/regcomp.c b/lib/regcomp.c
index b607c85320..122c3de58c 100644
--- a/lib/regcomp.c
+++ b/lib/regcomp.c
@@ -2038,15 +2038,25 @@ peek_token_bracket (re_token_t *token, re_string_t *input, reg_syntax_t syntax)
 }
   switch (c)
 {
-case '-':
-  token->type = OP_CHARSET_RANGE;
-  break;
 case ']':
   token->type = OP_CLOSE_BRACKET;
   break;
 case '^':
   token->type = OP_NON_MATCH_LIST;
   break;
+case '-':
+  /* In V7 Unix grep and Unix awk and mawk, [...---...]
+ (3 adjacent minus signs) stands for a single minus sign.
+ Support that without breaking anything else.  */
+  if (! (re_string_cur_idx (input) + 2 < re_string_length (input)
+ && re_string_peek_byte (input, 1) == '-'
+ && re_string_peek_byte (input, 2) == '-'))
+{
+  token->type = OP_CHARSET_RANGE;
+  break;
+}
+  re_string_skip_bytes (input, 2);
+  FALLTHROUGH;
 default:
   token->type = CHARACTER;
 }
-- 
2.35.1



Re: bug#54764: encode-time: make DST and TIMEZONE fields of the list argument optional ones

2022-04-21 Thread Paul Eggert
What appears to be happening here is that the MS-Windows native 
timestamp resolution is 1/64th of a second, and your system's clock is 
offset by 0.0075 s from an integer boundary. I.e., the timestamps in 
increasing order are:


  ...
  1650522862 + 62/64 + 0.0075 = 1650522862.976250
  1650522862 + 63/64 + 0.0075 = 1650522862.991875
  1650522863 +  0/64 + 0.0075 = 1650522863.007500
  1650522863 +  1/64 + 0.0075 = 1650522863.023125
  1650522863 +  2/64 + 0.0075 = 1650522863.038750
  ...

and the system clock never returns a timestamp on an integer boundary 
(i.e., tv_nsec is never zero).


We have two options to express this as Emacs timestamps:

(1) We can keep information about resolution but lose information about 
time, by using a resolution of 15.625 ms (i.e., 1/64 s) and truncating 
timestamps to the nearest 1/64 second.  This would generate the 
following (TICKS . HZ) timestamps:


  ...
  (105633463230 . 64) = 1650522862 + 62/64 = 1650522862.968750
  (105633463231 . 64) = 1650522862 + 63/64 = 1650522862.984375
  (105633463232 . 64) = 1650522863 +  0/64 = 1650522863.00
  (105633463233 . 64) = 1650522863 +  1/64 = 1650522863.015625
  (105633463234 . 64) = 1650522863 +  2/64 = 1650522863.031250
  ...

(2) We can keep information about time but lose information about the 
resolution, by using a resolution of 0.625 ms (i.e., HZ = 10 / 
625000 = 16000). (We use 0.625 ms because it is the coarsest resolution 
that does not lose time info.) This would generate the following (TICKS 
. HZ) timestamps:


  ...
  (2640836580762 . 1600) = 1650522862 + 1562/1600 = 1650522862.976250
  (2640836580762 . 1600) = 1650522862 + 1587/1600 = 1650522862.991875
  (2640836580762 . 1600) = 1650522863 +   12/1600 = 1650522863.007500
  (2640836580762 . 1600) = 1650522863 +   37/1600 = 1650522863.023125
  (2640836580762 . 1600) = 1650522863 +   62/1600 = 1650522863.038750
  ...

The patch does (2), and this explains the "gettime_res returned 625000 
ns" in your output.


It shouldn't be hard to change the patch to do (1), if desired. I doubt 
whether users will care one way or the other.



> don't we use time values for file timestamps?

Yes, but file timestamps should use the resolution of the file system, 
which in general is different from the resolution of the system clock. 
That's a separate matter, which would be the subject of a separate 
patch. For now we can stick with what we already have in that department.



> And for Windows, all this does is measure the "resolution" of the
> Gnulib emulation of timespec functions on MS-Windows, it tells nothing
> about the real resolution of the system time values.

If Emacs Lisp code (which currently is based on the Gnulib code) can see 
only (say) 1-microsecond timestamps, then it doesn't matter that the 
underlying system clock has (say) 1-nanosecond precision. Of course it 
would be better for Emacs to see the system timestamp resolution, and if 
we can get the time of day on MS-Windows to a precision better than 1/64 
second then I assume Emacs should do that. Once it does, the patch 
should calculate a higher HZ value to tell users about the improved 
resolution.



> if the "time resolution" determined by this procedure
> is different between two systems, does it mean that two time values
> that are 'equal' on one of them could be NOT 'equal' on another?

Sure, but the traditional (HIGH LOW MICROSEC PICOSEC) representation has 
the same issue. For example, if A's clock has 1 ms resolution and B's 
clock has 10 ms resolution, A's (time-convert nil 'list) called twice 
would return (say) the two timestamps (25184 64239 1000 0) and (25184 
64239 1001 0) at the same moments that B's calls would return (25184 
64239 1000 0) twice. A would say that the two timestamps differ; B would 
say they're the same.


This sort of disagreement is inherent to how timestamp resolution works. 
It doesn't matter whether the timestamps are represented by (HIGH LOW 
MICROSEC PICOSEC) or by (TICKS . HZ); users will run into the same 
problem in both cases.





Re: bug#55029: Simple backup swaps source and destination files

2022-04-20 Thread Paul Eggert

On 4/19/22 16:05, Steve Ward wrote:

When doing mv or cp with --backup=simple, if an existing file in
DIRECTORY has the same name as SOURCE, the files appear to be swapped
instead of an in-place backup of the original file in DIRECTORY being
made.


Thanks for the bug report. That's new to coreutils 9.1, and is a big 
enough fail that it suggests we'll need a 9.2 sooner rather than later. 
I introduced the bug when fixing an earlier bug (sorry).


I installed the attached Gnulib patch, which should fix the bug in 
Coreutils, with the attached two Coreutils patches to update to the 
latest Gnulib, and to add a test case for the bug.From 7347caeb9d902d3fca2c11f69a55a3e578d93bfe Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 20 Apr 2022 19:34:57 -0700
Subject: [PATCH] backupfile: fix bug when renaming simple backups

* lib/backupfile.c (backupfile_internal): Fix bug when RENAME
and when doing simple backups.  Problem reported by Steve Ward in:
https://bugs.gnu.org/55029
---
 ChangeLog| 5 +
 lib/backupfile.c | 7 +++
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 4b39a6a443..cd16bbe0cd 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,10 @@
 2022-04-20  Paul Eggert  
 
+	backupfile: fix bug when renaming simple backups
+	* lib/backupfile.c (backupfile_internal): Fix bug when RENAME
+	and when doing simple backups.  Problem reported by Steve Ward in:
+	https://bugs.gnu.org/55029
+
 	gettime-res: more-robust sampling
 	* lib/gettime-res.c (gettime_res): If adjacent timestamps are
 	identical search for a differing timestamp.  Also, stop collecting
diff --git a/lib/backupfile.c b/lib/backupfile.c
index 1e9290a187..d9f465a3e0 100644
--- a/lib/backupfile.c
+++ b/lib/backupfile.c
@@ -332,7 +332,7 @@ backupfile_internal (int dir_fd, char const *file,
 return s;
 
   DIR *dirp = NULL;
-  int sdir = AT_FDCWD;
+  int sdir = dir_fd;
   idx_t base_max = 0;
   while (true)
 {
@@ -371,10 +371,9 @@ backupfile_internal (int dir_fd, char const *file,
   if (! rename)
 break;
 
-  int olddirfd = sdir < 0 ? dir_fd : sdir;
-  idx_t offset = sdir < 0 ? 0 : base_offset;
+  idx_t offset = backup_type == simple_backups ? 0 : base_offset;
   unsigned flags = backup_type == simple_backups ? 0 : RENAME_NOREPLACE;
-  if (renameatu (olddirfd, file + offset, sdir, s + offset, flags) == 0)
+  if (renameatu (sdir, file + offset, sdir, s + offset, flags) == 0)
 break;
   int e = errno;
   if (! (e == EEXIST && extended))
-- 
2.35.1

From d1be566b18b9df34a22d61c9aa92bde00a4a6f0e Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 20 Apr 2022 19:36:44 -0700
Subject: [PATCH 1/2] build: update gnulib submodule to latest

---
 gnulib | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gnulib b/gnulib
index 58c597d13..7347caeb9 16
--- a/gnulib
+++ b/gnulib
@@ -1 +1 @@
-Subproject commit 58c597d13bc57dce3e97ea97856573f2d68ccb8c
+Subproject commit 7347caeb9d902d3fca2c11f69a55a3e578d93bfe
-- 
2.35.1

From 56b314b384192ab75c23c281968a38ac2cb31617 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 20 Apr 2022 19:44:56 -0700
Subject: [PATCH 2/2] mv: test Bug#55029

* tests/mv/backup-dir.sh: New test for Bug#55029,
reported by Steve Ward.
---
 NEWS   | 5 +
 tests/mv/backup-dir.sh | 6 ++
 2 files changed, 11 insertions(+)

diff --git a/NEWS b/NEWS
index 7bedb0617..26eb52ca0 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,11 @@ GNU coreutils NEWS-*- outline -*-
 
 * Noteworthy changes in release ?.? (-??-??) [?]
 
+** Bug fixes
+
+  'mv --backup=simple f d/' no longer mistakenly backs up d/f to f~.
+  [bug introduced in coreutils-9.1]
+
 
 * Noteworthy changes in release 9.1 (2022-04-15) [stable]
 
diff --git a/tests/mv/backup-dir.sh b/tests/mv/backup-dir.sh
index 84c51afc8..2f708b5b6 100755
--- a/tests/mv/backup-dir.sh
+++ b/tests/mv/backup-dir.sh
@@ -36,4 +36,10 @@ mkdir C D E || framework_failure_
 mv -T --backup=numbered C E/ || fail=1
 mv -T --backup=numbered D E/ || fail=1
 
+# Bug#55029
+mkdir F && echo 1 >1 && echo 2 >2 && cp 1 F/X && cp 2 X || framework_failure_
+mv --backup=simple X F/ || fail=1
+compare 1 F/X~ || fail=1
+compare 2 F/X || fail=1
+
 Exit $fail
-- 
2.35.1



Re: bug#54764: encode-time: make DST and TIMEZONE fields of the list argument optional ones

2022-04-20 Thread Paul Eggert

On 4/20/22 12:30, Eli Zaretskii wrote:


I see the time samples change in jumps of 15 msec.


Could you give the first part of the output? I would like to see what 
the the samples are jumping from and to, and how often they jump.


Something like the following is what I'd hope to see from the first 
lines of the output of 'gllib/test-gettime-res x'. What are you seeing?


gettime_res returned 15625000 ns
time = 1650496481.515625000
time = 1650496481.53125
time = 1650496481.546875000
time = 1650496481.56250
time = 1650496481.578125000
time = 1650496481.59375
time = 1650496481.609375000
time = 1650496481.62500
time = 1650496481.640625000
time = 1650496481.65625


 Which is expected
on MS-Windows, given the scheduler time tick, but what does that have
to do with the system's time resolution?


The resolution of Elisp's (time-convert nil t) is determined by the 
smallest nonzero gap between timestamps that are returned by C's 
current_timespec. This is the system time resolution as far as Elisp is 
concerned, because Elisp cannot return the current time at a finer 
resolution than what current_timespec gives it. This resolution is not 
necessarily the same as the time resolution of the motherboard clock, OS 
high-res timestamp, file timestamps, etc.



And how is the 0.625 msec
number reported by the program obtained from those samples?


Use the largest resolution R ns consistent with the samples, such that 
10 is an integer multiple of R so that timestamp computations 
are exact.




Re: bug#54764: encode-time: make DST and TIMEZONE fields of the list argument optional ones

2022-04-20 Thread Paul Eggert

On 4/20/22 12:14, Eli Zaretskii wrote:

Sorry, my bad.  The result is the same, but I do get printouts.  What
do you want to know or see from there?


I want to see what the current_timespec's resolution is, which we should 
be able to tell from the debugging output. For example, on my Solaris 10 
sparc platform the command 'gltests/test-gettime-res x' outputs:


gettime_res returned 200 ns
time = 1650482432.256445600
time = 1650482432.256460600
time = 1650482432.256464400
time = 1650482432.256468200
time = 1650482432.256471400
time = 1650482432.256474600
time = 1650482432.256478000
time = 1650482432.256481200
time = 1650482432.256484800
...

and these timestamps say that with very high probability 
current_timespec's clock resolution is indeed 200 ns.




[PATCH 2/2] Port _GL_HAS_C_ATTRIBUTE to pedantic gcc -std=c99

2022-04-19 Thread Paul Eggert
* m4/gnulib-common.m4 (_GL_HAS_C_ATTRIBUTE):
Disable -Wpedantic if using __has_c_attribute and this is not C2x.
---
 ChangeLog   | 4 
 m4/gnulib-common.m4 | 6 +-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index c2509b8387..fee51331c8 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,9 @@
 2022-04-19  Paul Eggert  
 
+   Port _GL_HAS_C_ATTRIBUTE to pedantic gcc -std=c99
+   * m4/gnulib-common.m4 (_GL_HAS_C_ATTRIBUTE):
+   Disable -Wpedantic if using __has_c_attribute and this is not C2x.
+
verify: port to pedantic gcc -std=c99
* lib/verify.h (_GL_VERIFY): If we lack both _Static_assert and
static_assert, suppress -Wnexted-externs.
diff --git a/m4/gnulib-common.m4 b/m4/gnulib-common.m4
index c5ced04f18..30911d1581 100644
--- a/m4/gnulib-common.m4
+++ b/m4/gnulib-common.m4
@@ -1,4 +1,4 @@
-# gnulib-common.m4 serial 72
+# gnulib-common.m4 serial 73
 dnl Copyright (C) 2007-2022 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
@@ -106,6 +106,10 @@ AC_DEFUN([gl_COMMON_BODY], [
 #endif
 
 #ifdef __has_c_attribute
+# if ((defined __STDC_VERSION__ ? __STDC_VERSION__ : 0) <= 201710 \
+  && _GL_GNUC_PREREQ (4, 6))
+#  pragma GCC diagnostic ignored "-Wpedantic"
+# endif
 # define _GL_HAS_C_ATTRIBUTE(attr) __has_c_attribute (__##attr##__)
 #else
 # define _GL_HAS_C_ATTRIBUTE(attr) 0
-- 
2.35.1




[PATCH 1/2] verify: port to pedantic gcc -std=c99

2022-04-19 Thread Paul Eggert
* lib/verify.h (_GL_VERIFY): If we lack both _Static_assert and
static_assert, suppress -Wnexted-externs.
---
 ChangeLog| 4 
 lib/verify.h | 3 +++
 2 files changed, 7 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index 9bab736be4..c2509b8387 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,9 @@
 2022-04-19  Paul Eggert  
 
+   verify: port to pedantic gcc -std=c99
+   * lib/verify.h (_GL_VERIFY): If we lack both _Static_assert and
+   static_assert, suppress -Wnexted-externs.
+
gettime-res: add tests
* modules/gettime-res-tests, tests/test-gettime-res.c: New files.
 
diff --git a/lib/verify.h b/lib/verify.h
index c2d2a56670..c5c63ae97c 100644
--- a/lib/verify.h
+++ b/lib/verify.h
@@ -215,6 +215,9 @@ template 
 # define _GL_VERIFY(R, DIAGNOSTIC, ...)\
 extern int (*_GL_GENSYM (_gl_verify_function) (void)) \
   [_GL_VERIFY_TRUE (R, DIAGNOSTIC)]
+# if 4 < __GNUC__ + (6 <= __GNUC_MINOR__)
+#  pragma GCC diagnostic ignored "-Wnested-externs"
+# endif
 #endif
 
 /* _GL_STATIC_ASSERT_H is defined if this code is copied into assert.h.  */
-- 
2.35.1




Re: Emacs 28.1 doesn't compile on Mac OS 10.7.5

2022-04-17 Thread Paul Eggert

On 4/17/22 02:13, Mattias Engdegård wrote:

I suppose it wouldn't hurt for an old Mac OS X expert to check the other uses of 
__clang_major__ in Emacs.<0001-verify-port-to-Mac-OS-10.7.5.patch>

I'm no expert, old or not, but I would prefer doing the minimum necessary to 
keep builds working. If that means slightly suboptimal code or diagnostics for 
long-obsolete OS X versions then so be it.



Yes, quite right. My only worry was whether Emacs has incorrect uses of 
 __clang_major__ that cause incorrect user-visible behavior on some 
macOS versions. I'm not worried about suboptimal code or bogus warnings.


Perhaps I'm worrying too much. Not being an old Mac OS X expert (pun was 
intended :-) I don't know. If someone reading this email is such an 
expert and cares about old macOS ports I hope they can spare a few 
minutes to check.




Re: Emacs 28.1 doesn't compile on Mac OS 10.7.5

2022-04-17 Thread Paul Eggert

On 4/16/22 20:28, Jeffrey Walton wrote:

maybe you should define a couple of macros
like GNULIB_LLVM_CLANG_VER and GNULIB_APPLE_CLANG_VER


I hope we don't need to do that. This is software archaeology (Mac OS X 
10.7.5 is so old that neither the Subject: line nor my patch got its 
name right, and nobody mentioned the mistake :-) and these macros would 
clutter the code for little benefit. Most Clang-specific code nowadays 
shouldn't use Clang version numbers; it should use __has_builtin etc.




Re: Emacs 28.1 doesn't compile on Mac OS 10.7.5

2022-04-16 Thread Paul Eggert

On 4/15/22 09:22, Mattias Engdegård wrote:

Paul, would you consider something like that patch (repeated here) for gnulib?


Sure, I installed the attached into Gnulib master on Savannah.

I suppose it wouldn't hurt for an old Mac OS X expert to check the other 
uses of __clang_major__ in Emacs.From 0cda5beb7962f6567f0c4e377df870fa05c6d681 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 16 Apr 2022 19:18:03 -0700
Subject: [PATCH] verify: port to Mac OS 10.7.5
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Mac OS 10.7.5 clang sets __clang_major__ to 4 even though it was
derived from Clang 3.2.  Problem reported by Werner Lemberg in:
https://lists.gnu.org/r/emacs-devel/2022-04/msg00779.html
* lib/verify.h (_GL_HAVE__STATIC_ASSERT): Don’t define to 1
when __clang_major__ == 4 && !__cplusplus
&& __STDC_VERSION__ < 201112L && !defined __STRICT_ANSI__.
---
 ChangeLog| 10 ++
 lib/verify.h |  2 +-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index a9b82a47d2..1e238d14e9 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,13 @@
+2022-04-16  Paul Eggert  
+
+	verify: port to Mac OS 10.7.5
+	Mac OS 10.7.5 clang sets __clang_major__ to 4 even though it was
+	derived from Clang 3.2.  Problem reported by Werner Lemberg in:
+	https://lists.gnu.org/r/emacs-devel/2022-04/msg00779.html
+	* lib/verify.h (_GL_HAVE__STATIC_ASSERT): Don’t define to 1
+	when __clang_major__ == 4 && !__cplusplus
+	&& __STDC_VERSION__ < 201112L && !defined __STRICT_ANSI__.
+
 2022-04-15  Bruno Haible  
 
 	sigsegv: Fix compilation error on arceb CPUs.
diff --git a/lib/verify.h b/lib/verify.h
index 07b2f4866f..c2d2a56670 100644
--- a/lib/verify.h
+++ b/lib/verify.h
@@ -34,7 +34,7 @@
 #ifndef __cplusplus
 # if (201112L <= __STDC_VERSION__ \
   || (!defined __STRICT_ANSI__ \
-  && (4 < __GNUC__ + (6 <= __GNUC_MINOR__) || 4 <= __clang_major__)))
+  && (4 < __GNUC__ + (6 <= __GNUC_MINOR__) || 5 <= __clang_major__)))
 #  define _GL_HAVE__STATIC_ASSERT 1
 # endif
 # if (202000L <= __STDC_VERSION__ \
-- 
2.32.0



Re: Module idx

2022-04-13 Thread Paul Eggert

On 4/12/22 02:12, Marc Nieper-Wißkirchen wrote:

I am wondering how to print (using printf) values of type idx_t
reliably without assuming that idx_t == ptrdiff_t and without
conversion to uintptr_t.


I just use %td, as that works better with i18n.

If we ever change idx_t to some other type (not likely) I plan to change 
the %td instances to something else then; that's easier than worrying 
about this now.




[PATCH] libgmp: pacify Clang too

2022-04-08 Thread Paul Eggert
* lib/mini-gmp-gnulib.c [NDEBUG]: Also use -Wunused-variable if clang.
Problem reported for Emacs by Mattias Engdegård.
---
 ChangeLog | 6 ++
 lib/mini-gmp-gnulib.c | 3 ++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index fb5802d61b..c3723d255a 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2022-04-08  Paul Eggert  
+
+   libgmp: pacify Clang too
+   * lib/mini-gmp-gnulib.c [NDEBUG]: Also use -Wunused-variable if clang.
+   Problem reported for Emacs by Mattias Engdegård.
+
 2022-04-04  Paul Eggert  
 
init.sh: don’t assume gzip
diff --git a/lib/mini-gmp-gnulib.c b/lib/mini-gmp-gnulib.c
index a18ee8f6ab..7d09c80e9e 100644
--- a/lib/mini-gmp-gnulib.c
+++ b/lib/mini-gmp-gnulib.c
@@ -40,7 +40,8 @@
 #endif
 
 /* Pacify GCC -Wunused-variable for variables used only in 'assert' calls.  */
-#if defined NDEBUG && 4 < __GNUC__ + (6 <= __GNUC_MINOR__)
+#if (defined NDEBUG \
+ && (4 < __GNUC__ + (6 <= __GNUC_MINOR__) || defined __clang__))
 # pragma GCC diagnostic ignored "-Wunused-variable"
 #endif
 
-- 
2.35.1




Re: gawk-5.1.1 bug report

2022-04-06 Thread Paul Eggert

On 4/6/22 01:24, arn...@skeeve.com wrote:

Most people
would wonder "Why is there a bitwise and here?" and not think of it
as a logical and.


I'm not sure I agree about the "most", as I expect most people won't 
notice or care about this level of detail. However, for people who 
wonder like that, about adding an explanatory comment? That will help 
people who are unaccustomed to this valid and useful (albeit 
less-common) programming style. Something like the attached (untested) 
patch, perhaps?



& for a logical test can be dangerous since any non-zero
value can be true.


Sure, but that's an issue only when using & on types like 'int'. It's 
not an issue when using & on 'bool'. Similarly, + has rounding issues on 
'float' but that doesn't mean we need to worry about +'s rounding issues 
on 'int'.diff --git a/lib/dfa.c b/lib/dfa.c
index a27d096f73..391c2ffbf2 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -43,6 +43,11 @@
MMU will check anyway.  */
 #define assume_nonnull(x) assume ((x) != NULL)
 
+/* Pacify Clang.  */
+#ifdef clang
+ #pragma clang diagnostic ignored "-Wbitwise-instead-of-logical"
+#endif
+
 static bool
 streq (char const *a, char const *b)
 {
@@ -1089,6 +1094,8 @@ parse_bracket_exp (struct dfa *dfa)
   /* Treat [x-y] as a range if x != y.  */
   if (wc != wc2 || wc == WEOF)
 {
+  /* Use "&" instead of "&&", as short-circuit evaluation is
+ not needed and might even slow things down.  */
   if (dfa->localeinfo.simple
   || (isasciidigit (c) & isasciidigit (c2)))
 {


Re: gawk-5.1.1 bug report

2022-04-06 Thread Paul Eggert

On 4/6/22 03:28, Bernhard Voelker wrote:

Well, it was an argument to say that & eliminates a conditional execution
branch, but if both sides of the & operator have to be evaluated


They don't. Neither operand has side effects, so a compiler can evaluate 
either operand and not bother to evaluate the other operand if the 
evaluated operand is false.


Not that I expect compilers to do that in this particular case, as on 
today's processors it can be faster to evaluate both sides 
unconditionally. The hardware evaluates the two sides in parallel and 
this is way faster than a branch predictor miss.



calling the 2nd function is much more
overhead than the savings of & over &&, right?


No, as the functions are inlined so there is no function call overhead.



Re: gawk-5.1.1 bug report

2022-04-06 Thread Paul Eggert

On 4/6/22 00:04, arn...@skeeve.com wrote:

IMHO clear code beats saving a single branch


Sure, but clarity also argues for "&" over "&&" here. Writing "f(x) && 
f(y)" would incorrectly imply that it's important that f(y) should not 
be evaluated when f(x) is false, an implication that is incorrect here. 
Writing "f(x) & f(y)" tells the reader that both sides are safe to 
evaluate and that they can be evaluated in either order, something I 
found worth knowing when I read that part of the code.




Re: gawk-5.1.1 bug report

2022-04-05 Thread Paul Eggert

On 4/5/22 22:18, arn...@skeeve.com wrote:

  dfa.c:1093:27: warning: use of bitwise '&' with boolean operands 
[-Wbitwise-instead-of-logical]


It's valid in C to use bitwise '&' on bool, and doing so here eliminates 
a conditional branch at the machine level, which can be a win.


How about if you disable -Wbitwise-instead-of-logical instead, since 
it's a false alarm?




[PATCH] init.sh: don’t assume gzip

2022-04-04 Thread Paul Eggert
* tests/init.sh (rand_bytes_): Don’t assume gzip is installed.
I found this while testing gzip installation on a platform where I
had removed the installed gzip.  gzip is executed only on
platforms lacking mktemp and /dev/urandom so this code is rarely
used; however, these platforms might also lack gzip since gzip
is neither specified by POSIX or required by the GNU Coding Standards.
---
 ChangeLog | 10 ++
 tests/init.sh | 43 ---
 2 files changed, 34 insertions(+), 19 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 0f88fceed0..fb5802d61b 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,13 @@
+2022-04-04  Paul Eggert  
+
+   init.sh: don’t assume gzip
+   * tests/init.sh (rand_bytes_): Don’t assume gzip is installed.
+   I found this while testing gzip installation on a platform where I
+   had removed the installed gzip.  gzip is executed only on
+   platforms lacking mktemp and /dev/urandom so this code is rarely
+   used; however, these platforms might also lack gzip since gzip
+   is neither specified by POSIX or required by the GNU Coding Standards.
+
 2022-03-30  Paul Eggert  
 
glob: sync better with glibc
diff --git a/tests/init.sh b/tests/init.sh
index 933fdd40f3..d5d37c98f8 100644
--- a/tests/init.sh
+++ b/tests/init.sh
@@ -271,12 +271,10 @@ test -n "$EXEEXT" && test -n "$BASH_VERSION" && shopt -s 
expand_aliases
 #
 # First, try to use the mktemp program.
 # Failing that, we'll roll our own mktemp-like function:
-#  - try to get random bytes from /dev/urandom
+#  - try to get random bytes from /dev/urandom, mapping them to file-name bytes
 #  - failing that, generate output from a combination of quickly-varying
-#  sources and gzip.  Ignore non-varying gzip header, and extract
-#  "random" bits from there.
-#  - given those bits, map to file-name bytes using tr, and try to create
-#  the desired directory.
+#  sources and awk.
+#  - try to create the desired directory.
 #  - make only $MAX_TRIES_ attempts
 
 # Helper function.  Print $N pseudo-random bytes from a-zA-Z0-9.
@@ -296,20 +294,27 @@ rand_bytes_ ()
 return
   fi
 
-  n_plus_50_=`expr $n_ + 50`
-  cmds_='date; date +%N; free; who -a; w; ps auxww; ps -ef'
-  data_=` (eval "$cmds_") 2>&1 | gzip `
-
-  # Ensure that $data_ has length at least 50+$n_
-  while :; do
-len_=`echo "$data_"|wc -c`
-test $n_plus_50_ -le $len_ && break;
-data_=` (echo "$data_"; eval "$cmds_") 2>&1 | gzip `
-  done
-
-  echo "$data_" \
-| dd bs=1 skip=50 count=$n_ 2>/dev/null \
-| LC_ALL=C tr -c $chars_ 01234567$chars_$chars_$chars_
+  # Fall back on quickly-varying sources + awk.
+  # Limit awk program to 7th Edition Unix so that it works even on Solaris 10.
+
+  (date; date +%N; free; who -a; w; ps auxww; ps -ef) 2>&1 | awk '
+ BEGIN {
+   n = '"$n_"'
+   for (i = 0; i < 256; i++)
+ ordinal[sprintf ("%c", i)] = i
+ }
+ {
+   for (i = 1; i <= length; i++)
+ a[ai++ % n] += ordinal[substr ($0, i, 1)]
+ }
+ END {
+   chars = "'"$chars_"'"
+   charslen = length (chars)
+   for (i = 0; i < n; i++)
+ printf "%s", substr (chars, a[i] % charslen + 1, 1)
+   printf "\n"
+ }
+  '
 }
 
 mktempd_ ()
-- 
2.35.1




Re: posix/glob.c: update from gnulib

2022-03-30 Thread Paul Eggert

On 3/30/22 16:40, Paul Eggert wrote:


I updated Gnulib to reflect this change; see first attached patch.


Oops, forgot to attach that patch. Here it is. Also cc'ing to bug-gnulib.From 8fa9898afa5ee3da8f5d5a4797f98ae62b12d427 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 30 Mar 2022 16:29:11 -0700
Subject: [PATCH] glob: sync better with glibc

* lib/glob.c (dirfd) [_LIBC]: Use #undef instead of #ifdef.
Problem reported by DJ Delorie.
---
 ChangeLog  | 6 ++
 lib/glob.c | 5 ++---
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 7ea4f7797b..0f88fceed0 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2022-03-30  Paul Eggert  
+
+	glob: sync better with glibc
+	* lib/glob.c (dirfd) [_LIBC]: Use #undef instead of #ifdef.
+	Problem reported by DJ Delorie.
+
 2022-03-23  Paul Eggert  
 
 	glob: test for glibc bug 25659
diff --git a/lib/glob.c b/lib/glob.c
index 52c79b4cd8..f6993a3706 100644
--- a/lib/glob.c
+++ b/lib/glob.c
@@ -57,9 +57,8 @@
 # define sysconf(id) __sysconf (id)
 # define closedir(dir) __closedir (dir)
 # define opendir(name) __opendir (name)
-# ifndef dirfd
-#  define dirfd(str) __dirfd (str)
-# endif
+# undef dirfd
+# define dirfd(str) __dirfd (str)
 # define readdir(str) __readdir64 (str)
 # define getpwnam_r(name, bufp, buf, len, res) \
 __getpwnam_r (name, bufp, buf, len, res)
-- 
2.35.1



Re: Issue building gnulib with clang (as used in GRUB)

2022-03-25 Thread Paul Eggert

On 3/24/22 12:16, Darren Kenny wrote:

Is this a known issue when building with clang? Would you have any
suggestions on how to correctly resolve it?


I didn't know about the issue, or had forgotten about it.

One way to resolve it would be to figure out what clang option is needed 
to suppress the incorrect warning, and to submit a patch to Gnulib that 
will use that option. If clang spells it differently from -Wvla, then 
use the different spelling.



I've tried building with the C flag -D__STDC_NO_VLA__, and this will get
it to build, but that seems like something that should only be defined
by a compiler and not a consumer of the gnulib code.


Another possibility is to ask the clang folks, perhaps by filing a bug 
report, as to why their compiler is rejecting VLAs but is not defining 
__STDC_NO_VLA__. A compiler is supposed to define that macro if it's not 
supporting VLAs.



he possible
security issue is that this size variable can be manipulated
to enable mis-use via a stack overflow

That issue shouldn't happen here; i.e., the diagnostic is a false alarm.




Re: [patch v2] glob: resolve DT_UNKNOWN via is_dir

2022-03-23 Thread Paul Eggert
  goto memory_error;
+}
+  p = stpcpy (fullpath, directory);
+  *p++ = '/';
+  strcpy (p, d.name);
+  isdir = is_dir (fullpath, flags, pglob);
+  if (!use_alloca)
+free (fullpath);
+  if (isdir)
+break;
+  continue;
+}
   default: continue;
   }
 
-- 
2.32.0

From 2f7f02986f9d338b5bb0e865bfd278678fb96325 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 23 Mar 2022 09:52:58 -0700
Subject: [PATCH 2/3] glob: fix symlink and // issues; improve speed

* lib/glob.c: Include fcntl.h.
(dirfd) [_LIBC]: New macro.
(GLOB_STAT64, GLOB_LSTAT64): Remove.  Replace all uses with ...
(GLOB_FSTATAT64): ... this new macro.
(glob_in_dir): Treat DT_LNK like DT_UNKNOWN.
Use directory-relative fstatat unless GLOB_ALTDIRFUNC, or dirfd fails.
Avoid duplicate strlen (directory).
Work even if directory is "/", without turning it into "//".
Use a scratch buffer instead of by-hand alloca stuff.
Use mempcpy and memcpy instead of stpcpy and strcpy.
* modules/glob (Depends-on): Add dirfd, fstatat.  Remove stat.
(License): Change from LGPLv2+ to GPL, since it depends on
fstatat.
---
 ChangeLog| 17 
 lib/glob.c   | 76 +++-
 modules/glob |  5 ++--
 3 files changed, 60 insertions(+), 38 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 8c50a52c78..a0d3519162 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,20 @@
+2022-03-23  Paul Eggert  
+
+	glob: fix symlink and // issues; improve speed
+	* lib/glob.c: Include fcntl.h.
+	(dirfd) [_LIBC]: New macro.
+	(GLOB_STAT64, GLOB_LSTAT64): Remove.  Replace all uses with ...
+	(GLOB_FSTATAT64): ... this new macro.
+	(glob_in_dir): Treat DT_LNK like DT_UNKNOWN.
+	Use directory-relative fstatat unless GLOB_ALTDIRFUNC, or dirfd fails.
+	Avoid duplicate strlen (directory).
+	Work even if directory is "/", without turning it into "//".
+	Use a scratch buffer instead of by-hand alloca stuff.
+	Use mempcpy and memcpy instead of stpcpy and strcpy.
+	* modules/glob (Depends-on): Add dirfd, fstatat.  Remove stat.
+	(License): Change from LGPLv2+ to GPL, since it depends on
+	fstatat.
+
 2022-03-23  DJ Delorie  
 
 	glob: resolve DT_UNKNOWN via is_dir
diff --git a/lib/glob.c b/lib/glob.c
index 0da46ac138..52c79b4cd8 100644
--- a/lib/glob.c
+++ b/lib/glob.c
@@ -28,6 +28,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -56,6 +57,9 @@
 # define sysconf(id) __sysconf (id)
 # define closedir(dir) __closedir (dir)
 # define opendir(name) __opendir (name)
+# ifndef dirfd
+#  define dirfd(str) __dirfd (str)
+# endif
 # define readdir(str) __readdir64 (str)
 # define getpwnam_r(name, bufp, buf, len, res) \
 __getpwnam_r (name, bufp, buf, len, res)
@@ -69,11 +73,8 @@
 # ifndef GLOB_LSTAT
 #  define GLOB_LSTATgl_lstat
 # endif
-# ifndef GLOB_STAT64
-#  define GLOB_STAT64   __stat64
-# endif
-# ifndef GLOB_LSTAT64
-#  define GLOB_LSTAT64  __lstat64
+# ifndef GLOB_FSTATAT64
+#  define GLOB_FSTATAT64__fstatat64
 # endif
 # include 
 #else /* !_LIBC */
@@ -88,8 +89,7 @@
 # define struct_statstruct stat
 # define struct_stat64  struct stat
 # define GLOB_LSTAT gl_lstat
-# define GLOB_STAT64stat
-# define GLOB_LSTAT64   lstat
+# define GLOB_FSTATAT64 fstatat
 #endif /* _LIBC */
 
 #include 
@@ -215,7 +215,8 @@ glob_lstat (glob_t *pglob, int flags, const char *fullname)
   } ust;
   return (__glibc_unlikely (flags & GLOB_ALTDIRFUNC)
   ? pglob->GLOB_LSTAT (fullname, &ust.st)
-  : GLOB_LSTAT64 (fullname, &ust.st64));
+  : GLOB_FSTATAT64 (AT_FDCWD, fullname, &ust.st64,
+AT_SYMLINK_NOFOLLOW));
 }
 
 /* Set *R = A + B.  Return true if the answer is mathematically
@@ -257,7 +258,8 @@ is_dir (char const *filename, int flags, glob_t const *pglob)
   struct_stat64 st64;
   return (__glibc_unlikely (flags & GLOB_ALTDIRFUNC)
   ? pglob->gl_stat (filename, &st) == 0 && S_ISDIR (st.st_mode)
-  : GLOB_STAT64 (filename, &st64) == 0 && S_ISDIR (st64.st_mode));
+  : (GLOB_FSTATAT64 (AT_FDCWD, filename, &st64, 0) == 0
+ && S_ISDIR (st64.st_mode)));
 }
 
 /* Find the end of the sub-pattern in a brace expression.  */
@@ -1283,6 +1285,8 @@ glob_in_dir (const char *pattern, const char *directory, int flags,
 {
   size_t dirlen = strlen (directory);
   void *stream = NULL;
+  struct scratch_buffer s;
+  scratch_buffer_init (&s);
 # define GLOBNAMES_MEMBERS(nnames) \
 struct globnames *next; size_t count; char *name[nnames];
   struct globnames { GLOBNAMES_MEMBERS (FLEXIBLE_ARRA

Re: [BUG REPORT] gnulib Android NDK and/or Cygwin build failure regression after 0c8a563f

2022-03-21 Thread Paul Eggert

On 3/21/22 13:30, Chris Renshaw wrote:

: Invalid argumentabi-gcc.exe: error:


That's a little cryptic. Could you explain a bit more what the problem is?


the breaking change is from 
https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commit;h=0c8a563f65d44752b33aec42cceec25bd485f2d5


That change looks like it affects only Makefiles. So, what's the 
difference between the working Makefile and the non-working one?




Re: uninorm/composition.c:75:22: runtime error

2022-03-13 Thread Paul Eggert

On 3/13/22 00:04, Simon Josefsson wrote:

Reading the ideas in your responses, I think gnulib could help
developers to use ASan/UBSan in their project by assisting with these
choices.  I'll see if I can come up with anything that is generally
useful, once I get a couple of projects to build and self-test reliably
with ASan/UBSan.


Something like --enable-gcc-warnings, but --enable-sanitizer? That would 
be helpful.


For what it's worth, I just now built GNU Emacs with 'clang 
-fsanitize=undefined' and it complained several times about adding 0 to 
a null pointer, due to code that's in Emacs not in Gnulib. I worked 
around the problem by compiling with '-fsanitize=undefined 
-fno-sanitize=pointer-overflow'. Of course this disables some useful checks.


It is a pain that Clang is wrongheadedly pedantic here. Adding 0 to a 
null pointer is well-defined in C++17, and there's no realistic 
possibility of Clang being ported to a platform where it's undefined 
behavior in C.




[PATCH] regex: fix double-free

2022-03-11 Thread Paul Eggert
* lib/regex_internal.c (re_dfa_add_node): Don’t free storage
twice if an allocation fails.
---
 ChangeLog|  4 
 lib/regex_internal.c | 22 ++
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 7a6ade78c3..4d49a824e5 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,9 @@
 2022-03-11  Paul Eggert  
 
+   regex: fix double-free
+   * lib/regex_internal.c (re_dfa_add_node): Don’t free storage
+   twice if an allocation fails.
+
regex: fix minor over-allocation
* lib/regexec.c (push_fail_stack): Fix off-by-one error that
over-allocated the stack.
diff --git a/lib/regex_internal.c b/lib/regex_internal.c
index 3945ee7ecb..0e6919f340 100644
--- a/lib/regex_internal.c
+++ b/lib/regex_internal.c
@@ -1396,24 +1396,22 @@ re_dfa_add_node (re_dfa_t *dfa, re_token_t token)
   if (__glibc_unlikely (new_nodes == NULL))
return -1;
   dfa->nodes = new_nodes;
+  dfa->nodes_alloc = new_nodes_alloc;
   new_nexts = re_realloc (dfa->nexts, Idx, new_nodes_alloc);
+  if (new_nexts != NULL)
+   dfa->nexts = new_nexts;
   new_indices = re_realloc (dfa->org_indices, Idx, new_nodes_alloc);
+  if (new_indices != NULL)
+   dfa->org_indices = new_indices;
   new_edests = re_realloc (dfa->edests, re_node_set, new_nodes_alloc);
+  if (new_edests != NULL)
+   dfa->edests = new_edests;
   new_eclosures = re_realloc (dfa->eclosures, re_node_set, 
new_nodes_alloc);
+  if (new_eclosures != NULL)
+   dfa->eclosures = new_eclosures;
   if (__glibc_unlikely (new_nexts == NULL || new_indices == NULL
|| new_edests == NULL || new_eclosures == NULL))
-   {
-  re_free (new_nexts);
-  re_free (new_indices);
-  re_free (new_edests);
-  re_free (new_eclosures);
-  return -1;
-   }
-  dfa->nexts = new_nexts;
-  dfa->org_indices = new_indices;
-  dfa->edests = new_edests;
-  dfa->eclosures = new_eclosures;
-  dfa->nodes_alloc = new_nodes_alloc;
+   return -1;
 }
   dfa->nodes[dfa->nodes_len] = token;
   dfa->nodes[dfa->nodes_len].constraint = 0;
-- 
2.35.1




Re: [patch v2] glob: resolve DT_UNKNOWN via is_dir

2022-03-11 Thread Paul Eggert
Thanks for looking into this; it's long been on my plate but I haven't 
had time to work on the proper solution, which is basically to rewrite 
glob from scratch (this should make it considerably faster).


As far as your patch goes:

Gnulib prefers spaces to tabs.

The code unnecessarily calls strlen (directory).

The patch mishandles a directory "/" and a file "x", as it stats "//x" 
but this may differ from "/x" on some systems.


For speed the code should prefer fstatat on d.name to stat on the full 
name. Too bad glob_t doesn't have a gl_fstatat entry, but we can use 
fstatat when GLOB_ALTDIRFUNC is not in use.


The patch goes through a loop calling alloca_account, which is not good: 
it'll waste the stack. Better to use a scratch buffer, as in other parts 
of glob.c.


The code should prefer mempcpy and memcpy to stpcpy and strcpy (as in 
the rest of glob.c). That way, the Gnulib glob module needn't depend on 
the stpcpy module (plus the code's more reliable if another thread 
stomps on our strings between strlen and strcpy :-).


I don't see how the patch fixes the case where readdir_result_type (d) 
returns DT_LNK; this might be a symlink to a directory.


Proposed pair of patches attached (I haven't installed these). The first 
is yours but with tabs turned to spaces and with ChangeLog equal to your 
log entry. The second contains my proposed improvements. I haven't 
tested except on the Gnulib test cases (which aren't much).


Of course performance will suffer with all these correctness patches, 
but that can wait until a rewrite.From 69247bc996c50da0564f6157358ba99b13abbd16 Mon Sep 17 00:00:00 2001
From: DJ Delorie 
Date: Fri, 11 Mar 2022 16:43:39 -0500
Subject: [PATCH 1/2] glob: resolve DT_UNKNOWN via is_dir

[v2: changed malloc failure from ignore to error; added support for
alloca; tested by copying to glibc and testing there]

The DT_* values returned by getdents (readdir) are only hints and
not required.  In fact, some Linux filesystems return DT_UNKNOWN
for most entries, regardless of actual type.  This causes make
to mis-match patterns with a trailing slash (via GLOB_ONLYDIR)
(see make's functions/wildcard test case).  Thus, this patch
detects that case and uses is_dir() to make the type known enough
for proper operation.

Performance in non-DT_UNKNOWN cases is not affected.

The lack of DT_* is a well known issue on older XFS installations
(for example, RHEL 7 and 8, Fedora 28) but can be recreated by
creating an XFS filesystem with flags that mimic older behavior:

$ fallocate -l 10G /xfs.fs
$ mkfs.xfs -n ftype=0 -m crc=0 -f /xfs.fs
$ mkdir /xfs
$ mount -o loop /xfs.fs /xfs
---
 ChangeLog  | 23 +++
 lib/glob.c | 28 +++-
 2 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 7a6ade78c3..f39c749ad3 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,26 @@
+2022-03-11  DJ Delorie  
+
+	glob: resolve DT_UNKNOWN via is_dir
+
+	The DT_* values returned by getdents (readdir) are only hints and
+	not required.  In fact, some Linux filesystems return DT_UNKNOWN
+	for most entries, regardless of actual type.  This causes make
+	to mis-match patterns with a trailing slash (via GLOB_ONLYDIR)
+	(see make's functions/wildcard test case).  Thus, this patch
+	detects that case and uses is_dir() to make the type known enough
+	for proper operation.
+
+	Performance in non-DT_UNKNOWN cases is not affected.
+
+	The lack of DT_* is a well known issue on older XFS installations
+	(for example, RHEL 7 and 8, Fedora 28) but can be recreated by
+	creating an XFS filesystem with flags that mimic older behavior:
+
+	$ fallocate -l 10G /xfs.fs
+	$ mkfs.xfs -n ftype=0 -m crc=0 -f /xfs.fs
+	$ mkdir /xfs
+	$ mount -o loop /xfs.fs /xfs
+
 2022-03-11  Paul Eggert  
 
 	regex: fix minor over-allocation
diff --git a/lib/glob.c b/lib/glob.c
index f8d8a306f2..0da46ac138 100644
--- a/lib/glob.c
+++ b/lib/glob.c
@@ -1381,7 +1381,33 @@ glob_in_dir (const char *pattern, const char *directory, int flags,
   if (flags & GLOB_ONLYDIR)
 switch (readdir_result_type (d))
   {
-  case DT_DIR: case DT_LNK: case DT_UNKNOWN: break;
+  case DT_DIR: case DT_LNK: break;
+  case DT_UNKNOWN:
+{
+  /* The filesystem was too lazy to give us a hint,
+ so we have to do it the hard way.  */
+  char *fullpath, *p;
+  bool isdir;
+  int need = strlen (directory) + strlen (d.name) + 2;
+  int use_alloca = glob_use_alloca (alloca_used, need);
+  if (use_alloca)
+fullpath = alloca_account (need, alloca_used);
+  else
+{
+  full

[PATCH 2/2] regex: fix minor over-allocation

2022-03-11 Thread Paul Eggert
* lib/regexec.c (push_fail_stack): Fix off-by-one error that
over-allocated the stack.
---
 ChangeLog | 4 
 lib/regexec.c | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 50f60c6372..7a6ade78c3 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,9 @@
 2022-03-11  Paul Eggert  
 
+   regex: fix minor over-allocation
+   * lib/regexec.c (push_fail_stack): Fix off-by-one error that
+   over-allocated the stack.
+
regex: fix free_fail_stack undefined behavior
* lib/regexec.c (push_fail_stack): Don’t increment number of
re_fail_stack_t entries until after successful allocation.  This
diff --git a/lib/regexec.c b/lib/regexec.c
index 0691e91e1e..521cb02841 100644
--- a/lib/regexec.c
+++ b/lib/regexec.c
@@ -1309,7 +1309,7 @@ push_fail_stack (struct re_fail_stack_t *fs, Idx str_idx, 
Idx dest_node,
 {
   reg_errcode_t err;
   Idx num = fs->num;
-  if (num + 1 == fs->alloc)
+  if (num == fs->alloc)
 {
   struct re_fail_stack_ent_t *new_array;
   new_array = re_realloc (fs->stack, struct re_fail_stack_ent_t,
-- 
2.35.1




[PATCH 1/2] regex: fix free_fail_stack undefined behavior

2022-03-11 Thread Paul Eggert
* lib/regexec.c (push_fail_stack): Don’t increment number of
re_fail_stack_t entries until after successful allocation.  This
prevents a crash if re_realloc or re_malloc fails here, and a
later free_fail_stack examines regs or a later pop_fail_stack
examines node.  Problem discovered by Coverity scan sent
2022-03-11 11:03:52Z.
---
 ChangeLog | 10 ++
 lib/regexec.c |  5 +++--
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 7713294982..50f60c6372 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,13 @@
+2022-03-11  Paul Eggert  
+
+   regex: fix free_fail_stack undefined behavior
+   * lib/regexec.c (push_fail_stack): Don’t increment number of
+   re_fail_stack_t entries until after successful allocation.  This
+   prevents a crash if re_realloc or re_malloc fails here, and a
+   later free_fail_stack examines regs or a later pop_fail_stack
+   examines node.  Problem discovered by Coverity scan sent
+   2022-03-11 11:03:52Z.
+
 2022-03-10  Paul Eggert  
 
fts: revert change to use AT_NO_AUTOMOUNT
diff --git a/lib/regexec.c b/lib/regexec.c
index aea1e7da52..0691e91e1e 100644
--- a/lib/regexec.c
+++ b/lib/regexec.c
@@ -1308,8 +1308,8 @@ push_fail_stack (struct re_fail_stack_t *fs, Idx str_idx, 
Idx dest_node,
 re_node_set *eps_via_nodes)
 {
   reg_errcode_t err;
-  Idx num = fs->num++;
-  if (fs->num == fs->alloc)
+  Idx num = fs->num;
+  if (num + 1 == fs->alloc)
 {
   struct re_fail_stack_ent_t *new_array;
   new_array = re_realloc (fs->stack, struct re_fail_stack_ent_t,
@@ -1324,6 +1324,7 @@ push_fail_stack (struct re_fail_stack_t *fs, Idx str_idx, 
Idx dest_node,
   fs->stack[num].regs = re_malloc (regmatch_t, 2 * nregs);
   if (fs->stack[num].regs == NULL)
 return REG_ESPACE;
+  fs->num = num + 1;
   memcpy (fs->stack[num].regs, regs, sizeof (regmatch_t) * nregs);
   memcpy (fs->stack[num].regs + nregs, prevregs, sizeof (regmatch_t) * nregs);
   err = re_node_set_init_copy (&fs->stack[num].eps_via_nodes, eps_via_nodes);
-- 
2.35.1




Re: [PATCH] fix descriptions for AT_NO_AUTOMOUNT

2022-03-10 Thread Paul Eggert

On 3/10/22 11:39, Pádraig Brady wrote:


The changes are a net improvement I think since fewer interfaces are used.

I would remove the AT_NO_AUTOMOUNT parameters to fstatat() though,
since they're redundant it seems, and would only result in confusion
if the patch is applied to remove that flag from the fstatat(2) man page.


OK, thanks, I installed the attached to do that.

From 51a5361a285783dd1bdc418bdad043069322d951 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Thu, 10 Mar 2022 13:07:53 -0800
Subject: [PATCH] fts: revert change to use AT_NO_AUTOMOUNT
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* NEWS: Don’t mention AT_NO_AUTOMOUNT.
* lib/fts.c (fts_stat): Don’t use AT_NO_AUTOMOUNT, as
it has no effect with fstatat.
---
 ChangeLog | 7 +++
 NEWS  | 3 +--
 lib/fts.c | 4 ++--
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 294f6286f3..7713294982 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2022-03-10  Paul Eggert  
+
+	fts: revert change to use AT_NO_AUTOMOUNT
+	* NEWS: Don’t mention AT_NO_AUTOMOUNT.
+	* lib/fts.c (fts_stat): Don’t use AT_NO_AUTOMOUNT, as
+	it has no effect with fstatat.
+
 2022-03-09  Paul Eggert  
 
 	statat: now obsolete
diff --git a/NEWS b/NEWS
index 8f90d8e958..1a1c21970a 100644
--- a/NEWS
+++ b/NEWS
@@ -66,8 +66,7 @@ User visible incompatible changes
 
 DateModules Changes
 
-2022-03-09  statat  This module is deprecated.  Use fstatat instead,
-to specify whether you want AT_NO_AUTOMOUNT.
+2022-03-09  statat  This module is deprecated.  Use fstatat instead.
 
 2022-01-05  stack   This module now uses idx_t instead of size_t
 for indexes and counts.
diff --git a/lib/fts.c b/lib/fts.c
index a1a7c09fdb..494a63af96 100644
--- a/lib/fts.c
+++ b/lib/fts.c
@@ -1775,12 +1775,12 @@ fts_stat(FTS *sp, register FTSENT *p, bool follow)
  * a stat(2).  If that fails, check for a non-existent symlink.  If
  * fail, set the errno from the stat call.
  */
-int flags = (follow ? 0 : AT_SYMLINK_NOFOLLOW) | AT_NO_AUTOMOUNT;
+int flags = follow ? 0 : AT_SYMLINK_NOFOLLOW;
 if (fstatat (sp->fts_cwd_fd, p->fts_accpath, sbp, flags) < 0)
   {
 if (follow && errno == ENOENT
 && 0 <= fstatat (sp->fts_cwd_fd, p->fts_accpath, sbp,
- AT_SYMLINK_NOFOLLOW | AT_NO_AUTOMOUNT))
+ AT_SYMLINK_NOFOLLOW))
   {
 __set_errno (0);
 return FTS_SLNONE;
-- 
2.35.1



Re: [PATCH] fix descriptions for AT_NO_AUTOMOUNT

2022-03-10 Thread Paul Eggert

On 3/10/22 05:46, Pádraig Brady wrote:

After looking at the kernel code, it seems that:
   fstatat() did _not_ imply AT_NO_AUTOMOUNT from 2.6.38 -> 4.11
     I'm not sure it even honored the AT_NO_AUTOMOUNT flag before 4.11
   fstatat() did imply AT_NO_AUTOMOUNT since 4.11


Ouch, so this whole thing has been a false alarm? Well, in some sense 
that's a relief; in another sense I wonder whether we should undo some 
of the recent Gnulib changes.




Re: fstatat + AT_NO_AUTOMOUNT

2022-03-09 Thread Paul Eggert
I audited gnulib's uses of fstatat and found one fishy one that doesn't 
use AT_NO_AUTOMOUNT, namely, in fts.c where the follow-symlink branch 
uses 'stat' whereas the no-follow-symlink branch uses fstatat without 
AT_NO_AUTOMOUNT. I installed the first patch to cause it be consistent 
in using AT_NO_AUTOMOUNT, which is also consistent with what glibc does 
(though this doesn't necessarily mean it's right - perhaps fts should 
have a new flag to control automounts, depending on what the user wants).


The second attached patch deprecates statat and lstatat, due to the 
confusion already mentioned.


I haven't audited Gnulib's uses of 'stat' and 'lstat'.From 44f347ce4009cd0100d0e6562939a032b16d6db1 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 9 Mar 2022 11:54:13 -0800
Subject: [PATCH 1/2] fts: be consistent about AT_NO_AUTOMOUNT
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* lib/fts.c (fts_stat): Use fstatat with AT_NO_AUTOMOUNT
consistently, instead of sometimes using stat (which implies
AT_NO_AUTOMOUNT) and sometimes using fstatat without AT_NO_AUTOMOUNT.
Remove a goto while we’re at it.
---
 ChangeLog |  8 
 lib/fts.c | 34 +-
 2 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index e3f0ed216c..58873a1762 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,11 @@
+2022-03-09  Paul Eggert  
+
+	fts: be consistent about AT_NO_AUTOMOUNT
+	* lib/fts.c (fts_stat): Use fstatat with AT_NO_AUTOMOUNT
+	consistently, instead of sometimes using stat (which implies
+	AT_NO_AUTOMOUNT) and sometimes using fstatat without AT_NO_AUTOMOUNT.
+	Remove a goto while we’re at it.
+
 2022-03-07  Pádraig Brady  
 
 	fcntl-h: add AT_NO_AUTOMOUNT
diff --git a/lib/fts.c b/lib/fts.c
index 706c56c597..a1a7c09fdb 100644
--- a/lib/fts.c
+++ b/lib/fts.c
@@ -1766,7 +1766,8 @@ fts_stat(FTS *sp, register FTSENT *p, bool follow)
 {
 struct stat *sbp = p->fts_statp;
 
-if (p->fts_level == FTS_ROOTLEVEL && ISSET(FTS_COMFOLLOW))
+if (ISSET (FTS_LOGICAL)
+|| (ISSET (FTS_COMFOLLOW) && p->fts_level == FTS_ROOTLEVEL))
 follow = true;
 
 /*
@@ -1774,22 +1775,21 @@ fts_stat(FTS *sp, register FTSENT *p, bool follow)
  * a stat(2).  If that fails, check for a non-existent symlink.  If
  * fail, set the errno from the stat call.
  */
-if (ISSET(FTS_LOGICAL) || follow) {
-if (stat(p->fts_accpath, sbp)) {
-if (errno == ENOENT
-&& lstat(p->fts_accpath, sbp) == 0) {
-__set_errno (0);
-return (FTS_SLNONE);
-}
-p->fts_errno = errno;
-goto err;
-}
-} else if (fstatat(sp->fts_cwd_fd, p->fts_accpath, sbp,
-   AT_SYMLINK_NOFOLLOW)) {
-p->fts_errno = errno;
-err:memset(sbp, 0, sizeof(struct stat));
-return (FTS_NS);
-}
+int flags = (follow ? 0 : AT_SYMLINK_NOFOLLOW) | AT_NO_AUTOMOUNT;
+if (fstatat (sp->fts_cwd_fd, p->fts_accpath, sbp, flags) < 0)
+  {
+if (follow && errno == ENOENT
+&& 0 <= fstatat (sp->fts_cwd_fd, p->fts_accpath, sbp,
+ AT_SYMLINK_NOFOLLOW | AT_NO_AUTOMOUNT))
+  {
+__set_errno (0);
+return FTS_SLNONE;
+  }
+
+p->fts_errno = errno;
+memset (sbp, 0, sizeof *sbp);
+return FTS_NS;
+  }
 
 if (S_ISDIR(sbp->st_mode)) {
 if (ISDOT(p->fts_name)) {
-- 
2.35.1

From eea9688d521634c58efa81130e509f647bbd9ff9 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Wed, 9 Mar 2022 13:54:53 -0800
Subject: [PATCH 2/2] statat: now obsolete
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* lib/openat.h (statat, lstatat): Now deprecated.
All uses removed, and replaced with fstatat.
* modules/statat: Mark as obsolete, because it’s confusing:
it’s not clear whether it should use AT_NO_AUTOMOUNT,
which is implied by stat and by lstat, but not by fstatat.
* tests/test-statat.c: Disable deprecated-declarations warnings.
---
 ChangeLog   |  8 
 NEWS|  3 +++
 lib/fchownat.c  |  2 +-
 lib/openat.h|  2 ++
 lib/renameatu.c | 15 ---
 lib/unlinkat.c  |  5 +++--
 modules/fchownat|  1 -
 modules/renameatu   |  1 -
 modules/statat  |  7 +++
 modules/unlinkat|  1 -
 tests/test-statat.c |  4 
 11 files changed, 36 insertions(+), 13 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 5

Re: [PATCH] fcntl-h: add AT_NO_AUTOMOUNT

2022-03-07 Thread Paul Eggert

On 3/7/22 06:08, Pádraig Brady wrote:

* lib/fcntl.in.h: Define AT_NO_AUTOMOUNT to 0 where not defined.
This is available on Linux since 2.6.38.


Looks good.

Please feel free to install this sort of thing without waiting for review.



Re: uninorm/composition.c:75:22: runtime error

2022-03-05 Thread Paul Eggert

On 3/4/22 05:58, Simon Josefsson via Gnulib discussion list wrote:

Below that is a patch for another UBSAN complaint about a NULL + 0
operation -- I recall that being discussed before too, but don't recall
the conclusion.


(char *)0 + 0 is undefined behavior, so clang's complaint about it is 
more justifiable than its complaint about assigning an unsigned char 
value to a char. But even with (char *)0 + 0 my feeling was that Clang 
is wrong, as adding 0 to (char *)0 works just fine on all platforms of 
practical interest; so we're better off overall by asking people to use 
-fno-sanitize=pointer-overflow when compiling Gnulib, if they also 
compile with -fsanitize=undefined. See the thread starting here:


https://lists.gnu.org/r/bug-gnulib/2021-10/msg00053.html

and ending here:

https://lists.gnu.org/r/bug-gnulib/2022-02/msg00049.html



Re: uninorm/composition.c:75:22: runtime error

2022-03-05 Thread Paul Eggert

On 3/4/22 05:21, Bruno Haible wrote:

Paul, do you agree that it's a good idea to add an explicit cast, to avoid
implicit conversion from 'unsigned char' to 'char'?


As you noted, there's no language reason to do it. It would be a change 
put in only to pacify Clang, in an area where Clang is buggy. So I'd 
leave the code alone and would use clang's -fno-sanitize-recover to work 
around the Clang bug. It might also help to file a bug report with the 
Clang folks.




Re: bug#32452: 26.1; gnutls_try_handshake maxes out cpu retrying when server is a bit busy

2022-03-03 Thread Paul Eggert

On 3/3/22 05:46, Lars Ingebrigtsen wrote:

So either there's something weird on my laptop, or it sounds like
there's an Autoconf bug in debian/bookworm?


Could be either.

Debian Bookworm uses Autoconf 2.71+patches, as opposed to the Autoconf 
2.69+patches that Fedora 35 uses. For what it's worth I just now tried 
to reproduce the problem on Fedora 35 but with an Autoconf 2.71 that I 
installed by hand, and could not reproduce the problem. However, I 
looked at the patches in autoconf_2.71-2.debian.tar.xz and none of them 
seemed to be relevant (some don't even apply, which is curious).


If you have an easily-reproducible script (runs in the C locale, etc.) 
it might be worth a bug report to the Debian developers. To be honest 
though it sounds like it might be something odd on your laptop.




Re: bug#32452: 26.1; gnutls_try_handshake maxes out cpu retrying when server is a bit busy

2022-03-01 Thread Paul Eggert

On 3/1/22 17:22, Lars Ingebrigtsen wrote:


--- a/lib/gnulib.mk.in
+++ b/lib/gnulib.mk.in
@@ -129,6 +129,7 @@
  #  minmax \
  #  mkostemp \
  #  mktime \
+#  nanosleep \
  #  nproc \
  #  nstrftime \
  #  pathmax \
@@ -2497,6 +2498,16 @@ EXTRA_libgnu_a_SOURCES += mktime.c
  endif
  ## end   gnulib module mktime-internal
  
+## begin gnulib module nanosleep

+ifeq (,$(OMIT_GNULIB_MODULE_nanosleep))
+
+ifneq (,$(GL_COND_OBJ_NANOSLEEP_CONDITION))
+libgnu_a_SOURCES += nanosleep.c
+endif
+
+endif
+## end   gnulib module nanosleep
+
  ## begin gnulib module nproc
  ifeq (,$(OMIT_GNULIB_MODULE_nproc))
  


This diff is wrong, as it omits a line "GL_COND_OBJ_NANOSLEEP_CONDITION 
= @GL_COND_OBJ_NANOSLEEP_CONDITION@".


I ran what should have been something like your commands and got the 
attached patch. One way forward is for you to simply install the 
attached patch and move on from there. Or we can continue to look into 
why things work for me and not for you. I suppose it could be an 
Autoconf bug on your platform, but it'd be an odd one.


Here's a shell transcript of what I did to get the attached patch, on 
Fedora 35 x86-64:


  $ git clone master master-tmp
  Cloning into 'master-tmp'...
  done.
  Updating files: 100% (4608/4608), done.
  $ cd master-tmp
  $ git log HEAD^!
  commit 689a34e2153ec558dbf406809a5e58489250fe1a (HEAD -> master, 
origin/master, origin/HEAD)

  Author: Po Lu 
  Date:   Wed Mar 2 09:46:44 2022 +0800

  Dismiss help text when item becomes unactivated on oldXMenu

  * oldXMenu/Activate.c (XMenuActivate): Dismiss help text when
  leaving an item.
  $ (cd ../gnulib && git log HEAD^! )
  commit 8c4f4d7a3c28f88b64fce2fb1d0dc0e570d1a482 (HEAD -> master, 
origin/master, origin/HEAD)

  Author: Paul Eggert 
  Date:   Tue Mar 1 10:01:22 2022 -0800

  Create lib/Makefile.am after gnulib-comp.m4

  * gnulib-tool (func_import): Create library makefile after
  creating gnulib-comp.m4.  With --gnu-make, the latter depends on
  the former.  See <https://bugs.gnu.org/32452#109>.
  $ sed -i 's/nproc nstrftime/nanosleep &/' admin/merge-gnulib
  $ admin/merge-gnulib
  Checking whether you have the necessary tools...
  (Read INSTALL.REPO for more details on building Emacs)
  Checking for autoconf (need at least version 2.65) ... ok
  Your system has the required tools.
  Building aclocal.m4 ...
  Running 'autoreconf -fi -I m4' ...
  Configuring local git repository...
  '.git/config' -> '.git/config.~1~'
  git config transfer.fsckObjects 'true'
  git config diff.cpp.xfuncname '!^[ 
\t]*[A-Za-z_][A-Za-z_0-9]*:[[:space:]]*($|/[/*])

  ^((::[[:space:]]*)?[A-Za-z_][A-Za-z_0-9]*[[:space:]]*\(.*)$
  ^((#define[[:space:]]|DEFUN).*)$'
  git config diff.elisp.xfuncname 
'^\([^[:space:]]*def[^[:space:]]+[[:space:]]+([^()[:space:]]+)'

  git config diff.m4.xfuncname '^((m4_)?define|A._DEFUN(_ONCE)?)\([^),]*'
  git config diff.make.xfuncname 
'^([$.[:alnum:]_].*:|[[:alnum:]_]+[[:space:]]*([*:+]?[:?]?|!?)=|define .*)'
  git config diff.shell.xfuncname 
'^([[:space:]]*[[:alpha:]_][[:alnum:]_]*[[:space:]]*\(\)|[[:alpha:]_][[:alnum:]_]*=)'
  git config diff.texinfo.xfuncname 
'^@node[[:space:]]+([^,[:space:]][^,]+)'

  Installing git hooks...
  'build-aux/git-hooks/commit-msg' -> '.git/hooks/commit-msg'
  'build-aux/git-hooks/pre-commit' -> '.git/hooks/pre-commit'
  'build-aux/git-hooks/prepare-commit-msg' -> 
'.git/hooks/prepare-commit-msg'

  '.git/hooks/applypatch-msg.sample' -> '.git/hooks/applypatch-msg'
  '.git/hooks/pre-applypatch.sample' -> '.git/hooks/pre-applypatch'
  You can now run './configure'.
  Module list with included dependencies (indented):
  absolute-header
  acl-permissions
alloca-opt
  allocator
  at-internal
  attribute
binary-io
  builtin-expect
byteswap
c-ctype
c-strcase
  c99
canonicalize-lgpl
careadlinkat
  clock-time
  cloexec
close-stream
copy-file-range
count-leading-zeros
count-one-bits
count-trailing-zeros
crypto/md5
crypto/md5-buffer
crypto/sha1-buffer
crypto/sha256-buffer
crypto/sha512-buffer
d-type
diffseq
  dirent
  dirfd
double-slash-root
dtoastr
dtotimespec
dup2
  dynarray
  eloop-threshold
environ
  errno
  euidaccess
execinfo
explicit_bzero
  extensions
  extern-inline
faccessat
fchmodat
fcntl
fcntl-h
fdopendir
file-has-acl
filemode
filename
filevercmp
flexmember
  fpending
fpieee
free-posix
fstatat
fsusage
fsync
futimens
  gen-header
  getdtablesize
  getgroups
getloadavg
getopt-gnu
  getopt-posix
ge

Re: bug#32452: 26.1; gnutls_try_handshake maxes out cpu retrying when server is a bit busy

2022-03-01 Thread Paul Eggert

On 3/1/22 10:52, Lars Ingebrigtsen wrote:

Paul Eggert  writes:


I looked into the problem some more and found what I think is the
underlying problem: gnulib-tool generated lib/gnulib.mk.in before it
generates m4/gnulib-comp.m4, which Makefile-generation relies upon. I
reverted my recent hack to emacs/admin/merge-gnulib and installed the
attached Gnulib patch. Please update to the latest Emacs and Gnulib
and try again.

I tried this now, but the symptoms seem to be the same -- after trying
to use nanosleep, as described before, I still get:

/usr/bin/ld: gnutls.o: in function `gnutls_try_handshake':
/home/larsi/src/emacs/gtest/src/gnutls.c:634: undefined reference to 
`rpl_nanosleep'
collect2: error: ld returned 1 exit status


Do you see this problem with a fresh checkout from the latest master 
branch, combined with the latest Gnulib? If not, problem solved. If so, 
what's the output of 'git status' and of 'git diff' when things fail?




Re: bug#32452: 26.1; gnutls_try_handshake maxes out cpu retrying when server is a bit busy

2022-03-01 Thread Paul Eggert

On 3/1/22 07:36, Lars Ingebrigtsen wrote:


My latest attempt wasn't from a bare checkout -- it was from my normal
development tree, though.


Well, that's annoying. :-)

I looked into the problem some more and found what I think is the 
underlying problem: gnulib-tool generated lib/gnulib.mk.in before it 
generates m4/gnulib-comp.m4, which Makefile-generation relies upon. I 
reverted my recent hack to emacs/admin/merge-gnulib and installed the 
attached Gnulib patch. Please update to the latest Emacs and Gnulib and 
try again.From 8c4f4d7a3c28f88b64fce2fb1d0dc0e570d1a482 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 1 Mar 2022 10:01:22 -0800
Subject: [PATCH] Create lib/Makefile.am after gnulib-comp.m4

* gnulib-tool (func_import): Create library makefile after
creating gnulib-comp.m4.  With --gnu-make, the latter depends on
the former.  See <https://bugs.gnu.org/32452#109>.
---
 ChangeLog   |  7 ++
 gnulib-tool | 68 +++--
 2 files changed, 42 insertions(+), 33 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 629ec803fd..c5a80fd3f3 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2022-03-01  Paul Eggert  
+
+	Create lib/Makefile.am after gnulib-comp.m4
+	* gnulib-tool (func_import): Create library makefile after
+	creating gnulib-comp.m4.  With --gnu-make, the latter depends on
+	the former.  See <https://bugs.gnu.org/32452#109>.
+
 2022-02-26  Paul Eggert  
 
 	gettime-res: fix unlikely overflow bug
diff --git a/gnulib-tool b/gnulib-tool
index 9ee7560209..e420b321d2 100755
--- a/gnulib-tool
+++ b/gnulib-tool
@@ -5692,39 +5692,6 @@ s,//*$,/,'
 func_note_Makefile_am_edit "$dir1" EXTRA_DIST "${dir2}gnulib-cache.m4"
   }
 
-  # Create library makefile.
-  func_dest_tmpfilename $sourcebase/$source_makefile_am
-  destfile="$sourcebase/$source_makefile_am"
-  modules="$main_modules"
-  if $automake_subdir; then
-func_emit_lib_Makefile_am | "$gnulib_dir"/build-aux/prefix-gnulib-mk --from-gnulib-tool --lib-name="$libname" --prefix="$sourcebase/" > "$tmpfile"
-  else
-func_emit_lib_Makefile_am > "$tmpfile"
-  fi
-  if test -f "$destdir"/$sourcebase/$source_makefile_am; then
-if cmp -s "$destdir"/$sourcebase/$source_makefile_am "$tmpfile"; then
-  rm -f "$tmpfile"
-else
-  if $doit; then
-echo "Updating $sourcebase/$source_makefile_am (backup in $sourcebase/$source_makefile_am~)"
-mv -f "$destdir"/$sourcebase/$source_makefile_am "$destdir"/$sourcebase/$source_makefile_am~
-mv -f "$tmpfile" "$destdir"/$sourcebase/$source_makefile_am
-  else
-echo "Update $sourcebase/$source_makefile_am (backup in $sourcebase/$source_makefile_am~)"
-rm -f "$tmpfile"
-  fi
-fi
-  else
-if $doit; then
-  echo "Creating $sourcebase/$source_makefile_am"
-  mv -f "$tmpfile" "$destdir"/$sourcebase/$source_makefile_am
-else
-  echo "Create $sourcebase/$source_makefile_am"
-  rm -f "$tmpfile"
-fi
-func_append added_files "$sourcebase/$source_makefile_am$nl"
-  fi
-
   # Create po/ directory.
   if test -n "$pobase"; then
 # Create po makefile and auxiliary files.
@@ -6131,6 +6098,41 @@ s,//*$,/,'
 fi
   fi
 
+  # Create library makefile.
+  # Do this after creating gnulib-comp.m4, because func_emit_lib_Makefile_am
+  # can run 'autoconf -t', which reads gnulib-comp.m4.
+  func_dest_tmpfilename $sourcebase/$source_makefile_am
+  destfile="$sourcebase/$source_makefile_am"
+  modules="$main_modules"
+  if $automake_subdir; then
+func_emit_lib_Makefile_am | "$gnulib_dir"/build-aux/prefix-gnulib-mk --from-gnulib-tool --lib-name="$libname" --prefix="$sourcebase/" > "$tmpfile"
+  else
+func_emit_lib_Makefile_am > "$tmpfile"
+  fi
+  if test -f "$destdir"/$sourcebase/$source_makefile_am; then
+if cmp -s "$destdir"/$sourcebase/$source_makefile_am "$tmpfile"; then
+  rm -f "$tmpfile"
+else
+  if $doit; then
+echo "Updating $sourcebase/$source_makefile_am (backup in $sourcebase/$source_makefile_am~)"
+mv -f "$destdir"/$sourcebase/$source_makefile_am "$destdir"/$sourcebase/$source_makefile_am~
+mv -f "$tmpfile" "$destdir"/$sourcebase/$source_makefile_am
+  else
+echo "Update $sourcebase/$source_makefile_am (backup in $sourcebase/$source_makefile_am~)"
+rm -f "$tmpfile"
+  fi
+fi
+  else
+if $doit; then
+  echo "Creating $sourcebase/$source_makefile_am"
+  mv -f "$tmpfile" "$destdir"/$sourcebase/$source_makefile_am
+else
+  echo "Create $sourcebase/$source_makefile_am"
+  rm -f "$tmpfile"
+fi
+func_append added_files "$sourcebase/$source_makefile_am$nl"
+  fi
+
   if $gentests; then
 # Create tests makefile.
 func_dest_tmpfilename $testsbase/$tests_makefile_am
-- 
2.32.0



Re: bug#32452: 26.1; gnutls_try_handshake maxes out cpu retrying when server is a bit busy

2022-02-28 Thread Paul Eggert

On 2/28/22 00:59, Lars Ingebrigtsen wrote:

/usr/bin/ld: gnutls.o: in function `gnutls_try_handshake':
/home/larsi/src/emacs/trunk/src/gnutls.c:634: undefined reference to 
`rpl_nanosleep'


Evidently my recent workarounds in Emacs to handle running gnulib-tool 
from a bare checkout were not sufficient. I installed the attached patch 
to up the ante; please give it a try.


It is unfortunate that emacs/admin/merge-gnulib now runs gnulib-tool 
twice from a bare checkout, as gnulib-tool is quite slow.


I think gnulib-tool needs to run twice because it builds 
emacs/lib/gnulib.mk.in before it builds emacs/m4/gnulib-comp.m4, and so 
doesn't use the newly-added emacs/m4/nanosleep.m4 to figure out the new 
X=@X@ lines that needed to be added to emacs/lib/gnulib.mk.in. I suspect 
that this is related to Emacs's using Gnu Make rather than Automake. 
However, I haven't debugged this out.From d150eb438baa62ef3965ef4dc1f9f342ed839a18 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Mon, 28 Feb 2022 13:16:44 -0800
Subject: [PATCH] Work around merge-gnulib glitch from fresh checkout

* admin/merge-gnulib: In a fresh checkout, run gnulib-tool
twice, instead of merely running autogen.sh twice.
---
 admin/merge-gnulib | 54 +++---
 1 file changed, 32 insertions(+), 22 deletions(-)

diff --git a/admin/merge-gnulib b/admin/merge-gnulib
index fec469c017..7219fadd47 100755
--- a/admin/merge-gnulib
+++ b/admin/merge-gnulib
@@ -102,34 +102,44 @@ gnulib_srcdir=
   exit 1
 }
 
-# gnulib-tool has problems with a bare checkout (Bug#32452#65).
-test -f configure || ./autogen.sh || exit
-
 # Old caches can confuse autoconf when some Gnulib-related changes take effect.
 rm -fr autom4te.cache || exit
 
+# gnulib-tool has problems with a bare checkout (Bug#32452#91).
+if test -f configure; then
+  passes='1'
+else
+  passes='1 2'
+fi
+
 avoided_flags=
 for module in $AVOIDED_MODULES; do
   avoided_flags="$avoided_flags --avoid=$module"
 done
 
-"$gnulib_srcdir"/gnulib-tool --dir="$src" $GNULIB_TOOL_FLAGS \
+for pass in $passes; do
+  case $pass in
+2) echo 'Running gnulib-tool again to work around Bug#32452#91.' >&2
+  esac
+
+  "$gnulib_srcdir"/gnulib-tool --dir="$src" $GNULIB_TOOL_FLAGS \
 	$avoided_flags $GNULIB_MODULES &&
-rm -- "$src"lib/gl_openssl.h \
-  "$src"lib/stdio-read.c "$src"lib/stdio-write.c \
-  "$src"m4/fcntl-o.m4 \
-  "$src"m4/gl-openssl.m4 \
-  "$src"m4/gnulib-cache.m4 "$src"m4/gnulib-tool.m4 \
-  "$src"m4/manywarnings-c++.m4 \
-  "$src"m4/warn-on-use.m4 "$src"m4/wint_t.m4 &&
-cp -- "$gnulib_srcdir"/build-aux/texinfo.tex "$src"doc/misc &&
-cp -- "$gnulib_srcdir"/build-aux/config.guess \
-  "$gnulib_srcdir"/build-aux/config.sub \
-  "$gnulib_srcdir"/build-aux/install-sh \
-  "$gnulib_srcdir"/build-aux/move-if-change \
-   "$src"build-aux &&
-cp -- "$gnulib_srcdir"/lib/af_alg.h \
-  "$gnulib_srcdir"/lib/save-cwd.h \
-   "$src"lib &&
-{ test -z "$src" || cd "$src"; } &&
-./autogen.sh
+  rm -- "$src"lib/gl_openssl.h \
+	"$src"lib/stdio-read.c "$src"lib/stdio-write.c \
+	"$src"m4/fcntl-o.m4 \
+	"$src"m4/gl-openssl.m4 \
+	"$src"m4/gnulib-cache.m4 "$src"m4/gnulib-tool.m4 \
+	"$src"m4/manywarnings-c++.m4 \
+	"$src"m4/warn-on-use.m4 "$src"m4/wint_t.m4 &&
+  cp -- "$gnulib_srcdir"/build-aux/texinfo.tex "$src"doc/misc &&
+  cp -- "$gnulib_srcdir"/build-aux/config.guess \
+	"$gnulib_srcdir"/build-aux/config.sub \
+	"$gnulib_srcdir"/build-aux/install-sh \
+	"$gnulib_srcdir"/build-aux/move-if-change \
+ "$src"build-aux &&
+  cp -- "$gnulib_srcdir"/lib/af_alg.h \
+	"$gnulib_srcdir"/lib/save-cwd.h \
+ "$src"lib &&
+  { test -z "$src" || cd "$src"; } &&
+  ./autogen.sh || exit
+done
-- 
2.32.0



[PATCH] gettime-res: fix unlikely overflow bug

2022-02-26 Thread Paul Eggert
* lib/gettime-res.c (gettime_res): Fix bug when hz * tv_sec overflows.
With 64-bit ‘long’ and nanosecond resolution the bug can occur
starting in the year 2262, with probability about 2e-9.
With 32-bit ‘long’ the bug can occur now, with same probability.
The probability goes up on hosts with worse timestamp resolution.
---
 ChangeLog | 7 +++
 lib/gettime-res.c | 2 +-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 430f81fd39..629ec803fd 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,12 @@
 2022-02-26  Paul Eggert  
 
+   gettime-res: fix unlikely overflow bug
+   * lib/gettime-res.c (gettime_res): Fix bug when hz * tv_sec overflows.
+   With 64-bit ‘long’ and nanosecond resolution the bug can occur
+   starting in the year 2262, with probability about 2e-9.
+   With 32-bit ‘long’ the bug can occur now, with same probability.
+   The probability goes up on hosts with worse timestamp resolution.
+
Document clang -fsanitize=undefined glitch
* doc/gnulib-intro.texi (Unsupported Platforms):
Document incompatibility of ‘clang -fsanitize=undefined’
diff --git a/lib/gettime-res.c b/lib/gettime-res.c
index 3cc07de6da..611f83ad27 100644
--- a/lib/gettime-res.c
+++ b/lib/gettime-res.c
@@ -64,7 +64,7 @@ gettime_res (void)
   for (int i = 0; 1 < r && i < 32; i++)
 {
   struct timespec now = current_timespec ();
-  r = gcd (r, now.tv_nsec ? now.tv_nsec : hz * now.tv_sec);
+  r = gcd (r, now.tv_nsec ? now.tv_nsec : hz);
 }
 
   return r;
-- 
2.35.1




Re: gl_array_list.c:452:29: runtime error: applying zero offset to null pointer

2022-02-26 Thread Paul Eggert

On 11/1/21 18:13, Paul Eggert wrote:


Most likely Paweł can configure his testing environment to suppress 
these false alarms. If not, I suggest firing off a bug report to the 
Clang developers, asking for an easy way to suppress them. In practice 
these particular diagnostics are more trouble than they're worth.


While rereading the Gnulib manual I remembered this issue, found a way 
to suppress Clang's false alarms, and documented it in the attached 
Gnulib patch.From 532b4c9f21473559657e273ef9f8f6fc8c7c2ab1 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 26 Feb 2022 11:39:32 -0800
Subject: [PATCH] Document clang -fsanitize=undefined glitch
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* doc/gnulib-intro.texi (Unsupported Platforms):
Document incompatibility of ‘clang -fsanitize=undefined’
with Gnulib, and how to work around it by also using
‘-fno-sanitize=pointer-overflow’.
---
 ChangeLog |  8 
 doc/gnulib-intro.texi | 11 +++
 2 files changed, 19 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index 6daf85da3e..430f81fd39 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,11 @@
+2022-02-26  Paul Eggert  
+
+	Document clang -fsanitize=undefined glitch
+	* doc/gnulib-intro.texi (Unsupported Platforms):
+	Document incompatibility of ‘clang -fsanitize=undefined’
+	with Gnulib, and how to work around it by also using
+	‘-fno-sanitize=pointer-overflow’.
+
 2022-02-25  Darshit Shah  
 
 	modules/unicase/special-casing: Fix compilation error
diff --git a/doc/gnulib-intro.texi b/doc/gnulib-intro.texi
index a80c0995f5..0bc9701561 100644
--- a/doc/gnulib-intro.texi
+++ b/doc/gnulib-intro.texi
@@ -235,6 +235,17 @@ and Gnulib-using code would have if it were intended to be portable to
 all practical POSIX or C platforms.
 
 @itemize @bullet
+@item
+Clang's @option{-fsanitize=undefined} option causes the program to
+crash if it adds zero to a null pointer -- behavior that is undefined
+in strict C, but which yields a null pointer on all practical porting
+targets and which the Gnulib portability guidelines allow.
+
+If you use Clang with @option{-fsanitize=undefined}, you can work
+around the problem by also using @samp{-fno-sanitize=pointer-overflow},
+although this may also disable some unrelated and useful pointer checks.
+Perhaps someday the Clang developers will fix the infelicity.
+
 @item
 The IBM i's pointers are 128 bits wide and it lacks the two types
 @code{intptr_t} and @code{uintptr_t}, which are optional in the C and
-- 
2.32.0



Re: [PATCH] modules/unicase/special-casing: Fix compilation error

2022-02-25 Thread Paul Eggert

Thanks, that fixes a typo I introduced on Christmas Eve. I installed it.



Re: [PATCH] Port FALLTHROUGH to Apple Clang before 6

2022-02-25 Thread Paul Eggert
Thanks for reporting the problem. We've run into it elsewhere in Gnulib, 
and used a slightly-different approach, so I installed the attached 
which I hope fixes things for you.From e86394634c8acfa7b95d1016e7ce1e5cae33207b Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 25 Feb 2022 15:30:42 -0800
Subject: [PATCH] =?UTF-8?q?Port=20=5F=5Fhas=5Fattribute=20to=20Apple?=
 =?UTF-8?q?=E2=80=99s=20Clang=20renumbering?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Problem reported by Kirill A. Korinsky in:
https://lists.gnu.org/r/bug-gnulib/2022-02/msg00034.html
* config/srclist.txt: Comment out sys/cdefs.h for now.
* lib/cdefs.h (__glibc_has_attribute):
* m4/gnulib-common.m4 (gl_COMMON_BODY):
Port to Apple’s renumbering of Clang versions.
---
 ChangeLog   | 8 
 config/srclist.txt  | 2 +-
 lib/cdefs.h | 4 +++-
 m4/gnulib-common.m4 | 4 +++-
 4 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 49a1e6a168..0b8525e926 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,13 @@
 2022-02-25  Paul Eggert  
 
+	Port __has_attribute to Apple’s Clang renumbering
+	Problem reported by Kirill A. Korinsky in:
+	https://lists.gnu.org/r/bug-gnulib/2022-02/msg00034.html
+	* config/srclist.txt: Comment out sys/cdefs.h for now.
+	* lib/cdefs.h (__glibc_has_attribute):
+	* m4/gnulib-common.m4 (gl_COMMON_BODY):
+	Port to Apple’s renumbering of Clang versions.
+
 	nanosleep: simplify by using pselect
 	GNU Emacs avoids Gnulib’s ‘select’ module and uses only pselect,
 	which it implements in a special way on MS-DOS.
diff --git a/config/srclist.txt b/config/srclist.txt
index dc69587e99..d89cd30ed9 100644
--- a/config/srclist.txt
+++ b/config/srclist.txt
@@ -64,7 +64,7 @@ $LIBCSRC malloc/scratch_buffer_grow.c	lib/malloc
 $LIBCSRC malloc/scratch_buffer_grow_preserve.c	lib/malloc
 $LIBCSRC malloc/scratch_buffer_set_array_size.c	lib/malloc
 #$LIBCSRC include/intprops.h lib
-$LIBCSRC misc/sys/cdefs.h		lib
+#$LIBCSRC misc/sys/cdefs.h		lib
 #$LIBCSRC posix/regcomp.c		lib
 $LIBCSRC posix/regex.c			lib
 $LIBCSRC posix/regex.h			lib
diff --git a/lib/cdefs.h b/lib/cdefs.h
index 44d3826bca..cb2514504f 100644
--- a/lib/cdefs.h
+++ b/lib/cdefs.h
@@ -41,7 +41,9 @@
Similarly for __has_builtin, etc.  */
 #if (defined __has_attribute \
  && (!defined __clang_minor__ \
- || 3 < __clang_major__ + (5 <= __clang_minor__)))
+ || (defined __apple_build_version__ \
+ ? 600 <= __apple_build_version__ \
+ : 3 < __clang_major__ + (5 <= __clang_minor__
 # define __glibc_has_attribute(attr) __has_attribute (attr)
 #else
 # define __glibc_has_attribute(attr) 0
diff --git a/m4/gnulib-common.m4 b/m4/gnulib-common.m4
index dbc4079614..c5ced04f18 100644
--- a/m4/gnulib-common.m4
+++ b/m4/gnulib-common.m4
@@ -69,7 +69,9 @@ AC_DEFUN([gl_COMMON_BODY], [
 [/* Attributes.  */
 #if (defined __has_attribute \
  && (!defined __clang_minor__ \
- || 3 < __clang_major__ + (5 <= __clang_minor__)))
+ || (defined __apple_build_version__ \
+ ? 600 <= __apple_build_version__ \
+ : 3 < __clang_major__ + (5 <= __clang_minor__
 # define _GL_HAS_ATTRIBUTE(attr) __has_attribute (__##attr##__)
 #else
 # define _GL_HAS_ATTRIBUTE(attr) _GL_ATTR_##attr
-- 
2.35.1



Re: Ncurses support?

2022-02-25 Thread Paul Eggert

On 2/25/22 11:26, Mike Frysinger wrote:

you also seem to be unfamiliar with what goes into a functional curses library
(including the terminfo library).  this is not a trivial undertaking by any
means.


I contributed to termcap in the 1970s, when Bill Joy was still in charge 
of it. You can see some of my work even today, by searching for "This 
description is tricky" in FreeBSD's /etc/termcap.


This experience convinced me to never work on anything like that again. 
What a mess it is! My hat is off to anybody who wants to maintain it.




Re: bug#32452: 26.1; gnutls_try_handshake maxes out cpu retrying when server is a bit busy

2022-02-25 Thread Paul Eggert

On 2/24/22 18:27, Lars Ingebrigtsen wrote:


But autogen.sh fails:

Running 'autoreconf -fi -I m4' ...
configure.ac:6060: warning: gl_FUNC_SELECT is m4_require'd but not m4_defun'd


This is because Gnulib's 'nanosleep' module depended on the 'select' 
module, but Emacs's admin/merge-gnulib avoids the 'select' module 
(because Emacs relies on pselect instead and has its own MS-DOS pselect 
substitute).


Gnulib's nanosleep appears to use select only for old Unixish platforms 
that were relevant in 2000 but aren't practical porting targets any 
more. So I installed into Gnulib the attached patch to simplify Gnulib 
nanosleep by having it fall back on pselect rather than select, and to 
not bother with signal handling. This should cause your addition of 
nanosleep to admin/merge-gnulib to add only the files lib/nanosleep.c 
and m4/nanosleep.m4 (not the other, signal-related files you mentioned; 
they shouldn't be needed with Emacs).


You might also want to adopt my recent little merge-gnulib changes.

I notice that Emacs's GNUstep code calls 'select'. For completeness this 
should be 'pselect' instead, so that Emacs never calls 'select'.From 2510ffcdcdad4e5cd20455b4891de4f5e128072a Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Fri, 25 Feb 2022 11:54:49 -0800
Subject: [PATCH] nanosleep: simplify by using pselect
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

GNU Emacs avoids Gnulib’s ‘select’ module and uses only pselect,
which it implements in a special way on MS-DOS.
Unfortunately, though, nanosleep uses ‘select’;
problem reported by Lars Ingebrigtsen (Bug#32452#74).
As far as I can tell, Gnulib nanosleep's use of
‘select’ with signals is only for ancient platforms
that Gnulib no longer cares about, so remove that use of ‘select’.
I don’t know of any platforms that still need this fallback code,
but just in case, fall back to pselect instead, while removing
signal handling that it shouldn’t be needed nowadays.
* lib/nanosleep.c: Do not include sig-handler.h, sys/time.h.
(SIGCONT, suspended, sighandler, my_usleep): Remove.
(nanosleep) [!HAVE_BUG_BIG_NANOSLEEP && !(_WIN32 && !__CYGWIN__)]:
Just call pselect.
* m4/nanosleep.m4 (gl_FUNC_NANOSLEEP): Do not check for sys/time.h
or call gl_FUNC_SELECT.  Do not include sys/time.h or worry
about LIBSOCKET.
(gl_PREREQ_NANOSLEEP): Remove as it’s no longer needed.
All uses removed.
* modules/nanosleep (Depends-on): Add pselect.
Remove select, sigaction, sys_time.
---
 ChangeLog | 25 +
 lib/nanosleep.c   | 89 +++
 m4/nanosleep.m4   | 24 +
 modules/nanosleep |  7 +---
 4 files changed, 31 insertions(+), 114 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 3499d066e4..49a1e6a168 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,28 @@
+2022-02-25  Paul Eggert  
+
+	nanosleep: simplify by using pselect
+	GNU Emacs avoids Gnulib’s ‘select’ module and uses only pselect,
+	which it implements in a special way on MS-DOS.
+	Unfortunately, though, nanosleep uses ‘select’;
+	problem reported by Lars Ingebrigtsen (Bug#32452#74).
+	As far as I can tell, Gnulib nanosleep's use of
+	‘select’ with signals is only for ancient platforms
+	that Gnulib no longer cares about, so remove that use of ‘select’.
+	I don’t know of any platforms that still need this fallback code,
+	but just in case, fall back to pselect instead, while removing
+	signal handling that it shouldn’t be needed nowadays.
+	* lib/nanosleep.c: Do not include sig-handler.h, sys/time.h.
+	(SIGCONT, suspended, sighandler, my_usleep): Remove.
+	(nanosleep) [!HAVE_BUG_BIG_NANOSLEEP && !(_WIN32 && !__CYGWIN__)]:
+	Just call pselect.
+	* m4/nanosleep.m4 (gl_FUNC_NANOSLEEP): Do not check for sys/time.h
+	or call gl_FUNC_SELECT.  Do not include sys/time.h or worry
+	about LIBSOCKET.
+	(gl_PREREQ_NANOSLEEP): Remove as it’s no longer needed.
+	All uses removed.
+	* modules/nanosleep (Depends-on): Add pselect.
+	Remove select, sigaction, sys_time.
+
 2022-02-24  Paul Eggert  
 
 	userspec: warn about '.' separator
diff --git a/lib/nanosleep.c b/lib/nanosleep.c
index 5294c646ae..446794edc0 100644
--- a/lib/nanosleep.c
+++ b/lib/nanosleep.c
@@ -23,7 +23,6 @@
 #include 
 
 #include "intprops.h"
-#include "sig-handler.h"
 #include "verify.h"
 
 #include 
@@ -32,7 +31,6 @@
 #include 
 #include 
 
-#include 
 #include 
 
 #include 
@@ -181,45 +179,9 @@ nanosleep (const struct timespec *requested_delay,
 }
 
 #else
-/* Unix platforms lacking nanosleep. */
-
-/* Some systems (MSDOS) don't have SIGCONT.
-   Using SIGTERM here turns the signal-handling code below
-   into a no-op on such systems. */
-# ifndef SIGCONT
-#  define SIGCONT SIGTERM
-# endif
-
-static sig_atomic_t volatile suspended;
-
-/* Handle SIGCONT.

[PATCH 2/2] userspec: warn about '.' separator

2022-02-24 Thread Paul Eggert
Problem reported by Dan Jacobson (Bug#44770).
* lib/userspec.c: Don’t include stdbool.h since it’s now in our API.
(parse_user_spec_warn): New function, broken out of parse_user_spec
and with a new PWARN arg.
(parse_user_spec): Use it.
* lib/userspec.h: Include stdbool.h and declare new function.
* tests/test-userspec.c (struct test.in): Now a char array
so that it can be modified.
(T): Make the placeholder a valid test, as that simplifies
the code.  Omit NULL placeholder at the end, likewise.
(main): Set up T in the new way, and test that the "."  separator
acts like the ":" separator except with a warning if it works.
---
 ChangeLog | 14 
 lib/userspec.c| 26 ---
 lib/userspec.h|  8 +++--
 tests/test-userspec.c | 77 ---
 4 files changed, 91 insertions(+), 34 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index dda27c7eb9..3499d066e4 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,19 @@
 2022-02-24  Paul Eggert  
 
+   userspec: warn about '.' separator
+   Problem reported by Dan Jacobson (Bug#44770).
+   * lib/userspec.c: Don’t include stdbool.h since it’s now in our API.
+   (parse_user_spec_warn): New function, broken out of parse_user_spec
+   and with a new PWARN arg.
+   (parse_user_spec): Use it.
+   * lib/userspec.h: Include stdbool.h and declare new function.
+   * tests/test-userspec.c (struct test.in): Now a char array
+   so that it can be modified.
+   (T): Make the placeholder a valid test, as that simplifies
+   the code.  Omit NULL placeholder at the end, likewise.
+   (main): Set up T in the new way, and test that the "."  separator
+   acts like the ":" separator except with a warning if it works.
+
userspec: no need for static vars
* lib/userspec.c (parse_with_separator): Simplify.
 
diff --git a/lib/userspec.c b/lib/userspec.c
index e75feb255d..0635bf8426 100644
--- a/lib/userspec.c
+++ b/lib/userspec.c
@@ -22,7 +22,6 @@
 /* Specification.  */
 #include "userspec.h"
 
-#include 
 #include 
 #include 
 #include 
@@ -251,15 +250,18 @@ parse_with_separator (char const *spec, char const 
*separator,
Either one might be NULL instead, indicating that it was not
given and the corresponding numeric ID was left unchanged.
 
-   Return NULL if successful, a static error message string if not.  */
+   Return NULL if successful, a static error message string if not.
+   If PWARN is null, return NULL instead of a warning;
+   otherwise, set *PWARN to true depending on whether returning a warning.  */
 
 char const *
-parse_user_spec (char const *spec, uid_t *uid, gid_t *gid,
- char **username, char **groupname)
+parse_user_spec_warn (char const *spec, uid_t *uid, gid_t *gid,
+  char **username, char **groupname, bool *pwarn)
 {
   char const *colon = gid ? strchr (spec, ':') : NULL;
   char const *error_msg =
 parse_with_separator (spec, colon, uid, gid, username, groupname);
+  bool warn = false;
 
   if (gid && !colon && error_msg)
 {
@@ -272,12 +274,26 @@ parse_user_spec (char const *spec, uid_t *uid, gid_t *gid,
   char const *dot = strchr (spec, '.');
   if (dot
   && ! parse_with_separator (spec, dot, uid, gid, username, groupname))
-error_msg = NULL;
+{
+  warn = true;
+  error_msg = pwarn ? N_("warning: '.' should be ':'") : NULL;
+}
 }
 
+  if (pwarn)
+*pwarn = warn;
   return error_msg;
 }
 
+/* Like parse_user_spec_warn, but generate only errors; no warnings.  */
+
+char const *
+parse_user_spec (char const *spec, uid_t *uid, gid_t *gid,
+ char **username, char **groupname)
+{
+  return parse_user_spec_warn (spec, uid, gid, username, groupname, NULL);
+}
+
 #ifdef TEST
 
 # define NULL_CHECK(s) ((s) == NULL ? "(null)" : (s))
diff --git a/lib/userspec.h b/lib/userspec.h
index fcfa4b9b74..7d5d063e7e 100644
--- a/lib/userspec.h
+++ b/lib/userspec.h
@@ -19,10 +19,14 @@
 #ifndef USERSPEC_H
 # define USERSPEC_H 1
 
+# include 
 # include 
 
-const char *
-parse_user_spec (const char *spec_arg, uid_t *uid, gid_t *gid,
+char const *
+parse_user_spec (char const *spec_arg, uid_t *uid, gid_t *gid,
  char **username_arg, char **groupname_arg);
+char const *
+parse_user_spec_warn (char const *spec_arg, uid_t *uid, gid_t *gid,
+  char **username_arg, char **groupname_arg, bool *pwarn);
 
 #endif
diff --git a/tests/test-userspec.c b/tests/test-userspec.c
index c2e36ffb50..65287a592d 100644
--- a/tests/test-userspec.c
+++ b/tests/test-userspec.c
@@ -35,7 +35,7 @@
 
 struct test
 {
-  const char *in;
+  char in[100];
   uid_t uid;
   gid_t gid;
   const char *user_name;
@@ -48,6 +48,7 @@ static struct test T[] =
 { "&q

[PATCH 1/2] userspec: no need for static vars

2022-02-24 Thread Paul Eggert
* lib/userspec.c (parse_with_separator): Simplify.
---
 ChangeLog  |  5 +
 lib/userspec.c | 10 +++---
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 9748e5ab50..dda27c7eb9 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2022-02-24  Paul Eggert  
+
+   userspec: no need for static vars
+   * lib/userspec.c (parse_with_separator): Simplify.
+
 2022-02-22  Benno Schulenberg(tiny change)
 
doc: add two missing closing parentheses
diff --git a/lib/userspec.c b/lib/userspec.c
index f05ccbe635..e75feb255d 100644
--- a/lib/userspec.c
+++ b/lib/userspec.c
@@ -103,10 +103,6 @@ parse_with_separator (char const *spec, char const 
*separator,
   uid_t *uid, gid_t *gid,
   char **username, char **groupname)
 {
-  static const char *E_invalid_user = N_("invalid user");
-  static const char *E_invalid_group = N_("invalid group");
-  static const char *E_bad_spec = N_("invalid spec");
-
   const char *error_msg;
   struct passwd *pwd;
   struct group *grp;
@@ -167,7 +163,7 @@ parse_with_separator (char const *spec, char const 
*separator,
 {
   /* If there is no group,
  then there may not be a trailing ":", either.  */
-  error_msg = E_bad_spec;
+  error_msg = N_("invalid spec");
 }
   else
 {
@@ -176,7 +172,7 @@ parse_with_separator (char const *spec, char const 
*separator,
   && tmp <= MAXUID && (uid_t) tmp != (uid_t) -1)
 unum = tmp;
   else
-error_msg = E_invalid_user;
+error_msg = N_("invalid user");
 }
 }
   else
@@ -209,7 +205,7 @@ parse_with_separator (char const *spec, char const 
*separator,
   && tmp <= MAXGID && (gid_t) tmp != (gid_t) -1)
 gnum = tmp;
   else
-error_msg = E_invalid_group;
+error_msg = N_("invalid group");
 }
   else
 gnum = grp->gr_gid;
-- 
2.35.1




Re: textstyle.h missing when building bison from current head

2022-02-23 Thread Paul Eggert

On 2/23/22 09:20, Paul Eggert wrote:


I haven't had time yet to look into the Emacs situation.


I've now fixed the problem in Emacs, I hope. It was related to Emacs 
using GNU Make instead of Automake for makefile conditions, and this 
collided with Gnulib's new way of doing conditional makes. This is 
discussed in:


https://bugs.gnu.org/32452#50

Since Bison uses Automake its problem is probably unrelated to Emacs's.



Re: textstyle.h missing when building bison from current head

2022-02-23 Thread Paul Eggert

On 2/23/22 05:02, Anthony Heading wrote:


I'm getting the error below when trying to build bison HEAD from github,
with a vanilla sequence of 'bootstrap', 'configure', 'make'.The latest
v3.8.2 bison release builds fine as I think it precedes gnulib updates in
December.


Yes, we're seeing similar problems with Emacs; we can't use 
Gnulib-since-December because of changes incompatible with how Emacs 
uses Gnulib (Emacs doesn't use Automake, and relies on its own 
'admin/merge-gnulib' script to use Gnulib). See:


https://bugs.gnu.org/32452#47

I haven't had time yet to look into the Emacs situation. I have so far 
resisted the temptation to suggest going back to Gnulib's old way of 
doing this sort of thing.




Re: bug#48085: date -d greater than 23 years ago gives error invalid date

2022-02-19 Thread Paul Eggert

On 4/28/21 16:23, Mark Krenz wrote:

So I'm not sure if this is a problem with coreutils or a change in the
zoneinfo database. Any ideas?


This appears to be a problem in the GNU C library, when its mktime 
deciphers the relatively unusual time zone history of Indiana.


I installed the attached patch into Gnulib and propagated it into 
Coreutils, so the issue should be fixed in the next release of GNU 
Coreutils. Eventually this patch should migrate from Gnulib to glibc so 
that other apps get the fix. Thanks for reporting the issue.From 06b2e943be39284783ff81ac6c9503200f41dba3 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 19 Feb 2022 15:04:43 -0800
Subject: [PATCH] mktime: improve heuristic for ca-1986 Indiana DST

Problem reported by Mark Krenz <https://bugs.gnu.org/48085>.
* lib/mktime.c (__mktime_internal): Be more generous about
accepting arguments with the wrong value of tm_isdst, by falling
back to a one-hour DST difference if we find no nearby DST that is
unusual.  This fixes a problem where "1986-04-28 00:00 EDT" was
rejected when TZ="America/Indianapolis" because the nearest DST
timestamp occurred in 1970, a temporal distance too great for the
old heuristic.  This also also narrows the search a bit, which
is a minor performance win.
* m4/mktime.m4 (gl_FUNC_MKTIME_WORKS):
Check for putenv failures and for Bug#48085.
* tests/test-parse-datetime.c (main):
Test for setenv failures and for Bug#48085.
---
 ChangeLog   | 17 +
 lib/mktime.c| 28 
 m4/mktime.m4| 29 +
 tests/test-parse-datetime.c | 21 +++--
 4 files changed, 81 insertions(+), 14 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 4bf0cec7f0..4d56be83d4 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,20 @@
+2022-02-19  Paul Eggert  
+
+	mktime: improve heuristic for ca-1986 Indiana DST
+	Problem reported by Mark Krenz <https://bugs.gnu.org/48085>.
+	* lib/mktime.c (__mktime_internal): Be more generous about
+	accepting arguments with the wrong value of tm_isdst, by falling
+	back to a one-hour DST difference if we find no nearby DST that is
+	unusual.  This fixes a problem where "1986-04-28 00:00 EDT" was
+	rejected when TZ="America/Indianapolis" because the nearest DST
+	timestamp occurred in 1970, a temporal distance too great for the
+	old heuristic.  This also also narrows the search a bit, which
+	is a minor performance win.
+	* m4/mktime.m4 (gl_FUNC_MKTIME_WORKS):
+	Check for putenv failures and for Bug#48085.
+	* tests/test-parse-datetime.c (main):
+	Test for setenv failures and for Bug#48085.
+
 2022-02-12  Paul Eggert  
 
 	filevercmp: fix several unexpected results
diff --git a/lib/mktime.c b/lib/mktime.c
index aa12e28e16..7dc9d67ef9 100644
--- a/lib/mktime.c
+++ b/lib/mktime.c
@@ -429,8 +429,13 @@ __mktime_internal (struct tm *tp,
 	 time with the right value, and use its UTC offset.
 
 	 Heuristic: probe the adjacent timestamps in both directions,
-	 looking for the desired isdst.  This should work for all real
-	 time zone histories in the tz database.  */
+	 looking for the desired isdst.  If none is found within a
+	 reasonable duration bound, assume a one-hour DST difference.
+	 This should work for all real time zone histories in the tz
+	 database.  */
+
+  /* +1 if we wanted standard time but got DST, -1 if the reverse.  */
+  int dst_difference = (isdst == 0) - (tm.tm_isdst == 0);
 
   /* Distance between probes when looking for a DST boundary.  In
 	 tzdata2003a, the shortest period of DST is 601200 seconds
@@ -441,12 +446,14 @@ __mktime_internal (struct tm *tp,
 	 periods when probing.  */
   int stride = 601200;
 
-  /* The longest period of DST in tzdata2003a is 536454000 seconds
-	 (e.g., America/Jujuy starting 1946-10-01 01:00).  The longest
-	 period of non-DST is much longer, but it makes no real sense
-	 to search for more than a year of non-DST, so use the DST
-	 max.  */
-  int duration_max = 536454000;
+  /* In TZDB 2021e, the longest period of DST (or of non-DST), in
+	 which the DST (or adjacent DST) difference is not one hour,
+	 is 457243209 seconds: e.g., America/Cambridge_Bay with leap
+	 seconds, starting 1965-10-31 00:00 in a switch from
+	 double-daylight time (-05) to standard time (-07), and
+	 continuing to 1980-04-27 02:00 in a switch from standard time
+	 (-07) to daylight time (-06).  */
+  int duration_max = 457243209;
 
   /* Search in both directions, so the maximum distance is half
 	 the duration; add the stride to avoid off-by-1 problems.  */
@@ -483,6 +490,11 @@ __mktime_internal (struct tm *tp,
 	  }
 	  }
 
+  /* No unusual DST offset was found nearby.  Assume one-hour DST.  */
+  t += 60 * 60 * dst_difference;
+  if (mktime_min <= t && t <= mktime_max && convert_time (convert, t, &tm))
+	goto offse

Re: bug#49239: Unexpected results with sort -V

2022-02-12 Thread Paul Eggert

On 6/28/21 10:54, Kamil Dudka wrote:

You are right.  The matching algorithm was not implemented correctly and
the patch you attached fixes it.


I looked into Bug#49239 and found some more places where the 
documentation disagreed with the code. I installed the attached patches 
into Gnulib and Coreutils, respectively, which should bring the two into 
agreement and should fix the bugs that Michael reported albeit in a 
different way than his proposed patch. Briefly:


* The code didn't allow file name suffixes to be the entire file name, 
but the documentation did. Here I went with the documentation. I could 
be talked into the other way; it shouldn't matter much either way.


* The code did the preliminary test (without suffixes) using strcmp, the 
documentation said it should use version comparison. Here I went with 
the documentation.


* As Michael mentioned, sort -V mishandled NUL. I fixed this by adding a 
Gnulib function filenvercmp that treats NUL as just another character.


* As Michael also mentioned, filevercmp fell back on strcmp if version 
sort found no difference, which meant sort's --stable flag was 
ineffective. I fixed this by not having filevercmp fall back on strcmp.


* I fixed the two-consecutive dot and trailing-dot bugs Michael 
mentioned, by rewriting the suffix finder to not have that confusing 
READ_ALPHA state variable, and to instead implement the regular 
expression's nested * operators in the usual way with nested loops.


Thanks, Michael, for reporting the problem. I'm boldly closing the 
Coreutils bug report as fixed.From 9f48fb992a3d7e96610c4ce8be969cff2d61a01b Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 12 Feb 2022 16:27:05 -0800
Subject: [PATCH] filevercmp: fix several unexpected results

Problems reported by Michael Debertol in <https://bugs.gnu.org/49239>.
While looking into this, I spotted some more areas where the
code and documentation did not agree, or where the documentation
was unclear.  The biggest change needed by coreutils is a new
function filenvercmp that can compare byte strings containing NUL.
* lib/filevercmp.c: Do not include sys/types.h, stdlib.h, string.h.
Include idx.h, verify.h.
(match_suffix): Remove, replacing all uses with calls to ...
(file_prefixlen): ... this new function.  Simplify it by
avoiding the need for a confusing READ_ALPHA state variable.
Change its API to something more useful, with a *LEN arg.
it with a new *LEN arg.
(file_prefixlen, verrevcmp):
Prefer idx_t to size_t where either will do.
(order): Change args to S, POS, LEN instead of just S[POS].
This lets us handle NUL bytes correctly.  Callers changed.
Verify that ints are sufficiently wide for its API.
(verrevcmp): Don't assume that S1[S1_LEN] is a non-digit,
and likewise for S2[S2_LEN].  The byte might not be accessible
if filenvercmp is being called.
(filevercmp): Reimplement by calling filenvercmp.
(filenvercmp): New function, rewritten without the assumption
that the inputs are null-terminated.
Remove "easy comparison to see if strings are identical", as the
use of it later (a) was undocumented, and (b) caused sort -V to be
unstable.  When both strings start with ".", do not skip past
the "."s before looking for suffixes, as this disagreed
with the documentation.
* lib/filevercmp.h: Fix comments, which had many mistakes.
(filenvercmp): New decl.
* modules/filevercmp (Depends-on): Add idx, verify.  Remove string.
* tests/test-filevercmp.c: Include string.h.
(examples): Reorder examples ".0" and ".9" that matched the code
but not the documentation.  The code has been fixed to match the
documentation.  Add some examples involving \1 so that they
can be tried with both \1 and \0.  Add some other examples
taken from the bug report.
(equals): New set of test cases.
(sign, test_filevercmp): New functions.
(main): Remove test case where the fixed filevercmp disagrees with
strverscmp.  Use test_filevercmp instead of filevercmp, so that
we also test filenvercmp.  Test the newly-introduced EQUALS cases.
---
 ChangeLog   |  46 ++
 lib/filevercmp.c| 187 +---
 lib/filevercmp.h|  66 ++
 modules/filevercmp  |   3 +-
 tests/test-filevercmp.c |  94 +++-
 5 files changed, 284 insertions(+), 112 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 62162cbfce..4bf0cec7f0 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,49 @@
+2022-02-12  Paul Eggert  
+
+	filevercmp: fix several unexpected results
+	Problems reported by Michael Debertol in <https://bugs.gnu.org/49239>.
+	While looking into this, I spotted some more areas where the
+	code and documentation did not agree, or where the documentation
+	was unclear.  The biggest change needed by coreutils is a new
+	function filenvercmp that can compare byte strings containing NUL.
+	* lib/filevercmp.c: Do not include sys/types.h, stdl

Re: helper to bump scriptversion=

2022-02-09 Thread Paul Eggert

On 2/7/22 18:33, Mike Frysinger wrote:

is there a helper script that i'm missing to update $scriptversion
automatically for me ?


I normally rely on Emacs to update $scriptversion. For example, the 
following at the bottom of the bootstrap file tells Emacs to update 
$scriptversion whenever I edit the file.


# Local Variables:
# eval: (add-hook 'before-save-hook 'time-stamp)
# time-stamp-start: "scriptversion="
# time-stamp-format: "%:y-%02m-%02d.%02H"
# time-stamp-time-zone: "UTC0"
# time-stamp-end: "; # UTC"
# End:

This could be turned into a shell script that calls Emacs.



Re: bug#50115: date command arithmetic involving the epoch produces "invalid date"

2022-02-05 Thread Paul Eggert
Thanks for the bug report. I installed the attached patches to Gnulib 
and to Coreutils, and the fix should be in the next Coreutils release.From aa0d1e7800903f2d75432d78aa64a0e9770e83f2 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 5 Feb 2022 11:05:44 -0800
Subject: [PATCH] parse-datetime: allow calculations to yield -1

Problem reported by Jeremy Cantrell <https://bugs.gnu.org/50115>.
* lib/parse-datetime.y (parse_datetime_body): When calling mktime,
use an unmodifed and negative tm_wday or tm_yday to detect an error,
as a (time_t) -1 return value is valid on most hosts.
* tests/test-parse-datetime.c (main): Add a test for the bug.
---
 ChangeLog   |  9 +
 lib/parse-datetime.y| 22 +++---
 tests/test-parse-datetime.c |  8 
 3 files changed, 28 insertions(+), 11 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 5445802ea2..18dcb3fe3f 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2022-02-05  Paul Eggert  
+
+	parse-datetime: allow calculations to yield -1
+	Problem reported by Jeremy Cantrell <https://bugs.gnu.org/50115>.
+	* lib/parse-datetime.y (parse_datetime_body): When calling mktime,
+	use an unmodifed and negative tm_wday or tm_yday to detect an error,
+	as a (time_t) -1 return value is valid on most hosts.
+	* tests/test-parse-datetime.c (main): Add a test for the bug.
+
 2022-02-04  Paul Eggert  
 
 	userspec: help fix GNU ‘id’ incompatibility
diff --git a/lib/parse-datetime.y b/lib/parse-datetime.y
index c40fdcef7f..9fc14c9d46 100644
--- a/lib/parse-datetime.y
+++ b/lib/parse-datetime.y
@@ -2076,21 +2076,20 @@ parse_datetime_body (struct timespec *result, char const *p,
   if (pc.days_seen && ! pc.dates_seen)
 {
   intmax_t dayincr;
-  if (INT_MULTIPLY_WRAPV ((pc.day_ordinal
-   - (0 < pc.day_ordinal
-  && tm.tm_wday != pc.day_number)),
-  7, &dayincr)
-  || INT_ADD_WRAPV ((pc.day_number - tm.tm_wday + 7) % 7,
-dayincr, &dayincr)
-  || INT_ADD_WRAPV (dayincr, tm.tm_mday, &tm.tm_mday))
-Start = -1;
-  else
+  tm.tm_yday = -1;
+  if (! (INT_MULTIPLY_WRAPV ((pc.day_ordinal
+  - (0 < pc.day_ordinal
+ && tm.tm_wday != pc.day_number)),
+ 7, &dayincr)
+ || INT_ADD_WRAPV ((pc.day_number - tm.tm_wday + 7) % 7,
+   dayincr, &dayincr)
+ || INT_ADD_WRAPV (dayincr, tm.tm_mday, &tm.tm_mday)))
 {
   tm.tm_isdst = -1;
   Start = mktime_z (tz, &tm);
 }
 
-  if (Start == (time_t) -1)
+  if (tm.tm_yday < 0)
 {
   if (debugging (&pc))
 dbg_printf (_("error: day '%s' "
@@ -2156,8 +2155,9 @@ parse_datetime_body (struct timespec *result, char const *p,
   tm.tm_min = tm0.tm_min;
   tm.tm_sec = tm0.tm_sec;
   tm.tm_isdst = tm0.tm_isdst;
+  tm.tm_wday = -1;
   Start = mktime_z (tz, &tm);
-  if (Start == (time_t) -1)
+  if (tm.tm_wday < 0)
 {
   if (debugging (&pc))
 dbg_printf (_("error: adding relative date resulted "
diff --git a/tests/test-parse-datetime.c b/tests/test-parse-datetime.c
index 059c810cd1..1e7955bc96 100644
--- a/tests/test-parse-datetime.c
+++ b/tests/test-parse-datetime.c
@@ -398,6 +398,14 @@ main (_GL_UNUSED int argc, char **argv)
   ASSERT (result.tv_sec == thur2 + ((i + 3) % 7 - 7) * 24 * 3600);
 }
 
+  p = "1970-12-31T23:59:59+00:00 - 1 year";  /* Bug#50115 */
+  now.tv_sec = -1;
+  now.tv_nsec = 0;
+  ASSERT (parse_datetime (&result, p, &now));
+  LOG (p, now, result);
+  ASSERT (result.tv_sec == now.tv_sec
+  && result.tv_nsec == now.tv_nsec);
+
   p = "THURSDAY UTC+00";  /* The epoch was on Thursday.  */
   now.tv_sec = 0;
   now.tv_nsec = 0;
-- 
2.32.0

From cf6c84989968c5081c683bbef77825fc35e03c9d Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 5 Feb 2022 11:08:45 -0800
Subject: [PATCH 1/2] build: update gnulib submodule to latest

---
 gnulib | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gnulib b/gnulib
index ff208d546..aa0d1e780 16
--- a/gnulib
+++ b/gnulib
@@ -1 +1 @@
-Subproject commit ff208d546a26fee39a0191297c11560da74b5dee
+Subproject commit aa0d1e7800903f2d75432d78aa64a0e9770e83f2
-- 
2.32.0

From 8a3dedfef9479c53cd9016139ce00d58a6006ba2 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sat, 5 Feb 2022 13:46:44 -0800
Subject: [PATCH 2/2] date: test against bug#50115

* tests/misc/date.pl: Add test.
---
 tests/misc/date.pl

[PATCH] userspec: help fix GNU ‘id’ incompatibility

2022-02-04 Thread Paul Eggert
* lib/userspec.c (parse_with_separator):
Don’t set *username to a numeric string that is not a user name,
and similarly for *groupname.  Needed to fix Bug#53631.
---
 ChangeLog  | 7 +++
 lib/userspec.c | 2 ++
 2 files changed, 9 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index a202950bd9..5445802ea2 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2022-02-04  Paul Eggert  
+
+   userspec: help fix GNU ‘id’ incompatibility
+   * lib/userspec.c (parse_with_separator):
+   Don’t set *username to a numeric string that is not a user name,
+   and similarly for *groupname.  Needed to fix Bug#53631.
+
 2022-01-30  Pádraig Brady  
 
argmatch: add variants that only match full argument
diff --git a/lib/userspec.c b/lib/userspec.c
index 99ac93bb53..f05ccbe635 100644
--- a/lib/userspec.c
+++ b/lib/userspec.c
@@ -161,6 +161,7 @@ parse_with_separator (char const *spec, char const 
*separator,
   pwd = (*u == '+' ? NULL : getpwnam (u));
   if (pwd == NULL)
 {
+  username = NULL;
   bool use_login_group = (separator != NULL && g == NULL);
   if (use_login_group)
 {
@@ -202,6 +203,7 @@ parse_with_separator (char const *spec, char const 
*separator,
   grp = (*g == '+' ? NULL : getgrnam (g));
   if (grp == NULL)
 {
+  groupname = NULL;
   unsigned long int tmp;
   if (xstrtoul (g, NULL, 10, &tmp, "") == LONGINT_OK
   && tmp <= MAXGID && (gid_t) tmp != (gid_t) -1)
-- 
2.34.1




Fwd: bug#50745: coreutils-8.32 gnulib test results on hppa HP-UX 11.11

2022-01-28 Thread Paul Eggert

I'm forwarding this bug report to bug-gnulib from here:

https://bugs.gnu.org/50745

I don't have time to work on HP-UX 11.11 porting glitches right now, but 
perhaps someone else does.--- Begin Message ---
See attached.

Dave

-- 
John David Anglin  dave.ang...@bell.net



gnulib-testsuite-hppa-hpux11.11.log.gz
Description: GNU Zip compressed data
--- End Message ---


<    5   6   7   8   9   10   11   12   13   14   >