Re: *alloc-gnu on glibc

2021-05-15 Thread Bruno Haible
Pádraig Brady wrote:
> Ah this issue is restricted to glibc's MALLOC_CHECK_ implementation.

Ah, that explains why I got different results than you did.

> I've reported the glibc bug at:
> https://sourceware.org/bugzilla/show_bug.cgi?id=27870

Thanks!

> The attached avoids this part of the test if it's being used.

Applied, with a comment tweak.

Bruno




Re: *alloc-gnu on glibc

2021-05-15 Thread Pádraig Brady

On 15/05/2021 16:49, Bruno Haible wrote:

Hi Pádraig,


On glibc-2.31-5.fc32.x86_64 I'm seeing this test failure with coreutils,
since the tests are now checking for a specific errno.
I just checked a later Fedora 34 system which has the same issue.


For me, a testdir created through

./gnulib-tool --create-testdir --dir=../testdir --single-configure \
   malloc-gnu realloc-gnu calloc-gnu reallocarray

passes all tests on Fedora 32 (glibc 2.31) and Fedora 33 (glibc 2.32).


Oh. Ah this issue is restricted to glibc's MALLOC_CHECK_ implementation.
The attached avoids this part of the test if it's being used.
I've reported the glibc bug at:
https://sourceware.org/bugzilla/show_bug.cgi?id=27870

With this applied, your command above passes all tests.

cheers,
Pádraig
>From defe5bebcd1e2bb75f15eb54ae1135afaae3c477 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Sat, 15 May 2021 17:50:33 +0100
Subject: [PATCH] realloc-gnu: avoid glibc MALLOC_CHECK_ issue

* tests/test-realloc-gnu.c (main): if MALLOC_CHECK_ env var
is set then don't check ENOMEM is returned from realloc().
See https://sourceware.org/bugzilla/show_bug.cgi?id=27870
Note it doesn't suffice to unsetenv() this var within the program,
as the hooks have already been set up at that stage.
---
 ChangeLog| 9 +
 tests/test-realloc-gnu.c | 4 +++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index b802233cf..30663cb38 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2021-05-15  Pádraig Brady  
+
+	realloc-gnu: avoid glibc MALLOC_CHECK_ issue
+	* tests/test-realloc-gnu.c (main): if MALLOC_CHECK_ env var
+	is set then don't check ENOMEM is returned from realloc().
+	See https://sourceware.org/bugzilla/show_bug.cgi?id=27870
+	Note it doesn't suffice to unsetenv() this var within the program,
+	as the hooks have already been set up at that stage.
+
 2021-05-14  Paul Eggert  
 
 	c-stack: work around Solaris 11 bugs
diff --git a/tests/test-realloc-gnu.c b/tests/test-realloc-gnu.c
index 3a787ed91..5534f2ea1 100644
--- a/tests/test-realloc-gnu.c
+++ b/tests/test-realloc-gnu.c
@@ -38,7 +38,9 @@ main (int argc, char **argv)
   size_t one = argc != 12345;
   p = realloc (p, PTRDIFF_MAX + one);
   ASSERT (p == NULL);
-  ASSERT (errno == ENOMEM);
+  /* Avoid https://sourceware.org/bugzilla/show_bug.cgi?id=27870  */
+  if (!getenv ("MALLOC_CHECK_"))
+ASSERT (errno == ENOMEM);
 }
 
   free (p);
-- 
2.26.2



Re: *alloc-gnu on glibc

2021-05-15 Thread Bruno Haible
Hi Pádraig,

> On glibc-2.31-5.fc32.x86_64 I'm seeing this test failure with coreutils,
> since the tests are now checking for a specific errno.
> I just checked a later Fedora 34 system which has the same issue.

For me, a testdir created through

   ./gnulib-tool --create-testdir --dir=../testdir --single-configure \
  malloc-gnu realloc-gnu calloc-gnu reallocarray

passes all tests on Fedora 32 (glibc 2.31) and Fedora 33 (glibc 2.32).

> Specifically test-realloc-gnu is enabled in coreutils
> and it's failing as realloc is returning NULL as expected, but errno is 0.
> Specifically on glibc realloc(NULL, PTRDIFF_MAX+1) does return 0,errno=ENOMEM,
> but realloc(non_null, PTRDIFF_MAX+1) returns 0,errno=0.

POSIX 
says two things:

  1) By enumerating the error values, it implies that when realloc() fails,
 errno gets set to an error number. See
 

  2) "If there is not enough available memory, realloc() shall return a null
  pointer and set errno to [ENOMEM]."

If realloc() is failing but does not set errno, or sets errno to 0, that's a
POSIX violation. To be reported in the glibc bugzilla.

If realloc() is failing and sets errno to some value != ENOMEM, that is
technically valid (since you can debate which is the "cause" of the failure
and which errno value provides the best information). Personally I think
that
  - for an argument > PTRDIFF_MAX, EOVERFLOW is a reasonable error code,
  - EAGAIN is not a good error code, since "Out of memory" is a better error
message than "Resource temporarily unavailable".

> In any case I'd be in favor of relaxing the test,

I'm in favour of reporting the POSIX violation in the first place.

> some notes on solaris below...
> ...
> but I did check the code in isolation on a 64 bit solaris 10 system.
> ...
> The following shows that it's important to ensure the compiler
> is explicitly running in 64 bit mode, as the default is 32 bit,

My knowledge about how to invoke the compiler is here:
https://gitlab.com/ghwiki/gnow-how/-/wikis/Platforms/Configuration
If you think some of the table entries are wrong, please say so.

> Both the above returned immediately for me.

Solaris 11 apparently behaves differently than Solaris 10.

Bruno




Re: *alloc-gnu on glibc

2021-05-15 Thread Paul Eggert

On 5/15/21 9:04 AM, Pádraig Brady wrote:


I'm wondering should we be more relaxed here as,
there is only one standardised ENOMEM error from realloc,
so code doesn't have to behave differently based on what the errno is. 


We've already relaxed the errno test for reallocarray, where NetBSD 
fails with EOVERFLOW on ptrdiff_t overflow. (My own feeling is that 
NetBSD is right and glibc is wrong here; perhaps I should file a glibc 
bug report.)


The allocator modules are in some sense backwards. malloc-gnu requires 
malloc-posix because GNU guarantees are a superset of POSIX guarantees. 
And yet GNU programs often don't care about the POSIX guarantee of errno 
being set, whereas some do care about the GNU guarantee that malloc 
returns NULL only on failure.



Now I accept that error messages may be more accurate,
but one should opt into that I think. 


That opt-in was what malloc-posix was supposed to be.

I see three options:

1. Make the realloc-gnu test more generous.

2. Fix realloc-posix to check for this glibc bug and replace realloc if 
the test fails.


3. Complicate the set of Gnulib allocator modules so that apps can 
more-accurately express their needs.


Although (3) is in some sense the "best" it's probably too complicated. 
What we have is already too complicated. I'd rather have just malloc-gnu 
and not worry about non-GNU behavior (that's the Gnulib way, right?).


(2) is surely the right way to go in the long run. The glibc behavior is 
simply a bug.


In the short run I can see the temptation of (1).




Re: *alloc-gnu on glibc

2021-05-15 Thread Pádraig Brady

On glibc-2.31-5.fc32.x86_64 I'm seeing this test failure with coreutils,
since the tests are now checking for a specific errno.
I just checked a later Fedora 34 system which has the same issue.

Specifically test-realloc-gnu is enabled in coreutils
and it's failing as realloc is returning NULL as expected, but errno is 0.
Specifically on glibc realloc(NULL, PTRDIFF_MAX+1) does return 0,errno=ENOMEM,
but realloc(non_null, PTRDIFF_MAX+1) returns 0,errno=0.

I'm wondering should we be more relaxed here as,
there is only one standardised ENOMEM error from realloc,
so code doesn't have to behave differently based on what the errno is.
Especially if it's using xrealloc etc.
Now I accept that error messages may be more accurate,
but one should opt into that I think.

Should we adjust things so depending on realloc-gnu
would not have the more stringent requirement of ENOMEM
being returned from realloc(), and for projects that wanted that,
they could depend on the realloc-posix module instead?

In any case I'd be in favor of relaxing the test,
rather than replacing realloc everywhere.

some notes on solaris below...

On 14/05/2021 18:04, Bruno Haible wrote:

On Solaris 11.3 (the machine gcc211.fsffrance.org, in 64-bit mode), there
are test failures:
   FAIL: test-calloc-gnu
   FAIL: test-malloc-gnu
   FAIL: test-realloc-gnu
   FAIL: test-reallocarray
Each of these tests takes about 30 seconds before failing, and is not
immediately reactive to Ctrl-C.


These EAGAIN failure modes sound like the value passed to realloc() etc.
is too small to ensure ENOMEM was return immediately.
I've not looked into how the compiler for the m4 test is invoked,
but I did check the code in isolation on a 64 bit solaris 10 system.
The following shows that it's important to ensure the compiler
is explicitly running in 64 bit mode, as the default is 32 bit,
resulting in 32 bit longs and hence too small of values passed in:

> gcc malloc-m4.c
> ./a.out
PTRDIFF_MAX = 2147483647
p=20f90 errno=0
p=0 errno=12
p=0 errno=12

> gcc -m64 malloc-m4.c
> ./a.out
PTRDIFF_MAX = 9223372036854775807
p=0 errno=12
p=0 errno=12
p=0 errno=12

Both the above returned immediately for me.
Though I also notice the existing note in malloc.m4 on this
check being too dangerous to run in general due to this issue.

cheers,
Pádraig