Re: Possible false valgrind error reports

2023-02-21 Thread Tom Lane
Karina Litskevich  writes:
> Thank you, I moved comment changes to 0001 and rewrote the fix through Min().

Looks good.  I pushed it after a little more fiddling with the comments.
Thanks for the report and patch!

regards, tom lane




Re: Possible false valgrind error reports

2023-02-17 Thread Karina Litskevich
Thank you, I moved comment changes to 0001 and rewrote the fix through Min().

> The first hunk in 0001 doesn't seem quite right yet:
>
>   * old allocation.
>   */
>  #ifdef USE_VALGRIND
> -if (oldsize > chunk->requested_size)
> +if (size > chunk->requested_size && oldsize > chunk->requested_size)
>  VALGRIND_MAKE_MEM_UNDEFINED((char *) pointer + 
> chunk->requested_size,
>  oldsize - chunk->requested_size);
>  #endif
>
> If size < oldsize, aren't we still doing the wrong thing?  Seems like
> maybe it has to be like

If size > chunk->requested_size than chksize >= oldsize and so we can mark this
memory without worries. Region from size to chksize will be marked NOACCESS
later anyway:

/* Ensure any padding bytes are marked NOACCESS. */
VALGRIND_MAKE_MEM_NOACCESS((char *) pointer + size, chksize - size);

I agree that it's not obvious, so I changed the first hunk like this:

- if (oldsize > chunk->requested_size)
+ if (Min(size, oldsize) > chunk->requested_size)
  VALGRIND_MAKE_MEM_UNDEFINED((char *) pointer + chunk->requested_size,
- oldsize - chunk->requested_size);
+ Min(size, oldsize) - chunk->requested_size);

Any ideas on how to make this place easier to understand and comment above it
concise and clear are welcome.

There is another thing about this version. New line
+ Min(size, oldsize) - chunk->requested_size);
is longer than 80 symbols and I don't know what's the best way to avoid this
without making it look weird.

I also noticed that if RANDOMIZE_ALLOCATED_MEMORY is defined then
randomize_mem()
have already marked this memory UNDEFINED. So we only "may need to adjust
trailing bytes" if RANDOMIZE_ALLOCATED_MEMORY isn't defined. I reflected it in
v2 of 0001 too.
From 009ef9bbabcd71e0d5f2b60799c0b71d21fb9767 Mon Sep 17 00:00:00 2001
From: Karina Litskevich 
Date: Fri, 17 Feb 2023 15:32:05 +0300
Subject: [PATCH v2 2/2] Change variable name in AllocSetRealloc()

---
 src/backend/utils/mmgr/aset.c | 32 +---
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/src/backend/utils/mmgr/aset.c b/src/backend/utils/mmgr/aset.c
index ace4907ce9..9584d50b14 100644
--- a/src/backend/utils/mmgr/aset.c
+++ b/src/backend/utils/mmgr/aset.c
@@ -1112,7 +1112,7 @@ AllocSetRealloc(void *pointer, Size size)
 	AllocBlock	block;
 	AllocSet	set;
 	MemoryChunk *chunk = PointerGetMemoryChunk(pointer);
-	Size		oldsize;
+	Size		oldchksize;
 	int			fidx;
 
 	/* Allow access to private part of chunk header. */
@@ -1140,11 +1140,11 @@ AllocSetRealloc(void *pointer, Size size)
 
 		set = block->aset;
 
-		oldsize = block->endptr - (char *) pointer;
+		oldchksize = block->endptr - (char *) pointer;
 
 #ifdef MEMORY_CONTEXT_CHECKING
 		/* Test for someone scribbling on unused space in chunk */
-		Assert(chunk->requested_size < oldsize);
+		Assert(chunk->requested_size < oldchksize);
 		if (!sentinel_ok(pointer, chunk->requested_size))
 			elog(WARNING, "detected write past chunk end in %s %p",
  set->header.name, chunk);
@@ -1197,15 +1197,15 @@ AllocSetRealloc(void *pointer, Size size)
 #else
 		/*
 		 * If this is an increase, realloc() will have left any newly-allocated
-		 * part (from oldsize to chksize) UNDEFINED, but we may need to adjust
+		 * part (from oldchksize to chksize) UNDEFINED, but we may need to adjust
 		 * trailing bytes from the old allocation (from chunk->requested_size to
-		 * oldsize) as they are marked NOACCESS.  Make sure not to mark extra
-		 * bytes in case chunk->requested_size < size < oldsize.
+		 * oldchksize) as they are marked NOACCESS.  Make sure not to mark extra
+		 * bytes in case chunk->requested_size < size < oldchksize.
 		 */
 #ifdef USE_VALGRIND
-		if (Min(size, oldsize) > chunk->requested_size)
+		if (Min(size, oldchksize) > chunk->requested_size)
 			VALGRIND_MAKE_MEM_UNDEFINED((char *) pointer + chunk->requested_size,
-		Min(size, oldsize) - chunk->requested_size);
+		Min(size, oldchksize) - chunk->requested_size);
 #endif
 #endif
 
@@ -1223,7 +1223,7 @@ AllocSetRealloc(void *pointer, Size size)
 		 * portion DEFINED.  Make sure not to mark memory behind the new
 		 * allocation in case it's smaller than old one.
 		 */
-		VALGRIND_MAKE_MEM_DEFINED(pointer, Min(size, oldsize));
+		VALGRIND_MAKE_MEM_DEFINED(pointer, Min(size, oldchksize));
 #endif
 
 		/* Ensure any padding bytes are marked NOACCESS. */
@@ -1248,11 +1248,11 @@ AllocSetRealloc(void *pointer, Size size)
 
 	fidx = MemoryChunkGetValue(chunk);
 	Assert(FreeListIdxIsValid(fidx));
-	oldsize = GetChunkSizeFromFreeListIdx(fidx);
+	oldchksize = GetChunkSizeFromFreeListIdx(fidx);
 
 #ifdef MEMORY_CONTEXT_CHECKING
 	/* Test for someone scribbling on unused space in chunk */
-	if (chunk->requested_size < oldsize)
+	if (chunk->requested_size < oldchksize)
 		if (!sentinel_ok(pointer, chunk->requested_size))
 			elog(WARNING, "detected write past chunk end in %s %p",
  set->header.name, 

Re: Possible false valgrind error reports

2023-02-14 Thread Tom Lane
Karina Litskevich  writes:
> In 82d0a46ea32 AllocSetRealloc() was changed to allow decreasing size of
> external chunks and give memory back to the malloc pool. Two
> VALGRIND_MAKE_MEM_UNDEFINED() calls were not changed to work properly in the
> case of decreasing size: they can mark memory behind the new allocated
> memory
> UNDEFINED. If this memory was already allocated and initialized, it's
> expected
> to be DEFINED. So it can cause false valgrind error reports. I fixed it in
> 0001 patch.

Hmm, I see the concern: adjusting the Valgrind marking of bytes beyond the
newly-realloced block is wrong because it might tromp on memory allocated
in another way.  However, I'm not sure about the details of your patch.

The first hunk in 0001 doesn't seem quite right yet:

  * old allocation.
  */
 #ifdef USE_VALGRIND
-if (oldsize > chunk->requested_size)
+if (size > chunk->requested_size && oldsize > chunk->requested_size)
 VALGRIND_MAKE_MEM_UNDEFINED((char *) pointer + 
chunk->requested_size,
 oldsize - chunk->requested_size);
 #endif

If size < oldsize, aren't we still doing the wrong thing?  Seems like
maybe it has to be like

 if (size > chunk->requested_size && oldsize > chunk->requested_size)
 VALGRIND_MAKE_MEM_UNDEFINED((char *) pointer + 
chunk->requested_size,
 Min(size, oldsize) - 
chunk->requested_size);

  * allocation; it could have been as small as one byte.  We have to be
  * conservative and just mark the entire old portion DEFINED.
  */
-VALGRIND_MAKE_MEM_DEFINED(pointer, oldsize);
+if (size >= oldsize)
+VALGRIND_MAKE_MEM_DEFINED(pointer, oldsize);
+else
+VALGRIND_MAKE_MEM_DEFINED(pointer, size);
 #endif

This is OK, though I wonder if it'd read better as

+VALGRIND_MAKE_MEM_DEFINED(pointer, Min(size, oldsize));


I've not thought hard about whether I like the variable renaming proposed
in 0002.  I do suggest though that those comment changes are an integral
part of the bug fix and hence belong in 0001.

regards, tom lane




Possible false valgrind error reports

2023-02-14 Thread Karina Litskevich
Hi hackers,

In 82d0a46ea32 AllocSetRealloc() was changed to allow decreasing size of
external chunks and give memory back to the malloc pool. Two
VALGRIND_MAKE_MEM_UNDEFINED() calls were not changed to work properly in the
case of decreasing size: they can mark memory behind the new allocated
memory
UNDEFINED. If this memory was already allocated and initialized, it's
expected
to be DEFINED. So it can cause false valgrind error reports. I fixed it in
0001
patch.

Also, it took me a while to understand what's going on there, so in 0002
patch
I tried to improve comments and renamed a variable. Its name "oldsize"
confused
me. I first thought "oldsize" and "size" represent the same parameters of
the
old and new chunk. But actually "size" is new "chunk->requested_size" and
"oldsize" is old "chksize". So I believe it's better to rename "oldsize"
into
"oldchksize".

Best regards,
Karina Litskevich
Postgres Professional: http://postgrespro.com/
From 9556b7918e6abccb2dc19f20dbf572d41cd35cb4 Mon Sep 17 00:00:00 2001
From: Karina Litskevich 
Date: Tue, 14 Feb 2023 17:13:17 +0300
Subject: [PATCH v1 1/2] Fix VALGRIND_MAKE_MEM_DEFINED() calls

---
 src/backend/utils/mmgr/aset.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/backend/utils/mmgr/aset.c b/src/backend/utils/mmgr/aset.c
index 740729b5d0..1ff0bb83cb 100644
--- a/src/backend/utils/mmgr/aset.c
+++ b/src/backend/utils/mmgr/aset.c
@@ -1199,7 +1199,7 @@ AllocSetRealloc(void *pointer, Size size)
 		 * old allocation.
 		 */
 #ifdef USE_VALGRIND
-		if (oldsize > chunk->requested_size)
+		if (size > chunk->requested_size && oldsize > chunk->requested_size)
 			VALGRIND_MAKE_MEM_UNDEFINED((char *) pointer + chunk->requested_size,
 		oldsize - chunk->requested_size);
 #endif
@@ -1215,7 +1215,10 @@ AllocSetRealloc(void *pointer, Size size)
 		 * allocation; it could have been as small as one byte.  We have to be
 		 * conservative and just mark the entire old portion DEFINED.
 		 */
-		VALGRIND_MAKE_MEM_DEFINED(pointer, oldsize);
+		if (size >= oldsize)
+			VALGRIND_MAKE_MEM_DEFINED(pointer, oldsize);
+		else
+			VALGRIND_MAKE_MEM_DEFINED(pointer, size);
 #endif
 
 		/* Ensure any padding bytes are marked NOACCESS. */
-- 
2.25.1

From 7cdaee9caaf96564237e4cfa8279699a36942624 Mon Sep 17 00:00:00 2001
From: Karina Litskevich 
Date: Tue, 14 Feb 2023 17:18:30 +0300
Subject: [PATCH v1 2/2] Better comments and variable name in AllocSetRealloc()

---
 src/backend/utils/mmgr/aset.c | 46 ---
 1 file changed, 27 insertions(+), 19 deletions(-)

diff --git a/src/backend/utils/mmgr/aset.c b/src/backend/utils/mmgr/aset.c
index 1ff0bb83cb..35007750a9 100644
--- a/src/backend/utils/mmgr/aset.c
+++ b/src/backend/utils/mmgr/aset.c
@@ -1112,7 +1112,7 @@ AllocSetRealloc(void *pointer, Size size)
 	AllocBlock	block;
 	AllocSet	set;
 	MemoryChunk *chunk = PointerGetMemoryChunk(pointer);
-	Size		oldsize;
+	Size		oldchksize;
 	int			fidx;
 
 	/* Allow access to private part of chunk header. */
@@ -1140,11 +1140,11 @@ AllocSetRealloc(void *pointer, Size size)
 
 		set = block->aset;
 
-		oldsize = block->endptr - (char *) pointer;
+		oldchksize = block->endptr - (char *) pointer;
 
 #ifdef MEMORY_CONTEXT_CHECKING
 		/* Test for someone scribbling on unused space in chunk */
-		Assert(chunk->requested_size < oldsize);
+		Assert(chunk->requested_size < oldchksize);
 		if (!sentinel_ok(pointer, chunk->requested_size))
 			elog(WARNING, "detected write past chunk end in %s %p",
  set->header.name, chunk);
@@ -1194,14 +1194,17 @@ AllocSetRealloc(void *pointer, Size size)
 #endif
 
 		/*
-		 * realloc() (or randomize_mem()) will have left any newly-allocated
-		 * part UNDEFINED, but we may need to adjust trailing bytes from the
-		 * old allocation.
+		 * If this is an increase, realloc() (or randomize_mem()) will have left
+		 * any newly-allocated part UNDEFINED, but we may need to adjust
+		 * trailing bytes from the old allocation as they are marked NOACCESS.
 		 */
 #ifdef USE_VALGRIND
-		if (size > chunk->requested_size && oldsize > chunk->requested_size)
+		if (size > chunk->requested_size && oldchksize > chunk->requested_size)
+		{
+			Assert(chksize >= oldchksize);
 			VALGRIND_MAKE_MEM_UNDEFINED((char *) pointer + chunk->requested_size,
-		oldsize - chunk->requested_size);
+		oldchksize - chunk->requested_size);
+		}
 #endif
 
 		chunk->requested_size = size;
@@ -1211,12 +1214,15 @@ AllocSetRealloc(void *pointer, Size size)
 #else			/* !MEMORY_CONTEXT_CHECKING */
 
 		/*
-		 * We don't know how much of the old chunk size was the actual
-		 * allocation; it could have been as small as one byte.  We have to be
-		 * conservative and just mark the entire old portion DEFINED

Re: valgrind error

2020-06-06 Thread Tom Lane
I wrote:
> Noah Misch  writes:
>> Apparently, valgrind-3.15.0 doesn't complain about undefined input
>> to _mm_crc32_u* functions.  We should not be surprised if Valgrind gains the
>> features necessary to complain about the other implementations.

> Perhaps it already has ... I wonder if anyone's tried this on ARMv8
> lately.

I installed Fedora 32/aarch64 on a Raspberry Pi 3B+, and can report that
valgrind 3.16.0 is just as blind to this problem in pg_comp_crc32c_armv8
as it is in pg_comp_crc32c_sse42.  Seems odd, but there you have it.

(There are some other issues, but they seem fit for separate threads.)

regards, tom lane




Re: valgrind error

2020-06-05 Thread Tom Lane
Noah Misch  writes:
> On Fri, Jun 05, 2020 at 12:17:54PM -0400, Tom Lane wrote:
>> as you have it, I'd prefer to use
>> -   fun:pg_comp_crc32c
>> +   fun:pg_comp_crc32c_sb8
>> which precisely matches what 4f700bc did.  The other way seems like
>> it's giving a free pass to problems that could lurk in unrelated CRC
>> implementations.

> The undefined data is in the CRC input, namely the padding bytes in xl_*
> structs.

Oh, I see.  Objection withdrawn.

> Apparently, valgrind-3.15.0 doesn't complain about undefined input
> to _mm_crc32_u* functions.  We should not be surprised if Valgrind gains the
> features necessary to complain about the other implementations.

Perhaps it already has ... I wonder if anyone's tried this on ARMv8
lately.

regards, tom lane




Re: valgrind error

2020-06-05 Thread Noah Misch
On Fri, Jun 05, 2020 at 12:17:54PM -0400, Tom Lane wrote:
> Noah Misch  writes:
> > I can reproduce this on a 2017-vintage CPU with ./configure
> > ... USE_SLICING_BY_8_CRC32C=1 and then running "make installcheck-parallel"
> > under valgrind-3.15.0 (as packaged by RHEL 7.8).  valgrind.supp has a
> > suppression for CRC calculations, but it didn't get the memo when commit
> > 4f700bc renamed the function.  The attached patch fixes the suppression.
> 
> I can also reproduce this, on RHEL 8.2 which likewise has valgrind-3.15.0,
> using the same configuration to force use of that CRC function.  I concur
> with your diagnosis that this is just a missed update of the pre-existing
> suppression rule.  However, rather than
> 
> -   fun:pg_comp_crc32c
> +   fun:pg_comp_crc32c*
> 
> as you have it, I'd prefer to use
> 
> -   fun:pg_comp_crc32c
> +   fun:pg_comp_crc32c_sb8
> 
> which precisely matches what 4f700bc did.  The other way seems like
> it's giving a free pass to problems that could lurk in unrelated CRC
> implementations.

The undefined data is in the CRC input, namely the padding bytes in xl_*
structs.  Apparently, valgrind-3.15.0 doesn't complain about undefined input
to _mm_crc32_u* functions.  We should not be surprised if Valgrind gains the
features necessary to complain about the other implementations.

Most COMP_CRC32C callers don't have a suppression, so Valgrind still studies
each CRC implementation via those other callers.




Re: valgrind error

2020-06-05 Thread Tom Lane
Noah Misch  writes:
> I can reproduce this on a 2017-vintage CPU with ./configure
> ... USE_SLICING_BY_8_CRC32C=1 and then running "make installcheck-parallel"
> under valgrind-3.15.0 (as packaged by RHEL 7.8).  valgrind.supp has a
> suppression for CRC calculations, but it didn't get the memo when commit
> 4f700bc renamed the function.  The attached patch fixes the suppression.

I can also reproduce this, on RHEL 8.2 which likewise has valgrind-3.15.0,
using the same configuration to force use of that CRC function.  I concur
with your diagnosis that this is just a missed update of the pre-existing
suppression rule.  However, rather than

-   fun:pg_comp_crc32c
+   fun:pg_comp_crc32c*

as you have it, I'd prefer to use

-   fun:pg_comp_crc32c
+   fun:pg_comp_crc32c_sb8

which precisely matches what 4f700bc did.  The other way seems like
it's giving a free pass to problems that could lurk in unrelated CRC
implementations.

regards, tom lane




Re: valgrind error

2020-06-05 Thread Noah Misch
On Sun, May 10, 2020 at 09:29:05AM -0400, Andrew Dunstan wrote:
> On 4/18/20 9:15 AM, Andrew Dunstan wrote:
> > I was just trying to revive lousyjack, my valgrind buildfarm animal
> > which has been offline for 12 days, after having upgraded the machine
> > (fedora 31, gcc 9.3.1, valgrind 3.15) and noticed lots of errors like this:

> > {
> >    
> >    Memcheck:Value8
> >    fun:pg_comp_crc32c_sb8
> >    fun:XLogRecordAssemble
> >    fun:XLogInsert
> >    fun:LogCurrentRunningXacts
> >    fun:LogStandbySnapshot
> >    fun:CreateCheckPoint
> >    fun:CheckpointerMain
> >    fun:AuxiliaryProcessMain
> >    fun:StartChildProcess
> >    fun:reaper
> >    obj:/usr/lib64/libpthread-2.30.so
> >    fun:select
> >    fun:ServerLoop
> >    fun:PostmasterMain
> >    fun:main
> > }

> After many hours of testing I have a culprit for this. The error appears
> with valgrind 3.15.0  with everything else held constant. 3.14.0  does
> not produce the problem.

I suspect 3.15.0 is just better at tracking the uninitialized data.  A
more-remote possibility is valgrind-3.14.0 emulating sse42.  That would make
pg_crc32c_sse42_available() return true, avoiding the pg_comp_crc32c_sb8().

> andrew@freddo:bf (master)*$ lscpu
...
> Flags:   fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
> pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
> fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl
> nonstop_tsc cpuid extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy
> svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs
> skinit wdt hw_pstate vmmcall npt lbrv svm_lock nrip_save
> 
> 
> I did not manage to reproduce this anywhere else, tried on various
> physical, Virtualbox and Docker instances.

I can reproduce this on a 2017-vintage CPU with ./configure
... USE_SLICING_BY_8_CRC32C=1 and then running "make installcheck-parallel"
under valgrind-3.15.0 (as packaged by RHEL 7.8).  valgrind.supp has a
suppression for CRC calculations, but it didn't get the memo when commit
4f700bc renamed the function.  The attached patch fixes the suppression.
Author: Noah Misch 
Commit: Noah Misch 

Refresh function name in CRC-associated Valgrind suppressions.

Back-patch to 9.5, where commit 4f700bcd20c087f60346cb8aefd0e269be8e2157
first appeared.

Reviewed by FIXME.  Reported by Andrew Dunstan.

Discussion: 
https://postgr.es/m/4dfabec2-a3ad-0546-2d62-f816c97ed...@2ndquadrant.com

diff --git a/src/tools/valgrind.supp b/src/tools/valgrind.supp
index ec47a22..acdb620 100644
--- a/src/tools/valgrind.supp
+++ b/src/tools/valgrind.supp
@@ -45,7 +45,7 @@
padding_XLogRecData_CRC
Memcheck:Value8
 
-   fun:pg_comp_crc32c
+   fun:pg_comp_crc32c*
fun:XLogRecordAssemble
 }
 
@@ -89,7 +89,7 @@
 {
padding_twophase_CRC
Memcheck:Value8
-   fun:pg_comp_crc32c
+   fun:pg_comp_crc32c*
fun:EndPrepare
 }
 


Re: valgrind error

2020-05-10 Thread Andrew Dunstan


On 4/18/20 9:15 AM, Andrew Dunstan wrote:
> I was just trying to revive lousyjack, my valgrind buildfarm animal
> which has been offline for 12 days, after having upgraded the machine
> (fedora 31, gcc 9.3.1, valgrind 3.15) and noticed lots of errors like this:
>
>
> 2020-04-17 19:26:03.483 EDT [63741:3] pg_regress LOG:  statement: CREATE
> DATABASE "regression" TEMPLATE=template0
> ==63717== VALGRINDERROR-BEGIN
> ==63717== Use of uninitialised value of size 8
> ==63717==    at 0xAC5BB5: pg_comp_crc32c_sb8 (pg_crc32c_sb8.c:82)
> ==63717==    by 0x55A98B: XLogRecordAssemble (xloginsert.c:785)
> ==63717==    by 0x55A268: XLogInsert (xloginsert.c:461)
> ==63717==    by 0x8BC9E0: LogCurrentRunningXacts (standby.c:1005)
> ==63717==    by 0x8BC8F9: LogStandbySnapshot (standby.c:961)
> ==63717==    by 0x550CB3: CreateCheckPoint (xlog.c:8937)
> ==63717==    by 0x82A3B2: CheckpointerMain (checkpointer.c:441)
> ==63717==    by 0x56347D: AuxiliaryProcessMain (bootstrap.c:453)
> ==63717==    by 0x83CA18: StartChildProcess (postmaster.c:5474)
> ==63717==    by 0x83A120: reaper (postmaster.c:3045)
> ==63717==    by 0x4874B1F: ??? (in /usr/lib64/libpthread-2.30.so)
> ==63717==    by 0x5056F29: select (in /usr/lib64/libc-2.30.so)
> ==63717==    by 0x8380A0: ServerLoop (postmaster.c:1691)
> ==63717==    by 0x837A1F: PostmasterMain (postmaster.c:1400)
> ==63717==    by 0x74A71D: main (main.c:210)
> ==63717==  Uninitialised value was created by a stack allocation
> ==63717==    at 0x8BC942: LogCurrentRunningXacts (standby.c:984)
> ==63717==
> ==63717== VALGRINDERROR-END
> {
>    
>    Memcheck:Value8
>    fun:pg_comp_crc32c_sb8
>    fun:XLogRecordAssemble
>    fun:XLogInsert
>    fun:LogCurrentRunningXacts
>    fun:LogStandbySnapshot
>    fun:CreateCheckPoint
>    fun:CheckpointerMain
>    fun:AuxiliaryProcessMain
>    fun:StartChildProcess
>    fun:reaper
>    obj:/usr/lib64/libpthread-2.30.so
>    fun:select
>    fun:ServerLoop
>    fun:PostmasterMain
>    fun:main
> }
>
>


After many hours of testing I have a culprit for this. The error appears
with valgrind 3.15.0  with everything else held constant. 3.14.0  does
not produce the problem.  So lousyjack will be back on the air before long.


Here are the build flags it's using:


CFLAGS=-Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Werror=vla -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -fexcess-precision=standard -Wno-format-truncation
-Wno-stringop-truncatio
n -g -fno-omit-frame-pointer -O0 -fPIC
CPPFLAGS=-DUSE_VALGRIND  -DRELCACHE_FORCE_RELEASE -D_GNU_SOURCE
-I/usr/include/libxml2


and valgrind is invoked like this:


valgrind --quiet --trace-children=yes --track-origins=yes
--read-var-info=yes --num-callers=20 --leak-check=no
--gen-suppressions=all --error-limit=no
--suppressions=../pgsql/src/tools/valgrind.supp
--error-markers=VALGRINDERROR-BEGIN,VALGRINDERROR-END bin/postgres -D data-C


Does anyone see anything here that needs tweaking?


Note that this is quite an old machine:


andrew@freddo:bf (master)*$ lscpu
Architecture:    x86_64
CPU op-mode(s):  32-bit, 64-bit
Byte Order:  Little Endian
CPU(s):  2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):   1
NUMA node(s):    1
Vendor ID:   AuthenticAMD
CPU family:  16
Model:   6
Model name:  AMD Athlon(tm) II X2 215 Processor
Stepping:    2
CPU MHz: 2700.000
CPU max MHz: 2700.
CPU min MHz: 800.
BogoMIPS:    5425.13
Virtualization:  AMD-V
L1d cache:   64K
L1i cache:   64K
L2 cache:    512K
NUMA node0 CPU(s):   0,1
Flags:   fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl
nonstop_tsc cpuid extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy
svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs
skinit wdt hw_pstate vmmcall npt lbrv svm_lock nrip_save


I did not manage to reproduce this anywhere else, tried on various
physical, Virtualbox and Docker instances.


cheers


andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





valgrind error

2020-04-18 Thread Andrew Dunstan


I was just trying to revive lousyjack, my valgrind buildfarm animal
which has been offline for 12 days, after having upgraded the machine
(fedora 31, gcc 9.3.1, valgrind 3.15) and noticed lots of errors like this:


2020-04-17 19:26:03.483 EDT [63741:3] pg_regress LOG:  statement: CREATE
DATABASE "regression" TEMPLATE=template0
==63717== VALGRINDERROR-BEGIN
==63717== Use of uninitialised value of size 8
==63717==    at 0xAC5BB5: pg_comp_crc32c_sb8 (pg_crc32c_sb8.c:82)
==63717==    by 0x55A98B: XLogRecordAssemble (xloginsert.c:785)
==63717==    by 0x55A268: XLogInsert (xloginsert.c:461)
==63717==    by 0x8BC9E0: LogCurrentRunningXacts (standby.c:1005)
==63717==    by 0x8BC8F9: LogStandbySnapshot (standby.c:961)
==63717==    by 0x550CB3: CreateCheckPoint (xlog.c:8937)
==63717==    by 0x82A3B2: CheckpointerMain (checkpointer.c:441)
==63717==    by 0x56347D: AuxiliaryProcessMain (bootstrap.c:453)
==63717==    by 0x83CA18: StartChildProcess (postmaster.c:5474)
==63717==    by 0x83A120: reaper (postmaster.c:3045)
==63717==    by 0x4874B1F: ??? (in /usr/lib64/libpthread-2.30.so)
==63717==    by 0x5056F29: select (in /usr/lib64/libc-2.30.so)
==63717==    by 0x8380A0: ServerLoop (postmaster.c:1691)
==63717==    by 0x837A1F: PostmasterMain (postmaster.c:1400)
==63717==    by 0x74A71D: main (main.c:210)
==63717==  Uninitialised value was created by a stack allocation
==63717==    at 0x8BC942: LogCurrentRunningXacts (standby.c:984)
==63717==
==63717== VALGRINDERROR-END
{
   
   Memcheck:Value8
   fun:pg_comp_crc32c_sb8
   fun:XLogRecordAssemble
   fun:XLogInsert
   fun:LogCurrentRunningXacts
   fun:LogStandbySnapshot
   fun:CreateCheckPoint
   fun:CheckpointerMain
   fun:AuxiliaryProcessMain
   fun:StartChildProcess
   fun:reaper
   obj:/usr/lib64/libpthread-2.30.so
   fun:select
   fun:ServerLoop
   fun:PostmasterMain
   fun:main
}


I can't see what the problem is immediately.


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: contrib/bloom Valgrind error

2019-09-28 Thread Michael Paquier
On Fri, Sep 27, 2019 at 09:02:34PM -0700, Peter Geoghegan wrote:
> My Valgrind test script reports the following error, triggered from
> within contrib/bloom's regression test suite on master as of right
> now:
> 
> I suspect that the recent commit 69f94108 is involved here, but I
> haven't confirmed that explanation myself.

It looks that the complain is about the set of custom reloptions
initialized by bloom in _PG_init(), and that lockmode is actually not
set after fetching it via AlterTableGetLockLevel(), which is exactly
what 736b84e was addressing.

By repeating the beginning of the regression tests of bloom, I am
unfortunately not able to reproduce the problem.  Here is what I used
to start the server with valgrind:
valgrind --suppressions=$PG_SOURCE/src/tools/valgrind.supp
--trace-children=yes --track-origins=yes --leak-check=full
--read-var-info=yes postgres -D $PGDATA

What kind of commands and or compilation options do you use?
--
Michael


signature.asc
Description: PGP signature


contrib/bloom Valgrind error

2019-09-27 Thread Peter Geoghegan
My Valgrind test script reports the following error, triggered from
within contrib/bloom's regression test suite on master as of right
now:

""
2019-09-27 20:53:50.910 PDT 9740 DEBUG:  building index "bloomidx" on
table "tst" serially
2019-09-27 20:53:51.049 PDT 9740 DEBUG:  CommitTransaction(1) name:
unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid:
20721/1/2
2019-09-27 20:53:51.052 PDT 9740 DEBUG:  StartTransaction(1) name:
unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
2019-09-27 20:53:51.054 PDT 9740 LOG:  statement: ALTER INDEX bloomidx
SET (length=80);
==9740== VALGRINDERROR-BEGIN
==9740== Conditional jump or move depends on uninitialised value(s)
==9740==at 0x26D400: RangeVarGetRelidExtended (namespace.c:349)
==9740==by 0x33D084: AlterTableLookupRelation (tablecmds.c:3445)
==9740==by 0x4D0BC1: ProcessUtilitySlow (utility.c:)
==9740==by 0x4D0802: standard_ProcessUtility (utility.c:927)
==9740==by 0x4D083B: ProcessUtility (utility.c:360)
==9740==by 0x4CD0A4: PortalRunUtility (pquery.c:1175)
==9740==by 0x4CDBC0: PortalRunMulti (pquery.c:1321)
==9740==by 0x4CE7F9: PortalRun (pquery.c:796)
==9740==by 0x4CA3D9: exec_simple_query (postgres.c:1231)
==9740==by 0x4CB3BD: PostgresMain (postgres.c:4256)
==9740==by 0x4547DE: BackendRun (postmaster.c:4459)
==9740==by 0x4547DE: BackendStartup (postmaster.c:4150)
==9740==by 0x4547DE: ServerLoop (postmaster.c:1718)
==9740==by 0x455E2C: PostmasterMain (postmaster.c:1391)
==9740==by 0x3B94AC: main (main.c:210)
==9740==  Uninitialised value was created by a stack allocation
==9740==at 0x402F202: _PG_init (blutils.c:54)
==9740==
==9740== VALGRINDERROR-END
{
   
   Memcheck:Cond
   fun:RangeVarGetRelidExtended
   fun:AlterTableLookupRelation
   fun:ProcessUtilitySlow
   fun:standard_ProcessUtility
   fun:ProcessUtility
   fun:PortalRunUtility
   fun:PortalRunMulti
   fun:PortalRun
   fun:exec_simple_query
   fun:PostgresMain
   fun:BackendRun
   fun:BackendStartup
   fun:ServerLoop
   fun:PostmasterMain
   fun:main
}
""

I suspect that the recent commit 69f94108 is involved here, but I
haven't confirmed that explanation myself.

-- 
Peter Geoghegan