On 10/11/2014 10:20 PM, Bruce Momjian wrote:

Uh, was this fixed.  I see a cleanup commit for this C file, but this
report is from June:

        commit 07a4a93a0e35a778c77ffbbbc18de29e859e18f0
        Author: Heikki Linnakangas <heikki.linnakan...@iki.fi>
        Date:   Fri May 16 09:47:50 2014 +0300
        
            Initialize tsId and dbId fields in WAL record of COMMIT PREPARED.
        
            Commit dd428c79 added dbId and tsId to the xl_xact_commit struct 
but missed
            that prepared transaction commits reuse that struct. Fix that.
        
            Because those fields were left unitialized, replaying a commit 
prepared WAL
            record in a hot standby node would fail to remove the relcache init 
file.
            That can lead to "could not open file" errors on the standby. 
Relcache init
            file only needs to be removed when a system table/index is 
rewritten in the
            transaction using two phase commit, so that should be rare in 
practice. In
            HEAD, the incorrect dbId/tsId values are also used for filtering in 
logical
            replication code, causing the transaction to always be filtered out.
        
            Analysis and fix by Andres Freund. Backpatch to 9.0 where hot 
standby was
            introduced.

No, that was a different issue.

(more below)

On Mon, Jun 30, 2014 at 11:58:59AM +0200, Andres Freund wrote:
Hi,

I've just rerun valgrind for the first time in a while and saw the
following splat. My guess is it exists since bb38fb0d43c, but that's
blindly guessing:

==2049== Use of uninitialised value of size 8
==2049==    at 0x4FE66D: EndPrepare (twophase.c:1063)
==2049==    by 0x4F231B: PrepareTransaction (xact.c:2217)
==2049==    by 0x4F2A38: CommitTransactionCommand (xact.c:2676)
==2049==    by 0x79013E: finish_xact_command (postgres.c:2408)
==2049==    by 0x78DE97: exec_simple_query (postgres.c:1062)
==2049==    by 0x791FDD: PostgresMain (postgres.c:4010)
==2049==    by 0x71B13B: BackendRun (postmaster.c:4113)
==2049==    by 0x71A86D: BackendStartup (postmaster.c:3787)
==2049==    by 0x71714C: ServerLoop (postmaster.c:1566)
==2049==    by 0x716804: PostmasterMain (postmaster.c:1219)
==2049==    by 0x679405: main (main.c:219)
==2049==  Uninitialised value was created by a stack allocation
==2049==    at 0x4FE16C: StartPrepare (twophase.c:942)
==2049==
==2049== Syscall param write(buf) points to uninitialised byte(s)
==2049==    at 0x5C69640: __write_nocancel (syscall-template.S:81)
==2049==    by 0x4FE6AE: EndPrepare (twophase.c:1064)
==2049==    by 0x4F231B: PrepareTransaction (xact.c:2217)
==2049==    by 0x4F2A38: CommitTransactionCommand (xact.c:2676)
==2049==    by 0x79013E: finish_xact_command (postgres.c:2408)
==2049==    by 0x78DE97: exec_simple_query (postgres.c:1062)
==2049==    by 0x791FDD: PostgresMain (postgres.c:4010)
==2049==    by 0x71B13B: BackendRun (postmaster.c:4113)
==2049==    by 0x71A86D: BackendStartup (postmaster.c:3787)
==2049==    by 0x71714C: ServerLoop (postmaster.c:1566)
==2049==    by 0x716804: PostmasterMain (postmaster.c:1219)
==2049==    by 0x679405: main (main.c:219)
==2049==  Address 0x64694ed is 1,389 bytes inside a block of size 8,192 alloc'd
==2049==    at 0x4C27B8F: malloc (vg_replace_malloc.c:298)
==2049==    by 0x8E766E: AllocSetAlloc (aset.c:853)
==2049==    by 0x8E8E04: MemoryContextAllocZero (mcxt.c:627)
==2049==    by 0x8A54D3: AtStart_Inval (inval.c:704)
==2049==    by 0x4F1DFC: StartTransaction (xact.c:1841)
==2049==    by 0x4F28D1: StartTransactionCommand (xact.c:2529)
==2049==    by 0x7900A7: start_xact_command (postgres.c:2383)
==2049==    by 0x78DAF4: exec_simple_query (postgres.c:860)
==2049==    by 0x791FDD: PostgresMain (postgres.c:4010)
==2049==    by 0x71B13B: BackendRun (postmaster.c:4113)
==2049==    by 0x71A86D: BackendStartup (postmaster.c:3787)
==2049==    by 0x71714C: ServerLoop (postmaster.c:1566)
==2049==  Uninitialised value was created by a stack allocation
==2049==    at 0x4FE16C: StartPrepare (twophase.c:942)

It's probably just padding - twophase.c:1063 is the CRC32 computation of
the record data.

Yeah. The padding bytes in TwoPhaseFileHeader were not initialized.

That's simple enough to fix, but when I run valgrind, I get a lot whole bunch of similar messages. A few are from pgstat: the padding bytes in the pgstat messages are not initialized. One comes from write_relcache_init_file(); again I believe it's padding bytes being uninitialized (in FormData_pg_attribute). And one from the XLogInsert from heap_insert; there's an uninitialized padding byte in xl_heap_insert. And so forth.. Is it worthwhile to hunt down all of these? If there aren't many more than these, it probably is worth it, but I fear this might be an endless effort. Have we been clean of these warnings at any point in the past?

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to