Re: [HACKERS] uninitialized values in revised prepared xact code

Heikki Linnakangas Fri, 24 Oct 2014 07:15:02 -0700

On 10/11/2014 10:20 PM, Bruce Momjian wrote:


Uh, was this fixed.  I see a cleanup commit for this C file, but this
report is from June:

        commit 07a4a93a0e35a778c77ffbbbc18de29e859e18f0
        Author: Heikki Linnakangas <[email protected]>
        Date:   Fri May 16 09:47:50 2014 +0300
        
            Initialize tsId and dbId fields in WAL record of COMMIT PREPARED.
        
            Commit dd428c79 added dbId and tsId to the xl_xact_commit struct 
but missed
            that prepared transaction commits reuse that struct. Fix that.
        
            Because those fields were left unitialized, replaying a commit 
prepared WAL
            record in a hot standby node would fail to remove the relcache init 
file.
            That can lead to "could not open file" errors on the standby. 
Relcache init
            file only needs to be removed when a system table/index is 
rewritten in the
            transaction using two phase commit, so that should be rare in 
practice. In
            HEAD, the incorrect dbId/tsId values are also used for filtering in 
logical
            replication code, causing the transaction to always be filtered out.
        
            Analysis and fix by Andres Freund. Backpatch to 9.0 where hot 
standby was
            introduced.


No, that was a different issue.

(more below)

On Mon, Jun 30, 2014 at 11:58:59AM +0200, Andres Freund wrote:

Hi,

I've just rerun valgrind for the first time in a while and saw the
following splat. My guess is it exists since bb38fb0d43c, but that's
blindly guessing:

==2049== Use of uninitialised value of size 8
==2049==    at 0x4FE66D: EndPrepare (twophase.c:1063)
==2049==    by 0x4F231B: PrepareTransaction (xact.c:2217)
==2049==    by 0x4F2A38: CommitTransactionCommand (xact.c:2676)
==2049==    by 0x79013E: finish_xact_command (postgres.c:2408)
==2049==    by 0x78DE97: exec_simple_query (postgres.c:1062)
==2049==    by 0x791FDD: PostgresMain (postgres.c:4010)
==2049==    by 0x71B13B: BackendRun (postmaster.c:4113)
==2049==    by 0x71A86D: BackendStartup (postmaster.c:3787)
==2049==    by 0x71714C: ServerLoop (postmaster.c:1566)
==2049==    by 0x716804: PostmasterMain (postmaster.c:1219)
==2049==    by 0x679405: main (main.c:219)
==2049==  Uninitialised value was created by a stack allocation
==2049==    at 0x4FE16C: StartPrepare (twophase.c:942)
==2049==
==2049== Syscall param write(buf) points to uninitialised byte(s)
==2049==    at 0x5C69640: __write_nocancel (syscall-template.S:81)
==2049==    by 0x4FE6AE: EndPrepare (twophase.c:1064)
==2049==    by 0x4F231B: PrepareTransaction (xact.c:2217)
==2049==    by 0x4F2A38: CommitTransactionCommand (xact.c:2676)
==2049==    by 0x79013E: finish_xact_command (postgres.c:2408)
==2049==    by 0x78DE97: exec_simple_query (postgres.c:1062)
==2049==    by 0x791FDD: PostgresMain (postgres.c:4010)
==2049==    by 0x71B13B: BackendRun (postmaster.c:4113)
==2049==    by 0x71A86D: BackendStartup (postmaster.c:3787)
==2049==    by 0x71714C: ServerLoop (postmaster.c:1566)
==2049==    by 0x716804: PostmasterMain (postmaster.c:1219)
==2049==    by 0x679405: main (main.c:219)
==2049==  Address 0x64694ed is 1,389 bytes inside a block of size 8,192 alloc'd
==2049==    at 0x4C27B8F: malloc (vg_replace_malloc.c:298)
==2049==    by 0x8E766E: AllocSetAlloc (aset.c:853)
==2049==    by 0x8E8E04: MemoryContextAllocZero (mcxt.c:627)
==2049==    by 0x8A54D3: AtStart_Inval (inval.c:704)
==2049==    by 0x4F1DFC: StartTransaction (xact.c:1841)
==2049==    by 0x4F28D1: StartTransactionCommand (xact.c:2529)
==2049==    by 0x7900A7: start_xact_command (postgres.c:2383)
==2049==    by 0x78DAF4: exec_simple_query (postgres.c:860)
==2049==    by 0x791FDD: PostgresMain (postgres.c:4010)
==2049==    by 0x71B13B: BackendRun (postmaster.c:4113)
==2049==    by 0x71A86D: BackendStartup (postmaster.c:3787)
==2049==    by 0x71714C: ServerLoop (postmaster.c:1566)
==2049==  Uninitialised value was created by a stack allocation
==2049==    at 0x4FE16C: StartPrepare (twophase.c:942)

It's probably just padding - twophase.c:1063 is the CRC32 computation of
the record data.


Yeah. The padding bytes in TwoPhaseFileHeader were not initialized.

That's simple enough to fix, but when I run valgrind, I get a lot wholebunch of similar messages. A few are from pgstat: the padding bytes inthe pgstat messages are not initialized. One comes fromwrite_relcache_init_file(); again I believe it's padding bytes beinguninitialized (in FormData_pg_attribute). And one from the XLogInsertfrom heap_insert; there's an uninitialized padding byte inxl_heap_insert. And so forth.. Is it worthwhile to hunt down all ofthese? If there aren't many more than these, it probably is worth it,but I fear this might be an endless effort. Have we been clean of thesewarnings at any point in the past?


- Heikki



--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] uninitialized values in revised prepared xact code

Reply via email to