On Sat, Jul 16, 2016 at 10:08 AM, Andres Freund <and...@anarazel.de> wrote: > On 2016-07-14 20:53:07 -0700, Andres Freund wrote: >> On 2016-07-13 23:06:07 -0700, Andres Freund wrote: >> > won't enter the branch, because HEAP_XMAX_LOCK_ONLY won't be set. Which >> > will leave t_ctid and HEAP_HOT_UPDATED set differently on the master and >> > standby / after crash recovery. I'm failing to see any harmful >> > consequences right now, but differences between master and standby are a >> > bad >> > thing. >> >> I think it's actually critical, because HEAP_HOT_UPDATED / >> HEAP_XMAX_LOCK_ONLY are used to terminate ctid chasing loops (like >> heap_hot_search_buffer()). > > I've pushed a quite heavily revised version of the first patch to > 9.1-master. I manually verified using pageinspect, gdb breakpoints and a > standby that xmax, infomask etc are set correctly (leading to finding > a4d357bf). As there's noticeable differences, especially 9.2->9.3, > between versions, I'd welcome somebody having a look at the commits.
Waoh, man. Thanks! I have been just pinged this week end about a set up that likely has faced this exact problem in the shape of "tuple concurrently updated" with a node getting kill-9-ed by some framework because it did not finish its shutdown checkpoint after some time in some test which enforced it to do crash recovery. I have not been able to put my hands on the raw data to have a look at the flags set within those tuples but I got the string feeling that this is related to that. After a couple of rounds doing so, it was possible to see "tuple concurrently updated" errors for a relation that has few pages and a high update rate using 9.4. More seriously, I have spent some time looking at what you have pushed on each branch, and the fixes are looking correct to me. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers