Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-10 Thread Wood, Dan
I found one glitch with our merge of the original dup row fix. With that corrected AND Alvaro’s Friday fix things are solid. No dup’s. No index corruption. Thanks so much. On 10/10/17, 7:25 PM, "Michael Paquier" wrote: On Tue, Oct 10, 2017 at 11:14 PM, Alvaro Herrera wrote: > I

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-08 Thread Wood, Dan
I’m unclear on what is being repro’d in 9.6. Are you getting the duplicate rows problem or just the reindex problem? Are you testing with asserts enabled(I’m not)? If you are getting the dup rows consider the code in the block in heapam.c that starts with the comment “replace multi by update

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-05 Thread Wood, Dan
Yes, I’ve been testing 9.6. I’ll try Alvaro’s patch today. I would prefer to focus on either latest 9X or 11dev. Does Alvaro’s patch presume any of the other patch to set COMMITTED in the freeze code? On 10/4/17, 7:17 PM, "Michael Paquier" wrote: On Thu, Oct 5, 2017 at 10:3

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-04 Thread Wood, Dan
Whatever you do make sure to also test 250 clients running lock.sql. Even with the communities fix plus YiWen’s fix I can still get duplicate rows. What works for “in-block” hot chains may not work when spanning blocks. Once nearly all 250 clients have done their updates and everybody is waiti

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-03 Thread Wood, Dan
| 49155 | 36963 | 36961 7 | (0,7) | 8032 | 11010 | 32771 | 36961 | 0 (7 rows) On 10/3/17, 6:20 PM, "Peter Geoghegan" wrote: On Tue, Oct 3, 2017 at 6:09 PM, Wood, Dan wrote: > I’ve just started looking at this again after a fe

Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-03 Thread Wood, Dan
I’ve just started looking at this again after a few weeks break. There is a tangled web of issues here. With the community fix we get a corrupted page(invalid redirect ptr from indexed item). The cause of that is: pruneheap.c: /* * Check the tuple XMIN agai