On Mon, Jul 18, 2016 at 2:03 PM, Andres Freund <and...@anarazel.de> wrote: > On 2016-07-18 10:02:52 +0530, Amit Kapila wrote: >> > >> >> Consider the below scenario. >> >> Vacuum >> a. acquires a cleanup lock for page - 10 >> b. busy in checking visibility of tuples >> --assume, here it takes some time and in the meantime Session-1 >> performs step (a) and (b) and start waiting in step- (c) >> c. marks the page as all-visible (PageSetAllVisible) >> d. unlockandrelease the buffer >> >> Session-1 >> a. In heap_lock_tuple(), readbuffer for page-10 >> b. check PageIsAllVisible(), found page is not all-visible, so didn't >> acquire the visbilitymap_pin >> c. LockBuffer in ExlusiveMode - here it will wait for vacuum to >> release the lock >> d. Got the lock, but now the page is marked as all-visible, so ideally >> need to recheck the page and acquire the visibilitymap_pin > > So, I've tried pretty hard to reproduce that. While the theory above is > sound, I believe the relevant code-path is essentially dead for SQL > callable code, because we'll always hold a buffer pin before even > entering heap_update/heap_lock_tuple. >
It is possible that we don't hold any buffer pin before entering heap_update() and or heap_lock_tuple(). For heap_update(), it is possible when it enters via simple_heap_update() path. For heap_lock_tuple(), it is possible for ON CONFLICT DO Update statement and may be others as well. Let me also try to explain with a test for both the cases, if above is not clear enough. Case-1 for heap_update() ----------------------------------- Session-1 Create table t1(c1 int); Alter table t1 alter column c1 set default 10; --via debugger stop at StoreAttrDefault()/heap_update, while you are in heap_update(), note down the block number Session-2 vacuum (DISABLE_PAGE_SKIPPING) pg_attribute; -- In lazy_scan_heap(), stop at line (if (all_visible && !all_visible_according_to_vm))) for block number noted in Session-1. Session-1 In debugger, proceed and let it wait at lockbuffer, note that it will not pin the visibility map. Session-2 Set the visibility flag and complete the operation. Session-1 You will notice that it will attempt to unlock the buffer, pin the visibility map, lock the buffer again. Case-2 for heap_lock_tuple() ---------------------------------------- Session-1 Create table i_conflict(c1 int, c2 char(100)); Create unique index idx_u on i_conflict(c1); Insert into i_conflict values(1,'aaa'); Insert into i_conflict values(1,'aaa') On Conflict (c1) Do Update Set c2='bbb'; -- via debugger, stop at line 385 in nodeModifyTable.c (In ExecInsert(), at if (onconflict == ONCONFLICT_UPDATE)). Session-2 ------------- vacuum (DISABLE_PAGE_SKIPPING) i_conflict --stop before setting the all-visible flag Session-1 -------------- In debugger, proceed and let it wait at lockbuffer, note that it will not pin the visibility map. Session-2 --------------- Set the visibility flag and complete the operation. Session-1 -------------- PANIC: wrong buffer passed to visibilitymap_clear --this is problematic. Attached patch fixes the problem for me. Note, I have not tried to reproduce the problem for heap_lock_updated_tuple_rec(), but I think if you are convinced with above cases, then we should have a similar check in it as well. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pin_vm_lock_tuple-v1.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers