Il 26/06/15 15:43, marco.nenciar...@2ndquadrant.it ha scritto:
> The following bug has been logged on the website:
> 
> Bug reference:      13473
> Logged by:          Marco Nenciarini
> Email address:      marco.nenciar...@2ndquadrant.it
> PostgreSQL version: 9.4.4
> Operating system:   all
> Description:        
> 
> = Symptoms
> 
> Let's have a simple master -> standby setup, with hot_standby_feedback
> activated,
> if a backend on standby is holding the cluster xmin and the master runs a
> VACUUM FREEZE
> on the same database of the standby's backend, it will generate a conflict
> and the query
> running on standby will be canceled.
> 
> = How to reproduce it
> 
> Run the following operation on an idle cluster.
> 
> 1) connect to the standby and simulate a long running query:
> 
>    select pg_sleep(3600);
> 
> 2) connect to the master and run the following script
> 
>    create table t(id int primary key);
>    insert into t select generate_series(1, 10000);
>    vacuum freeze verbose t;
>    drop table t;
> 
> 3) after 30 seconds the pg_sleep query on standby will be canceled.
> 
> = Expected output
> 
> The hot standby feedback should have prevented the query cancellation
> 
> = Analysis
> 
> Ive run postgres at DEBUG2 logging level, and I can confirm that the vacuum
> correctly see the OldestXmin propagated by the standby through the hot
> standby feedback.
> The issue is in heap_xlog_freeze function, which calls
> ResolveRecoveryConflictWithSnapshot as first thing, passing the cutoff_xid
> value as first argument.
> The cutoff_xid is the OldestXmin active when the vacuum, so it represents a
> running xid. 
> The issue is that the function ResolveRecoveryConflictWithSnapshot expects
> as first argument of is latestRemovedXid, which represent the higher xid
> that has been actually removed, so there is an off-by-one error.
> 
> I've been able to reproduce this issue for every version of postgres since
> 9.0 (9.0, 9.1, 9.2, 9.3, 9.4 and current master)
> 
> = Proposed solution
> 
> In the heap_xlog_freeze we need to subtract one to the value of cutoff_xid
> before passing it to ResolveRecoveryConflictWithSnapshot.
> 
> 
> 

Attached a proposed patch that solves the issue.

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index caacc10..28edb17 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -7571,9 +7571,12 @@ heap_xlog_freeze_page(XLogReaderState *record)
        if (InHotStandby)
        {
                RelFileNode rnode;
+               TransactionId latestRemovedXid = cutoff_xid;
+
+               TransactionIdRetreat(latestRemovedXid);
 
                XLogRecGetBlockTag(record, 0, &rnode, NULL, NULL);
-               ResolveRecoveryConflictWithSnapshot(cutoff_xid, rnode);
+               ResolveRecoveryConflictWithSnapshot(latestRemovedXid, rnode);
        }
 
        if (XLogReadBufferForRedo(record, 0, &buffer) == BLK_NEEDS_REDO)

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to