On Wed, Oct 4, 2017 at 8:35 AM, Petr Jelinek <petr.jeli...@2ndquadrant.com> wrote: > On 02/10/17 18:59, Petr Jelinek wrote: >>> >>> Now fix the trigger function: >>> CREATE OR REPLACE FUNCTION replication_trigger_proc() RETURNS TRIGGER AS $$ >>> BEGIN >>> RETURN NEW; >>> END $$ LANGUAGE plpgsql; >>> >>> And manually perform at master two updates inside one transaction: >>> >>> postgres=# begin; >>> BEGIN >>> postgres=# update pgbench_accounts set abalance=abalance+1 where aid=26; >>> UPDATE 1 >>> postgres=# update pgbench_accounts set abalance=abalance-1 where aid=26; >>> UPDATE 1 >>> postgres=# commit; >>> <hangs> >>> >>> and in replica log we see: >>> 2017-10-02 18:40:26.094 MSK [2954] LOG: logical replication apply >>> worker for subscription "sub" has started >>> 2017-10-02 18:40:26.101 MSK [2954] ERROR: attempted to lock invisible >>> tuple >>> 2017-10-02 18:40:26.102 MSK [2882] LOG: worker process: logical >>> replication worker for subscription 16403 (PID 2954) exited with exit >>> code 1 >>> >>> Error happens in trigger.c: >>> >>> #3 0x000000000069bddb in GetTupleForTrigger (estate=0x2e36b50, >>> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tid=0x2dd08ac, >>> lockmode=LockTupleNoKeyExclusive, newSlot=0x7ffc4420ec40) at >>> trigger.c:3103 >>> #4 0x000000000069b259 in ExecBRUpdateTriggers (estate=0x2e36b50, >>> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tupleid=0x2dd08ac, >>> fdw_trigtuple=0x0, slot=0x2dd0240) at trigger.c:2748 >>> #5 0x00000000006d2395 in ExecSimpleRelationUpdate (estate=0x2e36b50, >>> epqstate=0x7ffc4420eda0, searchslot=0x2dd0358, slot=0x2dd0240) >>> at execReplication.c:461 >>> #6 0x0000000000820894 in apply_handle_update (s=0x7ffc442163b0) at >>> worker.c:736 >> >> We have locked the same tuple in RelationFindReplTupleByIndex() just >> before this gets called and didn't get the same error. I guess we do >> something wrong with snapshots. Will need to investigate more. >> > > Okay, so it's not snapshot. It's the fact that we don't set the > es_output_cid in replication worker which GetTupleForTrigger is using > when locking the tuple. Attached one-liner fixes it. >
Thank you for the patch. This bug can happen even without the trigger and I confirmed tgat the bug is fixed by the patch. I think the patch fixed it properly. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers