Re: [HACKERS] Logical decoding restart problems

Konstantin Knizhnik Fri, 19 Aug 2016 09:37:30 -0700


On 19.08.2016 19:06, Petr Jelinek wrote:

I don't think this will work correctly, there will be gap between whenthe new slot starts to decode and the drop of the old one as the newslot first needs to make snapshot.
Do I understand correctly that you are not using replication origins?


No, we are using replication origins to prevent recursive replication.

The gap is not a biggest problem because in multimaster we commit onlythose transaction which are acknowledged by all live nodes.So if some transaction falls in the gap... it is unlucky and will beaborted.

The problem is that for some reasons consistent point is not so
consistent and we get partly decoded transactions.
I.e. transaction body consists of two UPDATE but reorder_buffer extracts
only the one (last) update and sent this truncated transaction to
destination causing consistency violation at replica.  I started
investigation of logical decoding code and found several things which I
do not understand.

Never seen this happen. Do you have more details about what exactly ishappening?


This is transaction at primary node:

root@knizhnik:/home/knizhnik/postgres_cluster/contrib/mmts# docker exec-ti node1 bashpostgres@9a04a0c9f246:/pg$ pg_xlogdumpdata/pg_xlog/000000010000000000000001 | fgrep "tx: 6899"rmgr: Heap len (rec/tot): 7/ 53, tx: 6899, lsn:0/019EFBD8, prev 0/019EFB80, desc: LOCK off 12: xid 6899 LOCK_ONLYEXCL_LOCK , blkref #0: rel 1663/12407/16421 blk 60rmgr: Heap len (rec/tot): 14/ 74, tx: 6899, lsn:0/019EFC10, prev 0/019EFBD8, desc: HOT_UPDATE off 12 xmax 6899 ; new off224 xmax 6899, blkref #0: rel 1663/12407/16421 blk 60rmgr: Heap len (rec/tot): 7/ 53, tx: 6899, lsn:0/019EFC60, prev 0/019EFC10, desc: LOCK off 153: xid 6899 LOCK_ONLYEXCL_LOCK , blkref #0: rel 1663/12407/16421 blk 49rmgr: Heap len (rec/tot): 14/ 82, tx: 6899, lsn:0/019EFC98, prev 0/019EFC60, desc: UPDATE off 153 xmax 6899 ; new off 55xmax 6899, blkref #0: rel 1663/12407/16421 blk 62, blkref #1: rel1663/12407/16421 blk 49rmgr: Btree len (rec/tot): 2/ 64, tx: 6899, lsn:0/019EFCF0, prev 0/019EFC98, desc: INSERT_LEAF off 294, blkref #0: rel1663/12407/16424 blk 23rmgr: Transaction len (rec/tot): 236/ 265, tx: 6899, lsn:0/019EFD30, prev 0/019EFCF0, desc: PREPAREpg_xlogdump: FATAL: error in WAL record at 0/1AC0E70: invalid recordlength at 0/1AC0EA8: wanted 24, got 0

This is the replicated transaction at other node (it is ont clear fromthe trace, but believe me, it is the same transaction):

root@knizhnik:/home/knizhnik/postgres_cluster/contrib/mmts# docker exec-ti node2 bashpostgres@e5b16d82ce06:/pg$ pg_xlogdumpdata/pg_xlog/000000010000000000000001 | fgrep "tx: 6882"rmgr: Heap len (rec/tot): 14/ 74, tx: 6882, lsn:0/019F3240, prev 0/019F31F0, desc: HOT_UPDATE off 113 xmax 6882 ; newoff 219 xmax 0, blkref #0: rel 1663/12407/16421 blk 53rmgr: Heap len (rec/tot): 14/ 82, tx: 6882, lsn:0/019F5CB8, prev 0/019F5C60, desc: UPDATE off 163 xmax 6882 ; new off 4xmax 0, blkref #0: rel 1663/12407/16421 blk 62, blkref #1: rel1663/12407/16421 blk 51rmgr: Btree len (rec/tot): 2/ 64, tx: 6882, lsn:0/019F5D10, prev 0/019F5CB8, desc: INSERT_LEAF off 284, blkref #0: rel1663/12407/16424 blk 23rmgr: Transaction len (rec/tot): 248/ 274, tx: 6882, lsn:0/019F61F8, prev 0/019F60E8, desc: PREPAREpg_xlogdump: FATAL: error in WAL record at 0/1AD3AD8: invalid recordlength at 0/1AD3B10: wanted 24, got 0



And "shorten" version of the same transaction at the third (recovered) node:

root@knizhnik:/home/knizhnik/postgres_cluster/contrib/mmts# docker exec-ti node3 bashpostgres@b4066d4211bc:/pg$ pg_xlogdumpdata/pg_xlog/000000010000000000000001 | fgrep "tx: 6893"rmgr: Heap len (rec/tot): 7/ 53, tx: 6893, lsn:0/01A29828, prev 0/01A297E0, desc: LOCK off 172: xid 6893 LOCK_ONLYEXCL_LOCK , blkref #0: rel 1663/12407/16421 blk 50rmgr: Heap len (rec/tot): 14/ 82, tx: 6893, lsn:0/01A29860, prev 0/01A29828, desc: UPDATE+INIT off 172 xmax 6893 ; newoff 1 xmax 6893, blkref #0: rel 1663/12407/16421 blk 64, blkref #1: rel1663/12407/16421 blk 50rmgr: Btree len (rec/tot): 2/ 64, tx: 6893, lsn:0/01A298B8, prev 0/01A29860, desc: INSERT_LEAF off 314, blkref #0: rel1663/12407/16424 blk 23rmgr: Transaction len (rec/tot): 236/ 265, tx: 6893, lsn:0/01A298F8, prev 0/01A298B8, desc: PREPAREpg_xlogdump: FATAL: error in WAL record at 0/1ACBBF8: invalid recordlength at 0/1ACBC30: wanted 24, got 0


You can see one update instead of two.

Sorry, I have not saved trace with output of logical decoder. Bu t itreally decodes just one update!

What I have done:

DROP_REPLICATION_SLOT "mtm_slot_1";
CREATE_REPLICATION_SLOT "mtm_slot_1" LOGICAL "multimaster";

START_REPLICATION SLOT "mtm_slot_1" LOGICAL 0/0 ("startup_params_format"'1', "max_proto_version" '1', "min_proto_version" '1',"forward_changesets" '1', "mtm_replication_mode" 'recovered');

I have also tried to calculate last origin LSN for this node andexplicitly specify it in START_REPLICATION.

But it doesn't help: the same problem persists.





--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical decoding restart problems

Reply via email to