On Wednesday, July 27, 2022 4:22 PM houzj.f...@fujitsu.com wrote: > > On Tuesday, July 26, 2022 5:34 PM Dilip Kumar <dilipbal...@gmail.com> > wrote: > > > 3. > > Why are we restricting parallel apply workers only for the streamed > > transactions, because streaming depends upon the size of the logical > > decoding work mem so making steaming and parallel apply tightly > > coupled seems too restrictive to me. Do we see some obvious problems > > in applying other transactions in parallel? > > We thought there could be some conflict failure and deadlock if we parallel > apply normal transaction which need transaction dependency check[1]. But I > will do some more research for this and share the result soon.
After thinking about this, I confirmed that it would be easy to cause deadlock error if we don't have additional dependency analysis and COMMIT order preserve handling for parallel apply normal transaction. Because the basic idea to parallel apply normal transaction in the first version is that: the main apply worker will receive data from pub and pass them to apply bgworker without applying by itself. And only before the apply bgworker apply the final COMMIT command, it need to wait for any previous transaction to finish to preserve the commit order. It means we could pass the next transaction's data to another apply bgworker before the previous transaction is committed in the first apply bgworker. In this approach, we have to do the dependency analysis because it's easy to cause dead lock error when applying DMLs in parallel(See the attachment for the examples where the dead lock could happen). So, it's a bit different from streaming transaction. We could apply the next transaction only after the first transaction is committed in which approach we don't need the dependency analysis, but it would not bring noticeable performance improvement even if we start serval apply workers to do that because the actual DMLs are not performed in parallel. Based on above, we plan to first introduce the patch to perform streaming logical transactions by background workers, and then introduce parallel apply normal transaction which design is different and need some additional handling. Best regards, Hou zj > [1] > https://www.postgresql.org/message-id/CAA4eK1%2BwyN6zpaHUkCLorEW > Nx75MG0xhMwcFhvjqm2KURZEAGw%40mail.gmail.com
**************CASE 1************** CREATE TABLE test(a int primary key); INSERT INTO test VALUES(1); INSERT INTO test VALUES(3); --------------On publisher-------------- Tx1: BEGIN; UPDATE test SET a = 2 WHERE a = 1; DELETE FROM test where a = 3; INSERT INTO test VALUES(3); COMMIT; Tx2: BEGIN; UPDATE test SET a = 4 WHERE a = 3; INSERT INTO test VALUES(1); COMMIT; --------------On subscriber-------------- Tx1: BEGIN; UPDATE test SET a = 2 WHERE a = 1; Tx2: BEGIN; UPDATE test SET a = 4 WHERE a = 3; Tx1: DELETE FROM test where a = 3; Tx2: INSERT INTO test VALUES(1); ERROR: deadlock detected **************CASE 2************** CREATE TABLE test(a int); INSERT INTO test VALUES(1); CREATE TABLE test2(a int); INSERT INTO test2 VALUES(1); --------------On publisher-------------- Tx1: BEGIN; UPDATE test SET a = 2 WHERE a = 1; UPDATE test2 SET a = 2 WHERE a = 1; COMMIT; Tx2: BEGIN; INSERT INTO test2 VALUES(1); UPDATE test2 SET a = 2 WHERE a = 1; INSERT INTO test VALUES(1); UPDATE test SET a = 2 WHERE a = 1; COMMIT; --------------On subscriber-------------- Tx1: BEGIN; UPDATE test SET a = 2 WHERE a = 1; Tx2: BEGIN; INSERT INTO test2 VALUES(1); UPDATE test2 SET a = 2 WHERE a = 1; Tx1: UPDATE test2 SET a = 2 WHERE a = 1; Tx2: INSERT INTO test VALUES(1); UPDATE test SET a = 2 WHERE a = 1; COMMIT; COMMIT; ERROR: deadlock detected