RE: Perform streaming logical transactions by background workers and parallel apply

[email protected] Tue, 02 Aug 2022 04:46:47 -0700

On Wednesday, July 27, 2022 4:22 PM [email protected] wrote:
> 
> On Tuesday, July 26, 2022 5:34 PM Dilip Kumar <[email protected]>
> wrote:
> 
> > 3.
> > Why are we restricting parallel apply workers only for the streamed
> > transactions, because streaming depends upon the size of the logical
> > decoding work mem so making steaming and parallel apply tightly
> > coupled seems too restrictive to me.  Do we see some obvious problems
> > in applying other transactions in parallel?
> 
> We thought there could be some conflict failure and deadlock if we parallel
> apply normal transaction which need transaction dependency check[1]. But I
> will do some more research for this and share the result soon.


After thinking about this, I confirmed that it would be easy to cause deadlock
error if we don't have additional dependency analysis and COMMIT order preserve
handling for parallel apply normal transaction.

Because the basic idea to parallel apply normal transaction in the first
version is that: the main apply worker will receive data from pub and pass them
to apply bgworker without applying by itself. And only before the apply
bgworker apply the final COMMIT command, it need to wait for any previous
transaction to finish to preserve the commit order. It means we could pass the
next transaction's data to another apply bgworker before the previous
transaction is committed in the first apply bgworker.

In this approach, we have to do the dependency analysis because it's easy to
cause dead lock error when applying DMLs in parallel(See the attachment for the
examples where the dead lock could happen). So, it's a bit different from
streaming transaction.

We could apply the next transaction only after the first transaction is
committed in which approach we don't need the dependency analysis, but it would
not bring noticeable performance improvement even if we start serval apply
workers to do that because the actual DMLs are not performed in parallel.

Based on above, we plan to first introduce the patch to perform streaming
logical transactions by background workers, and then introduce parallel apply
normal transaction which design is different and need some additional handling.

Best regards,
Hou zj

> [1]
> https://www.postgresql.org/message-id/CAA4eK1%2BwyN6zpaHUkCLorEW
> Nx75MG0xhMwcFhvjqm2KURZEAGw%40mail.gmail.com

**************CASE 1**************
CREATE TABLE test(a int primary key);
INSERT INTO test VALUES(1);
INSERT INTO test VALUES(3);

--------------On publisher--------------
Tx1:
BEGIN;
UPDATE test SET a = 2 WHERE a = 1;
DELETE FROM test where a = 3;
INSERT INTO test VALUES(3);
COMMIT;

Tx2:
BEGIN;
UPDATE test SET a = 4 WHERE a = 3;
INSERT INTO test VALUES(1);
COMMIT;

--------------On subscriber--------------

Tx1:
BEGIN;
UPDATE test SET a = 2 WHERE a = 1;
    Tx2:
    BEGIN;
    UPDATE test SET a = 4 WHERE a = 3;
Tx1:
DELETE FROM test where a = 3;
    Tx2:
    INSERT INTO test VALUES(1);

ERROR:  deadlock detected


**************CASE 2**************
CREATE TABLE test(a int);
INSERT INTO test VALUES(1);

CREATE TABLE test2(a int);
INSERT INTO test2 VALUES(1);

--------------On publisher--------------
Tx1:
BEGIN;
UPDATE test SET a = 2 WHERE a = 1;
UPDATE test2 SET a = 2 WHERE a = 1;
COMMIT;

Tx2:
BEGIN;
INSERT INTO test2 VALUES(1);
UPDATE test2 SET a = 2 WHERE a = 1;
INSERT INTO test VALUES(1);
UPDATE test SET a = 2 WHERE a = 1;
COMMIT;

--------------On subscriber--------------
Tx1:
BEGIN;
UPDATE test SET a = 2 WHERE a = 1;
    Tx2:
    BEGIN;
    INSERT INTO test2 VALUES(1);
    UPDATE test2 SET a = 2 WHERE a = 1;
Tx1:
UPDATE test2 SET a = 2 WHERE a = 1;
    Tx2:
    INSERT INTO test VALUES(1);
    UPDATE test SET a = 2 WHERE a = 1;
    COMMIT;
COMMIT;

ERROR:  deadlock detected

RE: Perform streaming logical transactions by background workers and parallel apply

Reply via email to