[HACKERS] Logical replication and multimaster

Konstantin Knizhnik Mon, 30 Nov 2015 08:21:31 -0800

Hello all,

We have implemented ACID multimaster based on logical replication andour DTM (distributed transaction manager) plugin.

Good news is that it works and no inconsistency is detected.
But unfortunately it is very very slow...

At standalone PostgreSQL I am able to achieve about 30000 TPS with 10clients performing simple depbit-credit transactions.And with multimaster consisting of three nodes spawned at the samesystem I got about 100 (one hundred) TPS.

There are two main reasons of such awful performance:

1. Logical replication serializes all transactions: there is singleconnection between wal-sender and receiver BGW.

2. 2PC synchronizes transaction commit at all nodes.

None of these two reasons are show stoppers themselves.

If we remove DTM and do asynchronous logical replication thenperformance of multimaster is increased to 6000 TPS(please notice that in this test all multimaster node are spawned at thesame system, sharing its resources,

so 6k is not bad result comparing with 30k at standalone system).

And according to 2ndquadrant results, BDR performance is very close tohot standby.

On the other hand our previous experiments with DTM shows only about 2times slowdown comparing with vanilla PostgreSQL.

But result of combining DTM and logical replication is frustrating.

I wonder if it is principle limitation of logical replication approachwhich is efficient only for asynchronous replication or it can besomehow tuned/extended to efficiently support synchronous replication?


We have also considered alternative approaches:
1. Statement based replication.
2. Trigger-based replication.
3. Replication using custom nodes.

In case of statement based replication it is hard to guarantee identityof of data at different nodes.Approaches 2 and 3 are much harder to implement and requiring to"reinvent" substantial part of logical replication.Them also require some kind of connection pool which can be used to sendreplicated transactions to the peer nodes (to avoid serialization ofparallel transactions as in case of logical replication).

But looks like there is not so much sense in having multiple networkconnection between one pair of nodes.It seems to be better to have one connection between nodes, but provideparallel execution of received transactions at destination side. But itseems to be also nontrivial. We have now in PostgreSQL someinfrastructure for background works, but there is still no abstractionof workers pool and job queue which can provide simple way to organizeparallel execution of some jobs. I wonder if somebody is working now onit or we should try to propose our solution?


Best regards,
Konstantin




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Logical replication and multimaster

Reply via email to