Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

Alexey Kondratov Tue, 22 Oct 2019 12:03:34 -0700

On 22.10.2019 20:22, Tomas Vondra wrote:

On Tue, Oct 22, 2019 at 11:01:48AM +0530, Dilip Kumar wrote:

On Tue, Oct 22, 2019 at 10:46 AM Amit Kapila<[email protected]> wrote:

  In general, yours and Alexy's test results

show that there is merit by having workers applying such transactions.
  OTOH, as noted above [1], we are also worried about the performance
of Rollbacks if we follow that approach.  I am not sure how much we
need to worry about Rollabcks if commits are faster, but can we think
of recording the changes in memory and only write to a file if the
changes are above a certain threshold?  I think that might help saving
I/O in many cases.  I am not very sure if we do that how much
additional workers can help, but they might still help.  I think we
need to do some tests and experiments to figure out what is the best
approach?  What do you think?

I agree with the point.  I think we might need to do some small
changes and test to see what could be the best method to handle the
streamed changes at the subscriber end.

Tomas, Alexey, do you have any thoughts on this matter?  I think it is
important that we figure out the way to proceed in this patch.
[1] -https://www.postgresql.org/message-id/b25ce80e-f536-78c8-d5c8-a5df3e230785%40postgrespro.ru


I think the patch should do the simplest thing possible, i.e. what it
does today. Otherwise we'll never get it committed.

I have to agree with Tomas, that keeping things as simple as possibleshould be a main priority right now. Otherwise, the entire patch setwill pass next release cycle without being committed at least partially.In the same time, it resolves important problem from my perspective. Itmoves I/O overhead from primary to replica using large transactionsstreaming, which is a nice to have feature I guess.

Later it would be possible to replace logical apply worker withbgworkers pool in a separated patch, if we decide that it is a viablesolution. Anyway, regarding the Amit's questions:

- I doubt that maintaining a separate buffer on the apply side beforespilling to disk would help enough. We already have ReorderBuffer withlogical_work_mem limit, and if we exceeded that limit on the senderside, then most probably we exceed it on the applier side as well,excepting the case when this new buffer size will be significantlyhigher then logical_work_mem to keep multiple open xacts.

- I still think that we should optimize database for commits, notrollbacks. BGworkers pool is dramatically slower for rollbacks-onlyload, though being at least twice as faster for commits-only. I do notknow how it will perform with real life load, but this drawback may beinappropriate for such a general purpose database like Postgres.

- Tomas' implementation of streaming with spilling does not have thisbias between commits/aborts. However, it has a noticeable performancedrop (~x5 slower compared with master [1]) for large transactionconsisting of many small rows. Although it is not of an order ofmagnitude slower.

Another thing is it that about a year ago I have found some problemswith MVCC/visibility and fixed them somehow [1]. If I get it correctlyTomas adapted some of those fixes into his patch set, but I think thatthis part should be reviewed carefully again. I would be glad to checkit, but now I am a little bit confused with all the patch set variantsin the thread. Which is the last one? Is it still dependent on 2pc decoding?

[1]https://www.postgresql.org/message-id/flat/40c38758-04b5-74f4-c963-cf300f9e5dff%40postgrespro.ru#98d06fefc88122385dacb2f03f7c30f7



Thanks for moving this patch forward!

--
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

Reply via email to