Fwd: Perform streaming logical transactions by background workers and parallel apply

2022-12-03 Thread Peter Smith
(Resending this because somehow my previous post did not appear in the
mail archives)

-- Forwarded message -
From: Peter Smith 
Date: Fri, Dec 2, 2022 at 7:59 PM
Subject: Re: Perform streaming logical transactions by background
workers and parallel apply
To: houzj.f...@fujitsu.com 
Cc: Amit Kapila , Masahiko Sawada
, wangw.f...@fujitsu.com
, Dilip Kumar ,
shiy.f...@fujitsu.com , PostgreSQL Hackers



Here are my review comments for patch v54-0001.

==

FILE: .../replication/logical/applyparallelworker.c

1. File header comment

1a.

+ * This file contains the code to launch, set up, and teardown parallel apply
+ * worker which receives the changes from the leader worker and
invokes routines
+ * to apply those on the subscriber database.

"parallel apply worker" -> "a parallel apply worker"

~

1b.

+ *
+ * This file contains routines that are intended to support setting up, using
+ * and tearing down a ParallelApplyWorkerInfo which is required to communicate
+ * among leader and parallel apply workers.

"that are intended to support" -> "for"

"required to communicate among leader and parallel apply workers." ->
"required so the leader worker and parallel apply workers can
communicate with each other."

~

1c.

+ *
+ * The parallel apply workers are assigned (if available) as soon as xact's
+ * first stream is received for subscriptions that have set their 'streaming'
+ * option as parallel. The leader apply worker will send changes to this new
+ * worker via shared memory. We keep this worker assigned till the transaction
+ * commit is received and also wait for the worker to finish at commit. This
+ * preserves commit ordering and avoid file I/O in most cases, although we
+ * still need to spill to a file if there is no worker available. See comments
+ * atop logical/worker to know more about streamed xacts whose changes are
+ * spilled to disk. It is important to maintain commit order to avoid failures
+ * due to (a) transaction dependencies, say if we insert a row in the first
+ * transaction and update it in the second transaction on publisher then
+ * allowing the subscriber to apply both in parallel can lead to failure in the
+ * update. (b) deadlocks, allowing transactions that update the same set of
+ * rows/tables in the opposite order to be applied in parallel can lead to
+ * deadlocks.

"due to (a)" -> "due to: "

"(a) transaction dependencies, " -> "(a) transaction dependencies - "

". (b) deadlocks, " => "; (b) deadlocks - "

~

1d.

+ *
+ * We maintain a worker pool to avoid restarting workers for each streaming
+ * transaction. We maintain each worker's information in the
+ * ParallelApplyWorkersList. After successfully launching a new worker, its
+ * information is added to the ParallelApplyWorkersList. Once the worker
+ * finishes applying the transaction, we mark it available for re-use. Now,
+ * before starting a new worker to apply the streaming transaction, we check
+ * the list for any available worker. Note that we maintain a maximum of half
+ * the max_parallel_apply_workers_per_subscription workers in the pool and
+ * after that, we simply exit the worker after applying the transaction.
+ *

"We maintain a worker pool" -> "A worker pool is used"

"We maintain each worker's information" -> "We maintain each worker's
information (ParallelApplyWorkerInfo)"

"we mark it available for re-use" -> "it is marked as available for re-use"

"Note that we maintain a maximum of half" -> "Note that we retain a
maximum of half"

~

1e.

+ * XXX This worker pool threshold is a bit arbitrary and we can provide a GUC
+ * variable for this in the future if required.

"a bit arbitrary" -> "arbitrary"

~

1f.

+ *
+ * The leader apply worker will create a separate dynamic shared memory segment
+ * when each parallel apply worker starts. The reason for this design is that
+ * we cannot count how many workers will be started. It may be possible to
+ * allocate enough shared memory in one segment based on the maximum number of
+ * parallel apply workers (max_parallel_apply_workers_per_subscription), but
+ * this would waste memory if no process is actually started.
+ *

"we cannot count how many workers will be started." -> "we cannot
predict how many workers will be needed."

~

1g.

+ * The dynamic shared memory segment will contain (a) a shm_mq that is used to
+ * send changes in the transaction from leader apply worker to parallel apply
+ * worker (b) another shm_mq that is used to send errors (and other messages
+ * reported via elog/ereport) from the parallel apply worker to leader apply
+ * worker (c) necessary information to be shared among parallel apply workers
+ * and leader apply worker (i.e. members of ParallelApplyWorkerShared).

"will contain (a)" => "contains: (a)"

"worker (b)" -> "worker; (b)

"worker (c)" -> "worker; (c)"

"and leader apply worker" -> "and the leader apply worker"

~

1h.

+ *
+ * Locking Considerations
+ * --
+ * Since the 

Fwd: Perform streaming logical transactions by background workers and parallel apply

2022-12-02 Thread Peter Smith
-- Forwarded message -
From: Peter Smith 
Date: Sat, Dec 3, 2022 at 8:03 AM
Subject: Re: Perform streaming logical transactions by background
workers and parallel apply
To: Amit Kapila 


On Fri, Dec 2, 2022 at 8:57 PM Amit Kapila  wrote:
>
> On Fri, Dec 2, 2022 at 2:29 PM Peter Smith  wrote:
> >
> > 3. pa_setup_dsm
> >
> > +/*
> > + * Set up a dynamic shared memory segment.
> > + *
> > + * We set up a control region that contains a fixed-size worker info
> > + * (ParallelApplyWorkerShared), a message queue, and an error queue.
> > + *
> > + * Returns true on success, false on failure.
> > + */
> > +static bool
> > +pa_setup_dsm(ParallelApplyWorkerInfo *winfo)
> >
> > IMO that's confusing to say "fixed-sized worker info" when it's
> > referring to the ParallelApplyWorkerShared structure and not the other
> > ParallelApplyWorkerInfo.
> >
> > Might be better to say:
> >
> > "a fixed-size worker info (ParallelApplyWorkerShared)" -> "a
> > fixed-size struct (ParallelApplyWorkerShared)"
> >
> > ~~~
> >
>
> I find the existing wording better than what you are proposing. We can
> remove the structure name if you think that is better but IMO, current
> wording is good.
>

Including the structure name was helpful, but "worker info" made me
wrongly think it was talking about ParallelApplyWorkerInfo (e.g.
"worker info" was too much like WorkerInfo). So any different way to
say "worker info" might avoid that confusion.

--
Kind Regards,
Peter Smith.
Fujitsu Australia