Re: [WIP] Pipelined Recovery

Xuneng Zhou Thu, 25 Jun 2026 00:47:51 -0700

Hi Imran,

On Tue, Jun 23, 2026 at 9:27 PM Imran Zaheer <[email protected]> wrote:
>
> Hi
>
> I am attaching the new series of patches.
>
> What has changed?
>
> * Rebased
>
> * The patch set is now split into two new patches. This will make the
> code easier to understand and review.
>
> * The v4-0003 patch contains code mostly related to keeping the
> recovery states synced between the startup process and the pipeline
> process. Most of these changes were required to make the streaming
> replication work.
>
> * The v4-0002 patch now only contains the consumer code that handles
> receiving the decoded records from the shmem queue and moving the redo
> loop forward.
>
> * The v4-0004 contains some basic tests to see if the pipeline worker
> is functioning as expected. More testing was done by passing
> PG_TEST_INITDB_EXTRA_OPTS="-c wal_pipeline=on" before running the
> recovery test suite.


+1 for splitting the patch set into smaller components to make the
review process smoother.

> * Other than that, the cpu overhead during deserialization is
> optimized by skipping multiple copies of the decoded record and
> directly passing the pointer to the shmem queue. There is still some
> overhead visible during serialization that could be improved at the
> producer end.
>
> * Signal handling for the pipeline worker is improved so that
> promotion signals are sent to both the startup process and the
> producer worker by the postmaster.
>
>
> You will also find the new benchmarks attached [1] and the pdf report
> overview. A simple cpu profiling on the pipelined startup process
> shows that the cpu overhead during reading records has now been
> removed and offloaded to the producer worker.
>
> Before pipelining:
>
> Around 50% of the cpu time is spent on fetching the wal record. Note that
> in this workload pipeline is off so don't worry about the new func
> ReceiveRecord(), it's just a wrapper around ReadRecord().
>
>   Children      Self  Command   Shared O  Symbol
> -   98.85%     0.21%  postgres  postgres  [.] PerformWalRecovery
>    - 98.64% PerformWalRecovery
>       - 51.00% ReceiveRecord
>          - 50.78% ReadRecord
>             - 50.52% XLogPrefetcherReadRecord
>                - 49.61% XLogPrefetcherNextBlock
>                   + 25.33% XLogReadAhead
>                   + 22.32% PrefetchSharedBuffer
>                   + 0.76% smgropen
>       - 46.68% ApplyWalRecord
>          + 29.23% heap_redo
>          + 9.51% heap2_redo
>          + 4.74% btree_redo
>          + 1.11% xlog_redo
>          + 0.80% xact_redo
>
>
> After Pipelining:
>
> Here the only work needed to be done by the cpu is to get the decoded
> record from
> the queue. Other times (89.13%) cpu is worried about applying the wal record.
>
>   Children      Self  Command   Shared O  Symbol
> -   98.74%     0.37%  postgres  postgres  [.] PerformWalRecovery
>    - 98.37% PerformWalRecovery
>       - 89.13% ApplyWalRecord
>          + 56.89% heap_redo
>          + 18.28% heap2_redo
>          + 8.01% btree_redo
>          + 2.02% xlog_redo
>          + 1.15% xact_redo
>       - 7.80% ReceiveRecord
>          + 7.63% WalPipeline_ReceiveRecord
>
> If the recovery process is not I/O bound then we would be able to test
> this cpu optimization. Doing pgbench on a workload that is fully in
> memory shows around 30% performance gains. You can see more
> benchmarking details in the attached drive link [1]

The perf result looks promising!

> Some comments related to attached pdf and benchmarking, it is showing
> that we can get more performance advantage out of the pipeline when
> most of the workload is running in memory i.e. we have enough shared
> buffers configured.
>
> If you want to do some experiments, please be my guest; I would be
> happy to see more testing. You can share what performance advantage
> you are getting from this. You can also refer to the benchmarking
> script that I have been using [2].
>
>
> Looking forward to your review, comments, etc.

I haven't had a chance for a meaningful review yet, but expect to do so soon.

--
Regards,
Xuneng Zhou
HighGo Software Co., Ltd.

Re: [WIP] Pipelined Recovery

Reply via email to