Hi
This test is completely meaningless. Just as you wouldn't set
innodb_redo_log_capacity=Minimum Value, innodb_max_dirty_pages_pct=Minimum
Value
You used an extreme example to prove the double write.Why didn't you
compare using best practices?

Thank

On Fri, Feb 27, 2026 at 7:43 PM 陈宗志 <[email protected]> wrote:

> Hi wenhui,
>
> Here are the latest benchmark results for the Double Write Buffer (DWB)
> proposal. In this round of testing, I have included the two-phase
> checkpoint batch fsync optimization and evaluated the impact of
> wal_compression (lz4) on both FPW and DWB.
>
> Test Environment:
> - PostgreSQL: 19devel (with DWB patch applied)
> - Hardware: Linux 5.10, x86_64
> - Configuration:
>   * shared_buffers = 1GB
>   * max_wal_size = 32MB (to stress checkpoint frequency)
>   * wal_compression = lz4
>   * double_write_buffer_size = 128MB (for DWB mode)
> - Workload: sysbench 1.1.0, 10 tables x 1,000,000 rows (~2.3GB dataset)
> - Method: 16 threads, 60 seconds per run, each mode tested
>   independently (only one instance running at a time to eliminate
>   I/O contention).
>
> Three modes compared:
> - FPW: io_torn_pages_protection = full_pages (current default)
> - DWB: io_torn_pages_protection = double_writes
> - OFF: io_torn_pages_protection = off (no protection, baseline)
>
> Results with wal_compression = lz4
> ----------------------------------
> 1. oltp_write_only (pure write transactions: UPDATE + DELETE + INSERT)
>
> Mode         TPS      vs FPW    vs OFF
> ----      ------      ------    ------
> FPW       13,772           -    -64.3%
> DWB       20,660      +50.0%    -46.5%
> OFF       38,588     +180.2%         -
>
> 2. oltp_update_non_index (single UPDATE per transaction)
>
> Mode         TPS      vs FPW    vs OFF
> ----      ------      ------    ------
> FPW       59,427           -    -57.5%
> DWB      104,328      +75.6%    -25.4%
> OFF      139,870     +135.4%         -
>
> 3. oltp_read_write (mixed: 70% reads + 30% writes)
>
> Mode         TPS      vs FPW    vs OFF
> ----      ------      ------    ------
> FPW        6,232           -     -9.0%
> DWB        4,408      -29.3%    -35.6%
> OFF        6,845       +9.8%         -
>
>
> Results without wal_compression (for comparison)
> ------------------------------------------------
> Workload                 FPW      DWB      DWB vs FPW
> --------              ------   ------      ----------
> oltp_write_only        9,651   22,111         +129.1%
> oltp_update_non_index 48,624   98,356         +102.3%
> oltp_read_write        5,414    5,275           -2.6%
>
>
> Key Observations:
>
> 1. Write-heavy workloads: DWB outperforms FPW by +50% to +76% even
>    with lz4 compression enabled. Without lz4, the advantage grows
>    to +102% to +129% because uncompressed full-page images cause
>    severe WAL bloat.
>
> 2. lz4 compression significantly helps FPW: For oltp_write_only, lz4
>    boosts FPW from 9,651 to 13,772 TPS (+43%), while DWB sees minimal
>    change (22,111 -> 20,660). This is expected -- lz4 compresses the
>    8KB full-page images that FPW writes to WAL, but DWB doesn't
>    generate FPIs at all, so lz4 has little effect on DWB's WAL volume.
>
> 3. Read-heavy mixed workloads: DWB shows a regression (-29%) in
>    oltp_read_write with lz4. This workload is 70% reads with only 4
>    write operations per transaction, so FPW overhead is minimal.
>    Meanwhile, DWB incurs additional I/O overhead from writing pages
>    to the double write buffer file, which outweighs the WAL savings
>    in this scenario.
>
> 4. Batch fsync optimization is critical for DWB: The two-phase
>    checkpoint approach (batch all DWB writes in Phase 1 -> single
>    fsync -> data file writes in Phase 2) reduces checkpoint DWB
>    fsyncs from millions to ~hundreds. For example, in
>    oltp_write_only: 1,157,729 DWB page writes -> only 148 fsyncs.
>
> Summary:
>
> DWB provides substantial performance benefits for write-intensive
> workloads with frequent checkpoints, which is the scenario where FPW
> overhead is most pronounced. The advantage is most significant without
> WAL compression (+100~130%), and remains strong (+50~76%) even with
> lz4 enabled. For read-dominated mixed workloads, DWB currently shows
> overhead that needs further optimization (reducing non-checkpoint
> DWB fsync costs).
>
> Regards,
> Baotiao
>

Reply via email to