Hi This test is completely meaningless. Just as you wouldn't set innodb_redo_log_capacity=Minimum Value, innodb_max_dirty_pages_pct=Minimum Value You used an extreme example to prove the double write.Why didn't you compare using best practices?
Thank On Fri, Feb 27, 2026 at 7:43 PM 陈宗志 <[email protected]> wrote: > Hi wenhui, > > Here are the latest benchmark results for the Double Write Buffer (DWB) > proposal. In this round of testing, I have included the two-phase > checkpoint batch fsync optimization and evaluated the impact of > wal_compression (lz4) on both FPW and DWB. > > Test Environment: > - PostgreSQL: 19devel (with DWB patch applied) > - Hardware: Linux 5.10, x86_64 > - Configuration: > * shared_buffers = 1GB > * max_wal_size = 32MB (to stress checkpoint frequency) > * wal_compression = lz4 > * double_write_buffer_size = 128MB (for DWB mode) > - Workload: sysbench 1.1.0, 10 tables x 1,000,000 rows (~2.3GB dataset) > - Method: 16 threads, 60 seconds per run, each mode tested > independently (only one instance running at a time to eliminate > I/O contention). > > Three modes compared: > - FPW: io_torn_pages_protection = full_pages (current default) > - DWB: io_torn_pages_protection = double_writes > - OFF: io_torn_pages_protection = off (no protection, baseline) > > Results with wal_compression = lz4 > ---------------------------------- > 1. oltp_write_only (pure write transactions: UPDATE + DELETE + INSERT) > > Mode TPS vs FPW vs OFF > ---- ------ ------ ------ > FPW 13,772 - -64.3% > DWB 20,660 +50.0% -46.5% > OFF 38,588 +180.2% - > > 2. oltp_update_non_index (single UPDATE per transaction) > > Mode TPS vs FPW vs OFF > ---- ------ ------ ------ > FPW 59,427 - -57.5% > DWB 104,328 +75.6% -25.4% > OFF 139,870 +135.4% - > > 3. oltp_read_write (mixed: 70% reads + 30% writes) > > Mode TPS vs FPW vs OFF > ---- ------ ------ ------ > FPW 6,232 - -9.0% > DWB 4,408 -29.3% -35.6% > OFF 6,845 +9.8% - > > > Results without wal_compression (for comparison) > ------------------------------------------------ > Workload FPW DWB DWB vs FPW > -------- ------ ------ ---------- > oltp_write_only 9,651 22,111 +129.1% > oltp_update_non_index 48,624 98,356 +102.3% > oltp_read_write 5,414 5,275 -2.6% > > > Key Observations: > > 1. Write-heavy workloads: DWB outperforms FPW by +50% to +76% even > with lz4 compression enabled. Without lz4, the advantage grows > to +102% to +129% because uncompressed full-page images cause > severe WAL bloat. > > 2. lz4 compression significantly helps FPW: For oltp_write_only, lz4 > boosts FPW from 9,651 to 13,772 TPS (+43%), while DWB sees minimal > change (22,111 -> 20,660). This is expected -- lz4 compresses the > 8KB full-page images that FPW writes to WAL, but DWB doesn't > generate FPIs at all, so lz4 has little effect on DWB's WAL volume. > > 3. Read-heavy mixed workloads: DWB shows a regression (-29%) in > oltp_read_write with lz4. This workload is 70% reads with only 4 > write operations per transaction, so FPW overhead is minimal. > Meanwhile, DWB incurs additional I/O overhead from writing pages > to the double write buffer file, which outweighs the WAL savings > in this scenario. > > 4. Batch fsync optimization is critical for DWB: The two-phase > checkpoint approach (batch all DWB writes in Phase 1 -> single > fsync -> data file writes in Phase 2) reduces checkpoint DWB > fsyncs from millions to ~hundreds. For example, in > oltp_write_only: 1,157,729 DWB page writes -> only 148 fsyncs. > > Summary: > > DWB provides substantial performance benefits for write-intensive > workloads with frequent checkpoints, which is the scenario where FPW > overhead is most pronounced. The advantage is most significant without > WAL compression (+100~130%), and remains strong (+50~76%) even with > lz4 enabled. For read-dominated mixed workloads, DWB currently shows > overhead that needs further optimization (reducing non-checkpoint > DWB fsync costs). > > Regards, > Baotiao >
