Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches
It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --------------------------------------------------------------------------- Simon Riggs wrote: > transaction_guarantee.v11.patch > - keep current, cleanup, more comments and docs > > Brief Performance Analysis > -------------------------- > > I've tested 3 scenarios: > 1. normal > 2. wal_writer_delay = 100ms > 3. wal_writer_delay = 100ms and transaction_guarantee = off > > On my laptop, with a scale=1 pgbench database with 1 connection I > consistently get around 85 tps in mode (1), with a slight performance > drop in mode (2). In mode (3) I get anywhere from 200tps - 900 tps, > depending upon how well cached everything is, with 700 tps being fairly > typical. fsync = on gives around 900tps. > > Also good speedups with multiple session tests. > > make installcheck passes in 120 sec in mode (3), though 155 sec in mode > (1) and 158 sec in mode (2). > > Basic Implementation > -------------------- > > xact.c > xact.h > > The basic implementation simply records the LSN of the xlog commit > record in a shared memory area, the deferred fsync cache. > > ipci.c > > The cache is protected by an LWlock called DeferredFsyncLock. > > lwlock.h > > A WALWriter process wakes up regularly to perform a background flush of > WAL up to the point of the highest LSN in the deferred fsync cache. > > walwriter.c > walwriter.h > postmaster.c > > WALWriter can be enabled only at server start. > (All above same as March 11 version) > > Correctness > ----------- > > postgres.c > > Only certain code paths can execute transaction_guarantee = off > transactions, though the main code paths for OLTP allow it. > > xlog.c > > CreateCheckpoint() must protect against starting a checkpoint when > commits are not yet flushed, so an additional flush must occur here. > > vacuum.c > > VACUUM FULL cannot move tuples until their states are all known, so this > command triggers a background flush also. > > clog.c > clog.h > slru.c > slru.h > > Changes to Clog and SLRU enforce the basic rule of WAL-before-data, > which otherwise might allow the record of a commit to reach disk before > the flush of the WAL. This is implemented by storing an LSN for each > clog page. > > transam.c > transam.h > twophase.c > xact.c > > The above files have API changes that allow the LSN at transaction > commit to be passed through to the Clog. > > tqual.c > tqual.h > multixact.c > multixact.h > > Visibility hint bits must also not be set before the transaction is > flushed, so other changes are required to ensure we store the LSN of > each transaction, not just the maximum LSN. Changes to tqual.c appear > extensive, though this is just refactoring to allow us to make > additional function calls before setting bits - there are no functional > changes to any HeapTupleSatisfies... functions. > > xact.c > > Contains the module for the Deferred Transaction functions and in > particular the deferred transaction cache. This could be a separate > module, since there is only a slight link with the other xact.c code. > > User Interface > -------------- > > guc.c > postgresql.conf.sample > guc_table.h > > New parameters have been added, with a new parameter grouping of > WAL_COMMITS created to control the various commit parameters. > > Performance Tuning > ------------------ > > The WALWriter wakes up each eal_writer_delay milliseconds. There are two > protections against mis-setting this parameter. > > pmsignal.h > > The WALWriter will also be woken by a signal if the DF cache has nearly > filled and flushing would be desirable. > > The WALWriter will also loop without any delay if the number of > transactions committed while it was writing WAL is above a threshold > value. > > Docs > ---- > The fsync parameter has been removed from postgresql.conf.sample and the > docs, though it still exists in this patch to allow performance testing > during Beta. It is suggested that fsync=on should mean the same thing as > transaction_guarantee = off, wal_writer_delay = 100ms, if it is > specified in postgresql.conf or on the server command line. > > A new section in wal.sgml willd escribe this in more detail, later. > > Open Questions > -------------- > > 1. Should the DFC use a standard hash table? Custom code allows both > additional speed and the ability to signal when it fills. > > 2. Should tqual.c update the LSN of a heap page with the LSN of the > transaction commit that it can read from the DF cache? > > 3. Should the WALWriter also do the wal_buffers half-full write at the > start of XLogInsert() ? > > 4. The recent changes to remove CheckpointStartLock haven't changed the > code path for deferred transactions, so a similar solution might be > possible there also. > > 5. Is it correct to do WAL-before-flush for clog only, or should this > be multixact also? > > All of the above are fairly minor changes. > > Any other thoughts/comments/tests welcome. > > -- > Simon Riggs > EnterpriseDB http://www.enterprisedb.com > [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to [EMAIL PROTECTED] so that your > message can get through to the mailing list cleanly -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match