Marty Scholes <[EMAIL PROTECTED]> writes: > My experience with Oracle (and now limited experience with Pg) is that > the major choke point in performance is not the CPU or read I/O, it is > the log performance of big update and select statements.
If your load is primarily big update statements, maybe so... > Essentially, the data is written twice: first to the log and then the > data files. This would be ok except the transaction is regularly frozen > while the log files sync to disk with a bunch of tiny (8KB for Oracle > and Pg) write requests. I don't think I buy that claim. We don't normally fsync the log file except at transaction commit (and read-only transactions don't generate any commit record, so they don't cause an fsync). If a single transaction is generating lots of log data, it doesn't have to wait for that data to hit disk before it can do more stuff. But having said that --- on some platforms our default WAL sync method is open_datasync, which could result in the sort of behavior you are talking about. Try experimenting with the other possible values of wal_sync_method to see if you like them better. > If a transaction will do large updates or inserts, why don't we just log > the parsed statements in the WAL instead of the individual data blocks > that have changed? As already pointed out, this would not give enough information to reproduce the database state. > Some informal testing suggests that we get a factor of 8 improvement in > speed here if we completely disable fsync() in large updates under Pg. That probably gets you into a situation where no I/O is really happening at all, it's just being absorbed by kernel disk buffers. Unfortunately that doesn't have a lot to do with the performance you can get if you want to be sure you don't lose data ... BTW, one thing you can do to reduce the WAL I/O volume in Postgres is to increase the inter-checkpoint interval (there are two settings to increase, one time-based and one volume-based). The first write of a given data page after a checkpoint dumps the whole page into WAL, as a safety measure to deal with partial page writes during power failures. So right after a checkpoint the WAL volume goes way up. With a longer interval between checkpoints you don't pay that price as often. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]