Hannu Krosing <[EMAIL PROTECTED]> writes: > The writer process should just issue a continuous stream of > aio_write()'s while there are any waiters and keep track which waiters > are safe to continue - thus no guessing of who's gonna commit.
This recipe sounds like "eat I/O bandwidth whether we need it or not". It might be optimal in the case where activity is so heavy that we do actually need a WAL write on every disk revolution, but in any scenario where we're not maxing out the WAL disk's bandwidth, it will hurt performance. In particular, it would seriously degrade performance if the WAL file isn't on its own spindle but has to share bandwidth with data file access. What we really want, of course, is "write on every revolution where there's something worth writing" --- either we've filled a WAL blovk or there is a commit pending. But that just gets us back into the same swamp of how-do-you-guess-whether-more-commits-will-arrive-soon. I don't see how an extra process makes that problem any easier. BTW, it would seem to me that aio_write() buys nothing over plain write() in terms of ability to gang writes. If we issue the write at time T and it completes at T+X, we really know nothing about exactly when in that interval the data was read out of our WAL buffers. We cannot assume that commit records that were stored into the WAL buffer during that interval got written to disk. The only safe assumption is that only records that were in the buffer at time T are down to disk; and that means that late arrivals lose. You can't issue aio_write immediately after the previous one completes and expect that this optimizes performance --- you have to delay it as long as you possibly can in hopes that more commit records arrive. So it comes down to being the same problem. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly