On Sat, 9 Oct 2004, Tom Lane wrote: > mmap provides msync which is comparable to fsync, but AFAICS it > provides no way to prevent an in-memory change from reaching disk too > soon. This would mean that WAL entries would have to be written *and > flushed* before we could make the data change at all, which would > convert multiple updates of a single page into a series of write-and- > wait-for-WAL-fsync steps. Not good. fsync'ing WAL once per transaction > is bad enough, once per atomic action is intolerable.
Back when I was working out how to do this, I reckoned that you could use mmap by keeping a write queue for each modified page. Reading, you'd have to read the datum from the page and then check the write queue for that page to see if that datum had been updated, using the new value if it's there. Writing, you'd add the modified datum to the write queue, but not apply the write queue to the page until you'd had confirmation that the corresponding transaction log entry had been written. So multiple writes are no big deal; they just all queue up in the write queue, and at any time you can apply as much of the write queue to the page itself as the current log entry will allow. There are several different strategies available for mapping and unmapping the pages, and in fact there might need to be several available to get the best performance out of different systems. Most OSes do not seem to be optimized for having thousands or tens of thousands of small mappings (certainly NetBSD isn't), but I've never done any performance tests to see what kind of strategies might work well or not. cjs -- Curt Sampson <[EMAIL PROTECTED]> +81 90 7737 2974 http://www.NetBSD.org Make up enjoying your city life...produced by BIC CAMERA ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly