Neil Conway <[EMAIL PROTECTED]> writes: > - to modify a BufferDesc's meta data (shared refcount, flags, tag, > etc.), one must hold the buffer's "meta data lock". Also, I > removed the existing "io_in_progress" lock; instead, we now hold > the buffer's meta data lock while doing buffer I/O.
The latter is a really bad idea IMHO. The io_in_progress lock can be held for eons (in CPU terms) and should not be blocking people who simply want to bump their refcount up and down. In particular, there is no reason for an in-process write to block people who want read-only access to the buffer. It's possible that you could combine the io_in_progress lock with the cntx_lock, but I doubt locking out access to the metadata during i/o is a good idea. > - if a backend holds the BufMgrLock, it will never try to > LWLockAcquire() a per-buffer meta data lock, due to the risk of > deadlock (and the loss in concurrency if we got blocked waiting > while still holding the BufMgrLock). Instead it does a > LWLockConditionalAcquire() and handles an acquisition failure > sanely Hmm, what's "sanely" exactly? It seems to me that figuring out how to not need to lock in this direction is a critical part of the redesign, so you can't just handwave here and expect people to understand. > - Remove SetBufferCommitInfoNeedsSave(). AFAICS, this is now > completely equivalent to WriteNoReleaseBuffer(), so I just removed > the former and replaced all the calls to it with calls to the later. The reason I've kept the separation was as a form of documentation as to the reason for each write. Although they currently do the same thing, that might not always be true. I'd prefer not to eliminate the distinction from the source code --- though I'd not object if you want to make SetBufferCommitInfoNeedsSave a macro that invokes the other routine. > - Removed the following now-unused buffer flag bits: BM_VALID, > BM_DELETED, BM_JUST_DIRTIED, BM_IO_IN_PROGRESS, and shrunk the > 'flags' field of the BufferDesc down to 8 bits (from 16) I cannot believe that it's workable to remove BM_JUST_DIRTIED. How are you handling the race conditions that this was for? I'm also unconvinced that removing BM_IO_IN_PROGRESS is a good idea. > - Make 'PrivateRefCount' an array of uint16s, rather than longs. This > saves 2 bits * shared_buffers per backend on 32-bit machines and 6 > bits * shared_buffers per backend on some 64-bit machines. It means > a given backend can only pin a single buffer 65,636 times, but that > should be more than enough. Similarly, made LocalRefCount an array > of uint16s. I think both of these are ill-considered micro-optimization. How do you know that the pin count can't exceed 64K? Consider recursive plpgsql functions for a likely counterexample. > I was thinking of adding overflow checking to the refcount increment > code to make sure we fail safely if a backend *does* try to exceed > this number of pins, but I can't imagine a scenario when it would > actually occur, so I haven't bothered. Leaving the arrays as longs is a much saner approach IMHO. > - Remove the BufferLocks array. AFAICS this wasn't actually necessary. Please put that back. It is there to avoid unnecessary acquisitions of buffer locks during UnlockBuffers (which is executed during any transaction abort). Without it, you will be needing to lock every single buffer during an abort in order to check its flags. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])