On Sat, Jun 2, 2012 at 1:48 AM, Merlin Moncure <mmonc...@gmail.com> wrote: > Buffer pins aren't a cache: with a cache you are trying to mask a slow > operation (like a disk i/o) with a faster such that the amount of slow > operations are minimized. Buffer pins however are very different in > that we only care about contention on the reference count (the buffer > itself is not locked!) which makes me suspicious that caching type > algorithms are the wrong place to be looking. I think it comes to do > picking between your relatively complex but general, lock displacement > approach or a specific strategy based on known bottlenecks.
I agree that pins aren't like a cache. I mentioned the caching algorithms because they work based on access frequency and highly contended locks are likely to be accessed frequently even from a single backend. However this only makes sense for the delayed unpinning method, and I also have come to the conclusion that it's not likely to work well. Besides delaying cleanup, the overhead for the common case of uncontended access is just too much. It seems to me that even the nailing approach will need a replacement algorithm. The local pins still need to be published globally and because shared memory size is fixed, the maximum amount of locally pinned nailed buffers needs to be limited as well. But anyway, I managed to completely misread the profile that Sergey gave. Somehow I missed that the time went into the retry TAS in slock instead of the inlined TAS. This shows that the issue isn't just cacheline ping-pong but cacheline stealing. This could be somewhat mitigated by making pinning lock-free. The Nb-GCLOCK paper that Robert posted earlier in another thread describes an approach for this. I have a WIP patch (attached) that makes the clock sweep lock-free in the common case. This patch gave a 40% performance increase for an extremely allocation heavy load running with 64 clients on a 4 core 1 socket system, lesser gains across the board. Pinning has a shorter lock duration (and a different lock type) so the gain might be less, or it might be a larger problem and post a higher gain. Either way, I think the nailing approach should be explored further, cacheline ping-pong could still be a problem with higher number of processors and losing the spinlock also loses the ability to detect contention. Ants Aasma -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de
lockfree-getbuffer.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers