Re: [HACKERS] PrivateRefCount patch has got issues

2015-01-17 Thread Andres Freund
Hi,

Sorry for taking long to get back to this...

On 2014-12-21 13:21:56 -0500, Tom Lane wrote:
 The idea I'd been wondering about hinged on the same observation that we
 know the buffer is not pinned (by our process) already, but the mechanics
 would be closer to what we do in resource managers: reserve space first,
 do the thing that needs to be remembered, bump the count using the
 reserved space.  Given the way you've set this up, the idea boils down to
 having a precheck call that forces there to be an empty slot in the local
 fastpath array (by pushing something out to the hash table if necessary)
 before we start trying to pin the buffer.  Then it's guaranteed that the
 record step will succeed.

After pondering the problem for a while I agree that this is the best
approach. I've implemented it and it imo actually looks cleaner than
before because GetPrivateRefCountEntry() is much simpler.

 You could possibly even arrange it so that
 it's known which array entry needs to be used and then the record part
 is just a couple of inline instructions, so that it'd be OK to do that
 while still holding the spinlock.  Otherwise it would still be a good idea
 to do the record after releasing the spinlock IMO; but either way this
 avoids the issue of possible state inconsistency due to a midflight
 palloc failure.

I've indeed done it in a way that the reserved entry is
remembered. Primarily because that will, in many situations, allow us to
avoid searching the array for a new entry because we can setup a
reserved slot when forgetting a private refcount entry.  While the
previous reservation would make it safe to do the private tracking with
the spinlock held I've moved it to after the release; based on our
observation that it's never used for buffers already locally
pinned. Also added an assert checking that that actually still is the
case.  That observation also allows us to entirely forgo searching the
private refcount array in PinBuffer_locked(), we can just directly enter
the new entry. That is actually visible in profiles that are much larger
than s_b.


I've just written this patch down in one go, so I'll let the topic rest
for a couple days and commit it then after looking through it again.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 233092e994b843b61d2841d222a294a65515e312 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Sat, 17 Jan 2015 15:53:55 +0100
Subject: [PATCH] Fix various shortcomings of the new PrivateRefCount
 infrastructure.

As noted by Tom Lane the improvements in 4b4b680c3d6 had the problem
that in some situations we searched, entered and modified entries in
the private refcount hash while holding a spinlock. I had tried to
keep the logic entirely local to PinBuffer_Locked(), but that's not
really possible given it's called with a spinlock held...

Besides being disadvantageous from a performance point of view, this
also has problems with error handling safety. If we failed inserting
an entry into the hashtable due to an out of memory error, we'd error
out with a held spinlock. Not good.

Change the way private refcounts are manipulated: Before a buffer can
be tracked an entry has to be reserved using
ReservePrivateRefCountEntry(); then, if a entry is not found using
GetPrivateRefCountEntry(), it can be entered with
NewPrivateRefCountEntry().

Also take advantage of the fact that PinBuffer_Locked() currently is
never called for buffers that already have been pinned by the current
backend and don't search the private refcount entries for preexisting
local pins. That results in a small, but measurable, performance
improvement.

Additionally make ReleaseBuffer() always call UnpinBuffer() for shared
buffers. That avoids doing duplicating work in an eventual
UnpinBuffer() call that already has been done in ReleaseBuffer() and
saves some code.

Per discussion with Tom Lane.

Discussion: 15028.1418772...@sss.pgh.pa.us
---
 src/backend/storage/buffer/bufmgr.c | 359 
 1 file changed, 195 insertions(+), 164 deletions(-)

diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 7eed0ba..77a8e0e 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -106,7 +106,7 @@ static volatile BufferDesc *PinCountWaitBuf = NULL;
  *
  *
  * To avoid - as we used to - requiring an array with NBuffers entries to keep
- * track of local buffers we use a small sequentially searched array
+ * track of local buffers, we use a small sequentially searched array
  * (PrivateRefCountArray) and a overflow hash table (PrivateRefCountHash) to
  * keep track of backend local pins.
  *
@@ -117,40 +117,129 @@ static volatile BufferDesc *PinCountWaitBuf = NULL;
  *
  * Note that in most scenarios the number of pinned buffers will not exceed
  * REFCOUNT_ARRAY_ENTRIES.
+ *
+ *
+ * To 

Re: [HACKERS] PrivateRefCount patch has got issues

2014-12-21 Thread Andres Freund
Hi,

On 2014-12-16 18:25:13 -0500, Tom Lane wrote:
 I just happened to look into bufmgr.c for the first time in awhile, and
 noticed the privaterefcount-is-no-longer-a-simple-array stuff.  It doesn't
 look too well thought out to me.  In particular, PinBuffer_Locked calls
 GetPrivateRefCountEntry while holding a buffer-header spinlock.  That
 seems completely unacceptable.

Argh, yes. That certainly isn't ok.

The easiest way to fix that seems to be to declare that PinBuffer_Locked
can only be used when we're guaranteed to not have pinned the
buffer. That happens to be true for all the existing users. In fact all
of them even seem to require the refcount to be zero across all
backends.  That prerequisite then allows to increase the buffer header
refcount before releasing the spinlock *and* before increasing the
private refcount.

 It's also depressing that the very common code path
 ReleaseBuffer-UnpinBuffer results in a double search of the
 array/hashtable; that should be refactored to avoid that.

Sounds like a good idea, will see how that works out to look.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PrivateRefCount patch has got issues

2014-12-21 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-12-16 18:25:13 -0500, Tom Lane wrote:
 I just happened to look into bufmgr.c for the first time in awhile, and
 noticed the privaterefcount-is-no-longer-a-simple-array stuff.  It doesn't
 look too well thought out to me.  In particular, PinBuffer_Locked calls
 GetPrivateRefCountEntry while holding a buffer-header spinlock.  That
 seems completely unacceptable.

 Argh, yes. That certainly isn't ok.

 The easiest way to fix that seems to be to declare that PinBuffer_Locked
 can only be used when we're guaranteed to not have pinned the
 buffer. That happens to be true for all the existing users. In fact all
 of them even seem to require the refcount to be zero across all
 backends.  That prerequisite then allows to increase the buffer header
 refcount before releasing the spinlock *and* before increasing the
 private refcount.

Hm, if you do it like that, what happens if we get a palloc failure while
trying to record the private refcount?  I think you must not bump the pin
count in shared memory unless you're certain you can record the fact that
you've done so.

The idea I'd been wondering about hinged on the same observation that we
know the buffer is not pinned (by our process) already, but the mechanics
would be closer to what we do in resource managers: reserve space first,
do the thing that needs to be remembered, bump the count using the
reserved space.  Given the way you've set this up, the idea boils down to
having a precheck call that forces there to be an empty slot in the local
fastpath array (by pushing something out to the hash table if necessary)
before we start trying to pin the buffer.  Then it's guaranteed that the
record step will succeed.  You could possibly even arrange it so that
it's known which array entry needs to be used and then the record part
is just a couple of inline instructions, so that it'd be OK to do that
while still holding the spinlock.  Otherwise it would still be a good idea
to do the record after releasing the spinlock IMO; but either way this
avoids the issue of possible state inconsistency due to a midflight
palloc failure.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PrivateRefCount patch has got issues

2014-12-16 Thread Tom Lane
I just happened to look into bufmgr.c for the first time in awhile, and
noticed the privaterefcount-is-no-longer-a-simple-array stuff.  It doesn't
look too well thought out to me.  In particular, PinBuffer_Locked calls
GetPrivateRefCountEntry while holding a buffer-header spinlock.  That
seems completely unacceptable.  It's certainly a huge violation of our
design principle that spinlocks should be held for only a few
instructions; and I rather suspect that a palloc failure down inside the
hashtable entry-allocation code would leave things in a bad state.  It's
also depressing that the very common code path ReleaseBuffer-UnpinBuffer
results in a double search of the array/hashtable; that should be
refactored to avoid that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers