On 24.11.2010 07:07, Robert Haas wrote:
Per previous threats, I spent some time tonight running oprofile
(using the directions Tom Lane was foolish enough to provide me back
in May). I took testlibpq.c and hacked it up to just connect to the
server and then disconnect in a tight loop without doing anything
useful, hoping to measure the overhead of starting up a new
connection. Ha, ha, funny about that:
120899 18.0616 postgres AtProcExit_Buffers
56891 8.4992 libc-2.11.2.so memset
30987 4.6293 libc-2.11.2.so memcpy
26944 4.0253 postgres hash_search_with_hash_value
26554 3.9670 postgres AllocSetAlloc
20407 3.0487 libc-2.11.2.so _int_malloc
17269 2.5799 libc-2.11.2.so fread
13005 1.9429 ld-2.11.2.so do_lookup_x
11850 1.7703 ld-2.11.2.so _dl_fixup
10194 1.5229 libc-2.11.2.so _IO_file_xsgetn
In English: the #1 overhead here is actually something that happens
when processes EXIT, not when they start. Essentially all the time is
in two lines:
56920 6.6006 : for (i = 0; i< NBuffers; i++)
: {
98745 11.4507 : if (PrivateRefCount[i] != 0)
Oh, that's quite surprising.
Anything we can do about this? That's a lot of overhead, and it'd be
a lot worse on a big machine with 8GB of shared_buffers.
Micro-optimizing that search for the non-zero value helps a little bit
(attached). Reduces the percentage shown by oprofile from about 16% to
12% on my laptop.
For bigger gains, I think you need to somehow make the PrivateRefCount
smaller. Perhaps only use one byte for each buffer instead of int32, and
use some sort of an overflow list for the rare case that a buffer is
pinned more than 255 times. Or make it a hash table instead of a simple
lookup array. But whatever you do, you have to be very careful to not
add overhead to PinBuffer/UnPinBuffer, those can already be quite high
in oprofile reports of real applications. It might be worth
experimenting a bit, at the moment PrivateRefCount takes up 5MB of
memory per 1GB of shared_buffers. Machines with a high shared_buffers
setting have no shortage of memory, but a large array like that might
waste a lot of precious CPU cache.
Now, the other question is if this really matters. Even if we eliminate
that loop in AtProcExit_Buffers altogether, is connect/disconnect still
be so slow that you have to use a connection pooler if you do that a lot?
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 54c7109..03593fd 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1665,11 +1665,20 @@ static void
AtProcExit_Buffers(int code, Datum arg)
{
int i;
+ int *ptr;
+ int *end;
AbortBufferIO();
UnlockBuffers();
- for (i = 0; i < NBuffers; i++)
+ /* Fast search for the first non-zero entry in PrivateRefCount */
+ end = (int *) &PrivateRefCount[NBuffers - 1];
+ ptr = (int *) PrivateRefCount;
+ while(ptr < end && *ptr == 0)
+ ptr++;
+ i = ((int32 *) ptr) - PrivateRefCount;
+
+ for (;i < NBuffers; i++)
{
if (PrivateRefCount[i] != 0)
{
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers