[GENERAL] PostgreSQL hanging on new connections?

2010-11-29 Thread hubert depesz lubaczewski
hi,
got (another time, it's not the first thing) this situation:
1. server: 64gb of ram, 24 cores of xeon processors (hyperthreading is off), 
storage is fusion io.
2. at one point load skyrocketed to 48 (i know, it's not high with this
hardware)
3. we were not able to connect to PostgreSQL from psql - psql when run
just froze.
4. postgresql: PostgreSQL 8.3.12 on x86_64-redhat-linux-gnu, compiled by
GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48)

What I did gather:
straced postmaster when the problem was happening, and I was opening new
connections. strace looks like this:

16:54:36.886720 select(6, [3 4 5], NULL, NULL, {54, 366000}) = 1 (in [4], left 
{51, 563000})
16:54:39.768318 rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP ABRT BUS FPE SEGV CONT 
SYS RTMIN RT_1], NULL, 8) = 0
16:54:39.770795 accept(4, {sa_family=AF_INET, sin_port=htons(59661), 
sin_addr=inet_addr("10.50.50.7")}, [16]) = 8
16:54:39.773895 getsockname(8, {sa_family=AF_INET, sin_port=htons(5432), 
sin_addr=inet_addr("10.60.0.10")}, [16]) = 0
16:54:39.774522 setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0
16:54:39.774579 setsockopt(8, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
16:54:39.774729 clone(child_stack=0, 
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x2b06874a4460) = 12891
16:54:39.854208 close(8)= 0
16:54:39.855800 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:54:39.856853 select(6, [3 4 5], NULL, NULL, {60, 0}) = 1 (in [4], left {57, 
842000})
16:54:42.018570 rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP ABRT BUS FPE SEGV CONT 
SYS RTMIN RT_1], NULL, 8) = 0
16:54:42.027079 accept(4, {sa_family=AF_INET, sin_port=htons(59662), 
sin_addr=inet_addr("10.50.50.7")}, [16]) = 8
16:54:42.046511 getsockname(8, {sa_family=AF_INET, sin_port=htons(5432), 
sin_addr=inet_addr("10.60.0.10")}, [16]) = 0
16:54:42.095917 setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0
16:54:42.109190 setsockopt(8, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
16:54:42.111603 clone(child_stack=0, 
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x2b06874a4460) = 12912
16:54:42.114379 close(8) = 0
16:54:42.120551 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:54:42.128922 select(6, [3 4 5], NULL, NULL, {60, 0}) = 1 (in [4], left {58, 
593000})

so, everything looks right.

but - I was also stracing backends - just after creation (I did strace -ff -p 
 -o x.log).

strace from single backend, just forked off:

16:54:39.852638 close(3)= 0
16:54:39.852823 close(4)= 0
16:54:39.852872 close(5)= 0
16:54:39.852910 close(6)= 0
16:54:39.852994 setsid()= 12891
16:54:39.853050 rt_sigaction(SIGTERM, {0x5ae490, [], SA_RESTORER|SA_RESTART, 
0x3bd66302d0}, {0x588580, [], SA_RESTORER|SA_RESTART, 0x3bd66302d0}, 8) = 0
16:54:39.853169 rt_sigaction(SIGQUIT, {0x5ae490, [], SA_RESTORER|SA_RESTART, 
0x3bd66302d0}, {0x588580, [], SA_RESTORER|SA_RESTART, 0x3bd66302d0}, 8) = 0
16:54:39.853215 rt_sigaction(SIGALRM, {0x5ae490, [], SA_RESTORER, 
0x3bd66302d0}, {0x1, [], SA_RESTORER, 0x3bd66302d0}, 8) = 0
16:54:39.853259 rt_sigprocmask(SIG_SETMASK, ~[QUIT ILL TRAP ABRT BUS FPE SEGV 
ALRM TERM CONT SYS RTMIN RT_1], NULL, 8) = 0
16:54:39.853504 write(2, "\0\0^\0[2\0\0t2010-11-29 16:54:39 UTC  [12891]: [1-1] 
LOG:  connection received: host=10.50.50.7 port=59661\n", 103) = 103
16:54:39.853622 setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={60, 0}}, 
NULL) = 0
16:54:39.853689 recvfrom(8, "\0\0\0\10\4\322\26/", 8192, 0, NULL, NULL) = 8
16:54:39.853822 sendto(8, "N", 1, 0, NULL, 0) = 1
16:54:39.853901 recvfrom(8, 
"\0\0\0%\0\3\0\0user\0web\0database\0x\0\0", 8192, 0, NULL, NULL) = 37
16:54:39.856007 setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 0}}, 
NULL) = 0
16:54:39.856059 rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP ABRT BUS FPE SEGV CONT 
SYS RTMIN RT_1], NULL, 8) = 0
16:54:39.856130 write(2, "\0\0r\0[2\0\0t2010-11-29 16:54:39 UTC 
10.50.50.7(59661) [12891]: [2-1] LOG:  connection authorized: user=web 
database=x\n", 123) = 123
16:54:39.856255 rt_sigaction(SIGHUP, {0x5ad340, [], SA_RESTORER|SA_RESTART, 
0x3bd66302d0}, {0x588a90, [], SA_RESTORER|SA_RESTART, 0x3bd66302d0}, 8) = 0
16:54:39.856315 rt_sigaction(SIGINT, {0x5ae350, [], SA_RESTORER|SA_RESTART, 
0x3bd66302d0}, {0x588580, [], SA_RESTORER|SA_RESTART, 0x3bd66302d0}, 8) = 0
16:54:39.856359 rt_sigaction(SIGTERM, {0x5ae3f0, [], SA_RESTORER|SA_RESTART, 
0x3bd66302d0}, {0x5ae490, [], SA_RESTORER|SA_RESTART, 0x3bd66302d0}, 8) = 0
16:54:39.856402 rt_sigaction(SIGQUIT, {0x5ae4a0, [], SA_RESTORER|SA_RESTART, 
0x3bd66302d0}, {0x5ae490, [], SA_RESTORER|SA_RESTART, 0x3bd66302d0}, 8) = 0
16:54:39.856445 rt_sigaction(SIGALRM, {0x5a6160, [], SA_RESTORER, 
0x3bd66302d0}, {0x5ae490, [], SA_RESTORER, 0x3bd66302d0}, 8) = 0
16:54:39.856484 rt_sigaction(SIGPIPE, {0x1, [], SA_RESTORER|SA_RESTART, 
0x3bd66302d0}, {0x1, [], SA_RESTORER|SA_RESTART, 0x3bd66302d0}, 8) = 0
16:54:39.856526 rt_sigaction(SIGUSR1, {0x5a1740, [], S

Re: [GENERAL] PostgreSQL hanging on new connections?

2010-11-29 Thread Tom Lane
hubert depesz lubaczewski  writes:
> straced postmaster when the problem was happening, and I was opening new
> connections. strace looks like this:
> [ backend hangs on semop immediately after reading global/pg_database ]

It looks like something had exclusive lock on the database that new
connections wanted to connect to.  AFAICS the only action in 8.3 that
would do that would be a DROP DATABASE or RENAME DATABASE.  What was
that other session doing?

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL hanging on new connections?

2010-11-29 Thread hubert depesz lubaczewski
On Mon, Nov 29, 2010 at 01:23:00PM -0500, Tom Lane wrote:
> hubert depesz lubaczewski  writes:
> > straced postmaster when the problem was happening, and I was opening new
> > connections. strace looks like this:
> > [ backend hangs on semop immediately after reading global/pg_database ]
> 
> It looks like something had exclusive lock on the database that new
> connections wanted to connect to.  AFAICS the only action in 8.3 that
> would do that would be a DROP DATABASE or RENAME DATABASE.  What was
> that other session doing?

just some select.

as far as I know there are no backends with such commands - drop/rename
db.

depesz

-- 
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: dep...@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL hanging on new connections?

2010-11-29 Thread hubert depesz lubaczewski
On Mon, Nov 29, 2010 at 01:23:00PM -0500, Tom Lane wrote:
> hubert depesz lubaczewski  writes:
> > straced postmaster when the problem was happening, and I was opening new
> > connections. strace looks like this:
> > [ backend hangs on semop immediately after reading global/pg_database ]
> 
> It looks like something had exclusive lock on the database that new
> connections wanted to connect to.  AFAICS the only action in 8.3 that
> would do that would be a DROP DATABASE or RENAME DATABASE.  What was
> that other session doing?

verified in logs. there were no alter database or drop database
statements.

Best regards,

depesz

-- 
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: dep...@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL hanging on new connections?

2010-11-29 Thread Vick Khera
On Mon, Nov 29, 2010 at 1:23 PM, Tom Lane  wrote:
> hubert depesz lubaczewski  writes:
>> straced postmaster when the problem was happening, and I was opening new
>> connections. strace looks like this:
>> [ backend hangs on semop immediately after reading global/pg_database ]
>
> It looks like something had exclusive lock on the database that new
> connections wanted to connect to.  AFAICS the only action in 8.3 that
> would do that would be a DROP DATABASE or RENAME DATABASE.  What was
> that other session doing?

Every once in a while when I connect to my big DB it feels like it
stalls the connection for a few seconds.  I have no idea what causes
it, but it does feel like it has to do with the load.  The load is not
all that high relative to what my box can handle.  It doesn't happen
often enough for me to track it down, though.  It instinct is that it
is waiting on a lock, but clearly there is no rename/drop happening on
my main db else i'd be out of business :-)  I too run 8.3 as primary
right now... testing 9.0 for deployment soon-ish.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL hanging on new connections?

2010-11-29 Thread Joshua D. Drake
On Mon, 2010-11-29 at 15:57 -0500, Vick Khera wrote:
> On Mon, Nov 29, 2010 at 1:23 PM, Tom Lane  wrote:
> > hubert depesz lubaczewski  writes:
> >> straced postmaster when the problem was happening, and I was opening new
> >> connections. strace looks like this:
> >> [ backend hangs on semop immediately after reading global/pg_database ]
> >
> > It looks like something had exclusive lock on the database that new
> > connections wanted to connect to.  AFAICS the only action in 8.3 that
> > would do that would be a DROP DATABASE or RENAME DATABASE.  What was
> > that other session doing?
> 
> Every once in a while when I connect to my big DB it feels like it
> stalls the connection for a few seconds.  I have no idea what causes
> it, but it does feel like it has to do with the load.  The load is not
> all that high relative to what my box can handle.  It doesn't happen
> often enough for me to track it down, though.  It instinct is that it
> is waiting on a lock, but clearly there is no rename/drop happening on
> my main db else i'd be out of business :-)  I too run 8.3 as primary
> right now... testing 9.0 for deployment soon-ish.

Initiating the connection all the caches etc? Perhaps trying to allocate
shared memory?

JD 

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL hanging on new connections?

2010-11-29 Thread Ben Chobot
On Nov 29, 2010, at 12:57 PM, Vick Khera wrote:

> On Mon, Nov 29, 2010 at 1:23 PM, Tom Lane  wrote:
>> hubert depesz lubaczewski  writes:
>>> straced postmaster when the problem was happening, and I was opening new
>>> connections. strace looks like this:
>>> [ backend hangs on semop immediately after reading global/pg_database ]
>> 
>> It looks like something had exclusive lock on the database that new
>> connections wanted to connect to.  AFAICS the only action in 8.3 that
>> would do that would be a DROP DATABASE or RENAME DATABASE.  What was
>> that other session doing?
> 
> Every once in a while when I connect to my big DB it feels like it
> stalls the connection for a few seconds.  I have no idea what causes
> it, but it does feel like it has to do with the load.  The load is not
> all that high relative to what my box can handle.  It doesn't happen
> often enough for me to track it down, though.  It instinct is that it
> is waiting on a lock, but clearly there is no rename/drop happening on
> my main db else i'd be out of business :-)  I too run 8.3 as primary
> right now... testing 9.0 for deployment soon-ish.

We're having similar issues on 8.4.[245]... occasionally psql takes anywhere 
from a few to several dozen seconds to connect. I've been unsuccessfully trying 
to blame spikes in the OS run queue (we desperately need some connection 
pooling) but if it's something to do with locks I can't see in pg_locks, that 
would explain why I haven't been able to figure out what's going on yet
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL hanging on new connections?

2010-11-29 Thread Dusan Misic
On Mon, Nov 29, 2010 at 10:03 PM, Ben Chobot  wrote:

> On Nov 29, 2010, at 12:57 PM, Vick Khera wrote:
>
> > On Mon, Nov 29, 2010 at 1:23 PM, Tom Lane  wrote:
> >> hubert depesz lubaczewski  writes:
> >>> straced postmaster when the problem was happening, and I was opening
> new
> >>> connections. strace looks like this:
> >>> [ backend hangs on semop immediately after reading global/pg_database ]
> >>
> >> It looks like something had exclusive lock on the database that new
> >> connections wanted to connect to.  AFAICS the only action in 8.3 that
> >> would do that would be a DROP DATABASE or RENAME DATABASE.  What was
> >> that other session doing?
> >
> > Every once in a while when I connect to my big DB it feels like it
> > stalls the connection for a few seconds.  I have no idea what causes
> > it, but it does feel like it has to do with the load.  The load is not
> > all that high relative to what my box can handle.  It doesn't happen
> > often enough for me to track it down, though.  It instinct is that it
> > is waiting on a lock, but clearly there is no rename/drop happening on
> > my main db else i'd be out of business :-)  I too run 8.3 as primary
> > right now... testing 9.0 for deployment soon-ish.
>
> We're having similar issues on 8.4.[245]... occasionally psql takes
> anywhere from a few to several dozen seconds to connect. I've been
> unsuccessfully trying to blame spikes in the OS run queue (we desperately
> need some connection pooling) but if it's something to do with locks I can't
> see in pg_locks, that would explain why I haven't been able to figure out
> what's going on yet
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

This is normal. PostgreSQL needs to create new server process to handle your
requested connection.

Then it needs to allocate resources to that new connection. It initializes
shared memory for that connection. That is the stall you are mentioning.

If you are time sensitive on connection times, then use connection pooling
software like pg-bouncer or pg-pool2. Pooling will keep your server instance
"hot" and those lags won't happen.


Re: [GENERAL] PostgreSQL hanging on new connections?

2010-11-30 Thread Craig Ringer

On 11/30/2010 03:28 PM, Dusan Misic wrote:

We're having similar issues on 8.4.[245]... occasionally psql takes
anywhere from a few to several dozen seconds to connect. I've been
unsuccessfully trying to blame spikes in the OS run queue (we
desperately need some connection pooling) but if it's something to
do with locks I can't see in pg_locks, that would explain why I
haven't been able to figure out what's going on yet
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


This is normal. PostgreSQL needs to create new server process to handle
your requested connection.

Then it needs to allocate resources to that new connection. It
initializes shared memory for that connection. That is the stall you are
mentioning.


Eh, what?

Forking a backend and attaching to shared memory should *not* take "a 
few seconds". On my test machine it takes 100ms to fork psql, connect to 
the postmaster, fork a backend, init the backend, authenticate, run a 
dummy query and exit psql.


If you're seeing delays like that, your machine is horrifyingly 
overloaded or there's something else wrong.


--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL hanging on new connections?

2010-11-30 Thread Dusan Misic
On Tue, Nov 30, 2010 at 10:20 AM, Craig Ringer
wrote:

> On 11/30/2010 03:28 PM, Dusan Misic wrote:
>
>>We're having similar issues on 8.4.[245]... occasionally psql takes
>>anywhere from a few to several dozen seconds to connect. I've been
>>unsuccessfully trying to blame spikes in the OS run queue (we
>>desperately need some connection pooling) but if it's something to
>>do with locks I can't see in pg_locks, that would explain why I
>>haven't been able to figure out what's going on yet
>>--
>>Sent via pgsql-general mailing list (pgsql-general@postgresql.org
>>)
>>To make changes to your subscription:
>>http://www.postgresql.org/mailpref/pgsql-general
>>
>>
>> This is normal. PostgreSQL needs to create new server process to handle
>> your requested connection.
>>
>> Then it needs to allocate resources to that new connection. It
>> initializes shared memory for that connection. That is the stall you are
>> mentioning.
>>
>
> Eh, what?
>
> Forking a backend and attaching to shared memory should *not* take "a few
> seconds". On my test machine it takes 100ms to fork psql, connect to the
> postmaster, fork a backend, init the backend, authenticate, run a dummy
> query and exit psql.
>
> If you're seeing delays like that, your machine is horrifyingly overloaded
> or there's something else wrong.
>
> --
> Craig Ringer
>

This is not happening to me. Even on my home computer (which is terribly
slow single core Sempron 2600+ with 128 kB of L2 cache, 1.5 GB DDR RAM and
with slow ATA hard disks) it is fast to connect. Maximum (slowest)
connection time is about 500 ms. I did some tuning in postgresql.conf.

I'm very happy with PostgreSQL speed on my 'snail' computer. :)


Re: [GENERAL] PostgreSQL hanging on new connections?

2010-11-30 Thread Florian Weimer
* hubert depesz lubaczewski:

> Now, the question is: why did it hang? Is there anything we can do to
> make it *not* hang?

It might be some general system overload issue.  Try running "echo w >
/proc/sysrq-trigger" as root the next time it happens.  This will dump
kernel backtraces to dmesg, which might hint to what's going on with
the system.

-- 
Florian Weimer
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL hanging on new connections?

2010-11-30 Thread hubert depesz lubaczewski
On Mon, Nov 29, 2010 at 03:57:29PM -0500, Vick Khera wrote:
> On Mon, Nov 29, 2010 at 1:23 PM, Tom Lane  wrote:
> > hubert depesz lubaczewski  writes:
> >> straced postmaster when the problem was happening, and I was opening new
> >> connections. strace looks like this:
> >> [ backend hangs on semop immediately after reading global/pg_database ]
> >
> > It looks like something had exclusive lock on the database that new
> > connections wanted to connect to.  AFAICS the only action in 8.3 that
> > would do that would be a DROP DATABASE or RENAME DATABASE.  What was
> > that other session doing?
> 
> Every once in a while when I connect to my big DB it feels like it
> stalls the connection for a few seconds.  I have no idea what causes
> it, but it does feel like it has to do with the load.  The load is not
> all that high relative to what my box can handle.  It doesn't happen
> often enough for me to track it down, though.  It instinct is that it
> is waiting on a lock, but clearly there is no rename/drop happening on
> my main db else i'd be out of business :-)  I too run 8.3 as primary
> right now... testing 9.0 for deployment soon-ish.
 
 jkust so it will be clear - on some other occasion when we had the
 problem, i left it hng for a while. 5 minutes later i still didn't get
 the connection.

 Best regards,

 depesz


-- 
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: dep...@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL hanging on new connections?

2010-11-30 Thread hubert depesz lubaczewski
On Tue, Nov 30, 2010 at 10:12:47AM +, Florian Weimer wrote:
> * hubert depesz lubaczewski:
> 
> > Now, the question is: why did it hang? Is there anything we can do to
> > make it *not* hang?
> 
> It might be some general system overload issue.  Try running "echo w >
> /proc/sysrq-trigger" as root the next time it happens.  This will dump
> kernel backtraces to dmesg, which might hint to what's going on with
> the system.

will check the idea, but the thing is that aside from postgresql -
everything else on the server is working fine - ssh/top/*stat.
and other postgreses seem to be locked too.

Best regards,

depesz

-- 
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: dep...@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general