Hi Jeff,


On Tuesday, May 14, 2013, Ioana Danes  wrote:

Hi all,
>
>I have a production database that sometimes runs out of memory at nightly 
>vacuum.
>
>The application runs typically with around 40 postgres connections but there 
>are times when the connections increase because of some queries going on. 

You should consider not allowing that to happen.  If 40 connections is the 
sweet spot for your hardware, then you can't change reality simply by changing 
the connection pool setting.
 
Yes I agree


The reason is that the operations are slow, the terminals time out and try to 
reconnect using new connections. 


So, "The beatings will continue until morale improves".  You should consider 
not allowing that to happen, either.  If the original operation is slow, why 
would trying it again on a new connection (while leaving the original one 
behind to clog things up) be any better?

Yes I agree it is just hard to distinguish between a real and false request for 
a new connections (high sales or bad queries).

 
Some time ago I started to have problems with too many connections being open 
so I lowered the limit to 300 connections. It was all good until recently when 
even with 180 connections I am running out of memory on vacuum... So the 
connections increase to 180 and the system still runs properly for other 2 days 
but then at nightly vacuum runs out of memory.
>The fix is to restart postgres ... If I only close the connections the problem 
>is still these so I need to restart postgres.
>

How are you closing the connections? 

I restart the application server. The problem is that the max_idle connections 
was set to 1000 on jdbc connection so once the spike happened the app would run 
with 300 connections and 250 of them or so IDLE for most of the time. I am 
fixing that

 
If I don't restart postgres then the system will run out of memory on queries 
at a point...
>Another important thing is that during vacuum at 1am nothing else is going on 
>that server so all the connections are idle.

Truly idle, or idle in transaction, or not even that?  If the "abandoned" 
connections have completed whatever long running thing caused them to be 
abandoned, then they should no longer exist.  What are the largest processes 
according to top?
They were IDLE (not IDLE IN TRANSACTION)


You seem to have a series of unfortunate events here, but I think you are 
tackling them from the wrong end.  What are these slow queries that take up a 
lot of memory, and why can't they be improved?  Given that you have slow 
queries that take a lot of memory, why does your app respond to this by 
launching a Denial of Service attack against its own database, and why do you 
let it do that?

I never said that I am ignoring the bad queries or the bad processes. Sometimes 
it just takes time to catch and fix. It is just the reality.... 
Not all the time the pb is very obvious and easy to reproduce.


Thank you very much for your response,
ioana

Reply via email to