2009/5/28 Greg Smith <gsm...@gregsmith.com>

> On Thu, 28 May 2009, Flavio Henrique Araque Gurgel wrote:
>
>  It is 2.6.24 We had to apply the kswapd patch also. It's important
>> specially if you see your system % going as high as 99% in top and loosing
>> the machine's control. I have read something about 2.6.28 had this patch
>> accepted in mainstream.
>>
>
> It would help if you gave more specific information about what you're
> talking about.  I know there was a bunch of back and forth on the "kswapd
> should only wait on IO if there is IO" patch, where it was commited and then
> reverted etc, but it's not clear to me if that's what you're talking
> about--and if so, what that has to do with the context switch problem.
>
> Back to Fabrix's problem.  You're fighting a couple of losing battles here.
>  Let's go over the initial list:
>
> 1) You have 32 cores.  You think they should be allowed to schedule
>
>> 3500 active connections across them.  That doesn't work, and what happens
>>
> is exactly the sort of context switch storm you're showing data for. Think
> about it for a minute:  how many of those can really be doing work at any
> time?  32, that's how many.  Now, you need some multiple of the number of
> cores to try to make sure everybody is always busy, but that multiple should
> be closer to 10X the number of cores rather than 100X. You need to adjust
> the connection pool ratio so that the PostgreSQL max_connections is closer
> to 500 than 5000, and this is by far the most critical thing for you to do.
>  The PostgreSQL connection handler is known to be bad at handling high
> connection loads compared to the popular pooling projects, so you really
> shouldn't throw this problem at it. While kernel problems stack on top of
> that, you really shouldn't start at kernel fixes; nail the really
> fundamental and obvious problem first.


In this application is not closing the connection, the development team is
makeing the change for close the connection after getting the job done. So
most connections are in idle state.  How much would this help? Does this
could be the real problem?


>
> 2) You have very new hardware and a very old kernel.  Once you've done the
> above, if you're still not happy with performance, at that point you should
> consider using a newer one.  It's fairly simple to build a Linux kernel
> using the same basic kernel parameters as the stock RedHat one. 2.6.28 is
> six months old now, is up to 2.6.28.10, and has gotten a lot more testing
> than most kernels due to it being the Ubuntu 9.04 default. I'd suggest you
> try out that version.


ok, I'll test if updating the kernel this improves


>
> 3) A system with 128GB of RAM is in a funny place where by using the
> defaults or the usual rules of thumb for a lot of parameters ("set
> shared_buffers to 1/4 of RAM") are all bad ideas.  shared_buffers seems to
> top out its usefulness around 10GB on current generation hardware/software,
> and some Linux memory tunables have defaults on 2.6.18 that are insane for
> your system; vm_dirty_ratio at 40 comes to mind as the one I run into most.
>  Some of that gets fixed just by moving to a newer kernel, some doesn't.
>  Again, these aren't the problems you're having now though; they're the ones
> you'll have in the future *if* you fix the more fundamental problems first.




>
>
> --
> * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD
>

Reply via email to