On 11/20/2012 04:08 PM, Jeff Janes wrote:

Shaun Thomas reports one that is (I assume) not read intensive, but
his diagnosis is that this is a kernel bug where a larger
shared_buffers for no good reason causes the kernel to kill off its
page cache.

We're actually very read intensive. According to pg_stat_statements, we regularly top out at 42k queries per second, and pg_stat_database says we're pushing 7k TPS.

But I'm still sure this is a kernel bug. Moving from 4GB to 6GB or 8GB causes the kernel to cut the active page cache in half, in addition to freeing 1/4 of RAM to just sit around doing nothing. That in turn causes kswapd to work constantly, while our IO drivers work to undo the damage. It's a positive feedback loop that I can reliably drive the load up to 800+ on an 800-client pgbench with two threads, all while having 0% CPU free.

Make that 4GB, and not only does the problem completely disappear, but the load settles down to around 9, and the machine becomes about 60% idle. Something in there is fantastically broken, but I can't point a finger at what.

I was just piping in because, in absence of an obvious PG-related culprit, the problem could be the OS itself. It certainly was in our case.

That, or PG has a memory leak that only appears at > 4GB of shared buffers.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to