We haven't seen any issues since we decreased shared_buffers.  We also
tuned some of the longer running / more frequently executed queries, so
that may have had an effect as well, but my money would be on the
shared_buffers change.  If the issue re-appears I'll try to get a perf
again and post back, but if you don't hear from me again you can assume the
problem is solved.

Thank you all again for the help.

-Dave

On Fri, Sep 13, 2013 at 11:05 AM, David Whittaker <d...@iradix.com> wrote:

>
>
>
> On Fri, Sep 13, 2013 at 10:52 AM, Merlin Moncure <mmonc...@gmail.com>wrote:
>
>> On Thu, Sep 12, 2013 at 3:06 PM, David Whittaker <d...@iradix.com> wrote:
>> > Hi All,
>> >
>> > We lowered shared_buffers to 8G and increased effective_cache_size
>> > accordingly.  So far, we haven't seen any issues since the adjustment.
>>  The
>> > issues have come and gone in the past, so I'm not convinced it won't
>> crop up
>> > again, but I think the best course is to wait a week or so and see how
>> > things work out before we make any other changes.
>> >
>> > Thank you all for your help, and if the problem does reoccur, we'll look
>> > into the other options suggested, like using a patched postmaster and
>> > compiling for perf -g.
>> >
>> > Thanks again, I really appreciate the feedback from everyone.
>>
>> Interesting -- please respond with a follow up if/when you feel
>> satisfied the problem has gone away.  Andres was right; I initially
>> mis-diagnosed the problem (there is another issue I'm chasing that has
>> a similar performance presentation but originates from a different
>> area of the code).
>>
>> That said, if reducing shared_buffers made *your* problem go away as
>> well, then this more evidence that we have an underlying contention
>> mechanic that is somehow influenced by the setting.  Speaking frankly,
>> under certain workloads we seem to have contention issues in the
>> general area of the buffer system.  I'm thinking (guessing) that the
>> problems is usage_count is getting incremented faster than the buffers
>> are getting cleared out which is then causing the sweeper to spend
>> more and more time examining hotly contended buffers.  This may make
>> no sense in the context of your issue; I haven't looked at the code
>> yet.  Also, I've been unable to cause this to happen in simulated
>> testing.  But I'm suspicious (and dollars to doughnuts '0x347ba9' is
>> spinlock related).
>>
>> Anyways, thanks for the report and (hopefully) the follow up.
>>
>> merlin
>>
>
> You guys have taken the time to help me through this, following up is the
> least I can do.  So far we're still looking good.
>

Reply via email to