Re: [PERFORM] postgresql.conf recommendations

2013-02-06 Thread Kevin Grittner
Johnny Tan wrote: > Wouldn't this be controlled by our checkpoint settings, though? Spread checkpoints made the issue less severe, but on servers with a lot of RAM I've had to make the above changes (or even go lower with shared_buffers) to prevent a burst of writes from overwhelming the RAID c

Re: [PERFORM] postgresql.conf recommendations

2013-02-06 Thread Josh Krupka
I originally got started down that trail because running perf top while having some of the slow query issues showed compaction_alloc at the top of the list. That function is the THP page compaction which lead me to some pages like: http://www.olivierdoucet.info/blog/2012/05/19/debugging-a-mysql-st

Re: [PERFORM] postgresql.conf recommendations

2013-02-06 Thread Jeff Janes
On Tue, Feb 5, 2013 at 2:02 PM, Johnny Tan wrote: > checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 - 1.0 I always set this to 0.9. I don't know why the default is 0.5. > But periodically, there are spikes in our app's db response time. Normally, > the app's db response t

Re: [PERFORM] postgresql.conf recommendations

2013-02-06 Thread Johnny Tan
On Tue, Feb 5, 2013 at 11:23 PM, Josh Krupka wrote: > I've been looking into something on our system that sounds similar to what > you're seeing. I'm still researching it, but I'm suspecting the memory > compaction that runs as part of transparent huge pages when memory is > allocated... yet to

Re: [PERFORM] postgresql.conf recommendations

2013-02-06 Thread Johnny Tan
On Wed, Feb 6, 2013 at 2:13 PM, David Whittaker wrote: > We disabled THP all together, with the thought that we might re-enable > without defrag if we got positive results. At this point I don't think THP > is the root cause though, so I'm curious to see if anyone else gets > positive results fr

Re: [PERFORM] postgresql.conf recommendations

2013-02-06 Thread Johnny Tan
On Wed, Feb 6, 2013 at 7:49 AM, Kevin Grittner wrote: > "a...@hsk.hk" wrote: > > Johnny Tan wrote: > > >>shared_buffers = 48GB# min 128kB > > > From the postgresql.conf, I can see that the shared_buffers is > > set to 48GB which is not small, it would be possible that the > > large buffer cach

Re: [PERFORM] postgresql.conf recommendations

2013-02-06 Thread David Whittaker
Hi Josh, On Wed, Feb 6, 2013 at 1:20 PM, Josh Krupka wrote: > David, > > Interesting observations. I had not been tracking the interrupts but > perhaps I should take a look. How are you measuring them over a period of > time, or are you just getting them real time? > I initially saw it happen

Re: [PERFORM] postgresql.conf recommendations

2013-02-06 Thread Josh Krupka
David, Interesting observations. I had not been tracking the interrupts but perhaps I should take a look. How are you measuring them over a period of time, or are you just getting them real time? Did you turn off THP all together or just the THP defrag? On Wed, Feb 6, 2013 at 10:42 AM, David

Re: [PERFORM] Slow Query Help

2013-02-06 Thread Pavan Deolasee
On Wed, Feb 6, 2013 at 9:52 PM, Kevin Grittner wrote: > Will Platnick wrote: >> Will Platnick wrote: > >>> The only thing that stands out is: on your production server I see >>> "Total runtime: 7.515 ms", but the top node in EXPLAIN ANAYZE shows >>> actual time as 0.179 ms. Not sure where that a

Re: [PERFORM] Slow Query Help

2013-02-06 Thread Will Platnick
Clients are technically our pgbouncer which is on the same machine. The explain analyze was local through psql direct to postgresql. On Wednesday, February 6, 2013 at 11:22 AM, Kevin Grittner wrote: > Will Platnick mailto:wplatn...@gmail.com)> wrote: > > Will Platnick mailto:wplatn...@gmail.com

Re: [PERFORM] Slow Query Help

2013-02-06 Thread Kevin Grittner
Will Platnick wrote: > Will Platnick wrote: >> The only thing that stands out is: on your production server I see >> "Total runtime: 7.515 ms", but the top node in EXPLAIN ANAYZE shows >> actual time as 0.179 ms. Not sure where that additional time is being >> spent though. It could be ExecutorS

Re: [PERFORM] postgresql.conf recommendations

2013-02-06 Thread David Whittaker
Josh/Johnny, We've been seeing a similar problem as well, and had also figured THP was involved. We found this in syslog: https://gist.github.com/davewhittaker/4723285, which led us to disable THP 2 days ago. At first the results seemed good. In particular, our issues always seemed interrupt re

Re: [PERFORM] Slow Query Help

2013-02-06 Thread Will Platnick
Good eye, I totally missed that! Any ideas on how to troubleshoot this delay? On Wednesday, February 6, 2013 at 3:51 AM, Pavan Deolasee wrote: > On Tue, Feb 5, 2013 at 9:15 AM, Will Platnick (mailto:wplatn...@gmail.com)> wrote: > > We upgraded from PG 9.1 to 9.2. Since the upgrade, the # of

Re: [PERFORM] postgresql.conf recommendations

2013-02-06 Thread Kevin Grittner
"a...@hsk.hk" wrote: > Johnny Tan wrote: >>shared_buffers = 48GB# min 128kB > From the postgresql.conf, I can see that the shared_buffers is > set to 48GB which is not small,  it would be possible that the > large buffer cache could be "dirty", when a checkpoint starts, it > would cause a check

Re: [PERFORM] Slow Query Help

2013-02-06 Thread Pavan Deolasee
On Tue, Feb 5, 2013 at 9:15 AM, Will Platnick wrote: > We upgraded from PG 9.1 to 9.2. Since the upgrade, the # of active queries > has raised significantly, especially during our peak time where lots of > users are logging in. According to New Relic, this query is now taking up > the most amount