Re: [PERFORM] Massive I/O spikes during checkpoint

2012-07-10 Thread David Kerr

On Jul 9, 2012, at 10:51 PM, Maxim Boguk wrote:

 
 
 But what appears to be happening is that all of the data is being written out 
 at the end of the checkpoint.
 
 This happens at every checkpoint while the system is under load.
 
 I get the feeling that this isn't the correct behavior and i've done 
 something wrong. 
 
 
 
 It's not an actual checkpoints.
 It's is a fsync after checkpoint which create write spikes hurting server.
 You should set sysctl vm.dirty_background_bytes and vm.dirty_bytes to 
 reasonable low values

So use bla_bytes instead of bla_ratio?

 (for 512MB raid controller with cache I would suggest to sometning like
 vm.dirty_background_bytes = 33554432 
 vm.dirty_bytes = 268435456
 32MB and 256MB respectively)

I'll take a look.

 
 If youre server doesn't have raid with BBU cache - then you should tune these 
 values to much lower values.
 
 Please read http://blog.2ndquadrant.com/tuning_linux_for_low_postgresq/ 
 and related posts.

yeah, I saw that I guess I didn't put 2+2 together. thanks.





Re: [PERFORM] Massive I/O spikes during checkpoint

2012-07-10 Thread Maxim Boguk
On Tue, Jul 10, 2012 at 4:03 PM, David Kerr d...@mr-paradox.net wrote:


 On Jul 9, 2012, at 10:51 PM, Maxim Boguk wrote:



 But what appears to be happening is that all of the data is being written
 out at the end of the checkpoint.

 This happens at every checkpoint while the system is under load.

 I get the feeling that this isn't the correct behavior and i've done
 something wrong.



 It's not an actual checkpoints.
 It's is a fsync after checkpoint which create write spikes hurting server.

 You should set sysctl vm.dirty_background_bytes and vm.dirty_bytes to
 reasonable low values


 So use bla_bytes instead of bla_ratio?


Yes because on 256GB server
echo 10  /proc/sys/vm/dirty_ratio
is equivalent to 26Gb dirty_bytes

and
echo 5 /proc/sys/vm/dirty_background_ratio
is equivalent to 13Gb dirty_background_bytes

It is really huge values.

So kernel doesn't start write any pages out in background before it has at
least 13Gb dirty pages in kernel memory.
And at end of the checkpoint kernel trying flush all dirty pages to disk.

Even echo 1 /proc/sys/vm/dirty_background_ratio  is too high value for
contemporary server.
That is why  *_bytes controls added to kernel.

-- 
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.com/


Re: [PERFORM] Massive I/O spikes during checkpoint

2012-07-10 Thread David Kerr

On 7/9/2012 11:14 PM, Maxim Boguk wrote:



On Tue, Jul 10, 2012 at 4:03 PM, David Kerr d...@mr-paradox.net
mailto:d...@mr-paradox.net wrote:


On Jul 9, 2012, at 10:51 PM, Maxim Boguk wrote:




But what appears to be happening is that all of the data is
being written out at the end of the checkpoint.

This happens at every checkpoint while the system is under load.

I get the feeling that this isn't the correct behavior and
i've done something wrong.



It's not an actual checkpoints.
It's is a fsync after checkpoint which create write spikes hurting
server.
You should set sysctl vm.dirty_background_bytes and vm.dirty_bytes
to reasonable low values


So use bla_bytes instead of bla_ratio?


Yes because on 256GB server
echo 10  /proc/sys/vm/dirty_ratio
is equivalent to 26Gb dirty_bytes

and
echo 5 /proc/sys/vm/dirty_background_ratio
is equivalent to 13Gb dirty_background_bytes

It is really huge values.

sigh yeah, I never bothered to think that through.


So kernel doesn't start write any pages out in background before it has
at least 13Gb dirty pages in kernel memory.
And at end of the checkpoint kernel trying flush all dirty pages to disk.

Even echo 1 /proc/sys/vm/dirty_background_ratio  is too high value for
contemporary server.
That is why  *_bytes controls added to kernel.


Awesome, Thanks.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Massive I/O spikes during checkpoint

2012-07-10 Thread Andres Freund
On Tuesday, July 10, 2012 08:14:00 AM Maxim Boguk wrote:
 On Tue, Jul 10, 2012 at 4:03 PM, David Kerr d...@mr-paradox.net wrote:
  On Jul 9, 2012, at 10:51 PM, Maxim Boguk wrote:
  But what appears to be happening is that all of the data is being
  written out at the end of the checkpoint.
  
  This happens at every checkpoint while the system is under load.
  
  I get the feeling that this isn't the correct behavior and i've done
  something wrong.
  
  It's not an actual checkpoints.
  It's is a fsync after checkpoint which create write spikes hurting
  server.
  
  You should set sysctl vm.dirty_background_bytes and vm.dirty_bytes to
  reasonable low values
  
  
  So use bla_bytes instead of bla_ratio?
 
 Yes because on 256GB server
 echo 10  /proc/sys/vm/dirty_ratio
 is equivalent to 26Gb dirty_bytes
 
 and
 echo 5 /proc/sys/vm/dirty_background_ratio
 is equivalent to 13Gb dirty_background_bytes
 
 It is really huge values.
 
 So kernel doesn't start write any pages out in background before it has at
 least 13Gb dirty pages in kernel memory.
 And at end of the checkpoint kernel trying flush all dirty pages to disk.
Thast not entirely true. The kernel will also writeout pages which haven't 
been written to for dirty_expire_centisecs.

But yes, adjusting dirty_* is definitely a good idea.

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Massive I/O spikes during checkpoint

2012-07-10 Thread Jeff Janes
On Tue, Jul 10, 2012 at 5:44 AM, Andres Freund and...@2ndquadrant.com wrote:
 On Tuesday, July 10, 2012 08:14:00 AM Maxim Boguk wrote:

 So kernel doesn't start write any pages out in background before it has at
 least 13Gb dirty pages in kernel memory.
 And at end of the checkpoint kernel trying flush all dirty pages to disk.

 Thast not entirely true. The kernel will also writeout pages which haven't
 been written to for dirty_expire_centisecs.

There seems to be many situations in which it totally fails to do that.

Although I've never been able to categorize just what those situations are.

Cheers,

Jeff

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Massive I/O spikes during checkpoint

2012-07-10 Thread Andres Freund
On Tuesday, July 10, 2012 03:36:35 PM Jeff Janes wrote:
 On Tue, Jul 10, 2012 at 5:44 AM, Andres Freund and...@2ndquadrant.com 
wrote:
  On Tuesday, July 10, 2012 08:14:00 AM Maxim Boguk wrote:
  So kernel doesn't start write any pages out in background before it has
  at least 13Gb dirty pages in kernel memory.
  And at end of the checkpoint kernel trying flush all dirty pages to
  disk.
  
  Thast not entirely true. The kernel will also writeout pages which
  haven't been written to for dirty_expire_centisecs.
 
 There seems to be many situations in which it totally fails to do that.
Totally as in diry pages sitting around without any io activity? Or just not 
agressive enough?

Currently its a bit hard to speculate about all without specifying the kernel 
because there have been massive rewrites of all that stuff in several kernels 
in the last two years...

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Massive I/O spikes during checkpoint

2012-07-09 Thread Maxim Boguk



 But what appears to be happening is that all of the data is being written
 out at the end of the checkpoint.

 This happens at every checkpoint while the system is under load.

 I get the feeling that this isn't the correct behavior and i've done
 something wrong.



It's not an actual checkpoints.
It's is a fsync after checkpoint which create write spikes hurting server.

You should set sysctl vm.dirty_background_bytes and vm.dirty_bytes to
reasonable low values
(for 512MB raid controller with cache I would suggest to sometning like
vm.dirty_background_bytes = 33554432
vm.dirty_bytes = 268435456
32MB and 256MB respectively)

If youre server doesn't have raid with BBU cache - then you should tune
these values to much lower values.

Please read http://blog.2ndquadrant.com/tuning_linux_for_low_postgresq/
and related posts.

-- 
Maxim Boguk
Senior Postgresql DBA.
http://www.postgresql-consulting.com/

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

Skype: maxim.boguk
Jabber: maxim.bo...@gmail.com
МойКруг: http://mboguk.moikrug.ru/

People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage.


Re: [PERFORM] Massive I/O spikes during checkpoint

2012-07-09 Thread Jeff Janes
On Mon, Jul 9, 2012 at 10:39 PM, David Kerr d...@mr-paradox.net wrote:

 I thought that the idea of checkpoint_completion_target was that we try to
 finish writing
 out the data throughout the entire checkpoint (leaving some room to spare,
 in my case 30%
 of the total estimated checkpoint time)

 But what appears to be happening is that all of the data is being written
 out at the end of the checkpoint.

Postgres is writing data out to the kernel throughout the checkpoint.
But the kernel is just buffering it up dirty, until the end of the
checkpoint when the fsyncs start landing like bombs.


 This happens at every checkpoint while the system is under load.

 I get the feeling that this isn't the correct behavior and i've done
 something wrong.

 Also, I didn't see this sort of behavior in PG 8.3, however unfortunately, I
 don't have data to back that
 statement up.

Did you have less RAM back when you were running PG 8.3?

 Any suggestions. I'm willing and able to profile, or whatever.

Who much RAM do you have?  What are your settings for /proc/sys/vm/dirty_* ?

Cheers,

Jeff

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Massive I/O spikes during checkpoint

2012-07-09 Thread David Kerr

On Jul 9, 2012, at 10:52 PM, Jeff Janes wrote:

 On Mon, Jul 9, 2012 at 10:39 PM, David Kerr d...@mr-paradox.net wrote:
 
 I thought that the idea of checkpoint_completion_target was that we try to
 finish writing
 out the data throughout the entire checkpoint (leaving some room to spare,
 in my case 30%
 of the total estimated checkpoint time)
 
 But what appears to be happening is that all of the data is being written
 out at the end of the checkpoint.
 
 Postgres is writing data out to the kernel throughout the checkpoint.
 But the kernel is just buffering it up dirty, until the end of the
 checkpoint when the fsyncs start landing like bombs.

Ahh. duh!

I guess i assumed that the point of spreading the checkpoint I/O was 
spreading the syncs out.  

 
 
 This happens at every checkpoint while the system is under load.
 
 I get the feeling that this isn't the correct behavior and i've done
 something wrong.
 
 Also, I didn't see this sort of behavior in PG 8.3, however unfortunately, I
 don't have data to back that
 statement up.
 
 Did you have less RAM back when you were running PG 8.3?
nope. I was on RHEL 5.5 back then though.

 
 Any suggestions. I'm willing and able to profile, or whatever.
 
 Who much RAM do you have?  What are your settings for /proc/sys/vm/dirty_* ?

256G 
and I've been running with this for a while now, but I think that's the default 
in RHEL 6+
echo 10  /proc/sys/vm/dirty_ratio 
echo 5 /proc/sys/vm/dirty_background_ratio



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance