Re: [GENERAL] Why checkpoint_timeout had maximum value of 1h?

2012-04-03 Thread Greg Smith

On 03/29/2012 06:57 AM, Maxim Boguk wrote:

Is there any real reason why checkpoint_timeout limited to 1hour?


Just to keep people from accidentally setting a value that's dangerously 
high.  There can be some pretty bad drops in performance if you let 
writes pile up for too long, once the checkpoint really does start running.


In my case I have some replicas with WAL on SAS raid and PGDATA on SSD 
with limited write enduranceIn that case having 
checkpoint_timeout=10hour could reduce amout of writes on SSD  by 
factor of 10, and increase planned ssd lifetime by the same amount.


The big write endurance problem is WAL data, and you're already 
addressing that.  Note that if nothing has been written out since the 
last one, the checkpoint won't actually do anything.  So this 10X 
endurance idea might only work out on a system that's always doing 
something.  You'll certainly get less wear; without measuring your 
workload better, I can't say just what the multiplier is.


The other idea you should be considering, if you haven't already, is not 
provisioning all of the space.



I would like to have ability to set checkpoint_timeout=high value
and (whats even better) checkpoint_timeout=0 - in that case checkpoint 
happen when all checkpoint_segments were used.

Is there any serious drawbacks in that idea?
Is it safe to increase that limit in source and rebuild database?  
(9.0 and 9.1 case)


You can edit src/backend/utils/misc/guc.c , find checkpoint_time, and 
change the 3600 value there to something higher.  You will need to 
rebuild the whole database cluster with that setting (initdb), and 
moving a database cluster of files between your tweaked version to/from 
a regular PostgreSQL will do strange things.  You can prevent that from 
happening accidentally by editing src/include/catalog/catversion.h ; 
find the line that looks like this:


#define CATALOG_VERSION_NO201202141

And change it.  It's just MMDDN to create a complete catalog serial 
number, where N is an incrementing number if more than one change is 
made on the same day.  If you do that and increase the upper bound on 
checkpoint_timeout, that should do what you want, while protecting 
against the dangerous situation--where system catalog doesn't match the 
database binaries.


Setting checkpoint_timeout to 0 instead won't work--it will checkpoint 
all of the time then.  The bottom limit is 30 seconds and you don't want 
to touch that.  It's possible to make 0 mean never timeout, but that 
would require juggling a couple of code pieces around.  The idea of just 
making the timeout big is a safer thing to consider.  I'm not sure if 
you'll really see the gains you're hoping for, but it should be easy 
enough to test.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Why checkpoint_timeout had maximum value of 1h?

2012-03-29 Thread Maxim Boguk
Hi all,

Is there any real reason why checkpoint_timeout limited to 1hour?

In my case I have some replicas with WAL on SAS raid and PGDATA on SSD with
limited write endurance.
And I don't worry about possible long time recovery after power failure in
that case.
Whats more working dataset fill in shared buffers, so almost no dirty
buffers evictions by bgwriter or backends happened.

In that case having checkpoint_timeout=10hour could reduce amout of writes
on SSD  by factor of 10, and increase planned ssd lifetime by the same
amount.

I would like to have ability to set checkpoint_timeout=high value
and (whats even better) checkpoint_timeout=0 - in that case checkpoint
happen when all checkpoint_segments were used.

Is there any serious drawbacks in that idea?

Is it safe to increase that limit in source and rebuild database?  (9.0 and
9.1 case)

-- 
Maxim Boguk
Senior Postgresql DBA.

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

Skype: maxim.boguk
Jabber: maxim.bo...@gmail.com

LinkedIn profile: http://nz.linkedin.com/in/maximboguk
If they can send one man to the moon... why can't they send them all?

МойКруг: http://mboguk.moikrug.ru/
People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage.