On Thu, 4 Dec 2008, Gregory Stark wrote:

My point was more that you could have a data warehouse on a non-dedicated machine, you could have a web server on a non-dedicated machine, or you could have a mixed server on a non-dedicated machine.

I should just finish the documentation, where there will be a big disclaimer saying "THESE SETTINGS ASSUME A SERVER DEDICATED TO POSTGRESQL!" That's the context here. Why, after you follow my tuning instructions, you're lucky if the server will run anything but the database afterwards.

Josh's logic is impeccable -- for the specific use case he's describing of a
truly dedicated server with enough disk space for a major production database.
But not every install is going to have gigabytes of space reserved for it and
not every admin is going to realize that he really should set aside gigabytes
of space even though he only expects his database to be a few megabytes.

It's really quite simple. Josh and I don't care directly about disk space used by the WAL for people with trivial databases. At all. Whatsoever. Maybe once, long ago, when we were young and frugal and skinny[1]; not now, or probably ever again the future. If that's your concern, maybe there can be some companion utility named pgmiser that lowers parameters back down again. Your mascot can be some sort of animal that efficiently lives off small scraps of food or something.[2]

The context here is pgtune, which is aiming to make a fat elephant of a server faster so that there's an answer to people who say "My benchmarks are all running really slow, is this because my system with 16PT of RAM is only using 32MB of it for the database? This sucks, I'm going back to Oracle which used all my RAM." If there are people who instead think, "hey, I'll run this tuning utility to make my database faster, then it will also be a lot smaller!", maybe we can find a class about space/time tradeoffs in algorithm design to send them to or something.[3]

There are exactly two important things here. The first is how large checkpoint_settings needs to be in order to for the considerable overhead of checkpoints to be bearable. That drives the setting up. Our super-fat DW application gets set to at least 64 so that when you bulk-load another TB of data into it, that doesn't get bottlenecked dumping gigabytes of dirty buffers every few seconds. If the database crashes and recovery reads or writes a bunch of data, who cares about random writes because your SAN has a 4GB write cache on it and dozens of drives slaving away.

Driving the setting down is knowing how much time you'll have to wait for recovery to happen, which is really a measure of what your tolerance for downtime is. We're thinking that someone who picks the Desktop tuning may have no tolerance for the database to be sluggish coming back up after Windows crashed and they rebooted, so tiny setting for them to make recovery super fast.

Everybody else in our sample profiles fall in the middle of those two extremes, which is why the values curve the way they do. Web app? Probably not a lot of write volume, probably trouble if it's down a long time; how about 8, on the low side, but it gives checkpoints more time to spread out their I/O so worst-case latency isn't as bad. That's the sort of analysis those numbers come from. Do performance tuning and juggle these trade-offs for long enough for new people all the time, you get a gut feel for the right ballpark an app should start at based on its type. The whole idea behind this tool is that we're taking some of that hard-won knowledge and trying to automate the distribution of it.

It's great that Postgres has such great documentation but whenever we have the
chance to replace something with an option which doesn't need any
documentation that would be even better. I'm just exploring whether that's an
option here.

I would be glad to have a post-CommitFest discussion of this very topic as it's quite a pain to me in its current form. Just not right now because it's too late to touch it.

Nobody's even tried to do this side of things before. They always got bogged down in trying to parse config files and such.

It's actually because most of them were working in Perl, which encourages deviant behavior where people delight in converting useful ideas into illegible punctuation rather than actually getting anything done. Except for that other Greg around here who's not involved in this discussion, his Perl is pretty good.

[1] Josh is being aggressively bulked up right now for his next sumo match.

[2] Like a rat, which would give you an excuse to add the long overdue PL/Ratfor.

[3] This wouldn't actually help them learn anything, but it would make their heads explode at which point all their problems are gone.

--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to