Greg Smith <[EMAIL PROTECTED]> writes:

> On Thu, 4 Dec 2008, Gregory Stark wrote:
>
>> What I'm suggesting is that you shouldn't have to special case this. That you
>> should expect whatever formulas you're using to produce the same values as
>> initdb if they were run on the same machine initdb is targeting.
>
> The reason that approach isn't taken is that the model here assumes the OS
> overhead is negligable relative to everything else going on.  

ok that does make sense. But the non-memory parameters...

... I've cut part of my response for a separate thread ...

>> And "desktop" seems like an outlier here. I suppose it's meant to capture
>> whether postgres is on a dedicated box? But it's possible to have a
>> non-dedicated oltp application or non-dedicated data warehouse box just as
>> easily.
>
> That's the target for something that's not a dedicated server--a desktop PC 
> you
> use as a general workstation, maybe you're installing PostgreSQL as a 
> developer
> that's competing with your web server and other apps; something like that.
> There might be a better name for that.

My point was more that you could have a data warehouse on a non-dedicated
machine, you could have a web server on a non-dedicated machine, or you could
have a mixed server on a non-dedicated machine. I don't see how you would
decide whether to set enable_constraint_exclusion for "desktop" for example.

>>> Is 368MB of overhead unreasonable for a web application database today
>>
>> Well I think it's more than most people expect a single application install 
>> to
>> take up before they start putting data in it.
>
> Segments don't get allocated until you churn through that much WAL activity;
> that figure is an upper-bound after you've pushed more than that worth of data
> through WAL and into the database.  The only example where this overhead isn't
> dwarfed by the size of the resulting database 

Right, well, no, it won't be dwarfed -- it'll be about the same size. Ie, if
you load 100MB into the database there'll be about 100MB of logs generated. Up
to the point where you hit this maximum upper bound.

But yes, right that it's the upper bound for the extra space allocated in
addition to the size of the database. And how much extra space should we
allocate? 

I don't see why this extra space bound should depend on the type of OLTP vss
DSS workload. Only on how much disk space is available that the admin is
willing to dedicate to Postgres. Assuming an admin of a 1TB server is willing
to dedicate 1GB to logs and the admin of a 1GB server would be annoyed to have
to throw more than a few hundred megs seems as reasonable a place as any to
start.

> is where some small number of records are inserted, then constantly updated
> and vacuumed. And you know what? The person doing that is likely to really
> benefit from having checkpoint_segments set to a larger value. Update and
> vacuum heavy workloads are exactly the sort where you end up checkpointing
> too often with the default parameters.

Well there are a few problems with this. a) we have HOT now so you don't need
any vacuums to be part of the picture. b) if you're updating the same pages
over and over again a checkpoint will be super-quick since there will only be
a few pages to write out so no you don't really need some large
checkpoint_segments for any performance reason.

Josh's logic is impeccable -- for the specific use case he's describing of a
truly dedicated server with enough disk space for a major production database.
But not every install is going to have gigabytes of space reserved for it and
not every admin is going to realize that he really should set aside gigabytes
of space even though he only expects his database to be a few megabytes.

>> I'm really beginning to think the root of the problem is the name. If it were
>> "transaction_log_max_space" and measured in megabytes people would be happy 
>> to
>> say "ok, I'll make space for 100MB of logs" or whatever. Today they don't 
>> know
>> what to set it to or what the impact of setting it will be.
>
> Unless they do something crazy like read the documentation:

Well we know nobody does that :/

It's great that Postgres has such great documentation but whenever we have the
chance to replace something with an option which doesn't need any
documentation that would be even better. I'm just exploring whether that's an
option here.

> What fun.  I'm beginning to remember why nobody has ever managed to deliver a
> community tool that helps with this configuration task before.

Well I don't think this is why. Nobody's even tried to do this side of things
before. They always got bogged down in trying to parse config files and such.

-- 
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to