Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-20 Thread Kevin Grittner
Greg Smith g...@2ndquadrant.com wrote:
 On 06/19/2011 06:15 PM, Kevin Grittner wrote:
 I think the point is that if, on a fresh system, the first access
 to a table is something which uses a tables scan -- like select
 count(*) -- that all indexed access would then  tend to be
 suppressed for that table.  After all, for each individual query,
 selfishly looking at its own needs in isolation, it likely
 *would* be faster to use the cached heap data.
 
 If those accesses can compete with other activity, such that the
 data really does stay in the cache rather than being evicted, then
 what's wrong with that?
 
The problem is that if somehow the index *does* find its way into
cache, the queries might all run an order of magnitude faster by
using it.  The *first* query to bite the bullet and read through the
index wouldn't, of course, since it would have all that random disk
access.  But its not hard to imagine an application mix where this
feature could cause a surprising ten-fold performance drop after
someone does a table scan which could persist indefinitely.  I'm not
risking that in production without a clear mechanism to
automatically recover from that sort of cache skew.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-20 Thread Greg Smith

Kevin Grittner wrote:

But its not hard to imagine an application mix where this
feature could cause a surprising ten-fold performance drop after
someone does a table scan which could persist indefinitely.  I'm not
risking that in production without a clear mechanism to
automatically recover from that sort of cache skew


The idea that any of this will run automatically is a dream at this 
point, so saying you want to automatically recover from problems with 
the mechanism that doesn't even exist yet is a bit premature.  Some of 
the implementation ideas here might eventually lead to where real-time 
cache information is used, and that is where the really scary feedback 
loops you are right to be worried about come into play.  The idea for 
now is that you'll run this new type of ANALYZE CACHE operation 
manually, supervised and at a time where recent activity reflects the 
sort of workload you want to optimize for.  And then you should review 
its results to make sure the conclusions it drew about your cache 
population aren't really strange.


To help with that, I was thinking of writing a sanity check tool that 
showed how the cached percentages this discovers compare against the 
historical block hit percentages for the relation.  An example of how 
values changed from what they were already set to after a second ANALYZE 
CACHE is probably useful too.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-20 Thread Kevin Grittner
Greg Smith g...@2ndquadrant.com wrote:
 
 The idea that any of this will run automatically is a dream at
 this point, so saying you want to automatically recover from
 problems with the mechanism that doesn't even exist yet is a bit
 premature.
 
Well, I certainly didn't mean it to be a reason not to move forward
with development -- I wouldn't have raised the issue had you not
said this upthread:
 
 I don't see how sequential scan vs. index costing will be any
 different on a fresh system than it is now.
 
All I was saying is: I do; here's how...
 
Carry on.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-19 Thread Greg Stark
On Tue, Jun 14, 2011 at 4:04 PM, Robert Haas robertmh...@gmail.com wrote:
 1. ANALYZE happens far too infrequently to believe that any data taken
 at ANALYZE time will still be relevant at execution time.
 2. Using data gathered by ANALYZE will make plans less stable, and our
 users complain not infrequently about the plan instability we already
 have, therefore we should not add more.
 3. Even if the data were accurate and did not cause plan stability, we
 have no evidence that using it will improve real-world performance.

I feel like this is all baseless FUD. ANALYZE isn't perfect but it's
our interface for telling postgres to gather stats and we generally
agree that having stats and modelling the system behaviour as
accurately as practical is the right direction so we need a specific
reason why this stat and this bit of modeling is a bad idea before we
dismiss it.

I think the kernel of truth in these concerns is simply that
everything else ANALYZE looks at mutates only on DML. If you load the
same data into two databases and run ANALYZE you'll get (modulo random
sampling) the same stats. And if you never modify it and analyze it
again a week later you'll get the same stats again. So autovacuum can
guess when to run analyze based on the number of DML operations, it
can run it without regard to how busy the system is, and it can hold
off on running it if the data hasn't changed.

In the case of the filesystem buffer cache the cached percentage will
vary over time regardless of whether the data changes. Plain select
queries will change it, even other activity outside the database will
change it. There are a bunch of strategies for mitigating this
problem: we might want to look at the cache situation more frequently,
discount the results we see since more aggressively, and possibly
maintain a kind of running average over time.

There's another problem which I haven't seen mentioned. Because the
access method will affect the cache there's the possibility of
feedback loops. e.g. A freshly loaded system prefers sequential scans
for a given table because without the cache the seeks of random reads
are too expensive... causing it to never load that table into cache...
causing that table to never be cached and never switch to an index
method. It's possible there are mitigation strategies for this as well
such as keeping a running average over time and discounting the
estimates with some heuristic values.







-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-19 Thread Cédric Villemain
2011/6/19 Greg Stark st...@mit.edu:
 On Tue, Jun 14, 2011 at 4:04 PM, Robert Haas robertmh...@gmail.com wrote:
 1. ANALYZE happens far too infrequently to believe that any data taken
 at ANALYZE time will still be relevant at execution time.
 2. Using data gathered by ANALYZE will make plans less stable, and our
 users complain not infrequently about the plan instability we already
 have, therefore we should not add more.
 3. Even if the data were accurate and did not cause plan stability, we
 have no evidence that using it will improve real-world performance.

 I feel like this is all baseless FUD. ANALYZE isn't perfect but it's
 our interface for telling postgres to gather stats and we generally
 agree that having stats and modelling the system behaviour as
 accurately as practical is the right direction so we need a specific
 reason why this stat and this bit of modeling is a bad idea before we
 dismiss it.

 I think the kernel of truth in these concerns is simply that
 everything else ANALYZE looks at mutates only on DML. If you load the
 same data into two databases and run ANALYZE you'll get (modulo random
 sampling) the same stats. And if you never modify it and analyze it
 again a week later you'll get the same stats again. So autovacuum can
 guess when to run analyze based on the number of DML operations, it
 can run it without regard to how busy the system is, and it can hold
 off on running it if the data hasn't changed.

 In the case of the filesystem buffer cache the cached percentage will
 vary over time regardless of whether the data changes. Plain select
 queries will change it, even other activity outside the database will
 change it. There are a bunch of strategies for mitigating this
 problem: we might want to look at the cache situation more frequently,
 discount the results we see since more aggressively, and possibly
 maintain a kind of running average over time.

Yes.


 There's another problem which I haven't seen mentioned. Because the
 access method will affect the cache there's the possibility of
 feedback loops. e.g. A freshly loaded system prefers sequential scans
 for a given table because without the cache the seeks of random reads
 are too expensive... causing it to never load that table into cache...
 causing that table to never be cached and never switch to an index
 method. It's possible there are mitigation strategies for this as well

Yeah, that's one of the problem to solve. So far I've tried to keep a
planner which behave as currently when the rel_oscache == 0. So that
fresh server will have the same planning than a server without
rel_oscache.

Those points are to be solved in costestimates (and selfunc). For this
case, there is a balance between page filtering cost and index access
cost. *And* once  the table is in cache, the index cost less and can
be better because it need less filtering (less rows, less pages, less
work). there is also a possible issue here (if using the index remove
the table from cache) but I am not too much afraid of that right now.

 such as keeping a running average over time and discounting the
 estimates with some heuristic values.

yes, definitively something to think about. My biggest fear here is
for shared servers (when there is competition between services to use
the OS cache, shooting down kernel cache strategies).

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-19 Thread Robert Haas
On Sun, Jun 19, 2011 at 9:38 AM, Greg Stark st...@mit.edu wrote:
 On Tue, Jun 14, 2011 at 4:04 PM, Robert Haas robertmh...@gmail.com wrote:
 1. ANALYZE happens far too infrequently to believe that any data taken
 at ANALYZE time will still be relevant at execution time.
 2. Using data gathered by ANALYZE will make plans less stable, and our
 users complain not infrequently about the plan instability we already
 have, therefore we should not add more.
 3. Even if the data were accurate and did not cause plan stability, we
 have no evidence that using it will improve real-world performance.

 I feel like this is all baseless FUD. ANALYZE isn't perfect but it's
 our interface for telling postgres to gather stats and we generally
 agree that having stats and modelling the system behaviour as
 accurately as practical is the right direction so we need a specific
 reason why this stat and this bit of modeling is a bad idea before we
 dismiss it.

 I think the kernel of truth in these concerns is simply that
 everything else ANALYZE looks at mutates only on DML. If you load the
 same data into two databases and run ANALYZE you'll get (modulo random
 sampling) the same stats. And if you never modify it and analyze it
 again a week later you'll get the same stats again. So autovacuum can
 guess when to run analyze based on the number of DML operations, it
 can run it without regard to how busy the system is, and it can hold
 off on running it if the data hasn't changed.

 In the case of the filesystem buffer cache the cached percentage will
 vary over time regardless of whether the data changes. Plain select
 queries will change it, even other activity outside the database will
 change it. There are a bunch of strategies for mitigating this
 problem: we might want to look at the cache situation more frequently,
 discount the results we see since more aggressively, and possibly
 maintain a kind of running average over time.

 There's another problem which I haven't seen mentioned. Because the
 access method will affect the cache there's the possibility of
 feedback loops. e.g. A freshly loaded system prefers sequential scans
 for a given table because without the cache the seeks of random reads
 are too expensive... causing it to never load that table into cache...
 causing that table to never be cached and never switch to an index
 method. It's possible there are mitigation strategies for this as well
 such as keeping a running average over time and discounting the
 estimates with some heuristic values.

*scratches head*

Well, yeah.  I completely agree with you that these are the things we
need to worry about.  Maybe I did a bad job explaining myself, because
ISTM you said my concerns were FUD and then went on to restate them in
different words.

I'm not bent out of shape about using ANALYZE to try to gather the
information.  That's probably a reasonable approach if it turns out we
actually need to do it at all.  I am not sure we do.  What I've argued
for in the past is that we start by estimating the percentage of the
relation that will be cached based on its size relative to
effective_cache_size, and allow the administrator to override the
percentage on a per-relation basis if it turns out to be wrong.  That
would avoid all of these concerns and allow us to focus on the issue
of how the caching percentages impact the choice of plan, and whether
the plans that pop out are in fact better when you provide information
on caching as input.  If we have that facility in core, then people
can write scripts or plug-in modules to do ALTER TABLE .. SET
(caching_percentage = XYZ) every hour or so based on the sorts of
statistics that Cedric is gathering here, and users will be able to
experiment with a variety of algorithms and determine which ones work
the best.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-19 Thread Cédric Villemain
2011/6/19 Robert Haas robertmh...@gmail.com:
 On Sun, Jun 19, 2011 at 9:38 AM, Greg Stark st...@mit.edu wrote:
 On Tue, Jun 14, 2011 at 4:04 PM, Robert Haas robertmh...@gmail.com wrote:
 1. ANALYZE happens far too infrequently to believe that any data taken
 at ANALYZE time will still be relevant at execution time.
 2. Using data gathered by ANALYZE will make plans less stable, and our
 users complain not infrequently about the plan instability we already
 have, therefore we should not add more.
 3. Even if the data were accurate and did not cause plan stability, we
 have no evidence that using it will improve real-world performance.

 I feel like this is all baseless FUD. ANALYZE isn't perfect but it's
 our interface for telling postgres to gather stats and we generally
 agree that having stats and modelling the system behaviour as
 accurately as practical is the right direction so we need a specific
 reason why this stat and this bit of modeling is a bad idea before we
 dismiss it.

 I think the kernel of truth in these concerns is simply that
 everything else ANALYZE looks at mutates only on DML. If you load the
 same data into two databases and run ANALYZE you'll get (modulo random
 sampling) the same stats. And if you never modify it and analyze it
 again a week later you'll get the same stats again. So autovacuum can
 guess when to run analyze based on the number of DML operations, it
 can run it without regard to how busy the system is, and it can hold
 off on running it if the data hasn't changed.

 In the case of the filesystem buffer cache the cached percentage will
 vary over time regardless of whether the data changes. Plain select
 queries will change it, even other activity outside the database will
 change it. There are a bunch of strategies for mitigating this
 problem: we might want to look at the cache situation more frequently,
 discount the results we see since more aggressively, and possibly
 maintain a kind of running average over time.

 There's another problem which I haven't seen mentioned. Because the
 access method will affect the cache there's the possibility of
 feedback loops. e.g. A freshly loaded system prefers sequential scans
 for a given table because without the cache the seeks of random reads
 are too expensive... causing it to never load that table into cache...
 causing that table to never be cached and never switch to an index
 method. It's possible there are mitigation strategies for this as well
 such as keeping a running average over time and discounting the
 estimates with some heuristic values.

 *scratches head*

 Well, yeah.  I completely agree with you that these are the things we
 need to worry about.  Maybe I did a bad job explaining myself, because
 ISTM you said my concerns were FUD and then went on to restate them in
 different words.

 I'm not bent out of shape about using ANALYZE to try to gather the
 information.  That's probably a reasonable approach if it turns out we
 actually need to do it at all.  I am not sure we do.  What I've argued
 for in the past is that we start by estimating the percentage of the
 relation that will be cached based on its size relative to
 effective_cache_size, and allow the administrator to override the
 percentage on a per-relation basis if it turns out to be wrong.  That
 would avoid all of these concerns and allow us to focus on the issue
 of how the caching percentages impact the choice of plan, and whether
 the plans that pop out are in fact better when you provide information
 on caching as input.  If we have that facility in core, then people
 can write scripts or plug-in modules to do ALTER TABLE .. SET
 (caching_percentage = XYZ) every hour or so based on the sorts of
 statistics that Cedric is gathering here, and users will be able to
 experiment with a variety of algorithms and determine which ones work
 the best.

Robert, I am very surprised.
My patch does offer that.

1st, I used ANALYZE because it is the way to update pg_class I found.
You are suggesting ALTER TABLE instead, that is fine, but give me that
lock-free :) else we have the ahem.. Alvaro's pg_class_ng (I find this
one interesting because it will be lot easier to have different values
on standby server if we find a way to have pg_class_ng 'updatable' per
server)
So, as long as the value can be change without problem, I don't care
where it resides.

2nd, I provided the patches on the last CF, exactly to allow to go to
the exciting part: the cost-estimates changes. (after all, we can work
on the cost estimate, and if later we find a way to use ALTER TABLE or
pg_class_ng, just do it instead of via the ANALYZE magic)

3nd, you can right now write a plugin to set the value of rel_oscache
(exactly like the one you'll do for a ALTER TABLE SET reloscache...)

RelationGetRelationOSCacheInFork(Relation relation, ForkNumber forkNum)
{
   float4 percent = 0;
   /* if a plugin is present, let it manage things */
   if (OSCache_hook)
   

Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-19 Thread Greg Smith

On 06/19/2011 09:38 AM, Greg Stark wrote:

There's another problem which I haven't seen mentioned. Because the
access method will affect the cache there's the possibility of
feedback loops. e.g. A freshly loaded system prefers sequential scans
for a given table because without the cache the seeks of random reads
are too expensive...


Not sure if it's been mentioned in this thread yet, but he feedback 
issue has popped up in regards to this area plenty of times.  I think 
everyone who's producing regular input into this is aware of it, even if 
it's not mentioned regularly.  I'm not too concerned about the specific 
case you warned about because I don't see how sequential scan vs. index 
costing will be any different on a fresh system than it is now.  But 
there are plenty of cases like it to be mapped out here, and many are 
not solvable--they're just something that needs to be documented as a risk.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-19 Thread Kevin Grittner
Greg Smith  wrote:
 
 I'm not too concerned about the specific case you warned about
 because I don't see how sequential scan vs. index costing will be
 any different on a fresh system than it is now.
 
I think the point is that if, on a fresh system, the first access to
a table is something which uses a tables scan -- like select count(*)
-- that all indexed access would then  tend to be suppressed for that
table.  After all, for each individual query, selfishly looking at
its own needs in isolation, it likely *would* be faster to use the
cached heap data.
 
I see two ways out of that -- one hard and one easy.  One way would
be to somehow look at the impact on the cache of potential plans and
the resulting impact on overall throughput of the queries being run
with various cache contents.  That's the hard one, in case anyone
wasn't clear.  ;-)  The other way would be to run some percentage of
the queries *without* considering current cache contents, so that the
cache can eventually adapt to the demands.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-19 Thread Greg Smith

On 06/19/2011 06:15 PM, Kevin Grittner wrote:

I think the point is that if, on a fresh system, the first access to
a table is something which uses a tables scan -- like select count(*)
-- that all indexed access would then  tend to be suppressed for that
table.  After all, for each individual query, selfishly looking at
its own needs in isolation, it likely *would* be faster to use the
cached heap data.
   


If those accesses can compete with other activity, such that the data 
really does stay in the cache rather than being evicted, then what's 
wrong with that?  We regularly have people stop by asking for how to pin 
particular relations to the cache, to support exactly this sort of scenario.


What I was would expect on any mixed workload is that the table would 
slowly get holes shot in it, as individual sections were evicted for 
more popular index data.  And eventually there'd be little enough left 
for it to win over an index scan.  But if people keep using the copy of 
the table in memory instead, enough so that it never really falls out of 
cache, well that's not necessarily even a problem--it could be 
considered a solution for some.


The possibility that people can fit their entire table into RAM and it 
never leaves there is turning downright probable in some use cases now.  
A good example are cloud instances using EC2, where people often 
architect their systems such that the data set put onto any one node 
fits into RAM.  As soon as that's not true you suffer too much from disk 
issues, so breaking the databases into RAM sized pieces turns out to be 
very good practice.  It's possible to tune fairly well for this case 
right now--just make the page costs all low.  The harder case that I see 
a lot is where all the hot data fits into cache, but there's a table or 
two of history/archives that don't.  And that would be easier to do the 
right thing with given this bit of what's in the cache? percentages.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-19 Thread Robert Haas
On Sun, Jun 19, 2011 at 3:32 PM, Cédric Villemain
cedric.villemain.deb...@gmail.com wrote:
 2nd, I provided the patches on the last CF, exactly to allow to go to
 the exciting part: the cost-estimates changes. (after all, we can work
 on the cost estimate, and if later we find a way to use ALTER TABLE or
 pg_class_ng, just do it instead of via the ANALYZE magic)

We're talking past each other here, somehow.  The cost-estimating part
does not require this patch in order to something useful, but this
patch, AFAICT, absolutely does require the cost-estimating part to do
something useful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [WIP] cache estimates, cache access cost

2011-06-14 Thread Cédric Villemain
2011/5/16 Greg Smith g...@2ndquadrant.com:
 Cédric Villemain wrote:


 http://git.postgresql.org/gitweb?p=users/c2main/postgres.git;a=shortlog;h=refs/heads/analyze_cache


 This rebases easily to make Cedric's changes move to the end; I just pushed
 a version with that change to
 https://github.com/greg2ndQuadrant/postgres/tree/analyze_cache if anyone
 wants a cleaner one to browse.  I've attached a patch too if that's more
 your thing.

 I'd recommend not getting too stuck on the particular hook Cédric has added
 here to compute the cache estimate, which uses mmap and mincore to figure it
 out.  It's possible to compute similar numbers, albeit less accurate, using
 an approach similar to how pg_buffercache inspects things.  And I even once
 wrote a background writer extension that collected this sort of data as it
 was running the LRU scan anyway.  Discussions of this idea seem to focus on
 how the what's in the cache? data is collected, which as far as I'm
 concerned is the least important part.  There are multiple options, some
 work better than others, and there's no reason that can't be swapped out
 later.  The more important question is how to store the data collected and
 then use it for optimizing queries.

Attached are updated patches without the plugin itself. I've also
added the cache_page_cost GUC, this one is not per tablespace, like
others page_cost.

There are 6 patches:

0001-Add-reloscache-column-to-pg_class.patch
0002-Add-a-function-to-update-the-new-pg_class-cols.patch
0003-Add-ANALYZE-OSCACHE-VERBOSE-relation.patch
0004-Add-a-Hook-to-handle-OSCache-stats.patch
0005-Add-reloscache-to-Index-Rel-OptInfo.patch
0006-Add-cache_page_cost-GUC.patch

I have some comments on my own code:

* I am not sure of the best datatype to use for 'reloscache'
* I didn't include the catalog number change in the patch itself.
* oscache_update_relstats() is very similar to vac_update_relstats(),
maybe better to merge them but reloscache should not be updated at the
same time than other stats.
* There is probably too much work done in do_oscache_analyze_rel()
because I kept vac_open_indexes() (not a big drama atm)
* I don't know so much how gram.y works, so I am not sure my changes
cover all cases.
* No tests; similar columns and GUC does not have test either, but it
lacks a test for ANALYZE OSCACHE

-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support
From d2fe7e85aea31cfe8cd6559a060f71c424fe03af Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Villemain?= ced...@2ndquadrant.fr
Date: Wed, 25 May 2011 23:17:36 +0200
Subject: [PATCH 1/7] Add reloscache column to pg_class

1 column reloscache to contain the percentage of pages in cache per relation.
May be used by the planner and updated with ANALYZE OSCACHE; (not done yet,
see next commits)
---
 doc/src/sgml/catalogs.sgml |   11 +
 src/backend/catalog/heap.c |1 +
 src/backend/utils/cache/relcache.c |2 +
 src/include/catalog/pg_class.h |   44 ++-
 4 files changed, 37 insertions(+), 21 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
new file mode 100644
index 8504555..4cfad39
*** a/doc/src/sgml/catalogs.sgml
--- b/doc/src/sgml/catalogs.sgml
***
*** 1634,1639 
--- 1634,1650 
   /row
  
   row
+   entrystructfieldreloscache/structfield/entry
+   entrytypefloat4/type/entry
+   entry/entry
+   entry
+Percentage of the files in OS cache.  This is only an estimate used by
+the planner.  It is updated by commandANALYZE OSCACHE/command.
+By default, the value is not updated and an extension is required.
+   /entry
+  /row
+ 
+  row
entrystructfieldreltoastrelid/structfield/entry
entrytypeoid/type/entry
entryliterallink linkend=catalog-pg-classstructnamepg_class/structname/link.oid/literal/entry
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
new file mode 100644
index a6e541d..2043c40
*** a/src/backend/catalog/heap.c
--- b/src/backend/catalog/heap.c
*** InsertPgClassTuple(Relation pg_class_des
*** 764,769 
--- 764,770 
  	values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel-reltablespace);
  	values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel-relpages);
  	values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel-reltuples);
+ 	values[Anum_pg_class_reloscache - 1] = Float4GetDatum(rd_rel-reloscache);
  	values[Anum_pg_class_reltoastrelid - 1] = ObjectIdGetDatum(rd_rel-reltoastrelid);
  	values[Anum_pg_class_reltoastidxid - 1] = ObjectIdGetDatum(rd_rel-reltoastidxid);
  	values[Anum_pg_class_relhasindex - 1] = BoolGetDatum(rd_rel-relhasindex);
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
new file mode 100644
index d7e94ff..ca09e3b
*** a/src/backend/utils/cache/relcache.c
--- 

Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-14 Thread Robert Haas
On Tue, Jun 14, 2011 at 10:29 AM, Cédric Villemain
cedric.villemain.deb...@gmail.com wrote:
 0001-Add-reloscache-column-to-pg_class.patch
 0002-Add-a-function-to-update-the-new-pg_class-cols.patch
 0003-Add-ANALYZE-OSCACHE-VERBOSE-relation.patch
 0004-Add-a-Hook-to-handle-OSCache-stats.patch
 0005-Add-reloscache-to-Index-Rel-OptInfo.patch
 0006-Add-cache_page_cost-GUC.patch

It seems to me that posting updated versions of this patch gets us no
closer to addressing the concerns I (and Tom, on other threads)
expressed about this idea previously.  Specifically:

1. ANALYZE happens far too infrequently to believe that any data taken
at ANALYZE time will still be relevant at execution time.
2. Using data gathered by ANALYZE will make plans less stable, and our
users complain not infrequently about the plan instability we already
have, therefore we should not add more.
3. Even if the data were accurate and did not cause plan stability, we
have no evidence that using it will improve real-world performance.

Now, it's possible that you or someone else could provide some
experimental evidence refuting these points.  But right now there
isn't any, and until there is, -1 from me on applying any of this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-14 Thread Cédric Villemain
2011/6/14 Robert Haas robertmh...@gmail.com:
 On Tue, Jun 14, 2011 at 10:29 AM, Cédric Villemain
 cedric.villemain.deb...@gmail.com wrote:
 0001-Add-reloscache-column-to-pg_class.patch
 0002-Add-a-function-to-update-the-new-pg_class-cols.patch
 0003-Add-ANALYZE-OSCACHE-VERBOSE-relation.patch
 0004-Add-a-Hook-to-handle-OSCache-stats.patch
 0005-Add-reloscache-to-Index-Rel-OptInfo.patch
 0006-Add-cache_page_cost-GUC.patch

 It seems to me that posting updated versions of this patch gets us no
 closer to addressing the concerns I (and Tom, on other threads)
 expressed about this idea previously.  Specifically:

 1. ANALYZE happens far too infrequently to believe that any data taken
 at ANALYZE time will still be relevant at execution time.

ANALYZE happens when people execute it, else it is auto-analyze and I
am not providing auto-analyze-oscache.
ANALYZE OSCACHE is just a very simple wrapper to update pg_class. The
frequency is not important here, I believe.

 2. Using data gathered by ANALYZE will make plans less stable, and our
 users complain not infrequently about the plan instability we already
 have, therefore we should not add more.

Again, it is hard to do a UPDATE pg_class SET reloscache, so I used
ANALYZE logic.
Also I have taken into account the fact that someone may want to SET
the values like it was also suggested, so my patches allow to do :
'this table is 95% in cache, the DBA said' (it is stable, not based on
OS stats).

This case has been suggested several times and is covered by my patch.

 3. Even if the data were accurate and did not cause plan stability, we
 have no evidence that using it will improve real-world performance.

I have not finish my work on cost estimation and I believe this work
will take some time and can be done in another commitfest. At the
moment my patches do not change anything on the dcision of the
planner, just offers the tools I need to hack cost estimates.


 Now, it's possible that you or someone else could provide some
 experimental evidence refuting these points.  But right now there
 isn't any, and until there is, -1 from me on applying any of this.

I was trying to split the patch size by group of features to reduce
its size. The work is in progress.


 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company




-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-14 Thread Greg Smith

On 06/14/2011 11:04 AM, Robert Haas wrote:

Even if the data were accurate and did not cause plan stability, we
have no evidence that using it will improve real-world performance.
   


That's the dependency Cédric has provided us a way to finally make 
progress on.  Everyone says there's no evidence that this whole approach 
will improve performance.  But we can't collect such data, to prove or 
disprove it helps, without a proof of concept patch that implements 
*something*.  You may not like the particular way the data is collected 
here, but it's a working implementation that may be useful for some 
people.  I'll take data collected at ANALYZE time as a completely 
reasonable way to populate the new structures with realistic enough test 
data to use initially.


Surely at least one other way to populate the statistics, and possibly 
multiple other ways that the user selects, will be needed eventually.  I 
commented a while ago on this thread:  every one of these discussions 
always gets dragged into the details of how the cache statistics data 
will be collected and rejects whatever is suggested as not good enough.  
Until that stops, no progress will ever get made on the higher level 
details.  By its nature, developing toward integrating cached 
percentages is going to lurch forward on both collecting the cache 
data and using the cache knowledge in queries fronts almost 
independently.  This is not a commit candidate; it's the first useful 
proof of concept step for something we keep talking about but never 
really doing.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-14 Thread Robert Haas
On Tue, Jun 14, 2011 at 1:10 PM, Greg Smith g...@2ndquadrant.com wrote:
 On 06/14/2011 11:04 AM, Robert Haas wrote:
 Even if the data were accurate and did not cause plan stability, we
 have no evidence that using it will improve real-world performance.

 That's the dependency Cédric has provided us a way to finally make progress
 on.  Everyone says there's no evidence that this whole approach will improve
 performance.  But we can't collect such data, to prove or disprove it helps,
 without a proof of concept patch that implements *something*.  You may not
 like the particular way the data is collected here, but it's a working
 implementation that may be useful for some people.  I'll take data
 collected at ANALYZE time as a completely reasonable way to populate the
 new structures with realistic enough test data to use initially.

But there's no reason that code (which may or may not eventually prove
useful) has to be incorporated into the main tree.  We don't commit
code so people can go benchmark it; we ask for the benchmarking to be
done first, and then if the results are favorable, we commit the code.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-14 Thread Robert Haas
On Tue, Jun 14, 2011 at 12:06 PM, Cédric Villemain
cedric.villemain.deb...@gmail.com wrote:
 1. ANALYZE happens far too infrequently to believe that any data taken
 at ANALYZE time will still be relevant at execution time.

 ANALYZE happens when people execute it, else it is auto-analyze and I
 am not providing auto-analyze-oscache.
 ANALYZE OSCACHE is just a very simple wrapper to update pg_class. The
 frequency is not important here, I believe.

Well, I'm not saying you have to have all the answers to post a WIP
patch, certainly.  But in terms of getting something committable, it
seems like we need to have at least an outline of what the long-term
plan is.  If ANALYZE OSCACHE is an infrequent operation, then the data
isn't going to be a reliable guide to what will happen at execution
time...

 2. Using data gathered by ANALYZE will make plans less stable, and our
 users complain not infrequently about the plan instability we already
 have, therefore we should not add more.

...and if it is a frequent operation then it's going to result in
unstable plans (and maybe pg_class bloat).  There's a fundamental
tension here that I don't think you can just wave your hands at.

 I was trying to split the patch size by group of features to reduce
 its size. The work is in progress.

Totally reasonable, but I can't see committing any of it without some
evidence that there's light at the end of the tunnel.  No performance
tests *whatsoever* have been done.  We can debate the exact amount of
evidence that should be required to prove that something is useful
from a performance perspective, but we at least need some.  I'm
beating on this point because I believe that the whole idea of trying
to feed this information back into the planner is going to turn out to
be something that we don't want to do.  I think it's going to turn out
to have downsides that are far larger than the upsides.  I am
completely willing to be be proven wrong, but right now I think this
will make things worse and you think it will make things better and I
don't see any way to bridge that gap without doing some measurements.

For example, if you run this patch on a system and subject that system
to a relatively even workload, how much do the numbers bounce around
between runs?  What if you vary the workload, so that you blast it
with OLTP traffic at some times and then run reporting queries at
other times?  Or different tables become hot at different times?

Once you've written code to make the planner do something with the
caching % values, then you can start to explore other questions.  Can
you generate plan instability, especially on complex queries, which
are more prone to change quickly based on small changes in the cost
estimates?  Can you demonstrate a workload where bad performance is
inevitable with the current code, but with your code, the system
becomes self-tuning and ends up with good performance?  What happens
if you have a large cold table with a small hot end where all activity
is concentrated?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-14 Thread Cédric Villemain
2011/6/14 Robert Haas robertmh...@gmail.com:
 On Tue, Jun 14, 2011 at 12:06 PM, Cédric Villemain
 cedric.villemain.deb...@gmail.com wrote:
 1. ANALYZE happens far too infrequently to believe that any data taken
 at ANALYZE time will still be relevant at execution time.

 ANALYZE happens when people execute it, else it is auto-analyze and I
 am not providing auto-analyze-oscache.
 ANALYZE OSCACHE is just a very simple wrapper to update pg_class. The
 frequency is not important here, I believe.

 Well, I'm not saying you have to have all the answers to post a WIP
 patch, certainly.  But in terms of getting something committable, it
 seems like we need to have at least an outline of what the long-term
 plan is.  If ANALYZE OSCACHE is an infrequent operation, then the data
 isn't going to be a reliable guide to what will happen at execution
 time...

Ok.


 2. Using data gathered by ANALYZE will make plans less stable, and our
 users complain not infrequently about the plan instability we already
 have, therefore we should not add more.

 ...and if it is a frequent operation then it's going to result in
 unstable plans (and maybe pg_class bloat).  There's a fundamental
 tension here that I don't think you can just wave your hands at.

I don't want to hide that point, which is just correct.
The idea is not to have something (which need to be) updated too much
but it needs to be taken into account.


 I was trying to split the patch size by group of features to reduce
 its size. The work is in progress.

 Totally reasonable, but I can't see committing any of it without some
 evidence that there's light at the end of the tunnel.  No performance
 tests *whatsoever* have been done.  We can debate the exact amount of
 evidence that should be required to prove that something is useful
 from a performance perspective, but we at least need some.  I'm
 beating on this point because I believe that the whole idea of trying
 to feed this information back into the planner is going to turn out to
 be something that we don't want to do.  I think it's going to turn out
 to have downsides that are far larger than the upsides.

it is possible, yes.
I try to do changes in a way that if the reloscache values is the one
by default then the planner keep the same behavior than in the past.

 I am
 completely willing to be be proven wrong, but right now I think this
 will make things worse and you think it will make things better and I
 don't see any way to bridge that gap without doing some measurements.

correct.


 For example, if you run this patch on a system and subject that system
 to a relatively even workload, how much do the numbers bounce around
 between runs?  What if you vary the workload, so that you blast it
 with OLTP traffic at some times and then run reporting queries at
 other times?  Or different tables become hot at different times?

This is all true, this is *already* true.
Like the thread about random_page_cost vs index_page_cost where the
good option is to change the parameters at certain moment in the day
(IIRC the use case).

I mean that I agree that those benchs need to be done, hopefully I can
fix some usecases, while not breaking others too much or not at all,
or ...


 Once you've written code to make the planner do something with the
 caching % values, then you can start to explore other questions.  Can
 you generate plan instability, especially on complex queries, which
 are more prone to change quickly based on small changes in the cost
 estimates?  Can you demonstrate a workload where bad performance is
 inevitable with the current code, but with your code, the system

My next step is cost estimation changes. I have already some very
small usecases where the minimum changes I did so far are interesting
but it is not enought to come with that as evidences.

 becomes self-tuning and ends up with good performance?  What happens
 if you have a large cold table with a small hot end where all activity
 is concentrated?

We are at step 3 here :-) I have already some ideas to handle those
situations but not yet polished.

The current idea is to be conservative, like PostgreSQL used to be, for example:

/*
 * disk and cache costs
 * this assumes an agnostic knowledge of the data repartition and query
 * usage despite large tables may have a hot part of 10% which is the 
only
 * requested part or that we select only (c)old data so the cache 
useless.
 * We keep the original strategy to not guess too much and just 
ponderate
 * the cost globaly.
 */
run_cost += baserel-pages * ( spc_seq_page_cost * (1 - 
baserel-oscache)
 + cache_page_cost   * 
baserel-oscache);



 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company




-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent 

Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-14 Thread Alvaro Herrera
Excerpts from Cédric Villemain's message of mar jun 14 10:29:36 -0400 2011:

 Attached are updated patches without the plugin itself. I've also
 added the cache_page_cost GUC, this one is not per tablespace, like
 others page_cost.
 
 There are 6 patches:
 
 0001-Add-reloscache-column-to-pg_class.patch

Hmm, do you really need this to be a new column?  Would it work to have
it be a reloption?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-14 Thread Cédric Villemain
2011/6/14 Alvaro Herrera alvhe...@commandprompt.com:
 Excerpts from Cédric Villemain's message of mar jun 14 10:29:36 -0400 2011:

 Attached are updated patches without the plugin itself. I've also
 added the cache_page_cost GUC, this one is not per tablespace, like
 others page_cost.

 There are 6 patches:

 0001-Add-reloscache-column-to-pg_class.patch

 Hmm, do you really need this to be a new column?  Would it work to have
 it be a reloption?

If we can have ALTER TABLE running on heavy workload, why not.
I am bit scared by the effect of such reloption, it focus on HINT
oriented strategy when I would like to allow a dynamic strategy from
the server. This work is not done and may not work, so a reloption is
good at least as a backup  (and is more in the idea suggested by Tom
and others)


 --
 Álvaro Herrera alvhe...@commandprompt.com
 The PostgreSQL Company - Command Prompt, Inc.
 PostgreSQL Replication, Consulting, Custom Development, 24x7 support




-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-14 Thread Alvaro Herrera
Excerpts from Cédric Villemain's message of mar jun 14 17:10:20 -0400 2011:

 If we can have ALTER TABLE running on heavy workload, why not.
 I am bit scared by the effect of such reloption, it focus on HINT
 oriented strategy when I would like to allow a dynamic strategy from
 the server. This work is not done and may not work, so a reloption is
 good at least as a backup  (and is more in the idea suggested by Tom
 and others)

Hmm, sounds like yet another use case for pg_class_nt.  Why do these
keep popping up?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-14 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Cédric Villemain's message of mar jun 14 10:29:36 -0400 2011:
 0001-Add-reloscache-column-to-pg_class.patch

 Hmm, do you really need this to be a new column?  Would it work to have
 it be a reloption?

If it's to be updated in the same way as ANALYZE updates reltuples and
relpages (ie, an in-place non-transactional update), I think it'll have
to be a real column.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-14 Thread Greg Smith

On 06/14/2011 01:16 PM, Robert Haas wrote:

But there's no reason that code (which may or may not eventually prove
useful) has to be incorporated into the main tree.  We don't commit
code so people can go benchmark it; we ask for the benchmarking to be
done first, and then if the results are favorable, we commit the code.
   


Who said anything about this being a commit candidate?  The WIP in the 
subject says it's not intended to be.  The community asks people to 
submit design ideas early so that ideas around them can be explored 
publicly.  One of the things that needs to be explored, and that could 
use some community feedback, is exactly how this should be benchmarked 
in the first place.  This topic--planning based on cached 
percentage--keeps coming up, but hasn't gone very far as an abstract 
discussion.  Having a patch to test lets it turn to a concrete one.


Note that I already listed myself as the reviewer  here, so it's not 
even like this is asking explicitly for a community volunteer to help.  
Would you like us to research this privately and then dump a giant patch 
that is commit candidate quality on everyone six months from now, 
without anyone else getting input to the process, or would you like the 
work to happen here?  I recommended Cédric not ever bother soliciting 
ideas early, because I didn't want to get into this sort of debate.  I 
avoid sending anything here unless I already have a strong idea about 
the solution, because it's hard to keep criticism at bay even with 
that.  He was more optimistic about working within the community 
contribution guidelines and decided to send this over early instead.  If 
you feel this is too rough to even discuss, I'll mark it returned with 
feedback and we'll go develop this ourselves.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-14 Thread Bruce Momjian
Greg Smith wrote:
 On 06/14/2011 01:16 PM, Robert Haas wrote:
  But there's no reason that code (which may or may not eventually prove
  useful) has to be incorporated into the main tree.  We don't commit
  code so people can go benchmark it; we ask for the benchmarking to be
  done first, and then if the results are favorable, we commit the code.
 
 
 Who said anything about this being a commit candidate?  The WIP in the 
 subject says it's not intended to be.  The community asks people to 
 submit design ideas early so that ideas around them can be explored 
 publicly.  One of the things that needs to be explored, and that could 
 use some community feedback, is exactly how this should be benchmarked 
 in the first place.  This topic--planning based on cached 
 percentage--keeps coming up, but hasn't gone very far as an abstract 
 discussion.  Having a patch to test lets it turn to a concrete one.
 
 Note that I already listed myself as the reviewer  here, so it's not 
 even like this is asking explicitly for a community volunteer to help.  
 Would you like us to research this privately and then dump a giant patch 
 that is commit candidate quality on everyone six months from now, 
 without anyone else getting input to the process, or would you like the 
 work to happen here?  I recommended C?dric not ever bother soliciting 
 ideas early, because I didn't want to get into this sort of debate.  I 
 avoid sending anything here unless I already have a strong idea about 
 the solution, because it's hard to keep criticism at bay even with 
 that.  He was more optimistic about working within the community 
 contribution guidelines and decided to send this over early instead.  If 
 you feel this is too rough to even discuss, I'll mark it returned with 
 feedback and we'll go develop this ourselves.

I would like to see us continue researching in this direction.  I think
perhaps the background writer would be ideal for collecting this
information because it scans the buffer cache already, and frequently.
(Yes, I know it can't access databases.)

I think random_page_cost is a dead-end --- it will never be possible for
it to produce the right value for us.  Its value is tied up in caching,
e.g. the default 4 is not the right value for a physical drive (it
should be much higher), but kernel and shared buffer caching require it
to be a hybrid number that isn't really realistic.  And once we have
caching in that number, it is not going to be even caching for all
tables, obviously.  Hence, there is no way for random_page_cost to be
improved and we have to start thinking about alternatives.

Basically, random_page_cost is a terrible setting and we have to admit
that and move forward.  I realize the concerns about unstable plans, and
we might need to give users the option of stable plans with a fixed
random_page_cost, but at this point we don't even have enough data to
know we need that.  What we do know is that random_page_cost is
inadequate.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-14 Thread Tom Lane
Greg Smith g...@2ndquadrant.com writes:
 On 06/14/2011 01:16 PM, Robert Haas wrote:
 But there's no reason that code (which may or may not eventually prove
 useful) has to be incorporated into the main tree.  We don't commit
 code so people can go benchmark it; we ask for the benchmarking to be
 done first, and then if the results are favorable, we commit the code.

 Who said anything about this being a commit candidate?  The WIP in the 
 subject says it's not intended to be.  The community asks people to 
 submit design ideas early so that ideas around them can be explored 
 publicly.  One of the things that needs to be explored, and that could 
 use some community feedback, is exactly how this should be benchmarked 
 in the first place.  This topic--planning based on cached 
 percentage--keeps coming up, but hasn't gone very far as an abstract 
 discussion.  Having a patch to test lets it turn to a concrete one.

Yeah, it *can't* go very far as an abstract discussion ... we need some
realistic testing to decide whether this is a good idea, and you can't
get that without code.

I think the real underlying issue here is that we have this CommitFest
process that is focused on getting committable or nearly-committable
code into the tree, and it just doesn't fit well for experimental code.
I concur with Robert's desire to not push experimental code into the
main repository, but we need to have *some* way of working with it.
Maybe a separate repo where experimental branches could hang out would
be helpful?

(Another way of phrasing my point is that WIP is not conveying the
true status of this patch.  Maybe Experimental would be an appropriate
label.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-14 Thread Greg Smith

On 06/14/2011 07:08 PM, Tom Lane wrote:

I concur with Robert's desire to not push experimental code into the
main repository, but we need to have *some* way of working with it.
Maybe a separate repo where experimental branches could hang out would
be helpful?
   


Well, this one is sitting around in branches at both git.postgresql.org 
and github so far, both being updated periodically.  Maybe there's some 
value around an official experimental repository too, but I thought that 
was the idea of individual people having their own directories on 
git.postgres.org.  Do we need something fancier than that?  It would be 
nice, but seems little return on investment to improve that, relative to 
what you can do easily enough now.


The idea David Fetter has been advocating of having a bit rot farm to 
help detect when the experimental branches drift too far out of date 
tries to make that concept really formal.  I like that idea, too, but 
find it hard to marshal enough resources to do something about it.  The 
current status quo isn't that terrible; noticing bit rot when it's 
relevant isn't that hard to do.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-14 Thread Robert Haas
On Tue, Jun 14, 2011 at 6:17 PM, Greg Smith g...@2ndquadrant.com wrote:
 Who said anything about this being a commit candidate?  The WIP in the
 subject says it's not intended to be.  The community asks people to submit
 design ideas early so that ideas around them can be explored publicly.  One
 of the things that needs to be explored, and that could use some community
 feedback, is exactly how this should be benchmarked in the first place.
  This topic--planning based on cached percentage--keeps coming up, but
 hasn't gone very far as an abstract discussion.  Having a patch to test lets
 it turn to a concrete one.

 Note that I already listed myself as the reviewer  here, so it's not even
 like this is asking explicitly for a community volunteer to help.  Would you
 like us to research this privately and then dump a giant patch that is
 commit candidate quality on everyone six months from now, without anyone
 else getting input to the process, or would you like the work to happen
 here?  I recommended Cédric not ever bother soliciting ideas early, because
 I didn't want to get into this sort of debate.  I avoid sending anything
 here unless I already have a strong idea about the solution, because it's
 hard to keep criticism at bay even with that.  He was more optimistic about
 working within the community contribution guidelines and decided to send
 this over early instead.  If you feel this is too rough to even discuss,
 I'll mark it returned with feedback and we'll go develop this ourselves.

My usual trope on this subject is that WIP patches tend to elicit
helpful feedback if and only if the patch author is clear about what
sort of feedback they are seeking.  I'm interested in this topic, so,
I'm willing to put some effort into it; but, as I've said before, I
think this patch is coming from the wrong end, so in the absence of
any specific guidance on what sort of input would be useful, that's
the feedback you're getting.  Feel free to clarify what would be more
helpful.  :-)

Incidentally, I have done a bit of math around how to rejigger the
costing formulas to take cached_page_cost and caching_percentage into
account, which I think is the most interesting end place to start this
work.  If it's helpful, I can write it up in a more organized way and
post that; it likely wouldn't be that much work to incorporate it into
what Cedric has here already.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers