Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-14 Thread marcin mank
On Thu, Jul 9, 2009 at 5:38 AM, Noah Mischn...@leadboat.com wrote:
z
 Describing in those terms illuminates much.  While the concepts do suggest 2^N
 worst-case planning cost, my artificial test case showed a rigid 4^N pattern;
 what could explain that?


Isn`t that just so that the planner has to examine O(2^N) subsets of
relations, and do O(2^N) work for each of them? To create level N join
the planner chooses pairs of level k and level N-k joins. the count of
level k joins is O(2^k), the count of level N-k ones is O(2^(N-k)).
Together it is O(N) * O(2^N) * O(2^k) * O(2^(N-k))  which is O(N* 4^N)
.


This is for the worst case. If we could make a better estimate of the
required planning time (I believe that the input data for a good
heuristic is a matrix which says which relation is constrained to
which relation), we could make better decisions about when to flatten
subqueries, collapse joins, launch geqo...

Greetings
Marcin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-12 Thread Andres Freund
Hi Tom,

On Saturday 11 July 2009 20:33:14 Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
  I just realized doing it in a naive way (in geqo()) causes the state to
  be reset multiple times during one query- at every invocation of
  make_rel_from_joinlist.
 
  Does anybody see a problem with that?

 I think that's probably what we want.  Otherwise, you'd have a situation
 where two identical subproblems might get planned differently.

 This ties into something I was thinking about earlier: the planner is
 potentially recursive (eg, it might call a user-defined function that
 contains a plannable query), and it'd probably be a good idea if the
 behavior of GEQO wasn't sensitive to that.  So the RNG's state needs to
 be kept in PlannerInfo or some associated structure, not be global.
Hm. I looked a bit into this and I dont see a real problem with a global 
random state if that one gets reinitialized on every geqo() invocation. If I 
understood the code correctly - which is not sure at all -  while 
make_rel_from_joinlist is itself recursively the actual join search code is 
not recursive. Correct?
Thus it would be enough to reset the seed on every geqo() invocation...

Any counter arguments?

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-12 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 On Saturday 11 July 2009 20:33:14 Tom Lane wrote:
 This ties into something I was thinking about earlier: the planner is
 potentially recursive (eg, it might call a user-defined function that
 contains a plannable query), and it'd probably be a good idea if the
 behavior of GEQO wasn't sensitive to that.  So the RNG's state needs to
 be kept in PlannerInfo or some associated structure, not be global.

 Hm. I looked a bit into this and I dont see a real problem with a global 
 random state if that one gets reinitialized on every geqo() invocation. If I 
 understood the code correctly - which is not sure at all -  while 
 make_rel_from_joinlist is itself recursively the actual join search code is 
 not recursive. Correct?

I wouldn't count on that.  GEQO is not recursive with respect to a
particular query, but there's still the risk of the planner deciding
to call out to some user-defined code while it does selectivity
estimates.  So the planner as a whole has to be re-entrant.

Now you could probably argue that the impact of extra RNG resets on
the overall behavior is small enough that it doesn't matter.  But if
you really want to promise consistent GEQO behavior then I think the
RNG state has to be local to a particular planner instantiation.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-12 Thread Andres Freund
On Sunday 12 July 2009 16:44:59 Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
  On Saturday 11 July 2009 20:33:14 Tom Lane wrote:
  This ties into something I was thinking about earlier: the planner is
  potentially recursive (eg, it might call a user-defined function that
  contains a plannable query), and it'd probably be a good idea if the
  behavior of GEQO wasn't sensitive to that.  So the RNG's state needs to
  be kept in PlannerInfo or some associated structure, not be global.
 
  Hm. I looked a bit into this and I dont see a real problem with a global
  random state if that one gets reinitialized on every geqo() invocation.
  If I understood the code correctly - which is not sure at all -  while
  make_rel_from_joinlist is itself recursively the actual join search code
  is not recursive. Correct?
 I wouldn't count on that.  GEQO is not recursive with respect to a
 particular query, but there's still the risk of the planner deciding
 to call out to some user-defined code while it does selectivity
 estimates.  So the planner as a whole has to be re-entrant.

 Now you could probably argue that the impact of extra RNG resets on
 the overall behavior is small enough that it doesn't matter.  But if
 you really want to promise consistent GEQO behavior then I think the
 RNG state has to be local to a particular planner instantiation.
I just did not see that it could call selectivity estimate functions. This 
will mean adding a additional Paramer (PlannerInfo) to most of the geqo 
functions, but I don't see a problem there.

Has anybody tried Geqo without ERX in recent time?

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-12 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 Has anybody tried Geqo without ERX in recent time?

No, I don't think so.  Anything that's ifdef'd out at the moment has
been that way for quite a few years, and so is likely broken anyhow :-(

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-11 Thread Andres Freund
On Wednesday 08 July 2009 23:46:02 Tom Lane wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
  For a moment it seemed logical to suggest a session GUC for the seed,
  so if you got a bad plan you could keep rolling the dice until you got
  one you liked; but my right-brain kept sending shivers down my spine
  to suggest just how uncomfortable it was with that idea

 If memory serves, we actually had exactly that at some point.  But I
 think the reason it got taken out was that it interfered with the
 behavior of the random() function for everything else.  We'd have to
 give GEQO its own private random number generator.
All of GEQOs usage of random() seems to be concentrated to geqo_random.h - so 
it would be a small change.
I will happily tackle that. If only to narrow down in which cases geqo fails 
to plan - a behaviour we have seen at times at a bit more crazy queries.

The only question I have is, whether random_r or similar is available on 
enough platforms... Has anybody an idea about this?
On most unixoid system one could just wrap erand48() if random_r is not 
available.
Windows?

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-11 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 The only question I have is, whether random_r or similar is available on 
 enough platforms... Has anybody an idea about this?
 On most unixoid system one could just wrap erand48() if random_r is not 
 available.
 Windows?

random_r() isn't in the Single Unix Spec AFAICS, and I also don't find
it on HPUX 10.20, so I'd vote against depending on it.  What I do see
in SUS is initstate() and setstate() which could be used to control
the random() function:
http://www.opengroup.org/onlinepubs/007908799/xsh/initstate.html
It would also work to leave random() for use by the core code and have
GEQO depend on something from the drand48() family:
http://www.opengroup.org/onlinepubs/007908799/xsh/drand48.html
Probably drand48() is less random than random(), but for the limited
purposes of GEQO I doubt we care very much.

So far as I can find in a quick google search, neither of these families
of functions exist on Windows :-(.  So I think maybe the best approach
is the second one --- we could implement a port/ module that provides a
version of whichever drand48 function we need.

On reflection I think the best user API is probably a geqo_seed GUC in
the range 0 to 1, and have GEQO always reset its seed to that value at
start of a planning cycle.  This ensures plan stability, and if you need
to experiment with alternative plans you can change to different seed
values.  The no reset behavior doesn't seem to have much real-world
usefulness, because even if you chance to get a good plan, you have no
way to reproduce it...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-11 Thread Andres Freund
Hi,

On Saturday 11 July 2009 18:23:59 Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
  The only question I have is, whether random_r or similar is available on
  enough platforms... Has anybody an idea about this?
  On most unixoid system one could just wrap erand48() if random_r is not
  available.
  Windows?

 random_r() isn't in the Single Unix Spec AFAICS, and I also don't find
 it on HPUX 10.20, so I'd vote against depending on it.  What I do see
 in SUS is initstate() and setstate() which could be used to control
 the random() function:
 http://www.opengroup.org/onlinepubs/007908799/xsh/initstate.html
Using setstate() has a bit too many possible implications for my taste - 
especially as there is no way to portably get/save the current random state.

(Running a known query to reset randoms seed or so)

 It would also work to leave random() for use by the core code and have
 GEQO depend on something from the drand48() family:
 http://www.opengroup.org/onlinepubs/007908799/xsh/drand48.html
 Probably drand48() is less random than random(), but for the limited
 purposes of GEQO I doubt we care very much.
Yes.

 So far as I can find in a quick google search, neither of these families
 of functions exist on Windows :-(.  So I think maybe the best approach
 is the second one --- we could implement a port/ module that provides a
 version of whichever drand48 function we need.
Okay.
It would be possible to implement random_r the same way if we are going to 
write something for windows anyway - Is it possible that it might be usefull 
somewhere else?

 On reflection I think the best user API is probably a geqo_seed GUC in
 the range 0 to 1, and have GEQO always reset its seed to that value at
 start of a planning cycle.  This ensures plan stability, and if you need
 to experiment with alternative plans you can change to different seed
 values.  The no reset behavior doesn't seem to have much real-world
 usefulness, because even if you chance to get a good plan, you have no
 way to reproduce it...
That was my thinking as well. 

Should geqo_seed be documented from start or not?

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-11 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 On Saturday 11 July 2009 18:23:59 Tom Lane wrote:
 So far as I can find in a quick google search, neither of these families
 of functions exist on Windows :-(.  So I think maybe the best approach
 is the second one --- we could implement a port/ module that provides a
 version of whichever drand48 function we need.

 It would be possible to implement random_r the same way if we are going to 
 write something for windows anyway - Is it possible that it might be usefull 
 somewhere else?

I think a decent version of random_r would be more work than it's worth.

(In practice we'd probably just lift the module from one of the BSDen
anyway, so maybe work is the wrong measure here, but code size is
still relevant.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-11 Thread Kenneth Marshall
On Sat, Jul 11, 2009 at 12:23:59PM -0400, Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
  The only question I have is, whether random_r or similar is available on 
  enough platforms... Has anybody an idea about this?
  On most unixoid system one could just wrap erand48() if random_r is not 
  available.
  Windows?
 
 random_r() isn't in the Single Unix Spec AFAICS, and I also don't find
 it on HPUX 10.20, so I'd vote against depending on it.  What I do see
 in SUS is initstate() and setstate() which could be used to control
 the random() function:
 http://www.opengroup.org/onlinepubs/007908799/xsh/initstate.html
 It would also work to leave random() for use by the core code and have
 GEQO depend on something from the drand48() family:
 http://www.opengroup.org/onlinepubs/007908799/xsh/drand48.html
 Probably drand48() is less random than random(), but for the limited
 purposes of GEQO I doubt we care very much.
 
Ugh, tracking down problems caused a poor random number generator
is a difficult. Poor randomness often causes weird results from
algorithms that were designed around the assumption of a random
number.

 So far as I can find in a quick google search, neither of these families
 of functions exist on Windows :-(.  So I think maybe the best approach
 is the second one --- we could implement a port/ module that provides a
 version of whichever drand48 function we need.
 
I think that having a port/module for a random number generator is a
good idea. There are a number of good, fast random number generators
to choose from.

Cheers,
Ken

 On reflection I think the best user API is probably a geqo_seed GUC in
 the range 0 to 1, and have GEQO always reset its seed to that value at
 start of a planning cycle.  This ensures plan stability, and if you need
 to experiment with alternative plans you can change to different seed
 values.  The no reset behavior doesn't seem to have much real-world
 usefulness, because even if you chance to get a good plan, you have no
 way to reproduce it...
 
   regards, tom lane
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-11 Thread Andres Freund
On Saturday 11 July 2009 18:23:59 Tom Lane wrote:
 On reflection I think the best user API is probably a geqo_seed GUC in
 the range 0 to 1, and have GEQO always reset its seed to that value at
 start of a planning cycle.  This ensures plan stability, and if you need
 to experiment with alternative plans you can change to different seed
 values.  The no reset behavior doesn't seem to have much real-world
 usefulness, because even if you chance to get a good plan, you have no
 way to reproduce it...
I just realized doing it in a naive way (in geqo()) causes the state to be 
reset multiple times during one query- at every invocation of 
make_rel_from_joinlist.
Does anybody see a problem with that?

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-11 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 I just realized doing it in a naive way (in geqo()) causes the state to be 
 reset multiple times during one query- at every invocation of 
 make_rel_from_joinlist.

 Does anybody see a problem with that?

I think that's probably what we want.  Otherwise, you'd have a situation
where two identical subproblems might get planned differently.

This ties into something I was thinking about earlier: the planner is
potentially recursive (eg, it might call a user-defined function that
contains a plannable query), and it'd probably be a good idea if the
behavior of GEQO wasn't sensitive to that.  So the RNG's state needs to
be kept in PlannerInfo or some associated structure, not be global.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-10 Thread Robert Haas
On Wed, Jul 8, 2009 at 4:57 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 Well, the reason I'm not voting for #3 is that it looks like a lot of
 work to implement something that would basically be a planner hint,
 which I'm generally against; furthermore, it's a hint that there's been
 no demand for.  (We're not even certain that anyone is using the ability
 to *fully* specify the join order, much less wanting some undetermined
 compromise between manual and automatic control.)  And anyway I didn't
 hear anyone volunteering to do it.  So the realistic alternatives are
 #1, #2, or do nothing; and out of those I like #2.

I took a look at this and it seems that #3 can be implemented with
essentially no additional code (the handful of lines I added where
more than balanced out by some simplifications in ruleutils.c).  Of
course you still don't have to like it.  :-)

Patch attached.

...Robert
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***
*** 2251,2313  SELECT * FROM parent WHERE key = 2400;
/listitem
   /varlistentry
  
-  varlistentry id=guc-from-collapse-limit xreflabel=from_collapse_limit
-   termvarnamefrom_collapse_limit/varname (typeinteger/type)/term
-   indexterm
-primaryvarnamefrom_collapse_limit/ configuration parameter/primary
-   /indexterm
-   listitem
-para
- The planner will merge sub-queries into upper queries if the
- resulting literalFROM/literal list would have no more than
- this many items.  Smaller values reduce planning time but might
- yield inferior query plans.  The default is eight.
- For more information see xref linkend=explicit-joins.
-/para
- 
-para
- Setting this value to xref linkend=guc-geqo-threshold or more
- may trigger use of the GEQO planner, resulting in nondeterministic
- plans.  See xref linkend=runtime-config-query-geqo.
-/para
-   /listitem
-  /varlistentry
- 
-  varlistentry id=guc-join-collapse-limit xreflabel=join_collapse_limit
-   termvarnamejoin_collapse_limit/varname (typeinteger/type)/term
-   indexterm
-primaryvarnamejoin_collapse_limit/ configuration parameter/primary
-   /indexterm
-   listitem
-para
- The planner will rewrite explicit literalJOIN/
- constructs (except literalFULL JOIN/s) into lists of
- literalFROM/ items whenever a list of no more than this many items
- would result.  Smaller values reduce planning time but might
- yield inferior query plans.
-/para
- 
-para
- By default, this variable is set the same as
- varnamefrom_collapse_limit/varname, which is appropriate
- for most uses. Setting it to 1 prevents any reordering of
- explicit literalJOIN/s. Thus, the explicit join order
- specified in the query will be the actual order in which the
- relations are joined. The query planner does not always choose
- the optimal join order; advanced users can elect to
- temporarily set this variable to 1, and then specify the join
- order they desire explicitly.
- For more information see xref linkend=explicit-joins.
-/para
- 
-para
- Setting this value to xref linkend=guc-geqo-threshold or more
- may trigger use of the GEQO planner, resulting in nondeterministic
- plans.  See xref linkend=runtime-config-query-geqo.
-/para
-   /listitem
-  /varlistentry
- 
   /variablelist
  /sect2
 /sect1
--- 2251,2256 
*** a/doc/src/sgml/perform.sgml
--- b/doc/src/sgml/perform.sgml
***
*** 599,606  WHERE tablename = 'road';
  
para
 It is possible
!to control the query planner to some extent by using the explicit literalJOIN/
!syntax.  To see why this matters, we first need some background.
/para
  
para
--- 599,607 
  
para
 It is possible
!to control the query planner to some extent by using literalJOIN/
!with the literalFORCE/ keyword.  To see why this matters, we first need
!some background.
/para
  
para
***
*** 675,681  SELECT * FROM a LEFT JOIN b ON (a.bid = b.id) LEFT JOIN c ON (a.cid = c.id);
para
 Even though most kinds of literalJOIN/ don't completely constrain
 the join order, it is possible to instruct the
!productnamePostgreSQL/productname query planner to treat all
 literalJOIN/ clauses as constraining the join order anyway.
 For example, these three queries are logically equivalent:
  programlisting
--- 676,682 
para
 Even though most kinds of literalJOIN/ don't completely constrain
 the join order, it is possible to instruct the
!productnamePostgreSQL/productname query planner to treat certain
 literalJOIN/ clauses as constraining the join order anyway.
 For example, these three queries are logically 

Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-10 Thread Dimitri Fontaine

Hi,

Le 10 juil. 09 à 17:22, Robert Haas a écrit :

I took a look at this and it seems that #3 can be implemented with
essentially no additional code (the handful of lines I added where
more than balanced out by some simplifications in ruleutils.c).  Of
course you still don't have to like it.  :-)


I see you're using the following syntax:
! SELECT * FROM a INNER FORCE JOIN (b INNER FORCE JOIN c ON (b.ref =  
c.id)) ON (a.id = b.id);


The only place I've seen that before is MySQL straight_join feature:
  http://dev.mysql.com/doc/refman/5.0/en/join.html

My first though at the time was what a lame optimiser if I'm to tell  
it where not to reorder joins, but perhaps this was because the  
option there, IIRC, could impact the results...


I guess I'm not going to like it, but nonetheless, if we're going to  
support the feature, what about calling it the same as MySQL, unless  
the standard has something to say?

--
dim
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-10 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I took a look at this and it seems that #3 can be implemented with
 essentially no additional code (the handful of lines I added where
 more than balanced out by some simplifications in ruleutils.c).  Of
 course you still don't have to like it.  :-)

You're right, I don't.  Even if I thought this were a good idea, which
I most definitely do not, the need to add a nonstandard fully-reserved
word is sufficient reason to reject it.  (The patch tries to pretend
it's not going to reserve the word, but that only works because you have
carefully changed only one of the five JOIN productions, leading to
bizarrely non-orthogonal syntax.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-10 Thread Robert Haas
On Jul 10, 2009, at 11:48 AM, Dimitri Fontaine dfonta...@hi- 
media.com wrote:



Hi,

Le 10 juil. 09 à 17:22, Robert Haas a écrit :

I took a look at this and it seems that #3 can be implemented with
essentially no additional code (the handful of lines I added where
more than balanced out by some simplifications in ruleutils.c).  Of
course you still don't have to like it.  :-)


I see you're using the following syntax:
! SELECT * FROM a INNER FORCE JOIN (b INNER FORCE JOIN c ON (b.ref =  
c.id)) ON (a.id = b.id);


The only place I've seen that before is MySQL straight_join feature:
 http://dev.mysql.com/doc/refman/5.0/en/join.html

My first though at the time was what a lame optimiser if I'm to  
tell it where not to reorder joins, but perhaps this was because  
the option there, IIRC, could impact the results...


I guess I'm not going to like it, but nonetheless, if we're going to  
support the feature, what about calling it the same as MySQL, unless  
the standard has something to say?


Well, I think we would be well-advised to get past the threshold issue  
of whether or not the overall design sucks before quibbling over  
details like my particular choice of keyword.  That having been said,  
if the MySQL feature to which you linked actually does the same thing  
as what I implemented here, then it's amazingly poorly documented. We  
certainly don't guarantee anything about the order in which the input  
tables are read; I'd ask what that even means except I don't care.  
We're only making a promise about where that join will be implemented  
in the plan tree as compared with *other joins*.


It was reported upthread that Oracle uses ORDERED for this but I don't  
know whether either the syntax or the semantics match what I did  
here.  At any rate the particular choice of keyword seems rather  
insignificant; I picked it because it was already a keyword and seemed  
vaguely appropriate, but it could easily be changed.


...Robert
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-10 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote: 
 
 At any rate the particular choice of keyword seems rather  
 insignificant; I picked it because it was already a keyword and
 seemed vaguely appropriate, but it could easily be changed.
 
Actually, if we were going to add fine-grained optimizer hints for
this (which I'm not at all convinced is a good idea), I'd be tempted
to go with what I saw a few years ago in SAP-DB (later rebranded as
MySQL Max-DB): treat parentheses around JOIN operations as optimizer
hints.  I would only consider this as remotely sane if there was an
enable_* GUC to disable the normal reordering of joins.  It introduces
no new keywords.  The queries are still standard compliant.  We would
just have a configuration option which treats an optional part of the
standard syntax as a hint.
 
In other words:
 
select * from (t1 join t2 on whatever) join t3 on whatever;
 
would limit optimizer choice from those available with:
 
select * from t1 join t2 on whatever join t3 on whatever;
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-10 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Actually, if we were going to add fine-grained optimizer hints for
 this (which I'm not at all convinced is a good idea), I'd be tempted
 to go with what I saw a few years ago in SAP-DB (later rebranded as
 MySQL Max-DB): treat parentheses around JOIN operations as optimizer
 hints.

That's a *truly* horrid idea, as sometimes you need them simply to
get the precedence correct.  Such awful design from SAP doesn't surprise
me, and MySQL copying a bad idea surprises me even less, but let's not
go there.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-10 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: 
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 treat parentheses around JOIN operations as optimizer hints.
 
 That's a *truly* horrid idea, as sometimes you need them simply to
 get the precedence correct.
 
You do, but it's been pretty rare in my experience, and we're
considering alternatives which give a lot less flexibility that this.
 
The *truly* awful thing about the SAP-DB implementation is that it
wasn't optional -- parentheses in this part of a query always limited
optimizer options; I sure wouldn't want to go there again.  I thought
we were talking about options for what to do when an enable_* setting
was off for diagnostic purposes
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-10 Thread Robert Haas

On Jul 10, 2009, at 12:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:


Robert Haas robertmh...@gmail.com writes:

I took a look at this and it seems that #3 can be implemented with
essentially no additional code (the handful of lines I added where
more than balanced out by some simplifications in ruleutils.c).  Of
course you still don't have to like it.  :-)


You're right, I don't.  Even if I thought this were a good idea, which
I most definitely do not, the need to add a nonstandard fully-reserved
word is sufficient reason to reject it.  (The patch tries to pretend
it's not going to reserve the word, but that only works because you  
have

carefully changed only one of the five JOIN productions, leading to
bizarrely non-orthogonal syntax.)


Well, it's pretty obvious that only one of those productions is  
actually a problem, and that is the one which produces an undecorated  
JOIN. The others could all be changed easily enough, but they add no  
expressive power, so I didn't think it very worthwhile to add MORE non- 
standard syntax.  In any event, the current non-orthogonality is  
exponentially more bizarre: you can constrain the join order by  
setting join_collapse_limit to 1, but then you must write the joins  
you want constrained using JOIN and the others as FROM items, which of  
course doesn't work at all for LEFT or RIGHT joins and will have  
random and unpredictable effects on subqueries pulled in via VIEWs.   
Needing to write INNER to be able to use FORCE pales by comparison.


That having been said, I'm not excited about pushing water up a hill.   
The important thing here is to remove the collapse limits; providing a  
tool to control the join order that won't make you want to beat your  
head in a brick is just something that can be trivially done with no  
extra code, not my primary goal.


...Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-10 Thread Jaime Casanova
On Fri, Jul 10, 2009 at 10:22 AM, Robert Haasrobertmh...@gmail.com wrote:
 I took a look at this and it seems that #3 can be implemented with
 essentially no additional code (the handful of lines I added where
 more than balanced out by some simplifications in ruleutils.c).  Of
 course you still don't have to like it.  :-)

 Patch attached.


! SELECT * FROM a INNER FORCE JOIN (b INNER FORCE JOIN c ON (b.ref =
c.id)) ON (a.id = b.id);

what's next? FORCE INDEX?
once we implement one hint like this the complaints about the others
will lose force


-- 
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-10 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 You do, but it's been pretty rare in my experience, and we're
 considering alternatives which give a lot less flexibility that this.

I dunno about considering.  We've already wasted vastly more time on
this than it's worth.  AFAIR there has never been one single user
request for the ability to partially constrain join order.  I think we
should do an enable_join_ordering boolean and quit wasting brainpower on
the issue.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-10 Thread Ron Mayer
Tom Lane wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 You do, but it's been pretty rare in my experience, and we're
 considering alternatives which give a lot less flexibility that this.
 
 I dunno about considering.  We've already wasted vastly more time on
 this than it's worth.  AFAIR there has never been one single user
 request for the ability to partially constrain join order.  I think we
 should do an enable_join_ordering boolean and quit wasting brainpower on
 the issue.

I think I've found it useful in the past[1], but I also think we
already have a way to give postgres such hints using subselects
and offset 0.

Instead of SAP-DB's
 select * from (t1 join t2 on whatever) join t3 on whatever;
ISTM we can already do
 select * from (select t1 join t2 on whatever offset 0) as a join t3 on 
 whatever;
which seems like a reasonably way of hinting which parenthesis
can be reordered and which can't.


Would these new proposals give (guc's or syntax hacks) anything that
I can't already do?



[1] http://archives.postgresql.org/pgsql-performance/2007-12/msg00088.php

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-10 Thread Robert Haas
On Fri, Jul 10, 2009 at 2:44 PM, Jaime
Casanovajcasa...@systemguards.com.ec wrote:
 On Fri, Jul 10, 2009 at 10:22 AM, Robert Haasrobertmh...@gmail.com wrote:
 I took a look at this and it seems that #3 can be implemented with
 essentially no additional code (the handful of lines I added where
 more than balanced out by some simplifications in ruleutils.c).  Of
 course you still don't have to like it.  :-)

 Patch attached.


 ! SELECT * FROM a INNER FORCE JOIN (b INNER FORCE JOIN c ON (b.ref =
 c.id)) ON (a.id = b.id);

 what's next? FORCE INDEX?
 once we implement one hint like this the complaints about the others
 will lose force

That would be pretty useless, because while it might work for trivial
cases, in a complex plan, it's almost certainly not going to do what
you want unless you specify every detail of the entire plan along with
it, which kind of defeats the purpose of using a database with a
sophisticated planner.  Actually, it seems to me that if we were to
add hints, the place to start would be with selectivity, since in my
experience bad selectivity estimates (often by 3, 4, 5, or 6 orders of
magnitude) are often the reason for a bad plan, and if you can fix
that, the rest takes care of itself - but fixing it is often very
difficult.  Of course, improving the selectivity estimator would be
even better, because time spent applying hints to queries is time that
could be better spent doing something else, and in fact one of the
reasons why I started using PostgreSQL is because complex queries Just
Work.

Constraining the join order does seem a lot less useful, because (for
example) it won't compel the planner to use a hash join instead of a
nest join, or to make a sensible decision about which relations should
be on the inner side of the join.  But I still think that it's worth
considering, because:

1. We basically already support the functionality - it's just broken.
Today, you can control the join order by setting join_collapse_limit =
1; then, the joins you want to order you specify using JOIN syntax;
the ones you don't want to order, you write as FROM items.  But all of
your outer joins will necessarily have the order forced, because
there's no way to write those as FROM items.  And if you want to use a
view that was written without this convention in mind, you're hosed.
So I think we either ought to remove it or fix it.

2. It's actually LESS code to support it completely than it is to
leave it the way it is now while removing from_collapse_limit and
replacing join_collapse_limit with enable_join_ordering.  Compare the
patch I sent earlier with the one that I'm about to send.

3. The point of this particular type of hint, at least as I see it, is
not so much to force the planner into the correct query plan as it is
to constrain the planning time.  There are very few tools available
for this today: join_collapse_limit and from_collapse_limit do it by
unexpectedly producing terrible plans, and the only other option is
GEQO.  In a case like the one posted upthread where you have ten or so
joins that can be reordered almost arbitrarily, you could shove a
single FORCE right into the middle and split it into two problems that
can be planned consecutively.  This isn't that different from what
join_collapse_limit does today, but it doesn't force a uniform
threshold on you across the board; you can apply where, when, and to
the extent necessary to control planning time for a particular query,
and you have to explicitly ask for it so you can't complain you were
ambushed if it doesn't work out.

But I can see I'm outvoted, so I give up!

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-10 Thread Robert Haas
On Fri, Jul 10, 2009 at 2:48 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 You do, but it's been pretty rare in my experience, and we're
 considering alternatives which give a lot less flexibility that this.

 I dunno about considering.  We've already wasted vastly more time on
 this than it's worth.  AFAIR there has never been one single user
 request for the ability to partially constrain join order.  I think we
 should do an enable_join_ordering boolean and quit wasting brainpower on
 the issue.

Patch attached.

...Robert
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***
*** 1798,1803  archive_command = 'copy %p C:\\server\\archivedir\\%f'  # Windows
--- 1798,1823 
/listitem
   /varlistentry
  
+  varlistentry id=guc-enable-join-ordering xreflabel=enable_join_ordering
+   termvarnameenable_join_ordering/varname (typeboolean/type)/term
+   indexterm
+primaryvarnameenable_join_ordering/ configuration parameter/primary
+   /indexterm
+   listitem
+para
+ Allows the planner to reorder joins, which is generally appropriate
+ for most uses. Setting it to false prevents any reordering of
+ explicit literalJOIN/s. Thus, the explicit join order
+ specified in the query will be the actual order in which the
+ relations are joined. The query planner does not always choose
+ the optimal join order; advanced users can elect to
+ temporarily set this variable to false, and then specify the join
+ order they desire explicitly.
+ For more information see xref linkend=explicit-joins.
+/para
+   /listitem
+  /varlistentry
+ 
   varlistentry id=guc-enable-mergejoin xreflabel=enable_mergejoin
termvarnameenable_mergejoin/varname (typeboolean/type)/term
indexterm
***
*** 2251,2313  SELECT * FROM parent WHERE key = 2400;
/listitem
   /varlistentry
  
-  varlistentry id=guc-from-collapse-limit xreflabel=from_collapse_limit
-   termvarnamefrom_collapse_limit/varname (typeinteger/type)/term
-   indexterm
-primaryvarnamefrom_collapse_limit/ configuration parameter/primary
-   /indexterm
-   listitem
-para
- The planner will merge sub-queries into upper queries if the
- resulting literalFROM/literal list would have no more than
- this many items.  Smaller values reduce planning time but might
- yield inferior query plans.  The default is eight.
- For more information see xref linkend=explicit-joins.
-/para
- 
-para
- Setting this value to xref linkend=guc-geqo-threshold or more
- may trigger use of the GEQO planner, resulting in nondeterministic
- plans.  See xref linkend=runtime-config-query-geqo.
-/para
-   /listitem
-  /varlistentry
- 
-  varlistentry id=guc-join-collapse-limit xreflabel=join_collapse_limit
-   termvarnamejoin_collapse_limit/varname (typeinteger/type)/term
-   indexterm
-primaryvarnamejoin_collapse_limit/ configuration parameter/primary
-   /indexterm
-   listitem
-para
- The planner will rewrite explicit literalJOIN/
- constructs (except literalFULL JOIN/s) into lists of
- literalFROM/ items whenever a list of no more than this many items
- would result.  Smaller values reduce planning time but might
- yield inferior query plans.
-/para
- 
-para
- By default, this variable is set the same as
- varnamefrom_collapse_limit/varname, which is appropriate
- for most uses. Setting it to 1 prevents any reordering of
- explicit literalJOIN/s. Thus, the explicit join order
- specified in the query will be the actual order in which the
- relations are joined. The query planner does not always choose
- the optimal join order; advanced users can elect to
- temporarily set this variable to 1, and then specify the join
- order they desire explicitly.
- For more information see xref linkend=explicit-joins.
-/para
- 
-para
- Setting this value to xref linkend=guc-geqo-threshold or more
- may trigger use of the GEQO planner, resulting in nondeterministic
- plans.  See xref linkend=runtime-config-query-geqo.
-/para
-   /listitem
-  /varlistentry
- 
   /variablelist
  /sect2
 /sect1
--- 2271,2276 
*** a/doc/src/sgml/perform.sgml
--- b/doc/src/sgml/perform.sgml
***
*** 692,699  SELECT * FROM a JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
para
 To force the planner to follow the join order laid out by explicit
 literalJOIN/s,
!set the xref linkend=guc-join-collapse-limit run-time parameter to 1.
!(Other possible values are discussed below.)
/para
  

Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-09 Thread Robert Haas

On Jul 8, 2009, at 8:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:


Robert Haas robertmh...@gmail.com writes:

That was my first reaction too, but now I'm wondering whether we
shouldn't just do #1.  #2 is a planner hint, too, just not a very  
good

one.  If, as you suggest, it isn't actually useful, then why keep it
at all? (On the other hand, if someone thinks they need it, it would
be interesting to know the use case, and think about the best way to
address it.)


Well, I can cite one reasonably plausible use case: when you have an
umpteen-way join you need to execute a lot, and you don't want to pay
for an exhaustive search, but GEQO doesn't reliably find a good plan.
What you can do is let the system do an exhaustive search once to find
the best plan, then you rearrange the query to specify that join order
via JOINs, and turn off join collapsing.  Presto, good plan every time
with very little planning time expended.

Now, your answer will probably be that we should provide some better
mechanism for re-using a previously identified plan structure.  No
doubt that would be ideal, but the amount of effort required to get
there is nontrivial, and I'm not at all convinced it would be repaid
in usefulness.  Whereas what I describe above is something that costs
us nearly nothing to provide.  The usefulness might be marginal too,
but on the basis of cost/benefit ratio it's a clear win.


I don't think that would be my answer because plan caching sounds  
hard.  It would be nice to have, though. :-)


But it's clearly a planner hint, however you slice it.

It occurs to me that one way to make GEQO less scary would be to  
take

out the nondeterminism by resetting its random number generator for
each query.  You might get a good plan or an awful one, but at least
it'd be the same one each time.  DBAs like predictability.



Hmm, that doesn't sound appealing to me, but I'm only a DBA at need.


I was imagining a GUC that would make the reset optional, in case  
anyone
really does want to have unstable plans.  I don't have much doubt  
about

what typical users will prefer though.


OK.

...Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-09 Thread Dimitri Fontaine
Hi,

Robert Haas robertmh...@gmail.com writes:
 On Jul 8, 2009, at 8:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Now, your answer will probably be that we should provide some better
 mechanism for re-using a previously identified plan structure.  No
 doubt that would be ideal, but the amount of effort required to get
 there is nontrivial, and I'm not at all convinced it would be repaid
 in usefulness.  Whereas what I describe above is something that costs
 us nearly nothing to provide.  The usefulness might be marginal too,
 but on the basis of cost/benefit ratio it's a clear win.

 I don't think that would be my answer because plan caching sounds hard.  It
 would be nice to have, though. :-)

In fact I think marshalling plans (and unmarshalling them) would rather
be the easy part of it, what looks very hard is the problem already
mentioned here in another context (gathering statistics on views):
  http://archives.postgresql.org/pgsql-performance/2009-06/msg00118.php

How to match a random user query with the stored plans with as little
analyze as possible?

 But it's clearly a planner hint, however you slice it.

Agreed.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-09 Thread Peter Hunsberger
On Wed, Jul 8, 2009 at 8:26 PM, Tom Lanet...@sss.pgh.pa.us wrote:

 Robert Haas robertmh...@gmail.com writes:
 That was my first reaction too, but now I'm wondering whether we
 shouldn't just do #1.  #2 is a planner hint, too, just not a very good
 one.  If, as you suggest, it isn't actually useful, then why keep it
 at all? (On the other hand, if someone thinks they need it, it would
 be interesting to know the use case, and think about the best way to
 address it.)

 Well, I can cite one reasonably plausible use case: when you have an
 umpteen-way join you need to execute a lot, and you don't want to pay
 for an exhaustive search, but GEQO doesn't reliably find a good plan.
 What you can do is let the system do an exhaustive search once to find
 the best plan, then you rearrange the query to specify that join order
 via JOINs, and turn off join collapsing.  Presto, good plan every time
 with very little planning time expended.

In the Oracle world they use the hint ORDERED for this purpose.  As
the Oracle optimizer has gotten smarter over the years I've seen less
need to use it, but over all, compared to other Oracle hints it does
not seem to be extremely
sensitive to data / index / stats changes. When you think about why
such  ordering might work that makes sense to me: small tables can be
used early to prune large tables later on.  Typically, these smaller
tables are static config info type data. Eg. pick species, then choose
which of the 10 million pathology samples you have match that species.


 Now, your answer will probably be that we should provide some better
 mechanism for re-using a previously identified plan structure.  No
 doubt that would be ideal, but the amount of effort required to get
 there is nontrivial, and I'm not at all convinced it would be repaid
 in usefulness.  Whereas what I describe above is something that costs
 us nearly nothing to provide.  The usefulness might be marginal too,
 but on the basis of cost/benefit ratio it's a clear win.


Again Oracle has a mechanism for doing this.  Don't know the details,
but our DBA would if anyone cares...

-- 
Peter Hunsberger

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-09 Thread Noah Misch
On Wed, Jul 08, 2009 at 05:23:16PM -0400, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
  The expontential factor seems smaller for real queries.  I have a query of
  sixteen joins that takes 71s to plan deterministically; it looks like this:
 
  SELECT 1 FROM fact JOIN dim0 ... JOIN dim6
  JOIN t t0 ON fact.key = t.key AND t.x = MCV0
  LEFT JOIN t t1 ON fact.key = t.key AND t.x = MCV1
  JOIN t t2 ON fact.key = t.key AND t.x = MCV2
  LEFT JOIN t t3 ON fact.key = t.key AND t.x = NON-MCV0
  LEFT JOIN t t4 ON fact.key = t.key AND t.x = NON-MCV1
  LEFT JOIN t t5 ON fact.key = t.key AND t.x = NON-MCV2
  LEFT JOIN t t6 ON fact.key = t.key AND t.x = NON-MCV3
  LEFT JOIN t t7 ON fact.key = t.key AND t.x = NON-MCV4
 
 I'm confused here --- I think you must have over-anonymized your query.
 Surely the ON conditions for the left joins should be referencing t3,
 t4, etc?

Yes.  Aside from that error, I picked the wrong features to emphasize and gloss
over.  Only two relations (amidst `dim0 ... dim6') resemble dimensions, having
an implied relative position on the plan tree.  The other fourteen relations
share a join key.  That explains the distinctive plan cost; this query resembles
the utterly unconstrained artificial query to a larger degree than most.

  For the real query, removing one join drops plan time to 26s, and
  removing two drops the time to 11s.  I don't have a good theory for
  the multiplier changing from 4 for the trivial demonstration to ~2.5
  for this real query.
 
 The rule of thumb that says that an n-way join requires 2^n work is only
 true if we consider every single combination of possible joins, which
 normally we don't.  The planner prefers join paths that follow join
 clauses, and will only consider clauseless joins when it has no other
 choice.  I believe that real queries tend to be pretty non-flat in this
 space and so the number of join paths to consider is a lot less than 2^n.
 Your synthesized query, on the other hand, allows any relation to be
 joined to any other --- it might not look that way, but after creating
 derived equalities there will be a potential join clause linking every
 relation to every other one.  So I think you were testing the worst case,
 and I'm not surprised that more-typical queries would show a slower
 growth curve.

Describing in those terms illuminates much.  While the concepts do suggest 2^N
worst-case planning cost, my artificial test case showed a rigid 4^N pattern;
what could explain that?

Thanks,
nm


pgpuxZFLTTeZz.pgp
Description: PGP signature


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-09 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 If, as you suggest, it isn't actually useful, then why keep it at
 all?
 
I was imagining that someone who has a query which is taking a long
time to run, and asserts that it would run faster if only the
optimizer would arrange the joins a certain way, could test that
theory, as part of the diagnostic process.  (In other words, for
similar reasons that the other enable_* GUC settings exist.)
 
I would certainly not want to use it in production, and wouldn't
expect that would normally be a very good idea.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-09 Thread Kevin Grittner
Resending to list due to failure:
451 4.3.5 Server configuration problem
 
Joshua Tolley eggyk...@gmail.com wrote:
 
 Entire stored plans, or predetermined seeds for GEQO's random number
 generator all boil down to saying, I want you to use this plan
 henceforth and forever.
 
A predetermined seed for geqo's random number generator only
guarantees that you get the same plan for the same query as long as
the statistics remain the same.  After the next ANALYZE, you may still
get a different plan.
 
 Do we know that GEQO plans are, in reality, less stable than than
 usual planner?
 
Yeah, I had a complaint of a slow query.  When I ran EXPLAIN ANALYZE
on it I ran it several times (as I usually do, to establish what
happens with and without caching).  I got different plans.  So I ran
the query a bunch of times and got a decent plan about 90% of the
time, and an awful one about 10% of the time.  It was clearly a geqo
effect, so I didn't post on it.  (Why post?  The product was clearly
operating as intended.)  There was an easy solution; I disabled geqo.
I'm sure it's useful to some; we haven't missed it.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-09 Thread Andres Freund
On Thursday 09 July 2009 17:37:41 Kevin Grittner wrote:
 Resending to list due to failure:
 451 4.3.5 Server configuration problem

 Joshua Tolley eggyk...@gmail.com wrote:
  Entire stored plans, or predetermined seeds for GEQO's random number
  generator all boil down to saying, I want you to use this plan
  henceforth and forever.

 A predetermined seed for geqo's random number generator only
 guarantees that you get the same plan for the same query as long as
 the statistics remain the same.  After the next ANALYZE, you may still
 get a different plan.

  Do we know that GEQO plans are, in reality, less stable than than
  usual planner?
Yes definitely. I.e. in the referenced schema (somewhere else in the thread) I 
even have queries which fail to plan only sometimes.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-09 Thread Noah Misch
On Wed, Jul 08, 2009 at 10:28:02AM -0500, Robert Haas wrote:
 On Jul 8, 2009, at 8:43 AM, Noah Misch n...@leadboat.com wrote:
 The expontential factor seems smaller for real queries.  I have a query of
 sixteen joins that takes 71s to plan deterministically; it looks like this:

 SELECT 1 FROM fact JOIN dim0 ... JOIN dim6
 JOIN t t0 ON fact.key = t.key AND t.x = MCV0
 LEFT JOIN t t1 ON fact.key = t.key AND t.x = MCV1
 JOIN t t2 ON fact.key = t.key AND t.x = MCV2
 LEFT JOIN t t3 ON fact.key = t.key AND t.x = NON-MCV0
 LEFT JOIN t t4 ON fact.key = t.key AND t.x = NON-MCV1
 LEFT JOIN t t5 ON fact.key = t.key AND t.x = NON-MCV2
 LEFT JOIN t t6 ON fact.key = t.key AND t.x = NON-MCV3
 LEFT JOIN t t7 ON fact.key = t.key AND t.x = NON-MCV4

 Very interesting!  I am guessing that the problem here is that all the  
 inner joins commute, as do all the left joins, so there are a lot of  
 possibilities to explore.  I also suspect that the actual join order  
 doesn't matter much, so it's a good candidate for GEQO. Even if you had 
 some restriction clauses against your dimension/t tables, that would 
 probably just mean that you want to do those joins first, and the rest 
 afterwards, which still seems like it ought to be OK for GEQO.

 But, in many queries this size, the join order is more constrained.   
 Some of the later joins depend on the tables added by the earlier ones, 
 rather than all referring back to the base table.  Is there some way we 
 could estimate the number of join offerings we'll have to consider 
 relatively cheaply?  That might be a better metric than # of tables.

Observing the pathological case taking plan time proportional to 4^N, apply
geqo_threshold as use GEQO when comparing more than geqo_threshold * 4^N join
order possibilities?  I'm not sure whether that figure is available (early
enough) to factor into the decision.  Such a change would probably imply a lower
default geqo_threshold, around 9 to 11.  The number of typical queries subject
to GEQO would nonetheless decrease.

nm


pgphBueR95eY4.pgp
Description: PGP signature


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-08 Thread Jan Urbański
Tom Lane wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 I guess the question is whether there is anyone who has had a contrary
 experience.  (There must have been some benchmarks to justify adding
 geqo at some point?)
 
 The CVS history shows that geqo was integrated on 1997-02-19, which
 I think means that it must have been developed against Postgres95

 So while I don't doubt that geqo was absolutely essential when it was
 written, it's fair to question whether it still provides a real win.
 And we could definitely stand to take another look at the default
 thresholds.

Well there is a TODO item about implementing an alternative to GEQO
(which is being treated more and more as the underdog of the project):
http://archives.postgresql.org/message-id/15658.1241278636%40sss.pgh.pa.us

Would people be interested in someone working on that item?

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-08 Thread Noah Misch
On Tue, Jul 07, 2009 at 09:31:14AM -0500, Kevin Grittner wrote:
 I don't remember any clear resolution to the wild variations in plan
 time mentioned here:
  
 http://archives.postgresql.org/pgsql-hackers/2009-06/msg00743.php
  
 I think it would be prudent to try to figure out why small changes in
 the query caused the large changes in the plan times Andres was
 seeing.  Has anyone else ever seen such behavior?  Can we get
 examples?  (It should be enough to get the statistics and the schema,
 since this is about planning time, not run time.)

With joins between statistically indistinguishable columns, I see planning times
change by a factor of ~4 for each join added or removed (postgres 8.3).  Varying
join_collapse_limit in the neighborhood of the actual number of joins has a
similar effect.  See attachment with annotated timings.  The example uses a
single table joined to itself, but using distinct tables with identical contents
yields the same figures.

The expontential factor seems smaller for real queries.  I have a query of
sixteen joins that takes 71s to plan deterministically; it looks like this:

SELECT 1 FROM fact JOIN dim0 ... JOIN dim6
JOIN t t0 ON fact.key = t.key AND t.x = MCV0
LEFT JOIN t t1 ON fact.key = t.key AND t.x = MCV1
JOIN t t2 ON fact.key = t.key AND t.x = MCV2
LEFT JOIN t t3 ON fact.key = t.key AND t.x = NON-MCV0
LEFT JOIN t t4 ON fact.key = t.key AND t.x = NON-MCV1
LEFT JOIN t t5 ON fact.key = t.key AND t.x = NON-MCV2
LEFT JOIN t t6 ON fact.key = t.key AND t.x = NON-MCV3
LEFT JOIN t t7 ON fact.key = t.key AND t.x = NON-MCV4

For the real query, removing one join drops plan time to 26s, and removing two
drops the time to 11s.  I don't have a good theory for the multiplier changing
from 4 for the trivial demonstration to ~2.5 for this real query.  Re-enabling
geqo drops plan time to .5s.  These tests used default_statistics_target = 1000,
but dropping that to 100 does not change anything dramatically.

 I guess the question is whether there is anyone who has had a contrary
 experience.  (There must have been some benchmarks to justify adding
 geqo at some point?)

I have queries with a few more joins (19-21), and I cancelled attempts to plan
them deterministically after 600+ seconds and 10+ GiB of memory usage.  Even
with geqo_effort = 10, they plan within 5-15s with good results.

All that being said, I've never encountered a situation where a value other than
1 or inf for *_collapse_limit appeared optimal.

nm
SET geqo = off;
SET join_collapse_limit = 100;
CREATE TEMP TABLE t AS SELECT * FROM generate_series(1, 1000) f(n); ANALYZE t;

--- Vary join count
-- 242.4s
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07
NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10 NATURAL JOIN t t11
NATURAL JOIN t t12;
-- 31.2s
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07
NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10 NATURAL JOIN t t11;
-- 8.1s
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07
NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10;
-- 2.0s
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07
NATURAL JOIN t t08 NATURAL JOIN t t09;
-- 0.5s
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07
NATURAL JOIN t t08;

--- Vary join_collapse_limit
-- 8.1s
SET join_collapse_limit = 100;
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07
NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10;
-- 8.0s
SET join_collapse_limit = 11;
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07
NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10;
-- 2.2s
SET join_collapse_limit = 10;
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07
NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10;
-- 0.5s
SET join_collapse_limit = 9;
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t t07
NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10;
-- 0.1s
SET join_collapse_limit = 8;
EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL JOIN t
t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t 

Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-08 Thread Kenneth Marshall
Hi,

When I was first familiarizing myself with PostgreSQL, I took a
walk through its documentation on GECO and similar processes in
the literature. One big advantage of GECO is that you can trade
off planning time for plan optimization. I do agree that it should
be updated, but there were definite cases in the literature where
the planning time for exhaustive searches could take orders of
magnitude more time to execute than the differences in the execution
times of the differing plans.

My two cents,
Ken

On Wed, Jul 08, 2009 at 09:43:12AM -0400, Noah Misch wrote:
 On Tue, Jul 07, 2009 at 09:31:14AM -0500, Kevin Grittner wrote:
  I don't remember any clear resolution to the wild variations in plan
  time mentioned here:
   
  http://archives.postgresql.org/pgsql-hackers/2009-06/msg00743.php
   
  I think it would be prudent to try to figure out why small changes in
  the query caused the large changes in the plan times Andres was
  seeing.  Has anyone else ever seen such behavior?  Can we get
  examples?  (It should be enough to get the statistics and the schema,
  since this is about planning time, not run time.)
 
 With joins between statistically indistinguishable columns, I see planning 
 times
 change by a factor of ~4 for each join added or removed (postgres 8.3).  
 Varying
 join_collapse_limit in the neighborhood of the actual number of joins has a
 similar effect.  See attachment with annotated timings.  The example uses a
 single table joined to itself, but using distinct tables with identical 
 contents
 yields the same figures.
 
 The expontential factor seems smaller for real queries.  I have a query of
 sixteen joins that takes 71s to plan deterministically; it looks like this:
 
 SELECT 1 FROM fact JOIN dim0 ... JOIN dim6
 JOIN t t0 ON fact.key = t.key AND t.x = MCV0
 LEFT JOIN t t1 ON fact.key = t.key AND t.x = MCV1
 JOIN t t2 ON fact.key = t.key AND t.x = MCV2
 LEFT JOIN t t3 ON fact.key = t.key AND t.x = NON-MCV0
 LEFT JOIN t t4 ON fact.key = t.key AND t.x = NON-MCV1
 LEFT JOIN t t5 ON fact.key = t.key AND t.x = NON-MCV2
 LEFT JOIN t t6 ON fact.key = t.key AND t.x = NON-MCV3
 LEFT JOIN t t7 ON fact.key = t.key AND t.x = NON-MCV4
 
 For the real query, removing one join drops plan time to 26s, and removing two
 drops the time to 11s.  I don't have a good theory for the multiplier changing
 from 4 for the trivial demonstration to ~2.5 for this real query.  Re-enabling
 geqo drops plan time to .5s.  These tests used default_statistics_target = 
 1000,
 but dropping that to 100 does not change anything dramatically.
 
  I guess the question is whether there is anyone who has had a contrary
  experience.  (There must have been some benchmarks to justify adding
  geqo at some point?)
 
 I have queries with a few more joins (19-21), and I cancelled attempts to plan
 them deterministically after 600+ seconds and 10+ GiB of memory usage.  Even
 with geqo_effort = 10, they plan within 5-15s with good results.
 
 All that being said, I've never encountered a situation where a value other 
 than
 1 or inf for *_collapse_limit appeared optimal.
 
 nm

 SET geqo = off;
 SET join_collapse_limit = 100;
 CREATE TEMP TABLE t AS SELECT * FROM generate_series(1, 1000) f(n); ANALYZE t;
 
 --- Vary join count
 -- 242.4s
 EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL 
 JOIN t
 t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t 
 t07
 NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10 NATURAL JOIN t t11
 NATURAL JOIN t t12;
 -- 31.2s
 EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL 
 JOIN t
 t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t 
 t07
 NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10 NATURAL JOIN t t11;
 -- 8.1s
 EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL 
 JOIN t
 t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t 
 t07
 NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10;
 -- 2.0s
 EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL 
 JOIN t
 t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t 
 t07
 NATURAL JOIN t t08 NATURAL JOIN t t09;
 -- 0.5s
 EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL 
 JOIN t
 t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t 
 t07
 NATURAL JOIN t t08;
 
 --- Vary join_collapse_limit
 -- 8.1s
 SET join_collapse_limit = 100;
 EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL 
 JOIN t
 t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t 
 t07
 NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10;
 -- 8.0s
 SET join_collapse_limit = 11;
 EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL 
 JOIN t
 t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t 
 t07
 NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL 

Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Jul 7, 2009 at 6:33 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 It's pretty much all-or-nothing now: the GUC does not give you any sort
 of useful control over *which* joins are reorderable.

 Yes.  So the way I see it, the options are:

 1. We can remove join_collapse_limit completely and provide no
 substitute.  In this case, the ability to explicitly specify the join
 order will be gone.

 2. We can remove join_collapse_limit but provide a different, Boolean
 GUC instead, like enable_join_reordering.  In this case, we're not
 actually reducing the number of GUCs, just the size of the foot-gun.

 3. We can remove join_collapse_limit and provide an alternative way to
 explicitly specify the join order that is more flexible.  This both
 reduces the number of GUCs and arguably provides some useful
 functionality that we don't have now.

 It sounds like your vote is for #2, which, as I say, seems like a
 feature with one arm tied behind its back, but hey, what do I know?

Well, the reason I'm not voting for #3 is that it looks like a lot of
work to implement something that would basically be a planner hint,
which I'm generally against; furthermore, it's a hint that there's been
no demand for.  (We're not even certain that anyone is using the ability
to *fully* specify the join order, much less wanting some undetermined
compromise between manual and automatic control.)  And anyway I didn't
hear anyone volunteering to do it.  So the realistic alternatives are
#1, #2, or do nothing; and out of those I like #2.

 Accepting that as the consensus in the absence of contrary votes, we
 still need to decide what to do about from_collapse_threshold and
 geqo_threshold.  I'm pretty sure that we shouldn't eliminate GEQO or
 geqo_threshold, because the basic algorithm is clearly exponential
 time and eventually you have to start worrying about that, but we
 could raise the value.  What to do about from_collapse_threshold is
 less clear to me.

I do not think there is a good argument for eliminating geqo_threshold.
There might well be an argument for cranking up its default value;
but that would take some hard data, which seems lacking at the moment.

I'm on the fence about from_collapse_threshold.  The argument for having
it seems to be that there might be cases where not folding a subquery
is preferable to folding it and then taking your chances with GEQO.
But I'm not really convinced there are any.

It occurs to me that one way to make GEQO less scary would be to take
out the nondeterminism by resetting its random number generator for
each query.  You might get a good plan or an awful one, but at least
it'd be the same one each time.  DBAs like predictability.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-08 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: 
 
 It occurs to me that one way to make GEQO less scary would be to
 take out the nondeterminism by resetting its random number generator
 for each query.  You might get a good plan or an awful one, but at
 least it'd be the same one each time.  DBAs like predictability.
 
+1  The biggest reason that I've tended to avoid geqo is that I would
never know when it might do something really stupid with a query one
time out of some large number, leading to mysterious complaints which
could eat a lot of time.
 
For a moment it seemed logical to suggest a session GUC for the seed,
so if you got a bad plan you could keep rolling the dice until you got
one you liked; but my right-brain kept sending shivers down my spine
to suggest just how uncomfortable it was with that idea
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-08 Thread Kenneth Marshall
On Wed, Jul 08, 2009 at 04:13:11PM -0500, Kevin Grittner wrote:
 Tom Lane t...@sss.pgh.pa.us wrote: 
  
  It occurs to me that one way to make GEQO less scary would be to
  take out the nondeterminism by resetting its random number generator
  for each query.  You might get a good plan or an awful one, but at
  least it'd be the same one each time.  DBAs like predictability.
  
 +1  The biggest reason that I've tended to avoid geqo is that I would
 never know when it might do something really stupid with a query one
 time out of some large number, leading to mysterious complaints which
 could eat a lot of time.
  
 For a moment it seemed logical to suggest a session GUC for the seed,
 so if you got a bad plan you could keep rolling the dice until you got
 one you liked; but my right-brain kept sending shivers down my spine
 to suggest just how uncomfortable it was with that idea
  
 -Kevin
 

+1  I like the idea of a session GUC for the random number seed. If
we can come up with a way to prune the search space more aggressively,
GECO (or GECO2) will be much less prone to generating a bad plan.
I also think that a session variable would make it easier to test
GECO* by removing the nondeteminism.

Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-08 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 With joins between statistically indistinguishable columns, I see planning 
 times
 change by a factor of ~4 for each join added or removed (postgres 8.3).  
 Varying
 join_collapse_limit in the neighborhood of the actual number of joins has a
 similar effect.  See attachment with annotated timings.  The example uses a
 single table joined to itself, but using distinct tables with identical 
 contents
 yields the same figures.

Hey, some hard data!  Thanks for doing that.

 The expontential factor seems smaller for real queries.  I have a query of
 sixteen joins that takes 71s to plan deterministically; it looks like this:

 SELECT 1 FROM fact JOIN dim0 ... JOIN dim6
 JOIN t t0 ON fact.key = t.key AND t.x = MCV0
 LEFT JOIN t t1 ON fact.key = t.key AND t.x = MCV1
 JOIN t t2 ON fact.key = t.key AND t.x = MCV2
 LEFT JOIN t t3 ON fact.key = t.key AND t.x = NON-MCV0
 LEFT JOIN t t4 ON fact.key = t.key AND t.x = NON-MCV1
 LEFT JOIN t t5 ON fact.key = t.key AND t.x = NON-MCV2
 LEFT JOIN t t6 ON fact.key = t.key AND t.x = NON-MCV3
 LEFT JOIN t t7 ON fact.key = t.key AND t.x = NON-MCV4

I'm confused here --- I think you must have over-anonymized your query.
Surely the ON conditions for the left joins should be referencing t3,
t4, etc?

 For the real query, removing one join drops plan time to 26s, and
 removing two drops the time to 11s.  I don't have a good theory for
 the multiplier changing from 4 for the trivial demonstration to ~2.5
 for this real query.

The rule of thumb that says that an n-way join requires 2^n work is only
true if we consider every single combination of possible joins, which
normally we don't.  The planner prefers join paths that follow join
clauses, and will only consider clauseless joins when it has no other
choice.  I believe that real queries tend to be pretty non-flat in this
space and so the number of join paths to consider is a lot less than 2^n.
Your synthesized query, on the other hand, allows any relation to be
joined to any other --- it might not look that way, but after creating
derived equalities there will be a potential join clause linking every
relation to every other one.  So I think you were testing the worst case,
and I'm not surprised that more-typical queries would show a slower
growth curve.

This is why I don't trust benchmarking the planner on artificial
queries.  Still, I appreciate your work, because it gives us at least
some hard data to go on.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-08 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 For a moment it seemed logical to suggest a session GUC for the seed,
 so if you got a bad plan you could keep rolling the dice until you got
 one you liked; but my right-brain kept sending shivers down my spine
 to suggest just how uncomfortable it was with that idea

If memory serves, we actually had exactly that at some point.  But I
think the reason it got taken out was that it interfered with the
behavior of the random() function for everything else.  We'd have to
give GEQO its own private random number generator.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-08 Thread Kenneth Marshall
On Wed, Jul 08, 2009 at 05:46:02PM -0400, Tom Lane wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
  For a moment it seemed logical to suggest a session GUC for the seed,
  so if you got a bad plan you could keep rolling the dice until you got
  one you liked; but my right-brain kept sending shivers down my spine
  to suggest just how uncomfortable it was with that idea
 
 If memory serves, we actually had exactly that at some point.  But I
 think the reason it got taken out was that it interfered with the
 behavior of the random() function for everything else.  We'd have to
 give GEQO its own private random number generator.
 
   regards, tom lane
 
A separate random number generator for GECO make a lot of sense.

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-08 Thread Robert Haas

On Jul 8, 2009, at 3:57 PM, Tom Lane t...@sss.pgh.pa.us wrote:


Robert Haas robertmh...@gmail.com writes:

On Tue, Jul 7, 2009 at 6:33 PM, Tom Lanet...@sss.pgh.pa.us wrote:
It's pretty much all-or-nothing now: the GUC does not give you any  
sort

of useful control over *which* joins are reorderable.



Yes.  So the way I see it, the options are:



1. We can remove join_collapse_limit completely and provide no
substitute.  In this case, the ability to explicitly specify the join
order will be gone.



2. We can remove join_collapse_limit but provide a different, Boolean
GUC instead, like enable_join_reordering.  In this case, we're not
actually reducing the number of GUCs, just the size of the foot-gun.


3. We can remove join_collapse_limit and provide an alternative way  
to

explicitly specify the join order that is more flexible.  This both
reduces the number of GUCs and arguably provides some useful
functionality that we don't have now.



It sounds like your vote is for #2, which, as I say, seems like a
feature with one arm tied behind its back, but hey, what do I know?


Well, the reason I'm not voting for #3 is that it looks like a lot of
work to implement something that would basically be a planner hint,
which I'm generally against; furthermore, it's a hint that there's  
been
no demand for.  (We're not even certain that anyone is using the  
ability

to *fully* specify the join order, much less wanting some undetermined
compromise between manual and automatic control.)  And anyway I didn't
hear anyone volunteering to do it.  So the realistic alternatives are
#1, #2, or do nothing; and out of those I like #2.


That was my first reaction too, but now I'm wondering whether we  
shouldn't just do #1.  #2 is a planner hint, too, just not a very good  
one.  If, as you suggest, it isn't actually useful, then why keep it  
at all? (On the other hand, if someone thinks they need it, it would  
be interesting to know the use case, and think about the best way to  
address it.)






Accepting that as the consensus in the absence of contrary votes, we
still need to decide what to do about from_collapse_threshold and
geqo_threshold.  I'm pretty sure that we shouldn't eliminate GEQO or
geqo_threshold, because the basic algorithm is clearly exponential
time and eventually you have to start worrying about that, but we
could raise the value.  What to do about from_collapse_threshold is
less clear to me.


I do not think there is a good argument for eliminating  
geqo_threshold.

There might well be an argument for cranking up its default value;
but that would take some hard data, which seems lacking at the moment.

I'm on the fence about from_collapse_threshold.  The argument for  
having

it seems to be that there might be cases where not folding a subquery
is preferable to folding it and then taking your chances with GEQO.
But I'm not really convinced there are any.


Me either.  You could probably get the same effect in other ways if  
you actually needed it, like OFFSET 0 or wrapping the subquery in a  
SRF.  I'm leaning more and more toward thinking we should just nuke it.



It occurs to me that one way to make GEQO less scary would be to take
out the nondeterminism by resetting its random number generator for
each query.  You might get a good plan or an awful one, but at least
it'd be the same one each time.  DBAs like predictability.


Hmm, that doesn't sound appealing to me, but I'm only a DBA at need.

...Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 That was my first reaction too, but now I'm wondering whether we  
 shouldn't just do #1.  #2 is a planner hint, too, just not a very good  
 one.  If, as you suggest, it isn't actually useful, then why keep it  
 at all? (On the other hand, if someone thinks they need it, it would  
 be interesting to know the use case, and think about the best way to  
 address it.)

Well, I can cite one reasonably plausible use case: when you have an
umpteen-way join you need to execute a lot, and you don't want to pay
for an exhaustive search, but GEQO doesn't reliably find a good plan.
What you can do is let the system do an exhaustive search once to find
the best plan, then you rearrange the query to specify that join order
via JOINs, and turn off join collapsing.  Presto, good plan every time
with very little planning time expended.

Now, your answer will probably be that we should provide some better
mechanism for re-using a previously identified plan structure.  No
doubt that would be ideal, but the amount of effort required to get
there is nontrivial, and I'm not at all convinced it would be repaid
in usefulness.  Whereas what I describe above is something that costs
us nearly nothing to provide.  The usefulness might be marginal too,
but on the basis of cost/benefit ratio it's a clear win.

 It occurs to me that one way to make GEQO less scary would be to take
 out the nondeterminism by resetting its random number generator for
 each query.  You might get a good plan or an awful one, but at least
 it'd be the same one each time.  DBAs like predictability.

 Hmm, that doesn't sound appealing to me, but I'm only a DBA at need.

I was imagining a GUC that would make the reset optional, in case anyone
really does want to have unstable plans.  I don't have much doubt about
what typical users will prefer though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-08 Thread Joshua Tolley
On Wed, Jul 08, 2009 at 09:26:35PM -0400, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  That was my first reaction too, but now I'm wondering whether we  
  shouldn't just do #1.  #2 is a planner hint, too, just not a very good  
  one.  If, as you suggest, it isn't actually useful, then why keep it  
  at all? (On the other hand, if someone thinks they need it, it would  
  be interesting to know the use case, and think about the best way to  
  address it.)
 
 Well, I can cite one reasonably plausible use case: when you have an
 umpteen-way join you need to execute a lot, and you don't want to pay
 for an exhaustive search, but GEQO doesn't reliably find a good plan.
 What you can do is let the system do an exhaustive search once to find
 the best plan, then you rearrange the query to specify that join order
 via JOINs, and turn off join collapsing.  Presto, good plan every time
 with very little planning time expended.

 Now, your answer will probably be that we should provide some better
 mechanism for re-using a previously identified plan structure.  No
 doubt that would be ideal, but the amount of effort required to get
 there is nontrivial, and I'm not at all convinced it would be repaid
 in usefulness.  Whereas what I describe above is something that costs
 us nearly nothing to provide.  The usefulness might be marginal too,
 but on the basis of cost/benefit ratio it's a clear win.

This sounds like planner hints to me. The argument against hinting, AIUI, is
that although the plan you've guaranteed via hints may be a good one today,
when the data change a bit your carefully crafted plan happens to become a bad
one, but you're no longer around to change the hints accordingly. Entire
stored plans, or predetermined seeds for GEQO's random number generator all
boil down to saying, I want you to use this plan henceforth and forever. 

  It occurs to me that one way to make GEQO less scary would be to take
  out the nondeterminism by resetting its random number generator for
  each query.  You might get a good plan or an awful one, but at least
  it'd be the same one each time.  DBAs like predictability.
 
  Hmm, that doesn't sound appealing to me, but I'm only a DBA at need.
 
 I was imagining a GUC that would make the reset optional, in case anyone
 really does want to have unstable plans.  I don't have much doubt about
 what typical users will prefer though.

Do we know that GEQO plans are, in reality, less stable than than usual
planner? Certainly on paper it appears they could be, but the mailing lists
are full of emails about this query's plan changed and performance suddenly
tanked; how do I fix it? so I'm unconvinced this is a problem unique to GEQO.
Which in turn boils down to we need real world data to look at.

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com


signature.asc
Description: Digital signature


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-08 Thread Tom Lane
Joshua Tolley eggyk...@gmail.com writes:
 This sounds like planner hints to me. The argument against hinting, AIUI, is
 that although the plan you've guaranteed via hints may be a good one today,
 when the data change a bit your carefully crafted plan happens to become a bad
 one, but you're no longer around to change the hints accordingly.

That's one argument against them.  Another one is that time put into
developing a planner hints mechanism is time that would be better spent
on fixing the underlying planner problem.  However, that second argument
doesn't apply to something like join_collapse_limit, whose
implementation is pretty nearly a one-liner (as are the various existing
enable_whatever switches).  Again, it's all about cost/benefit ratio.

 Do we know that GEQO plans are, in reality, less stable than than usual
 planner?

Yes, we do.  There aren't that many examples in the archives, but that
likely is because join_collapse_limit and from_collapse_limit are by
default set to try to prevent use of GEQO.  The most recent clear-cut
example I can find is
http://archives.postgresql.org/pgsql-general/2008-10/msg01449.php
wherein the guy has apparently written a flat FROM of 17 tables,
and so neither collapse_limit will kick in.  If we get rid of the
collapse_limits as proposed in this thread, you'll start seeing
those sorts of complaints all over the place, unless we make
GEQO deterministic.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-08 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 Describing in those terms illuminates much.  While the concepts do suggest 2^N
 worst-case planning cost, my artificial test case showed a rigid 4^N pattern;
 what could explain that?

Well, the point of the 2^N concept is just that adding one more relation
multiplies the planning work by a constant factor.  It's useful data
that you find the factor to be about 4, but I wouldn't have expected the
model to tell us that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote: 
 
 I'm interested in hearing from anyone who has practical experience
 with tuning these variables, or any ideas on what we should test to
 get a better idea as to how to set them.
 
I don't remember any clear resolution to the wild variations in plan
time mentioned here:
 
http://archives.postgresql.org/pgsql-hackers/2009-06/msg00743.php
 
I think it would be prudent to try to figure out why small changes in
the query caused the large changes in the plan times Andres was
seeing.  Has anyone else ever seen such behavior?  Can we get
examples?  (It should be enough to get the statistics and the schema,
since this is about planning time, not run time.)
 
My own experience is that when we investigate a complaint about a
query not performing to user or application programmer expectations,
we have sometimes found that boosting these values has helped.  We
boost them overall (in postgresql.conf) without ever having seen a
downside.  We currently have geqo disabled and set both collapse
limits to 20.  We should probably just set them both to several
hundred and not wait until some query with more than 20 tables
performs badly, but I'm not sure we have any of those yet.
 
In short, my experience is that when setting these higher has made any
difference at all, it has always generated a plan that saved more time
than the extra planning required.  Well, I'd bet that there has been
an increase in the plan time of some queries which wound up with the
same plan anyway, but the difference has never been noticeable; the
net
effect has been a plus for us.
 
I guess the question is whether there is anyone who has had a contrary
experience.  (There must have been some benchmarks to justify adding
geqo at some point?)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Robert Haas
On Jul 7, 2009, at 9:31 AM, Kevin Grittner kevin.gritt...@wicourts.gov 
 wrote:



Robert Haas robertmh...@gmail.com wrote:


I'm interested in hearing from anyone who has practical experience
with tuning these variables, or any ideas on what we should test to
get a better idea as to how to set them.


I don't remember any clear resolution to the wild variations in plan
time mentioned here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg00743.php

I think it would be prudent to try to figure out why small changes in
the query caused the large changes in the plan times Andres was
seeing.  Has anyone else ever seen such behavior?  Can we get
examples?  (It should be enough to get the statistics and the schema,
since this is about planning time, not run time.)


Well, there's not really enough information there to figure out  
specifically what was happening, but from 10,000 feet,  
join_collapse_limit and from_collapse_limit constrain the join order.   
If the estimates are all accurate, setting them to a value  infinity  
will either leave the plans unchanged or make them worse.  If it's  
making them better, then the estimates are off and the join order  
constraint happens to be preventing the planner from considering the  
cases what really hurts you.  But that's mostly luck.



My own experience is that when we investigate a complaint about a
query not performing to user or application programmer expectations,
we have sometimes found that boosting these values has helped.  We
boost them overall (in postgresql.conf) without ever having seen a
downside.  We currently have geqo disabled and set both collapse
limits to 20.  We should probably just set them both to several
hundred and not wait until some query with more than 20 tables
performs badly, but I'm not sure we have any of those yet.

In short, my experience is that when setting these higher has made any
difference at all, it has always generated a plan that saved more time
than the extra planning required.  Well, I'd bet that there has been
an increase in the plan time of some queries which wound up with the
same plan anyway, but the difference has never been noticeable; the
net
effect has been a plus for us.


You have a big dataset AIUI so the right values for you might be too  
high for some people with, say, OLTP workloads.



I guess the question is whether there is anyone who has had a contrary
experience.  (There must have been some benchmarks to justify adding
geqo at some point?)


GEQO or something like it is certainly needed for very large planning  
problems.  The non-GEQO planner takes exponential time in the size of  
the problem, so at some point that's going to get ugly.  But  
triggering it at the level we do now seems unnecessarily pessimistic  
about what constitutes too much planning.


...Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Andres Freund
Hi Kevin, Hi all,

On Tuesday 07 July 2009 16:31:14 Kevin Grittner wrote:
 Robert Haas robertmh...@gmail.com wrote:
  I'm interested in hearing from anyone who has practical experience
  with tuning these variables, or any ideas on what we should test to
  get a better idea as to how to set them.

 I don't remember any clear resolution to the wild variations in plan
 time mentioned here:

 http://archives.postgresql.org/pgsql-hackers/2009-06/msg00743.php

 I think it would be prudent to try to figure out why small changes in
 the query caused the large changes in the plan times Andres was
 seeing.  Has anyone else ever seen such behavior?  Can we get
 examples?  (It should be enough to get the statistics and the schema,
 since this is about planning time, not run time.)
I don't think it is surprising that small changes on those variables change 
the plan time widely on a complex query.
I.e. a increase by one in from_collapse_limit can completely change the plan 
before optimizations change due to more inlining.

I don't know the exact behaviour in the case more joins exists than 
join_collapse_limit but is not hard to imagine that this also can dramatically 
change the plan complexity. As there were quite many different views involved 
all the changes on the *_limit variables could have triggered plan changes in 
different parts of the query.

I plan to revisit the issue you referenced btw. Only first was release phase 
and then I could not motivate myself to investigate a bit more...

The mail you referenced contains a completely bogus and ugly query that shows 
similar symptoms by the way. I guess the variations would be even bigger if 
differently sized views/subqueries would be used.

 My own experience is that when we investigate a complaint about a
 query not performing to user or application programmer expectations,
 we have sometimes found that boosting these values has helped.  We
 boost them overall (in postgresql.conf) without ever having seen a
 downside.  We currently have geqo disabled and set both collapse
 limits to 20.  We should probably just set them both to several
 hundred and not wait until some query with more than 20 tables
 performs badly, but I'm not sure we have any of those yet.
I have not found consistently better results with geqo enabled. Some queries 
are better, others worse. Often the comparison is not reliably reproducable.
(The possibility to set geqo to some know starting value would be nice for 
such comparisons)

I cannot reasonably plan some queries with join_collapse_limit set to 20. At 
least not without setting the geqo limit very low and a geqo_effort to a low 
value.
So I would definitely not agree that removing j_c_l is a good idea. 

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 I cannot reasonably plan some queries with join_collapse_limit set to 20. At 
 least not without setting the geqo limit very low and a geqo_effort to a low 
 value.
 So I would definitely not agree that removing j_c_l is a good idea. 

Can you show some specific examples?  All of this discussion seems like
speculation in a vacuum ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Andres Freund
On Tuesday 07 July 2009 17:40:50 Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
  I cannot reasonably plan some queries with join_collapse_limit set to 20.
  At least not without setting the geqo limit very low and a geqo_effort to
  a low value.
  So I would definitely not agree that removing j_c_l is a good idea.
 Can you show some specific examples?  All of this discussion seems like
 speculation in a vacuum ...
I still may not publish the original schema (And I still have not heard any 
reasonable reasons) - the crazy query in the referenced email shows similar 
problems and has a somewhat similar structure.

If that is not enough I will try to design a schema that is similar and 
different enough from the original schema. Will take a day or two though.


Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 I guess the question is whether there is anyone who has had a contrary
 experience.  (There must have been some benchmarks to justify adding
 geqo at some point?)

The CVS history shows that geqo was integrated on 1997-02-19, which
I think means that it must have been developed against Postgres95
(or even earlier Berkeley releases?).  That was certainly before any
of the current community's work on the optimizer began.  A quick look
at the code as it stood on that date suggests that the regular
optimizer's behavior for large numbers of rels was a lot worse than it
is today --- notably, it looks like it would consider a whole lot more
Cartesian-product joins than we do now; especially if you had bushy
mode turned on, which you'd probably have to do to find good plans in
complicated cases.  There were also a bunch of enormous inefficiencies
that we've whittled down over time, such as the mechanisms for comparing
pathkeys or the use of integer Lists to represent relid sets.

So while I don't doubt that geqo was absolutely essential when it was
written, it's fair to question whether it still provides a real win.
And we could definitely stand to take another look at the default
thresholds.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 One possibility would be to remove join_collapse_limit entirely, but
 that would eliminate one possibily-useful piece of functionality that
 it current enables: namely, the ability to exactly specify the join
 order by setting join_collapse_limit to 1.  So one possibility would
 be to rename the variable something like explicit_join_order and make
 it a Boolean; another possibility would be to change the default value
 to INT_MAX.

As the person who put in those thresholds, I kind of prefer going over
to the boolean definition.  That was the alternative that we considered;
the numeric thresholds were used instead because they were easy to
implement and seemed to possibly offer more control.  But I'm not
convinced that anyone has really used them profitably.  I agree that
the ability to use JOIN syntax to specify the join order exactly (with
join_collapse_limit=1) is the only really solid use-case anyone has
proposed for either threshold.  I'm interested in Andreas' comment that
he has use-cases where using the collapse_limit is better than allowing
geqo to take over for very large problems ... but I think we need to see
those use-cases and see if there's a better fix.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Greg Stark
On Tue, Jul 7, 2009 at 5:58 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 So while I don't doubt that geqo was absolutely essential when it was
 written, it's fair to question whether it still provides a real win.
 And we could definitely stand to take another look at the default
 thresholds

The whole point of these parameters is to save time planning large
complex queries -- which are rarely going to be the kind of short,
simple, fast to execute oltp queries where planning time makes a big
difference. The larger more complex the query the more likely it is to
be a long-running dss or olap style query where shaving one percent
off the runtime would be worth spending many seconds planning.

I propose that there's a maximum reasonable planning time which a
programmer woulod normally expect the database to be able to come up
with a plan for virtually any query within. Personally I would be
surprised if a plain EXPLAIN took more than, say, 30s. perhaps even
something more like 10s.

We should benchmark the planner on increasingly large sets of
relations on a typical developer machine and set geqo to whatever
value the planner can handle in that length of time. I suspect even at
10s you're talking about substantially larger values than the current
default.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 We should benchmark the planner on increasingly large sets of
 relations on a typical developer machine and set geqo to whatever
 value the planner can handle in that length of time. I suspect even at
 10s you're talking about substantially larger values than the current
 default.

The problem is to find some realistic benchmark cases.  That's one
reason why I was pestering Andreas to see his actual use cases ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Andres Freund
On Tuesday 07 July 2009 19:45:44 Tom Lane wrote:
 Greg Stark gsst...@mit.edu writes:
  We should benchmark the planner on increasingly large sets of
  relations on a typical developer machine and set geqo to whatever
  value the planner can handle in that length of time. I suspect even at
  10s you're talking about substantially larger values than the current
  default.
 The problem is to find some realistic benchmark cases.  That's one
 reason why I was pestering Andreas to see his actual use cases ...
I will start writing a reduced/altered schema tomorrow then...

Andres


PS: Its Andres btw ;-)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Robert Haas

On Jul 7, 2009, at 12:32 PM, Tom Lane t...@sss.pgh.pa.us wrote:


Robert Haas robertmh...@gmail.com writes:

One possibility would be to remove join_collapse_limit entirely, but
that would eliminate one possibily-useful piece of functionality that
it current enables: namely, the ability to exactly specify the join
order by setting join_collapse_limit to 1.  So one possibility would
be to rename the variable something like explicit_join_order and make
it a Boolean; another possibility would be to change the default  
value

to INT_MAX.


As the person who put in those thresholds, I kind of prefer going over
to the boolean definition.


I'm OK with that, but out of conservatism suggested changing the  
default to unlimited in this release.  If by chance there is something  
we're missing and these parameters are doing someone any good, we can  
suggest that they set them back to the old values rather than telling  
them to use a private build.  If on the other hand we don't get any  
complaints, we can remove them with greater confidence in a future  
release.  But maybe that's too conservative.


Now, here's another thought: if we think it's reasonable for people to  
want to explicitly specify the join order, a GUC isn't really the best  
fit, because it's all or nothing.  Maybe we'd be better off dropping  
the GUCs entirely and adding some other bit of syntax that forces the  
join order, but only for that particular join.



That was the alternative that we considered;
the numeric thresholds were used instead because they were easy to
implement and seemed to possibly offer more control.  But I'm not
convinced that anyone has really used them profitably.  I agree that
the ability to use JOIN syntax to specify the join order exactly (with
join_collapse_limit=1) is the only really solid use-case anyone has
proposed for either threshold.  I'm interested in Andreas' comment  
that
he has use-cases where using the collapse_limit is better than  
allowing
geqo to take over for very large problems ... but I think we need to  
see

those use-cases and see if there's a better fix.

   regards, tom lane


Agreed.

...Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Dimitri Fontaine

Le 7 juil. 09 à 19:37, Greg Stark a écrit :

I propose that there's a maximum reasonable planning time


It sounds so much like the planner_effort GUC that has been talked  
about in the past...

  http://archives.postgresql.org/pgsql-performance/2009-05/msg00137.php

...except this time you want to measure it in seconds. The problem  
with measuring it in seconds is that when the time has elapsed, it's  
uneasy to switch from classic to geqo and avoid beginning from scratch  
again.

Would it be possible to start geqo from current planner state?

Another idea would be to have more complex metrics for deciding when  
to run geqo, that is guesstimate the query planning difficulty very  
quickly, based on more than just the number of relations in the from:  
presence of subqueries, UNION, EXISTS, IN, or branches in where  
clause, number of operators and index support for them, maybe some  
information from the stats too... The idea would be to
 - set an effort threshold from where we'd better run geqo (GUC,  
disabling possible)

 - if threshold enabled, compute metrics
 - if metric = threshold, use geqo, if not, classic planner
 - maybe default to disabling the threshold

It seems it'd be easier to set the new GUC on a per query basis...

The obvious problem to this approach is that computing the metric will  
take some time better spent at planning queries, but maybe we could  
have fast path for easy queries, which will look a lot like $subject.


Regards,
--
dim

I hope this will give readers better ideas than its bare content...
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Dimitri Fontaine

Le 7 juil. 09 à 21:16, Robert Haas a écrit :
Now, here's another thought: if we think it's reasonable for people  
to want to explicitly specify the join order, a GUC isn't really the  
best fit, because it's all or nothing.  Maybe we'd be better off  
dropping the GUCs entirely and adding some other bit of syntax that  
forces the join order, but only for that particular join.


MySQL calls them Straight Joins:
 
http://www.mysqlperformanceblog.com/2006/12/28/mysql-session-variables-and-hints/

I'm not sure our best move here would be in this direction :)
--
dim
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Tom Lane
Dimitri Fontaine dfonta...@hi-media.com writes:
 Another idea would be to have more complex metrics for deciding when  
 to run geqo, that is guesstimate the query planning difficulty very  
 quickly, based on more than just the number of relations in the from:  
 presence of subqueries, UNION, EXISTS, IN, or branches in where  
 clause, number of operators and index support for them, maybe some  
 information from the stats too...

Pointless, since GEQO is only concerned with examining alternative join
orderings.  I see no reason whatever to think that number-of-relations
isn't the correct variable to test to decide whether to use it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote: 
 
 if we think it's reasonable for people to want to explicitly specify
 the join order
 
Regardless of the syntax (GUC or otherwise), that is an optimizer
hint.  I thought we were trying to avoid those.
 
Although -- we do have all those enable_* GUC values which are also
optimizer hints; perhaps this should be another of those? 
enable_join_reorder?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Dimitri Fontaine

Le 7 juil. 09 à 21:45, Tom Lane a écrit :

Dimitri Fontaine dfonta...@hi-media.com writes:

Another idea would be to have more complex metrics for deciding when
to run geqo


Pointless, since GEQO is only concerned with examining alternative  
join

orderings.  I see no reason whatever to think that number-of-relations
isn't the correct variable to test to decide whether to use it.


Oh. It seems I prefer showing my ignorance rather than learning enough  
first. Writing mails is so much easier...


Sorry for the noise,
--
dim
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Although -- we do have all those enable_* GUC values which are also
 optimizer hints; perhaps this should be another of those? 
 enable_join_reorder?

Not a bad suggestion, especially since turning it off would usually be
considered just about as bad an idea as turning off the other ones.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Robert Haas
On Jul 7, 2009, at 3:03 PM, Kevin Grittner kevin.gritt...@wicourts.gov 
 wrote:



Robert Haas robertmh...@gmail.com wrote:


if we think it's reasonable for people to want to explicitly specify
the join order


Regardless of the syntax (GUC or otherwise), that is an optimizer
hint.  I thought we were trying to avoid those.


I guess my point is that there's not a lot of obvious benefit in  
allowing the functionality to exist but handicapping it so that it's  
useful in as few cases as possible.  If the consensus is that we want  
half a feature (but not more or less than half), that's OK with me,  
but it's not obvious to me why we should choose to want that.


...Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I guess my point is that there's not a lot of obvious benefit in  
 allowing the functionality to exist but handicapping it so that it's  
 useful in as few cases as possible.  If the consensus is that we want  
 half a feature (but not more or less than half), that's OK with me,  
 but it's not obvious to me why we should choose to want that.

Well, the question to my mind is whether the collapse_threshold GUCs in
their current form actually represent a feature ;-).  They were put
in pretty much entirely on speculation that someone might find them
useful.  Your argument is that they are not only useless but a foot-gun,
and so far we haven't got any clear contrary evidence.  If we accept
that argument then we should take them out, not just change the default.

My own thought is that from_collapse_limit has more justification,
since it basically acts to stop a subquery from being flattened when
that would make the parent query too complex, and that seems like a
more understandable and justifiable behavior than treating JOIN
syntax specially.  But I'm fine with removing join_collapse_limit
or reducing it to a boolean.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 Robert Haas robertmh...@gmail.com wrote:

 if we think it's reasonable for people to want to explicitly
 specify the join order

 Regardless of the syntax (GUC or otherwise), that is an optimizer
 hint.  I thought we were trying to avoid those.
 
 I guess my point is that there's not a lot of obvious benefit in  
 allowing the functionality to exist but handicapping it so that it's
 useful in as few cases as possible.  If the consensus is that we
 want half a feature (but not more or less than half), that's OK with
 me, but it's not obvious to me why we should choose to want that.
 
Ever since I've been following these lists, there have been a pretty
steady dribble of requests for optimizer hints of one type or another.
The standard response has always been, If you have a query which is
optimizing poorly, show us, and we'll try to fix the optimizer so that
it doesn't need hints to do a good job.  The enable_* GUCs
effectively *are* optimizer hints, but they are strongly discouraged
for anything but diagnostic purposes.  I guess I don't see any reason
to consider this issue as being different.
 
If implementing them this way seems to handicap them, I guess that's
probably to discourage their use, which seems reasonable to me; I bet
people would shoot themselves in the foot much more often than they
would help themselves with such options, we'd lose information which
might help to improve the optimizer, and the list would probably get a
pretty steady stream of slow queries which would turn out to be
(after digging deep enough) caused by people using hints that caused a
sub-optimal plan to be chosen.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Alvaro Herrera
Tom Lane escribió:

 My own thought is that from_collapse_limit has more justification,
 since it basically acts to stop a subquery from being flattened when
 that would make the parent query too complex, and that seems like a
 more understandable and justifiable behavior than treating JOIN
 syntax specially.

Isn't that what we use OFFSET 0 for?  That one has also the nice
property that you can actually specify which subquery you want to
prevent from being flattened.

Personally I have never seen a case where the collapse_limits were
useful tools.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Tom Lane escribió:
 My own thought is that from_collapse_limit has more justification,
 since it basically acts to stop a subquery from being flattened when
 that would make the parent query too complex, and that seems like a
 more understandable and justifiable behavior than treating JOIN
 syntax specially.

 Isn't that what we use OFFSET 0 for?  That one has also the nice
 property that you can actually specify which subquery you want to
 prevent from being flattened.

Well, if you want to modify your queries to prevent long planning times,
that'd be one way to do it.  It doesn't seem like a generally useful
answer to me though.  For example, typically the subquery would actually
be a view that might be used in various contexts.  If you stick an
OFFSET in it then you disable flattening in all those contexts, likely
not the best answer.

 Personally I have never seen a case where the collapse_limits were
 useful tools.

I'm not convinced they're useful either.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Robert Haas

On Jul 7, 2009, at 4:56 PM, Tom Lane t...@sss.pgh.pa.us wrote:


Robert Haas robertmh...@gmail.com writes:

I guess my point is that there's not a lot of obvious benefit in
allowing the functionality to exist but handicapping it so that it's
useful in as few cases as possible.  If the consensus is that we want
half a feature (but not more or less than half), that's OK with me,
but it's not obvious to me why we should choose to want that.


Well, the question to my mind is whether the collapse_threshold GUCs  
in

their current form actually represent a feature ;-).  They were put
in pretty much entirely on speculation that someone might find them
useful.  Your argument is that they are not only useless but a foot- 
gun,

and so far we haven't got any clear contrary evidence.  If we accept
that argument then we should take them out, not just change the  
default.


My own thought is that from_collapse_limit has more justification,
since it basically acts to stop a subquery from being flattened when
that would make the parent query too complex, and that seems like a
more understandable and justifiable behavior than treating JOIN
syntax specially.  But I'm fine with removing join_collapse_limit
or reducing it to a boolean.


That's pretty much where I am with it, too.  The feature I was  
referring to was not the collapse limits, but the ability to  
explicitly specify the join order, which perhaps could be a useful  
tool for reducing planning time or coping with bad estimates if you  
could do it for only some of the joins in the query, but which we're  
instead proposing to keep as an all-or-nothing flag.


...Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Jul 7, 2009, at 4:56 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 My own thought is that from_collapse_limit has more justification,

 That's pretty much where I am with it, too.  The feature I was  
 referring to was not the collapse limits, but the ability to  
 explicitly specify the join order, which perhaps could be a useful  
 tool for reducing planning time or coping with bad estimates if you  
 could do it for only some of the joins in the query, but which we're  
 instead proposing to keep as an all-or-nothing flag.

It's pretty much all-or-nothing now: the GUC does not give you any sort
of useful control over *which* joins are reorderable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-07 Thread Robert Haas
On Tue, Jul 7, 2009 at 6:33 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Jul 7, 2009, at 4:56 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 My own thought is that from_collapse_limit has more justification,

 That's pretty much where I am with it, too.  The feature I was
 referring to was not the collapse limits, but the ability to
 explicitly specify the join order, which perhaps could be a useful
 tool for reducing planning time or coping with bad estimates if you
 could do it for only some of the joins in the query, but which we're
 instead proposing to keep as an all-or-nothing flag.

 It's pretty much all-or-nothing now: the GUC does not give you any sort
 of useful control over *which* joins are reorderable.

Yes.  So the way I see it, the options are:

1. We can remove join_collapse_limit completely and provide no
substitute.  In this case, the ability to explicitly specify the join
order will be gone.
2. We can remove join_collapse_limit but provide a different, Boolean
GUC instead, like enable_join_reordering.  In this case, we're not
actually reducing the number of GUCs, just the size of the foot-gun.
3. We can remove join_collapse_limit and provide an alternative way to
explicitly specify the join order that is more flexible.  This both
reduces the number of GUCs and arguably provides some useful
functionality that we don't have now.

It sounds like your vote is for #2, which, as I say, seems like a
feature with one arm tied behind its back, but hey, what do I know?

Accepting that as the consensus in the absence of contrary votes, we
still need to decide what to do about from_collapse_threshold and
geqo_threshold.  I'm pretty sure that we shouldn't eliminate GEQO or
geqo_threshold, because the basic algorithm is clearly exponential
time and eventually you have to start worrying about that, but we
could raise the value.  What to do about from_collapse_threshold is
less clear to me.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] *_collapse_limit, geqo_threshold

2009-07-06 Thread Robert Haas
I think we should try to do something about join_collapse_limit,
from_collapse_limit, and geqo_threshold for 8.5.

http://archives.postgresql.org/message-id/9134.1243289...@sss.pgh.pa.us
http://archives.postgresql.org/message-id/603c8f070905251800g5b86d2dav26eca7f417d15...@mail.gmail.com

I'm still of the opinion that join_collapse_threshold is a loaded
foot-gun, because I don't think that users will expect that a join
specified this way:

SELECT ... FROM a JOIN b ON Pab JOIN c ON Pac JOIN d ON Pad ...

will behave differently than one specified this way:

SELECT ... FROM a, b, c, d WHERE Pab AND Pac AND Pad ...

The whole purpose of join_collapse_limit in the first instance is to
prevent planning time from getting out of control, but I don't see how
we can view it as a very effective safety valve when it depends so
heavily on which syntax is used. If the planning time for an N-way
join is excessive, then we're going to have a problem with excessive
planning time whenever the second syntax is selected, and I don't see
any reason to believe that users see the second syntax as dangerous
in terms of planning time but the first syntax as safer.

One possibility would be to remove join_collapse_limit entirely, but
that would eliminate one possibily-useful piece of functionality that
it current enables: namely, the ability to exactly specify the join
order by setting join_collapse_limit to 1.  So one possibility would
be to rename the variable something like explicit_join_order and make
it a Boolean; another possibility would be to change the default value
to INT_MAX.

The approach I've taken in the attached patch is to make 0 mean
unlimited and make that the default value.  I don't have a strong
feeling about whether that's better than the other two options,
although it seems cleaner to me or I'd not have written the patch that
way.  We could also consider adopting this same approach for
from_collapse_limit, though for some reason that behavior marginally
less pathological to me.

At any rate, regardless of whether this patch (or one of the other
approaches mentioned above) are adopted for 8.5, I think we should
raise the default values for whatever is left.  The defaults basically
haven't been modified since they were put in, and my experience is
that even queries with 10 to 15 joins perform acceptably for OLTP
workloads, which are exactly the workloads where query planning time
is most likely to be an issue.  So I would propose raising each of the
limits by 4 (to 12 for from_collapse_limit and join_collapse_limit if
we don't unlimit them entirely, and to 16 for geqo_threshold).  I'm
interested in hearing from anyone who has practical experience with
tuning these variables, or any ideas on what we should test to get a
better idea as to how to set them.

Thanks,

...Robert
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***
*** 2288,2296  SELECT * FROM parent WHERE key = 2400;
 /para
  
 para
! By default, this variable is set the same as
! varnamefrom_collapse_limit/varname, which is appropriate
! for most uses. Setting it to 1 prevents any reordering of
  explicit literalJOIN/s. Thus, the explicit join order
  specified in the query will be the actual order in which the
  relations are joined. The query planner does not always choose
--- 2288,2295 
 /para
  
 para
! By default, this variable is set to literal0/, which always
! allows rewriting.  Setting it to 1 prevents any reordering of
  explicit literalJOIN/s. Thus, the explicit join order
  specified in the query will be the actual order in which the
  relations are joined. The query planner does not always choose
*** a/src/backend/optimizer/plan/initsplan.c
--- b/src/backend/optimizer/plan/initsplan.c
***
*** 477,483  deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
  			/* force the join order exactly at this node */
  			joinlist = list_make1(list_make2(leftjoinlist, rightjoinlist));
  		}
! 		else if (list_length(leftjoinlist) + list_length(rightjoinlist) =
   join_collapse_limit)
  		{
  			/* OK to combine subproblems */
--- 477,484 
  			/* force the join order exactly at this node */
  			joinlist = list_make1(list_make2(leftjoinlist, rightjoinlist));
  		}
! 		else if (join_collapse_limit == 0
!  || list_length(leftjoinlist) + list_length(rightjoinlist) =
   join_collapse_limit)
  		{
  			/* OK to combine subproblems */
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
***
*** 1275,1284  static struct config_int ConfigureNamesInt[] =
  		 constructs are not flattened.),
  			gettext_noop(The planner will flatten explicit JOIN 
  		 constructs into lists of FROM items whenever a 
! 		 list of no more than this many items would result.)
  		},
  		join_collapse_limit,
! 		8, 1, INT_MAX,