[HACKERS] PROPOSAL: geqo improvement

2009-01-04 Thread marcin mank
Hello, List.

There are cases when GEQO returns a very bad plan in some rare
executions of a query. To decrease likehood of this happening, I
propose:

When GEQO detects that what it found is in fact a miserable plan  it
restarts the search. Simple math shows that if the probability of a
bad plan found in one 'go' is p, the overall probability of a bad plan
is p^N  .

GEQO would decide that the plan is bad when the calculated cost of the
plan would exceed the time spent planning so far a fixed number of
times (100 ? a configurable parameter ?) .
I think a function infering cost from time spent could be calculated
from cpu_operator_cost - or is there a better way?

As a safety mean I wish to limit the number of replannings to a fixed
value (10? 20? a configurable parameter?)

If I introduce some configuration variables, I plan to infer the
defaults from geqo_effort (no concrete plan for this now).

An alternative to restarting the search might be just extending it -
running the main loop of geqo() function longer.  I plan restarting
because I`m afraid the real reason for getting bad plans could be that
the algorithm is getting into some local minimum and can`t get out. I
will explore that more.

If there is agreement to do this, it looks simple enough that I
volunteer to implement it. Please tell me what is the deadline for
this to make into 8.4 .

What I lack is good test cases to verify the solution.

Greetings
Marcin Mańk

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PROPOSAL: geqo improvement

2009-01-04 Thread Robert Haas
2009/1/4 marcin mank marcin.m...@gmail.com:
 GEQO would decide that the plan is bad when the calculated cost of the
 plan would exceed the time spent planning so far a fixed number of
 times (100 ? a configurable parameter ?) .
 I think a function infering cost from time spent could be calculated
 from cpu_operator_cost - or is there a better way?

It sounds like you're proposing to compare the time spent planning to
the estimated execution time.  AFAICS, those things are unrelated, so
I'm not sure what you hope to figure out by comparing them.

 An alternative to restarting the search might be just extending it -
 running the main loop of geqo() function longer.  I plan restarting
 because I`m afraid the real reason for getting bad plans could be that
 the algorithm is getting into some local minimum and can`t get out. I
 will explore that more.

It sounds like you may have some concrete queries that suffer from
this problem.  It might be helpful to post the queries and the good
and bad plans.  It may be that the problem can be fixed with some
tuning of the existing parameters.

 If there is agreement to do this, it looks simple enough that I
 volunteer to implement it. Please tell me what is the deadline for
 this to make into 8.4 .

The deadline for the final CommitFest was November 1st, so I think it
is too late for 8.4.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PROPOSAL: geqo improvement

2009-01-04 Thread marcin mank
 It sounds like you're proposing to compare the time spent planning to
 the estimated execution time.  AFAICS, those things are unrelated, so
 I'm not sure what you hope to figure out by comparing them.

The idea is: If we are to spend a LOT of resources executing the
query, we might as well burn some cycles in hope of finding a better
plan.


 It sounds like you may have some concrete queries that suffer from
 this problem.  It might be helpful to post the queries and the good
 and bad plans.  It may be that the problem can be fixed with some
 tuning of the existing parameters.

Actually, no. This is my random thought based on observing some
threads where people get bad plans due to GEQO.


 The deadline for the final CommitFest was November 1st, so I think it
 is too late for 8.4.

ugh.. too bad. I`m still interested anyway :)

Marcin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers