subject:"Re\: \[PERFORM\] Overriding the optimizer"

Re: [PERFORM] Overriding the optimizer

2005-12-20 Thread Jim C. Nasby

On Sat, Dec 17, 2005 at 07:31:40AM -0500, Jaime Casanova wrote:
> > > Yeah it would - an implementation I have seen that I like is where the
> > > developer can supply the *entire* execution plan with a query. This is
> > > complex enough to make casual use unlikely :-), but provides the ability
> > > to try out other plans, and also fix that vital query that must run
> > > today.
> >
> > Being able to specify an exact plan would also provide for query plan
> > stability; something that is critically important in certain
> > applications. If you have to meet a specific response time requirement
> > for a query, you can't afford to have the optimizer suddenly decide that
> > some other plan might be faster when in fact it's much slower.
> 
> Plan stability doesn't mean time response stability...
> The plan that today is almost instantaneous tomorrow can take hours...

Sure, if your underlying data changes that much, but that's often not
going to happen in production systems (especially OLTP where this is
most important).

Of course if you have a proposal for ensuring that a query always
finishes in X amount of time, rather than always using the same plan,
I'd love to hear it. ;)
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Overriding the optimizer

2005-12-17 Thread Jaime Casanova

> > Yeah it would - an implementation I have seen that I like is where the
> > developer can supply the *entire* execution plan with a query. This is
> > complex enough to make casual use unlikely :-), but provides the ability
> > to try out other plans, and also fix that vital query that must run
> > today.
>
> Being able to specify an exact plan would also provide for query plan
> stability; something that is critically important in certain
> applications. If you have to meet a specific response time requirement
> for a query, you can't afford to have the optimizer suddenly decide that
> some other plan might be faster when in fact it's much slower.

Plan stability doesn't mean time response stability...
The plan that today is almost instantaneous tomorrow can take hours...

--
regards,
Jaime Casanova
(DBA: DataBase Aniquilator ;)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] Overriding the optimizer

2005-12-17 Thread David Lang


On Fri, 16 Dec 2005, Mark Kirkwood wrote:


Craig A. James wrote:



What would be cool would be some way the developer could alter the plan, 
but they way of doing so would strongly encourage the developer to send the 
information to this mailing list.  Postgres would essentially say, "Ok, you 
can do that, but we want to know why!"




Yeah it would - an implementation I have seen that I like is where the 
developer can supply the *entire* execution plan with a query. This is 
complex enough to make casual use unlikely :-), but provides the ability to 
try out other plans, and also fix that vital query that must run today.


hmm, I wonder if this option would have uses beyond the production hacks 
that are being discussed.


specificly developers working on the optimizer (or related things like 
clustered databases) could use the same hooks to develop and modify the 
'optimizer' externally to postgres (doing an explain would let them find 
the costs that postgres thinks each option has, along with it's 
reccomendation, but the developer could try different execution plans 
without having to recompile postgres between runs. and for clustered 
databases where the data is split between machines this would be a hook 
that the cluster engine could use to put it's own plan into place without 
having to modify and recompile)


David Lang

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Overriding the optimizer

2005-12-16 Thread Tom Lane

"Craig A. James" <[EMAIL PROTECTED]> writes:
> How about this: Instead of arguing in the abstract, tell me in
> concrete terms how you would address the very specific example I gave,
> where myfunc() is a user-written function.  To make it a little more
> challenging, try this: myfunc() can behave very differently depending
> on the parameters, and sometimes (but not always), the application
> knows how it will behave and could suggest a good execution plan.

A word to the wise:

regression=# explain select * from tenk1 where ten > 5 and ten < 9
regression-# and myfunc(unique1,unique2);
QUERY PLAN
--
 Seq Scan on tenk1  (cost=0.00..533.00 rows=982 width=244)
   Filter: ((ten > 5) AND (ten < 9) AND myfunc(unique1, unique2))
(2 rows)

regression=# explain select * from tenk1 where myfunc(unique1,unique2)
regression-# and ten > 5 and ten < 9;
QUERY PLAN
--
 Seq Scan on tenk1  (cost=0.00..533.00 rows=982 width=244)
   Filter: (myfunc(unique1, unique2) AND (ten > 5) AND (ten < 9))
(2 rows)

I might have taken your original complaint more seriously if it
weren't so blatantly bogus.  Your query as written absolutely
would not have evaluated myfunc() first, because there was no
reason for the planner to reorder the WHERE list.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] Overriding the optimizer

2005-12-16 Thread Kyle Cordes


Jaime Casanova wrote:


What I would
really like is for my DBMS to give me a little more pushback - I'd like
to ask it to run a query, and have it either find a "good" way to run
the query, or politely refuse to run it at all.
   




set statement_timeout in postgresql.conf
 



That is what I am doing now, and it is much better than nothing.

But it's not really sufficient, in that it is still quite possible for 
users repeatedly trying an operation that unexpectedly causes excessive 
DB usage, to load down the system to the point of failure.  In other 
words, I'd ideally like it to give up right away, not after N seconds of 
table scanning my 100-million-row tables... and not with a timeout, but 
with an explicit throwing up of its hands, exasperated, that it could 
not come up with an efficient way to run my query.


Kyle Cordes
www.kylecordes.com



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Overriding the optimizer

2005-12-16 Thread Jim C. Nasby

On Fri, Dec 16, 2005 at 04:16:58PM +1300, Mark Kirkwood wrote:
> Craig A. James wrote:
> 
> >
> >What would be cool would be some way the developer could alter the plan, 
> >but they way of doing so would strongly encourage the developer to send 
> >the information to this mailing list.  Postgres would essentially say, 
> >"Ok, you can do that, but we want to know why!"
> >
> 
> Yeah it would - an implementation I have seen that I like is where the 
> developer can supply the *entire* execution plan with a query. This is 
> complex enough to make casual use unlikely :-), but provides the ability 
> to try out other plans, and also fix that vital query that must run 
> today.

Being able to specify an exact plan would also provide for query plan
stability; something that is critically important in certain
applications. If you have to meet a specific response time requirement
for a query, you can't afford to have the optimizer suddenly decide that
some other plan might be faster when in fact it's much slower.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Overriding the optimizer

2005-12-16 Thread Jim C. Nasby

On Fri, Dec 16, 2005 at 03:31:03PM +1300, Mark Kirkwood wrote:
> After years of using several other database products (some supporting 
> hint type constructs and some not), I have come to believe that hinting 
> (or similar) actually *hinders* the development of a great optimizer.

I don't think you can assume that would hold true for an open-source
database. Unlike a commercial database, it's trivially easy to notify
developers about a bad query plan. With a commercial database you'd have
to open a support ticket and hope they actually use that info to improve
the planner. Here you need just send an email to this list and the
developers will at least see it, and will usually try and fix the issue.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Overriding the optimizer

2005-12-16 Thread Jim C. Nasby

On Thu, Dec 15, 2005 at 09:48:55PM -0800, Kevin Brown wrote:
> Craig A. James wrote:
> > Kevin Brown wrote:
> > >>Hints are dangerous, and I consider them a last resort.
> > >
> > >If you consider them a last resort, then why do you consider them to
> > >be a better alternative than a workaround such as turning off
> > >enable_seqscan, when all the other tradeoffs are considered?
> > 
> > If I understand enable_seqscan, it's an all-or-nothing affair.  Turning it 
> > off turns it off for the whole database, right?  The same is true of all 
> > of the planner-tuning parameters in the postgres conf file.
> 
> Nope.  What's in the conf file are the defaults.  You can change them
> on a per-connection basis, via the SET command.  Thus, before doing
> your problematic query:
> 
> SET enable_seqscan = off;
> 
> and then, after your query is done, 
> 
> SET enable_seqscan = on;

You can also turn it off inside a transaction and have it only affect
that transaction so that you can't accidentally forget to turn it back
on (which could seriously hose things up if you're doing this in
conjunction with a connection pool).
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] Overriding the optimizer

2005-12-16 Thread Mitch Skinner

On Thu, 2005-12-15 at 18:23 -0800, Craig A. James wrote:
> So, "you still have no problem" is exactly wrong, because Postgres picked the 
> wrong plan.  Postgres decided that applying myfunc() to 10,000,000 rows was a 
> better plan than an index scan of 50,000 row_nums.  So I'm screwed.

FWIW,
The cost_functionscan procedure in costsize.c has the following comment:
/*
 * For now, estimate function's cost at one operator eval per
function
 * call.  Someday we should revive the function cost estimate
columns in * pg_proc...
 */

I recognize that you're trying to talk about the issue in general rather
than about this particular example.  However, the example does seem to
me to be exactly the case where the effort might be better spent
improving the optimizer (reviving the function cost estimate columns),
rather than implementing a general hinting facility.  Which one is more
effort?  I don't really know for sure, but cost_functionscan does seem
pretty straightforward.

What percentage of problems raised on this list can be fixed by setting
configuration parameters, adding indexes, increasing statistics, or
re-architecting a crazy schema?  I've only been lurking for a few
months, but it seems like a pretty large fraction.  Of the remainder,
what percentage represent actual useful feedback about what needs
improvement in the optimizer?  A pretty large fraction, I think.
Including your example.

Personally, I think whoever was arguing for selectivity hints in
-hackers recently made a pretty good point, so I'm partly on your side.
Actually, function cost "hints" don't really seem that much different
from selectivity hints, and both seem to me to be slicker solutions
(closer to the right level of abstraction) than a general hint facility.

Mitch

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Overriding the optimizer

2005-12-16 Thread Craig A. James


Jaime Casanova wrote:

The context is this - in a busy OLTP system, sometimes a query comes
through that, for whatever reason (foolishness on my part as a
developer, unexpected use by a user, imperfection of the optimizer,
etc.), takes a really long time to run, usually because it table-scans
one or more large tables.  If several of these happen at once, it can
grind an important production system effectively to a halt.  I'd like to
have a few users/operations get a "sorry, I couldn't find a good way to
do that" message, rather than all the users find that their system has
effectively stopped working.
... 
set statement_timeout in postgresql.conf


I found it's better to use "set statement_timeout" in the code, rather than setting it 
globally.  Someone else pointed out to me that setting it in postgresql.conf makes it apply to ALL 
transactions, including VACUUM, ANALYZE and so forth.  I put it in my code just around the queries 
that are "user generated" -- queries that are from users' input.  I expect any SQL that I 
write to finish in a reasonable time ;-).

Craig

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Overriding the optimizer

2005-12-16 Thread Tomasz Rybak

Dnia 16-12-2005, pią o godzinie 16:16 +1300, Mark Kirkwood napisał(a):
> Craig A. James wrote:
> 
> > 
> > What would be cool would be some way the developer could alter the plan, 
> > but they way of doing so would strongly encourage the developer to send 
> > the information to this mailing list.  Postgres would essentially say, 
> > "Ok, you can do that, but we want to know why!"
> > 
> 
> Yeah it would - an implementation I have seen that I like is where the 
> developer can supply the *entire* execution plan with a query. This is 
> complex enough to make casual use unlikely :-), but provides the ability 
> to try out other plans, and also fix that vital query that must run 
> today.

I think you could use SPI for that.
There is function SPI_prepare, which prepares plan,
and SPI_execute_plan, executing it.
These functions are defined in src/backend/executor/spi.c.

I think (someone please correct me if I'm wrong) you could
prepare plan yourself, instead of taking it from SPI_prepare,
and give it to SPI_execute_plan.

SPI_prepare calls _SPI_prepare_plan, which parses query and calls
pg_analyze_and_rewrite. In your version don't call this function,
but provide PostgreSQL with your own plan (not-optimised according to
PostrgeSQL, but meeting your criteria).

-- 
Tomasz Rybak <[EMAIL PROTECTED]>

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Overriding the optimizer

2005-12-16 Thread Jaime Casanova

On 12/16/05, Kyle Cordes <[EMAIL PROTECTED]> wrote:
> Kevin Brown wrote:
>
> >Craig A. James wrote:
> >
> >
> >>Hints are dangerous, and I consider them a last resort.
> >>
> >>
> >
> >If you consider them a last resort, then why do you consider them to
> >be a better alternative than a workaround such as turning off
> >enable_seqscan, when all the other tradeoffs are considered?
> >
> >
>
> I would like a bit finer degree of control on this - I'd like to be able
> to tell PG that for my needs, it is never OK to scan an entire table of
> more than N rows.  I'd typically set N to 1,000,000 or so.  What I would
> really like is for my DBMS to give me a little more pushback - I'd like
> to ask it to run a query, and have it either find a "good" way to run
> the query, or politely refuse to run it at all.
>
> Yes, I know that is an unusual request  :-)
>
> The context is this - in a busy OLTP system, sometimes a query comes
> through that, for whatever reason (foolishness on my part as a
> developer, unexpected use by a user, imperfection of the optimizer,
> etc.), takes a really long time to run, usually because it table-scans
> one or more large tables.  If several of these happen at once, it can
> grind an important production system effectively to a halt.  I'd like to
> have a few users/operations get a "sorry, I couldn't find a good way to
> do that" message, rather than all the users find that their system has
> effectively stopped working.
>
> Kyle Cordes
> www.kylecordes.com
>
>

set statement_timeout in postgresql.conf

--
Atentamente,
Jaime Casanova
(DBA: DataBase Aniquilator ;)

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Overriding the optimizer

2005-12-16 Thread Kyle Cordes


Kevin Brown wrote:


Craig A. James wrote:
 


Hints are dangerous, and I consider them a last resort.
   



If you consider them a last resort, then why do you consider them to
be a better alternative than a workaround such as turning off
enable_seqscan, when all the other tradeoffs are considered?
 



I would like a bit finer degree of control on this - I'd like to be able 
to tell PG that for my needs, it is never OK to scan an entire table of 
more than N rows.  I'd typically set N to 1,000,000 or so.  What I would 
really like is for my DBMS to give me a little more pushback - I'd like 
to ask it to run a query, and have it either find a "good" way to run 
the query, or politely refuse to run it at all.


Yes, I know that is an unusual request  :-)

The context is this - in a busy OLTP system, sometimes a query comes 
through that, for whatever reason (foolishness on my part as a 
developer, unexpected use by a user, imperfection of the optimizer, 
etc.), takes a really long time to run, usually because it table-scans 
one or more large tables.  If several of these happen at once, it can 
grind an important production system effectively to a halt.  I'd like to 
have a few users/operations get a "sorry, I couldn't find a good way to 
do that" message, rather than all the users find that their system has 
effectively stopped working.


Kyle Cordes
www.kylecordes.com



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Bruno Wolff III

On Thu, Dec 15, 2005 at 21:41:06 -0800,
  "Craig A. James" <[EMAIL PROTECTED]> wrote:
> 
> If I understand enable_seqscan, it's an all-or-nothing affair.  Turning it 
> off turns it off for the whole database, right?  The same is true of all of 

You can turn it off just for specific queries. However, it will apply to
all joins within a query.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Kevin Brown

Craig A. James wrote:
> Kevin Brown wrote:
> >>Hints are dangerous, and I consider them a last resort.
> >
> >If you consider them a last resort, then why do you consider them to
> >be a better alternative than a workaround such as turning off
> >enable_seqscan, when all the other tradeoffs are considered?
> 
> If I understand enable_seqscan, it's an all-or-nothing affair.  Turning it 
> off turns it off for the whole database, right?  The same is true of all 
> of the planner-tuning parameters in the postgres conf file.

Nope.  What's in the conf file are the defaults.  You can change them
on a per-connection basis, via the SET command.  Thus, before doing
your problematic query:

SET enable_seqscan = off;

and then, after your query is done, 

SET enable_seqscan = on;

> >If your argument is that planner hints would give you finer grained
> >control, then the question is whether you'd rather the developers
> >spend their time implementing planner hints or improving the planner.
> 
> I agree 100% -- I'd much prefer a better planner.  But when it comes down 
> to a do-or-die situation, you need a hack, some sort of workaround, to get 
> you working *today*.

And that's why I was asking about workarounds versus planner hints.  I
expect that the situations in which the planner gets things wrong
*and* where there's no workaround are very rare indeed.


-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James


Kevin Brown wrote:

Hints are dangerous, and I consider them a last resort.


If you consider them a last resort, then why do you consider them to
be a better alternative than a workaround such as turning off
enable_seqscan, when all the other tradeoffs are considered?


If I understand enable_seqscan, it's an all-or-nothing affair.  Turning it off 
turns it off for the whole database, right?  The same is true of all of the 
planner-tuning parameters in the postgres conf file.  Since the optimizer does 
a good job most of the time, I'd hate to change a global setting like this -- 
what else would be affected?  I could try this, but it would make me nervous to 
alter the whole system to fix one particular query.


If your argument is that planner hints would give you finer grained
control, then the question is whether you'd rather the developers
spend their time implementing planner hints or improving the planner.


I agree 100% -- I'd much prefer a better planner.  But when it comes down to a 
do-or-die situation, you need a hack, some sort of workaround, to get you 
working *today*.

Regards,
Craig

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Kevin Brown

Craig A. James wrote:
> Hints are dangerous, and I consider them a last resort.

If you consider them a last resort, then why do you consider them to
be a better alternative than a workaround such as turning off
enable_seqscan, when all the other tradeoffs are considered?

If your argument is that planner hints would give you finer grained
control, then the question is whether you'd rather the developers
spend their time implementing planner hints or improving the planner.
I'd rather they did the latter, as long as workarounds are available
when needed.  A workaround will probably give the user greater
incentive to report the problem than use of planner hints.


-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Kevin Brown

Craig A. James wrote:
> 
> 
> Christopher Kings-Lynne wrote:
>   select * from my_table where row_num >= 5 and row_num < 
> 10
>    and myfunc(foo, bar);
> >>>
> >>>
> >>>You just create an index on myfunc(foo, bar)
> >>
> >>
> >>only if myfunc(foo, bar) is immutable...
> >
> >
> >And if it's not then the best any database can do is to index scan 
> >row_num - so still you have no problem.
> 
> Boy, you picked a *really* bad example ;-)
> 
> The problem is that Postgres decided to filter on myfunc() *first*, and 
> then filter on row_num, resulting in a query time that jumped from seconds 
> to hours.  And there's no way for me to tell Postgres not to do that!

Apologies in advance if all of this has been said, or if any of it is
wrong.

What kind of plan do you get if you eliminate the myfunc(foo, bar)
from the query entirely?  An index scan or a full table scan?  If the
latter then (assuming that the statistics are accurate) the reason you
want inclusion of myfunc() to change the plan must be the expense of
the function, not the expense of the scan (index versus sequential).
While the expense of the function isn't, as far as I know, known or
used by the planner, that obviously needn't be the case.

On the other hand, if the inclusion of the function call changes the
plan that is selected from an index scan to a sequential scan, then
that, I think, is clearly a bug, since even a zero-cost function
cannot make the sequential scan more efficient than an index scan
which is already more efficient than the base sequential scan.

> So, "you still have no problem" is exactly wrong, because Postgres picked 
> the wrong plan.  Postgres decided that applying myfunc() to 10,000,000 
> rows was a better plan than an index scan of 50,000 row_nums.  So I'm 
> screwed.

If PostgreSQL is indeed applying myfunc() to 10,000,000 rows, then
that is a bug if the function is declared VOLATILE (which is the
default if no volatility is specified), because it implies that it's
applying the function to rows that don't match the selection
condition.  From your prior description, it sounds like your function
is declared STABLE.

For your specific situation, my opinion is that the proper
modification to PostgreSQL would be to give it (if it isn't already
there) the ability to include the cost of functions in the plan.  The
cost needn't be something that it automatically measures -- it could
be specified at function creation time.

-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread David Lang


On Thu, 15 Dec 2005, Craig A. James wrote:

The example I raised in a previous thread, of irregular usage, is the same: I 
have a particular query that I *always* want to be fast even if it's only 
used rarely, but the system swaps its tables out of the file-system cache, 
based on "low usage", even though the "high usage" queries are low priority. 
How can Postgres know such things when there's no way for me to tell it?


actually, postgres doesn't manage the file-system cache, it deliberatly 
leaves that up to the OS it is running on to do that job.


one (extremely ugly) method that you could use would be to have a program 
that looks up what files are used to store your high priority tables and 
then write a trivial program to keep those files in memory (it may be as 
simple as mmaping the files and then going to sleep, or you may have to 
read various points through the file to keep them current in the cache, it 
WILL vary depending on your OS and filesystem in use)


oracle goes to extremes with this sort of control, I'm actually mildly 
surprised that they still run on a host OS and haven't completely taken 
over the machine (I guess they don't want to have to write device drivers, 
that's about the only OS code they really want to use, they do their own 
memory management, filesystem, and user systems), by avoiding areas like 
this postgres sacrafices a bit of performance, but gains a much broader 
set of platforms (hardware and OS) that it can run on. and this by itself 
can result in significant wins (does oracle support Opteron CPU's in 64 
bit mode yet? as of this summer it just wasn't an option)


David Lang

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Jaime Casanova

On 12/15/05, David Lang <[EMAIL PROTECTED]> wrote:
> On Thu, 15 Dec 2005, Craig A. James wrote:
>
> > Mark Kirkwood wrote:
> >> I hear what you are saying, but to use this fine example - I don't know
> >> what the best plan is - these experiments part of an investigation to
> find
> >> *if* there is a better plan, and if so, why Postgres is not finding it.
> >>
> >>> There isn't a database in the world that is as smart as a developer, or
> >>> that can have insight into things that only a developer can possibly
> know.
> >>
> >> That is often true - but the aim is to get Postgres's optimizer closer to
> >> developer smartness.
> >
> > What would be cool would be some way the developer could alter the plan,
> but
> > they way of doing so would strongly encourage the developer to send the
> > information to this mailing list.  Postgres would essentially say, "Ok,
> you
> > can do that, but we want to know why!"
>
> at the risk of sounding flippent (which is NOT what I intend) I will point
> out that with the source you can change the optimizer any way you need to
> :-)
>
> that being said, in your example the issue is the cost of the user created
> function and the fact that postgres doesn't know it's cost.
>
> would a resonable answer be to give postgres a way to learn how expensive
> the call is?
>
> a couple ways I could see to do this.
>
> 1. store some stats automagicly when the function is called and update the
> optimization plan when you do an ANALYSE
>
> 2. provide a way for a user to explicitly set a cost factor for a function
> (with a default value that's sane for fairly trivial functions so that it
> would only have to be set for unuseually expensive functions)
>
> now, neither of these will work all the time if a given function is
> sometimes cheap and sometimes expensive (depending on it's parameters),
> but in that case I would say that if the application knows that a function
> will be unusueally expensive under some conditions (and knows what those
> conditions will be) it may be a reasonable answer to duplicate the
> function, one copy that it uses most of the time, and a second copy that
> it uses when it expects it to be expensive. at this point the cost of the
> function can be set via either of the methods listed above)
>
> >> After years of using several other database products (some supporting
> hint
> >> type constructs and some not), I have come to believe that hinting (or
> >> similar) actually *hinders* the development of a great optimizer.
> >
> > I agree.  It takes the pressure off the optimizer gurus.  If the users can
> > just work around every problem, then the optimizer can suck and the system
> is
> > still usable.
> >
> > Lest anyone think I'm an all-out advocate of overriding the optimizer, I
> know
> > from first-hand experience what a catastrophe it can be.  An Oracle hint I
> > used worked fine on my test schema, but the customer's "table" turned out
> to
> > be a view, and Oracle's optimizer worked well on the view whereas my hint
> was
> > horrible.  Unfortunately, without the hint, Oracle sucked when working on
> an
> > ordinary table.  Hints are dangerous, and I consider them a last resort.
>
> I've been on the linux-kernel mailing list for the last 9 years, and have
> seen a similar debate rage during that entire time about kernel memory
> management. overall both of these tend to be conflicts between short-term
> and long-term benifits.
>
> in the short-term the application user wants to be able to override the
> system to get the best performance _now_
>
> in the long run the system designers don't trust the application
> programmers to get the hints right and want to figure out the right
> optimizer plan, even if it takes a lot longer to do so.
>
> the key to this balance seems to be to work towards as few controls as
> possible, becouse the user will get them wrong far more frequently then
> they get them right, but when you hit a point where there's absolutly no
> way for the system to figure things out (and it's a drastic difference)
> provide the application with a way to hint to the system that things are
> unusueal, but always keep looking for patterns that will let the system
> detect the need itself
>
> even the existing defaults are wrong as frequently as they are right (they
> were set when hardware was very different then it is today) so some way to
> gather real-world stats and set the system defaults based on actual
> hardware performance is really the right way to go (even for things like
> sequential scan speed that are set in the config file today)
>
> David Lang
>

there was discussion on this and IIRC the consensus was that could be
useful tu give some statistics to user defined functions... i don't if
someone is working on this or even if it is doable...

--
regards,
Jaime Casanova
(DBA: DataBase Aniquilator ;)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settin

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Jaime Casanova

On 12/15/05, Craig A. James <[EMAIL PROTECTED]> wrote:
> > Yeah it would - an implementation I have seen that I like is where the
> > developer can supply the *entire* execution plan with a query. This is
> > complex enough to make casual use unlikely :-), but provides the ability
> > to try out other plans, and also fix that vital query that must run
> > today.
>
> So, to move on to the concrete...
>
> I'm not familiar with the innards of Postgres except in a theoretical way.
> Maybe this is a totally naive or dumb question, but I have to ask:   How
> hard would it be to essentially turn off the optimizer?
>
> 1. Evaluate WHERE clauses left-to-right.
>
> select ... from FOO where A and B and C;
>
> This would just apply the criteria left-to-right, first A, then B, then C.
> If an index was available it would use it, but only in left-to-right order,
> i.e. if A had no index but B did, then too bad, you should have written "B
> and A and C".
>

pg < 8.1 when you use multi-column indexes do exactly this... but i
don't know why everyone wants this...

>
> 2. Evaluate joins left-to-right.
>
> select ... from FOO join BAR on (...) join BAZ on (...) where ...
>
> This would join FOO to BAR, then join the result to BAZ.  The only
> optimization would be to apply relevant "where" conditions to each join
> before processing the next join.
>

using explicit INNER JOIN syntax and parenthesis

>
> 3. Don't flatten sub-selects
>
> select ... from (select ... from FOO where ...) as X where ...;
>

select ... from (select ... from FOO where ... offset 0) as X where ...;

> This would do the inner select then use the result in the outer select, and
> wouldn't attempt to flatten the query.
>
> Thanks,
> Craig
>

what else?

--
regards,
Jaime Casanova
(DBA: DataBase Aniquilator ;)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread David Lang


On Thu, 15 Dec 2005, Craig A. James wrote:


Mark Kirkwood wrote:
I hear what you are saying, but to use this fine example - I don't know 
what the best plan is - these experiments part of an investigation to find 
*if* there is a better plan, and if so, why Postgres is not finding it.


There isn't a database in the world that is as smart as a developer, or 
that can have insight into things that only a developer can possibly know.


That is often true - but the aim is to get Postgres's optimizer closer to 
developer smartness.


What would be cool would be some way the developer could alter the plan, but 
they way of doing so would strongly encourage the developer to send the 
information to this mailing list.  Postgres would essentially say, "Ok, you 
can do that, but we want to know why!"


at the risk of sounding flippent (which is NOT what I intend) I will point 
out that with the source you can change the optimizer any way you need to 
:-)


that being said, in your example the issue is the cost of the user created 
function and the fact that postgres doesn't know it's cost.


would a resonable answer be to give postgres a way to learn how expensive 
the call is?


a couple ways I could see to do this.

1. store some stats automagicly when the function is called and update the 
optimization plan when you do an ANALYSE


2. provide a way for a user to explicitly set a cost factor for a function 
(with a default value that's sane for fairly trivial functions so that it 
would only have to be set for unuseually expensive functions)


now, neither of these will work all the time if a given function is 
sometimes cheap and sometimes expensive (depending on it's parameters), 
but in that case I would say that if the application knows that a function 
will be unusueally expensive under some conditions (and knows what those 
conditions will be) it may be a reasonable answer to duplicate the 
function, one copy that it uses most of the time, and a second copy that 
it uses when it expects it to be expensive. at this point the cost of the 
function can be set via either of the methods listed above)


After years of using several other database products (some supporting hint 
type constructs and some not), I have come to believe that hinting (or 
similar) actually *hinders* the development of a great optimizer.


I agree.  It takes the pressure off the optimizer gurus.  If the users can 
just work around every problem, then the optimizer can suck and the system is 
still usable.


Lest anyone think I'm an all-out advocate of overriding the optimizer, I know 
from first-hand experience what a catastrophe it can be.  An Oracle hint I 
used worked fine on my test schema, but the customer's "table" turned out to 
be a view, and Oracle's optimizer worked well on the view whereas my hint was 
horrible.  Unfortunately, without the hint, Oracle sucked when working on an 
ordinary table.  Hints are dangerous, and I consider them a last resort.


I've been on the linux-kernel mailing list for the last 9 years, and have 
seen a similar debate rage during that entire time about kernel memory 
management. overall both of these tend to be conflicts between short-term 
and long-term benifits.


in the short-term the application user wants to be able to override the 
system to get the best performance _now_


in the long run the system designers don't trust the application 
programmers to get the hints right and want to figure out the right 
optimizer plan, even if it takes a lot longer to do so.


the key to this balance seems to be to work towards as few controls as 
possible, becouse the user will get them wrong far more frequently then 
they get them right, but when you hit a point where there's absolutly no 
way for the system to figure things out (and it's a drastic difference) 
provide the application with a way to hint to the system that things are 
unusueal, but always keep looking for patterns that will let the system 
detect the need itself


even the existing defaults are wrong as frequently as they are right (they 
were set when hardware was very different then it is today) so some way to 
gather real-world stats and set the system defaults based on actual 
hardware performance is really the right way to go (even for things like 
sequential scan speed that are set in the config file today)


David Lang

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James

Yeah it would - an implementation I have seen that I like is where the 
developer can supply the *entire* execution plan with a query. This is 
complex enough to make casual use unlikely :-), but provides the ability 
to try out other plans, and also fix that vital query that must run 
today.


So, to move on to the concrete...

I'm not familiar with the innards of Postgres except in a theoretical way.  
Maybe this is a totally naive or dumb question, but I have to ask:   How hard 
would it be to essentially turn off the optimizer?

1. Evaluate WHERE clauses left-to-right.

select ... from FOO where A and B and C;

This would just apply the criteria left-to-right, first A, then B, then C.  If an index 
was available it would use it, but only in left-to-right order, i.e. if A had no index 
but B did, then too bad, you should have written "B and A and C".


2. Evaluate joins left-to-right.

select ... from FOO join BAR on (...) join BAZ on (...) where ...

This would join FOO to BAR, then join the result to BAZ.  The only optimization would be 
to apply relevant "where" conditions to each join before processing the next 
join.


3. Don't flatten sub-selects

select ... from (select ... from FOO where ...) as X where ...;

This would do the inner select then use the result in the outer select, and 
wouldn't attempt to flatten the query.

Thanks,
Craig

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James


Christopher Kings-Lynne wrote:
So your main example bad query is possibly just a case of lack of 
analyze stats and wrong postgresql.conf config?  And that's what causes 
you to shut down your database?  Don't you want your problem FIXED?


I'm trying to help by raising a question that I think is important, and have an 
honest, perhaps vigorous, but respectful, discussion about it.  I respect 
everyone's opinion, and I hope you respect mine.  I've been in this business a 
long time, and I don't raise issues lightly.

Yes, I want my query fixed.  And I may post it, in a thread with a new title.  
In fact, I posted a different query with essentially the same problem a while 
back and got nothing that helped:

   http://archives.postgresql.org/pgsql-performance/2005-11/msg00133.php

(I can't help but point out that Tom's response was to suggest a way to fool the 
optimizer so as to prevent it from "optimizing" the query.  In other words, he 
told me a trick that would force a particular plan on the optimizer.  Which is exactly 
the point of this discussion.)

The point is that the particular query is not relevant -- it's the fact that 
this topic (according to Tom) has been and continues to be raised.  This should 
tell us all something, that it's not going to go away, and that it's a real 
issue.

Regards,
Craig

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Christopher Kings-Lynne

... This seems like a case where PostgreSQL's current optimiser should 
easily know what to do if your config settings are correct and you've 
been running ANALYZE, so I'd like to see your settings and the explain 
analyze plan...


I could, but it would divert us from the main topic of this discussion.  
It's not about that query, which was just an example.  It's the larger 
issue.


So your main example bad query is possibly just a case of lack of 
analyze stats and wrong postgresql.conf config?  And that's what causes 
you to shut down your database?  Don't you want your problem FIXED?


But like I said - no developer is interested in doing planner hints. 
Possibly you could get a company to sponsor it.  Maybe what you want is 
a statement of "If someone submits a good, working, fully implemented 
patch that does planner hints, then we'll accept it."


Chris


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Mark Kirkwood


Craig A. James wrote:



What would be cool would be some way the developer could alter the plan, 
but they way of doing so would strongly encourage the developer to send 
the information to this mailing list.  Postgres would essentially say, 
"Ok, you can do that, but we want to know why!"




Yeah it would - an implementation I have seen that I like is where the 
developer can supply the *entire* execution plan with a query. This is 
complex enough to make casual use unlikely :-), but provides the ability 
to try out other plans, and also fix that vital query that must run 
today.


cheers

Mark

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James


Christopher Kings-Lynne wrote:

Can you paste explain analyze and your effective_cache_size, etc. settings.
... 
This seems like a case where PostgreSQL's current optimiser should 
easily know what to do if your config settings are correct and you've 
been running ANALYZE, so I'd like to see your settings and the explain 
analyze plan...


I could, but it would divert us from the main topic of this discussion.  It's 
not about that query, which was just an example.  It's the larger issue.

Tom's earlier response tells the story better than I can:

This discussion has been had before (many times) ... see
the -hackers archives for detailed arguments. 


If it's "been had before (many times)", and now I'm bringing it up again, then 
it's clearly an ongoing problem that hasn't been resolved.

Craig

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James


Tom Lane wrote:

This discussion has been had before (many times) ... see the -hackers
archives for detailed arguments.  The one that carries the most weight
in my mind is that planner hints embedded in applications will not adapt
to changing circumstances --- the plan that was best when you designed
the code might not be best today.


Absolutely right.  But what am I supposed to do *today* if the planner makes a 
mistake?  Shut down my web site?

Ropes are useful, but you can hang yourself with them.  Knives are useful, but 
you can cut yourself with them.  Should we ban useful tools because they cause 
harm to the careless?

Craig

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James


Mark Kirkwood wrote:
I hear what you are saying, but to use this fine example - I don't know 
what the best plan is - these experiments part of an investigation to 
find *if* there is a better plan, and if so, why Postgres is not finding 
it.


There isn't a database in the world that is as smart as a developer, 
or that can have insight into things that only a developer can 
possibly know.


That is often true - but the aim is to get Postgres's optimizer closer 
to developer smartness.


What would be cool would be some way the developer could alter the plan, but they way of 
doing so would strongly encourage the developer to send the information to this mailing 
list.  Postgres would essentially say, "Ok, you can do that, but we want to know 
why!"

After years of using several other database products (some supporting 
hint type constructs and some not), I have come to believe that hinting 
(or similar) actually *hinders* the development of a great optimizer.


I agree.  It takes the pressure off the optimizer gurus.  If the users can just 
work around every problem, then the optimizer can suck and the system is still 
usable.

Lest anyone think I'm an all-out advocate of overriding the optimizer, I know from 
first-hand experience what a catastrophe it can be.  An Oracle hint I used worked fine on 
my test schema, but the customer's "table" turned out to be a view, and 
Oracle's optimizer worked well on the view whereas my hint was horrible.  Unfortunately, 
without the hint, Oracle sucked when working on an ordinary table.  Hints are dangerous, 
and I consider them a last resort.

Craig

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James


Christopher Kings-Lynne wrote:
I don't necessarily disagree with your assertion that we need planner 
hints, but unless you or someone else is willing to submit a patch with 
the feature it's unlikely to ever be implemented...


Now that's an answer I understand and appreciate.  Open-source development 
relies on many volunteers, and I've benefitted from it since the early 1980's 
when emacs and Common Lisp first came to my attention.  I've even written a 
widely-circulated article about open-source development, which some of you may 
have read:

http://www.moonviewscientific.com/essays/software_lifecycle.htm

I hope nobody here thinks I'm critical of all the hard work that's been put into 
Postgres.  My hope is to raise the awareness of this issue in the hope that it's at least 
put on "the list" for serious consideration.

Craig


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Christopher Kings-Lynne


Boy, you picked a *really* bad example ;-)

The problem is that Postgres decided to filter on myfunc() *first*, and 
then filter on row_num, resulting in a query time that jumped from 
seconds to hours.  And there's no way for me to tell Postgres not to do 
that!


Can you paste explain analyze and your effective_cache_size, etc. settings.

So, "you still have no problem" is exactly wrong, because Postgres 
picked the wrong plan.  Postgres decided that applying myfunc() to 
10,000,000 rows was a better plan than an index scan of 50,000 
row_nums.  So I'm screwed.


This seems like a case where PostgreSQL's current optimiser should 
easily know what to do if your config settings are correct and you've 
been running ANALYZE, so I'd like to see your settings and the explain 
analyze plan...


Chris


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Mark Kirkwood

Craig A. James wrote:
I asked a while back if there were any plans to allow developers to 
override the optimizer's plan and force certain plans, and received a 
fairly resounding "No".  The general feeling I get is that a lot of work 
has gone into the optimizer, and by God we're going to use it!

I think this is just wrong, and I'm curious whether I'm alone in this 
opinion.

Over and over, I see questions posted to this mailing list about 
execution plans that don't work out well.  Many times there are good 
answers - add an index, refactor the design, etc. - that yield good 
results.  But, all too often the answer comes down to something like 
this recent one:

  > Right on. Some of these "coerced" plans may perform   > much better. 
If so, we can look at tweaking your runtime

  > config: e.g.
  >
  > effective_cache_size
  > random_page_cost
  > default_statistics_target
  >
  > to see if said plans can be chosen "naturally".

I see this over and over.  Tweak the parameters to "force" a certain 
plan, because there's no formal way for a developer to say, "I know the 
best plan."

I hear what you are saying, but to use this fine example - I don't know 
what the best plan is - these experiments part of an investigation to 
find *if* there is a better plan, and if so, why Postgres is not finding it.

There isn't a database in the world that is as smart as a developer, or 
that can have insight into things that only a developer can possibly 
know.

That is often true - but the aim is to get Postgres's optimizer closer 
to developer smartness.

After years of using several other database products (some supporting 
hint type constructs and some not), I have come to believe that hinting 
(or similar) actually *hinders* the development of a great optimizer.

Best wishes

Mark

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James




Christopher Kings-Lynne wrote:
  select * from my_table where row_num >= 5 and row_num < 
10

   and myfunc(foo, bar);



You just create an index on myfunc(foo, bar)



only if myfunc(foo, bar) is immutable...



And if it's not then the best any database can do is to index scan 
row_num - so still you have no problem.


Boy, you picked a *really* bad example ;-)

The problem is that Postgres decided to filter on myfunc() *first*, and then 
filter on row_num, resulting in a query time that jumped from seconds to hours. 
 And there's no way for me to tell Postgres not to do that!

So, "you still have no problem" is exactly wrong, because Postgres picked the 
wrong plan.  Postgres decided that applying myfunc() to 10,000,000 rows was a better plan 
than an index scan of 50,000 row_nums.  So I'm screwed.

Craig

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Christopher Kings-Lynne

  > Right on. Some of these "coerced" plans may perform   > much better. 
If so, we can look at tweaking your runtime

  > config: e.g.
  >
  > effective_cache_size
  > random_page_cost
  > default_statistics_target
  >
  > to see if said plans can be chosen "naturally".

I see this over and over.  Tweak the parameters to "force" a certain 
plan, because there's no formal way for a developer to say, "I know the 
best plan."


No, this is "fixing your wrongn, inaccurate parameters so that 
postgresql can choose a better plan".


I don't necessarily disagree with your assertion that we need planner 
hints, but unless you or someone else is willing to submit a patch with 
the feature it's unlikely to ever be implemented...


Chris


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James


Christopher Kings-Lynne wrote:

   select * from my_table where row_num >= 5 and row_num < 10
and myfunc(foo, bar);



You just create an index on myfunc(foo, bar)


Thanks, but myfunc() takes parameters (shown here as "foo, bar"), one of which 
is not a column, it's external and changes with every query.  A function index won't work.

Craig

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James


Tom,


I see this over and over.  Tweak the parameters to "force" a certain
plan, because there's no formal way for a developer to say, "I know
the best plan."


I think you've misunderstood those conversations entirely.  The point
is not to force the planner into a certain plan, it is to explore what's
going on with a view to understanding why the planner isn't making a
good choice, and thence hopefully improve the planner in future.


No, I understood the conversations very clearly.  But no matter how clever the 
optimizer, it simply can't compete with a developer who has knowledge that 
Postgres *can't* have.  The example of a user-written function is obvious.


There isn't a database in the world that is as smart as a developer,


People who are convinced they are smarter than the machine are often
wrong ;-). 


Often, but not always -- as I noted in my original posting.  And when the 
developer is smarter than Postgres, and Postgres makes the wrong choice, what 
is the developer supposed to do?  This isn't academic -- the wrong plans 
Postgres makes can be *catastrophic*, e.g. turning a 3-second query into a 
three-hour query.

How about this: Instead of arguing in the abstract, tell me in concrete terms 
how you would address the very specific example I gave, where myfunc() is a 
user-written function.  To make it a little more challenging, try this: 
myfunc() can behave very differently depending on the parameters, and sometimes 
(but not always), the application knows how it will behave and could suggest a 
good execution plan.

(And before anyone suggests that I rewrite myfunc(), I should explain that it's 
in the class of NP-complete problems.  The function is inherently hard and 
can't be made faster or more predictable.)

The example I raised in a previous thread, of irregular usage, is the same: I have a particular 
query that I *always* want to be fast even if it's only used rarely, but the system swaps its 
tables out of the file-system cache, based on "low usage", even though the "high 
usage" queries are low priority.  How can Postgres know such things when there's no way for me 
to tell it?

The answers from the Postgres community were essentially, "Postgres is smarter than 
you, let it do its job."  Unfortunately, this response completely ignores the 
reality: Postgres is NOT doing its job, and can't, because it doesn't have enough 
information.

Craig


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Christopher Kings-Lynne


  select * from my_table where row_num >= 5 and row_num < 10
   and myfunc(foo, bar);


You just create an index on myfunc(foo, bar)


only if myfunc(foo, bar) is immutable...


And if it's not then the best any database can do is to index scan 
row_num - so still you have no problem.


Chris


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Jaime Casanova

On 12/15/05, Christopher Kings-Lynne <[EMAIL PROTECTED]> wrote:
> >select * from my_table where row_num >= 5 and row_num < 10
> > and myfunc(foo, bar);
>
> You just create an index on myfunc(foo, bar)
>
> Chris
>

only if myfunc(foo, bar) is immutable...

--
regards,
Jaime Casanova
(DBA: DataBase Aniquilator ;)

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Christopher Kings-Lynne


   select * from my_table where row_num >= 5 and row_num < 10
and myfunc(foo, bar);


You just create an index on myfunc(foo, bar)

Chris


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Tom Lane

"Craig A. James" <[EMAIL PROTECTED]> writes:
> I see this over and over.  Tweak the parameters to "force" a certain
> plan, because there's no formal way for a developer to say, "I know
> the best plan."

I think you've misunderstood those conversations entirely.  The point
is not to force the planner into a certain plan, it is to explore what's
going on with a view to understanding why the planner isn't making a
good choice, and thence hopefully improve the planner in future.  (Now,
that's not necessarily what the user with an immediate problem is
thinking, but that's definitely what the developers are thinking.)

> There isn't a database in the world that is as smart as a developer,

People who are convinced they are smarter than the machine are often
wrong ;-).  If we did put in the nontrivial amount of work needed to
have such a facility, it would probably get abused more often than it
was used correctly.  I'd rather spend the work on making the planner
better.

This discussion has been had before (many times) ... see the -hackers
archives for detailed arguments.  The one that carries the most weight
in my mind is that planner hints embedded in applications will not adapt
to changing circumstances --- the plan that was best when you designed
the code might not be best today.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

40 matches

Mail list logo