Re: [HACKERS] query optimization with UDFs

2006-10-14 Thread Markus Schaber
Hi, Tom,

Tom Lane wrote:
 Neil Conway [EMAIL PROTECTED] writes:
 BTW, I think it would make sense to implement a limited subset of the
 xfunc ideas: add options to CREATE FUNCTION to allow cost information to
 be specified, and then take advantage of this information instead of
 using the existing constant kludges. This would be a tangible
 improvement, and would have minimal impact on the planner.
 
 The trick is to figure out what a useful parameterized cost model would
 look like.  IIRC, the main reason the xfunc code rotted on the vine was
 that its cost parameters didn't seem to be either easy to select or
 powerful in predicting actual cost.  We'd have to do better this time.

I don't know what the xfunc people did, but at least for some varlen
data types (Arrays, PostGIS, text), some function costs (concatenation,
GeomUnion etc.) can be estimated via the average field size of the tables.

Has that idea been considered?

HTH,
Markus

-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] query optimization with UDFs

2006-10-14 Thread Tom Lane
Markus Schaber [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 The trick is to figure out what a useful parameterized cost model would
 look like.  IIRC, the main reason the xfunc code rotted on the vine was
 that its cost parameters didn't seem to be either easy to select or
 powerful in predicting actual cost.  We'd have to do better this time.

 I don't know what the xfunc people did, but at least for some varlen
 data types (Arrays, PostGIS, text), some function costs (concatenation,
 GeomUnion etc.) can be estimated via the average field size of the tables

 Has that idea been considered?

[ shrug... ]  Concatenation is definitely not one of the functions we
need to worry about.  In fact, I'd say that only functions containing
database accesses are really worth worrying about --- and the hard
part of modeling them is the possible variation in the number of rows
touched depending on their parameters.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] query optimization with UDFs

2006-10-10 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 BTW, I think it would make sense to implement a limited subset of the
 xfunc ideas: add options to CREATE FUNCTION to allow cost information to
 be specified, and then take advantage of this information instead of
 using the existing constant kludges. This would be a tangible
 improvement, and would have minimal impact on the planner.

The trick is to figure out what a useful parameterized cost model would
look like.  IIRC, the main reason the xfunc code rotted on the vine was
that its cost parameters didn't seem to be either easy to select or
powerful in predicting actual cost.  We'd have to do better this time.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] query optimization with UDFs

2006-10-09 Thread jungmin shin




Hello Everybody,

I have a question about optimization of queries which includes UDFs.
Does anybody know what the Postgres does for optimizing the queries with UDFs?
Does the Postgres query optimizer do anything special with UDFs?

Thanks,
Jungmin Shin 


Re: [HACKERS] query optimization with UDFs

2006-10-09 Thread Neil Conway
On Mon, 2006-10-09 at 22:49 -0400, jungmin shin wrote:
 Does anybody know what the Postgres does for optimizing the queries
 with UDFs?

The optimizer considers function volatility to avoid reevaluating UDFs
needlessly, and to use index scans on predicates involving a function.
Also, functions defined in the SQL language will be inlined into the
function call site, when possible. That's all that comes to mind at the
moment...

Notably, the optimizer doesn't have support for realistic costing of
UDFs: it can't tell how expensive evaluating a UDF is going to be, nor
the number and distribution of the rows that will be produced by a
set-returning function. The Berkeley release of Postgres supported
expensive function optimization (a.k.a xfunc), which was an
implementation of Hellerstein and Stonebraker's work on Predicate
Migration[1]. That code is no longer in the current Postgres source.

BTW, I think it would make sense to implement a limited subset of the
xfunc ideas: add options to CREATE FUNCTION to allow cost information to
be specified, and then take advantage of this information instead of
using the existing constant kludges. This would be a tangible
improvement, and would have minimal impact on the planner. A further
improvement would be predicate migration: the xfunc code modified the
optimizer to consider pulling predicates up above joins in the query
plan. The traditional assumption is that all predicates are cheap, so
they are always pushed down in the hopes that this will reduce the size
of intermediate result sets. Naturally, this is not always wise with
expensive UDFs. After the basics have been done, it might be interesting
to implement this, provided it can be done without increasing the
planner's search space too much.

-Neil

[1] http://db.cs.berkeley.edu/jmh/miscpapers/sigmod93.pdf 


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly