Josh Berkus wrote:
Mark,


This would only seem to work for trivial functions.  Most functions that
I write are themselves dependent on underlying tables, and without any
idea how many rows are in the tables, and without any idea of the
statistical distribution of those rows, I can't really say anything like
"average rows returned = 5".

What I have wanted for some time is a function pairing system.  For each
set returning function F() I create, I would have the option of creating
a statistics function S() which returns a single integer which
represents the guess of how many rows will be returned.  S() would be
called by the planner, and the return value of S() would be used to
decide the plan.  S() would need access to the table statistics
information.  I imagine that the system would want to prevent S() from
running queries, and only allow it to call certain defined table
statistics functions and some internal math functions, thereby avoiding
any infinite recursion in the planner.  (If S() ran any queries, those
queries would go yet again to the planner, and on down the infinite
recursion you might go.)

Of course, some (possibly most) people could chose not to write an S()
for their F(), and the default of 1000 rows would continue to be used. As such, this new extension to the system would be backwards compatible
to functions which don't have an S() defined.


I think this is a fine idea, and I think I endorsed it the first time. However, even a static "function returns #" would be better than what we have now, and I think the S() method could take quite a bit of engineering to work out (for example, what if F() is being called in a JOIN or correlated subquery?). So I'm worried that shooting for the S() idea only could result in us not doing *anything* for several more versions.

What I'd like to do is implement the constant method for 8.2, and work on doing the S() method later on. Does that make sense?


I have no objections to implementing the constant method sooner than the full version. It might be useful to implement it as a subsyntax to the full version though in preparation for later expansion. For instance, if there is a plstats language developed, you could limit the parser for it to just functions like:

CREATE FUNCTION S() RETURNS INTEGER PROVIDES FOR F() AS $$
        RETURN 5;
$$ LANGUAGE PLSTATS;

Then the language could be expanded later to allow calls to the table statistics functions.

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org

Reply via email to