[HACKERS] Common Table Expressions applied; some issues remain

Tom Lane Sat, 04 Oct 2008 15:23:02 -0700

I've applied the latest version of the CTE patch.  Congratulations on
making that happen!


There are still some loose ends that need to be considered, though.

1. As committed, the patch takes an extremely hard line about WITH
queries being evaluated independently of the main query and only once
per main query execution.  This could be seen as a good thing --- it
provides much more determinism for execution of volatile functions
within complex queries than was really available in the past.  It could
also be seen as a bad thing --- in particular, we won't push any
limiting qualifications from the main query into the WITH queries.
So for instance

        WITH q AS ( SELECT * FROM foo )
        SELECT * FROM q WHERE key = 42;

is going to be executed quite inefficiently; it won't use an index on
foo.key.  I think we don't have much choice about this in the case of
recursive WITH queries: it would be pretty difficult to determine
whether pushing a restriction into a recursive WITH would change the
results incorrectly.  However, for plain non-recursive WITHs it's all
a matter of definition.  I gather from
http://www.oracle-developer.net/display.php?id=212
that Oracle chooses to treat WITH-queries as if they were plain
sub-selects if they're non-recursive and only referenced once.
That is, Oracle would rewrite the above into

        SELECT * FROM ( SELECT * FROM foo ) AS q WHERE key = 42;

and then flatten the sub-select and optimize normally.  It would
not be hard to make Postgres do the same, but then we would lose
some guarantees about predictable execution of volatile functions.

I'm inclined to think that there is no reason to provide two
different syntaxes to do the same thing, and so having the WITH
syntax behave like this is okay.  But it could well result in
performance surprises for people who are used to Oracle.

Any thoughts on what to do?  One possibility is to flatten only
if the subquery doesn't contain any volatile functions.


2. The patch didn't touch the implicit-RTE code, which means that

        WITH q AS ( SELECT ... )
        SELECT q.*

will fail even if you've got add_missing_from enabled.  I'm inclined
to think that this violates the principle of least surprise.  On
the other hand, add_missing_from is certainly a legacy thing and maybe
we shouldn't bother expending any extra code to make it work with
new features.  Thoughts?


3. ruleutils.c's get_name_for_var_field() hasn't implemented the
RTE_CTE case, which means that it doesn't work to reverse-list
examples like this:

explain verbose with qq as (select x from (values(1,2),(3,4)) as x(c1,c2))
select * from (select (x).c2 from qq offset 0) ss;

The reason I let this go is that while poking into it I found out that 
get_name_for_var_field is pretty broken already; this fails in HEAD:

explain verbose select (x).c2 from
 (select x from (values(1,2),(3,4)) as x(c1,c2) offset 0) ss ;

and this fails even in the back branches:

explain select * from 
 (select x from (values(1,2),(3,4)) as x(c1,c2) offset 0) ss
 where (x).c2 > 0;

It seems we need some redesign in and around EXPLAIN to make that work
nicely, so I figured it would be reasonable to tackle that stuff as a
separate patch.

                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Common Table Expressions applied; some issues remain

Reply via email to