On 12/06/2013 03:33 PM, Kevin Grittner wrote:
> Antonin Houska <[email protected]> wrote:
>
>> SELECT *
>> FROM tab1 a
>> LEFT JOIN
>> tab2 b
>> ON a.i = ANY (
>> SELECT k
>> FROM tab3 c
>> WHERE k = a.i);
>
> This query works with k in any or all tables, but the semantics
> certainly vary depending on where k happens to be. It would help a
> lot if you showed SQL statements to create and populate the tables
> involved and/or if you qualified all referenced column names with
> the table alias to avoid ambiguity.
I used the DDLs attached (tables.ddl) for this query too, not only for
the queries in quaries.sql. Yes, if I had mentioned it and/or qualified
the 'k' column reference, it wouldn't have broken anything.
> If I assume that the k reference is supposed to be a column in
> tab3, what you have is a query where you always get all rows from
> tab1, and for each row from tab1 you either match it to all rows
> from tab2 or no rows from tab2 depending on whether the tab1 row
> has a match in tab3.
I concede this particular query is not useful. But the important thing
to consider here is which side of the LEFT JOIN the subquery references.
>> SELECT *
>> FROM tab1 a
>> LEFT JOIN
>> (
>> SELECT *
>> tab2 b
>> SEMI JOIN
>> ( SELECT k
>> FROM tab3 c
>> WHERE k = a.i
>> ) AS ANY_subquery
>> ON a.i = ANY_subquery.k
>> ) AS SJ_subquery
>> ON true;
>
> It is hard to see what you intend here, since this is not valid
> syntax.
This is what I - after having read the related source code - imagine to
happen internally when the ANY predicate of the first query is being
processed. In fact it should become something like this (also internal
stuff)
SELECT *
FROM tab1 a
LEFT JOIN
(
tab2 b
SEMI JOIN
( SELECT k
FROM tab3 c
WHERE k = a.i
) AS ANY_subquery
ON a.i = ANY_subquery.k
)
ON true;
that is, SEMI JOIN node inserted into the tree rather than a subquery
(SJ_subquery). I posted the construct with SJ_subquery to show how I
thought about the problem: I thought it's safe (even though not
necessarily beautiful) to wrap the SEMI JOIN into the SJ_subquery and
let the existing infrastructure decide whether it's legal to turn it
into a join node. I concluded that the subquery's references to the tab1
ensure that SJ_subquery won't be flattened, so the patch does nothing if
such a reference exists.
> PostgreSQL supports semi-joins; but that is an implementation detail
> for the EXISTS or IN syntax.
... and for ANY, see subselect.c:convert_ANY_sublink_to_join()
> Could you clarify your intent?
To get rid of a subplan in some cases that require it so far: when the
subquery references table exactly 1 level higher (i.e. the immediate
parent query).
(I got the idea while reading the source code, as opposed to query
tuning.)
// Antonin Houska (Tony)
> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers