I have just noticed that 7.3 and CVS tip reject a query that was
accepted in earlier releases:

regression=# create table t1(f1 int, f2 int);
CREATE TABLE
regression=# create table t2(f2 int, f3 int);
CREATE TABLE
regression=# select t1.f1 from t1 join t2 on (t1.f2=t2.f2) group by f1;
ERROR:  Attribute t1.f1 must be GROUPed or used in an aggregate function

The difficulty here is that in 7.3, the unqualified name "f1" is treated
as a Var of the unnamed JOIN relation, which is not equal() to "t1.f1".

It's possible that we could avoid associating Vars with the JOIN
relation in the case of JOINs that have no alias, but I'm not convinced
this would work; it took a couple of iterations to get to the present
representation of JOIN variables and I no longer recall all the reasons
why simpler representations failed :-(

What I'm thinking of doing is applying flatten_join_alias_vars() in
parse_agg.c before checking for invalid grouping --- that replaces "f1"
with "t1.f1" so that the above works.

But first, I thought I'd ask if anyone thinks the above error message is
correct behavior?  The SQL spec seems quite unspecific about this ---
it says that the SELECT-list item must be "a reference to a grouping
column", but I couldn't find anywhere that specifically said whether
"f1" and "t1.f1" are to be considered the same column in this scenario.

There are related cases in which they are definitely not the same thing,
eg
        select ... from t1 right join t2 using (f1)
"f1" will never be NULL in the result of this FROM clause, but "t1.f1"
could be NULL.  ("f1" would really be "t2.f1" here.)

In the plain inner-join case, there isn't any semantic difference
between "f1" and "t1.f1", but does that mean we should treat them
as the same grouping column?  I'm not sure.  Anyone want to try these
cases on other DBMSes?

                        regards, tom lane


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Reply via email to