Re: [HACKERS] Gsets: ROW expression semantic broken between 9.4 and 9.5
> "Jeevan" == Jeevan Chalke writes: Jeevan> Hi Jeevan> It looks like we have broken the ROW expression without Jeevan> explicit ROW keyword in GROUP BY. Andres has given the short version, but here's the long version: In the spec, GROUP BY ROW(a,b) is an error, while GROUP BY (a,b) is exactly equivalent to GROUP BY a,b. Previously, pg treated GROUP BY (a,b) as if it were GROUP BY ROW(a,b) since it was parsing it as an expression, and (a,b) in an expression is shorthand for ROW(a,b). However, the parens are significant in many contexts in the grouping set syntax, e.g. ROLLUP(a,(b,c)) is equivalent to GROUPING SETS ((a,b,c), (a), ()), and we have to be able to parse both GROUPING SETS (a,b) (which is two grouping sets) and GROUPING SETS ((a,b),(c,d)), which means that we can't actually use the grammar to distinguish expressions from parenthesized sublists. What the code therefore does is to explicitly distinguish (a,b) and ROW(a,b), and treat the first as a list and the second as a single expression. This is documented in the following NOTE in queries.sgml: The construct (a,b) is normally recognized in expressions as a row constructor. Within the GROUP BY clause, this does not apply at the top levels of expressions, and (a,b) is parsed as a list of expressions as described above. If for some reason you need a row constructor in a grouping expression, use ROW(a,b). http://www.postgresql.org/docs/9.5/static/queries-table-expressions.html#QUERIES-GROUPING-SETS Andres has suggested that this should be mentioned in an incompatibility note in the release notes. I'm not sure that's needed, since I don't believe there are any cases where previously valid queries change in behavior; a query such as select (a,b) from (values (1,2)) v(a,b) group by (a,b); previously evaluated the row constructor before grouping, while now it groups by a and b separately and evaluates the row constructor afterwards. If there's a way to make this change affect the result, I've not found it yet, even when using volatile functions. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Gsets: ROW expression semantic broken between 9.4 and 9.5
Hi, On 2015-07-22 23:53:26 +0530, Jeevan Chalke wrote: > It looks like we have broken the ROW expression without explicit > ROW keyword in GROUP BY. That was "intentional", and is actually standards required behaviour. GROUP BY (a, b) is the same as GROUP BY a,b. It'd otherwise be pretty confusing because parens in GROUPING SETS would mean something different than in GROUP BY. We should probably add a release note entry about it... Thanks for testing gsets! - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Gsets: ROW expression semantic broken between 9.4 and 9.5
Hi It looks like we have broken the ROW expression without explicit ROW keyword in GROUP BY. I mean, after Grouping sets merge, if we have (c1, c2) in group by, we are treating it as ROW expression for grouping, but at the same time we are allowing individual column in the target list. However this was not true with PG9.4 where we error out saying "column "c1" must appear in the GROUP BY clause..". But if I use explicit ROW keyword, like ROW(c1, c2), then on PG95 it error outs for individual column reference in select list. Example may clear more: ON PG 9.5 (after grouping sets implementation) postgres=# create view gstest1(a,b,v) postgres-# as values (1,1,10),(1,1,11),(1,2,12),(1,2,13),(1,3,14), postgres-# (2,3,15), postgres-# (3,3,16),(3,4,17), postgres-# (4,1,18),(4,1,19); CREATE VIEW postgres=# postgres=# SELECT a, b, max(v) FROM gstest1 GROUP BY (a, b) ORDER BY 1, 2, 3 DESC; a | b | max ---+---+- 1 | 1 | 11 1 | 2 | 13 1 | 3 | 14 2 | 3 | 15 3 | 3 | 16 3 | 4 | 17 4 | 1 | 19 (7 rows) postgres=# SELECT a, b, max(v) FROM gstest1 GROUP BY ROW(a, b) ORDER BY 1, 2, 3 DESC; ERROR: column "gstest1.a" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: SELECT a, b, max(v) FROM gstest1 GROUP BY ROW(a, b) ORDER BY... ^ In above example, you see that when we have only (a, b), it is working fine. But when we have ROW(a, b), it is throwing an error. On PG 9.4 both cases are failing. Here it is: postgres=# SELECT a, b, max(v) FROM gstest1 GROUP BY (a, b) ORDER BY 1, 2, 3 DESC; ERROR: column "gstest1.a" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: SELECT a, b, max(v) FROM gstest1 GROUP BY (a, b) ORDER BY 1,... ^ postgres=# SELECT a, b, max(v) FROM gstest1 GROUP BY ROW(a, b) ORDER BY 1, 2, 3 DESC; ERROR: column "gstest1.a" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: SELECT a, b, max(v) FROM gstest1 GROUP BY ROW(a, b) ORDER BY... ^ Do we broke ROW expression semantics in grouping sets implementation? Any idea why is this happening? Thanks -- Jeevan B Chalke Principal Software Engineer, Product Development EnterpriseDB Corporation The Enterprise PostgreSQL Company