"David G. Johnston" <david.g.johns...@gmail.com> writes: > Something does seem off here with the interaction between grouping sets and > order by.
Yeah. I think Geoff is correct to identify the use of subqueries in the grouping sets as the triggering factor. We can get some insight by explicitly printing the ordering values: SELECT GROUPING(test1.n) AS gp_n, GROUPING(concat(test1.n, (SELECT x FROM test2 WHERE seq=test1.seq))) AS gp_conc, test1.n, CONCAT(test1.n, (SELECT x FROM test2 WHERE seq=test1.seq)), CASE WHEN GROUPING(test1.n)=0 THEN test1.n ELSE NULL END as o1, CASE WHEN GROUPING(concat(test1.n, (SELECT x FROM test2 WHERE seq=test1.seq)))=0 THEN concat(test1.n, (SELECT x FROM test2 WHERE seq=test1.seq)) ELSE NULL END as o2 FROM test1 GROUP BY GROUPING SETS( (test1.n), (concat(test1.n, (SELECT x FROM test2 WHERE seq=test1.seq))) ) ORDER BY o1 NULLS FIRST, o2 NULLS FIRST; which produces gp_n | gp_conc | n | concat | o1 | o2 ------+---------+----+--------+----+---- 1 | 0 | | n5x1 | | x1 1 | 0 | | n4x2 | | x2 1 | 0 | | n3x3 | | x3 1 | 0 | | n2x4 | | x4 1 | 0 | | n1x5 | | x5 0 | 1 | n1 | | n1 | 0 | 1 | n2 | | n2 | 0 | 1 | n3 | | n3 | 0 | 1 | n4 | | n4 | 0 | 1 | n5 | | n5 | (10 rows) Those columns appear correctly sorted, so it's not the sort that is misbehaving. But how come the values don't match the "concat" column where they should? EXPLAIN VERBOSE gives a further clue: QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=53622.76..53623.76 rows=400 width=136) Output: (GROUPING(test1.n)), (GROUPING(concat(test1.n, (SubPlan 1)))), test1.n, (concat(test1.n, (SubPlan 2))), (CASE WHEN (GROUPING(test1.n) = 0) THEN test1.n ELSE NULL::text END), (CASE WHEN (GROUPING(concat(test1.n, (SubPlan 3))) = 0) THEN concat(test1.n, (SubPlan 4)) ELSE NULL::text END) Sort Key: (CASE WHEN (GROUPING(test1.n) = 0) THEN test1.n ELSE NULL::text END) NULLS FIRST, (CASE WHEN (GROUPING(concat(test1.n, (SubPlan 3))) = 0) THEN concat(test1.n, (SubPlan 4)) ELSE NULL::text END) NULLS FIRST -> HashAggregate (cost=32890.30..53605.48 rows=400 width=136) Output: GROUPING(test1.n), GROUPING(concat(test1.n, (SubPlan 1))), test1.n, (concat(test1.n, (SubPlan 2))), CASE WHEN (GROUPING(test1.n) = 0) THEN test1.n ELSE NULL::text END, CASE WHEN (GROUPING(concat(test1.n, (SubPlan 3))) = 0) THEN concat(test1.n, (SubPlan 4)) ELSE NULL::text END Hash Key: test1.n Hash Key: concat(test1.n, (SubPlan 2)) -> Seq Scan on public.test1 (cost=0.00..32887.12 rows=1270 width=68) Output: test1.n, concat(test1.n, (SubPlan 2)), test1.seq SubPlan 2 -> Seq Scan on public.test2 (cost=0.00..25.88 rows=6 width=32) Output: test2.x Filter: (test2.seq = test1.seq) SubPlan 4 -> Seq Scan on public.test2 test2_1 (cost=0.00..25.88 rows=6 width=32) Output: test2_1.x Filter: (test2_1.seq = test1.seq) (17 rows) We have ended up with four distinct SubPlans (two of which seem to have gotten dropped because they are inside GROUPING functions, which never really evaluate their arguments). What I think happened here is that the parser let the concat() expressions in the targetlist and ORDER BY through because they were syntactically identical to GROUPING SET expresions --- but later on, the planner expanded each of the sub-selects to a distinct SubPlan, and that meant that those subexpressions were no longer identical, and so most of them didn't get converted to references to the grouping key column output by the HashAggregate node. Because they didn't get converted, they fail to do the right thing in rows where they should go to NULL because they're from the wrong grouping set. The "concat" targetlist element did get converted, so it behaves correctly, and the GROUPING functions don't actually care because they have a different method for determining what they should output. But you get the wrong answer for the concat() inside the "o2" expression: it gets evaluated afresh using the nulled value of test1.n. I think this particular symptom might be new, but we've definitely seen related trouble reports before. I'm inclined to think that the right fix will require making the parser actually replace such expressions with Vars referencing a notional grouping output relation, so that there's not multiple instances of the sub-query in the parser output in the first place. That's a fairly big job and nobody's tackled it yet. In the meantime, what I'd suggest as a workaround is to put those subexpressions into a sub-select with an optimization fence (you could use OFFSET 0 or a materialized CTE), so that the grouping sets list in the outer query just has simple Vars as elements. [Digression: the SQL spec *requires* grouping set elements to be simple Vars, and I begin to see why when contemplating examples like this. It's a little weird that "concat(test1.n, ...)" can evaluate with a non-null value of test1.n in a row where test1.n alone would evaluate as null. However, we've dug this hole for ourselves and now we have to deal with the consequences.] regards, tom lane