> On Jun 29, 2026, at 15:20, Chao Li <[email protected]> wrote: > > Hi, > > While testing “[ef38a4d97] Add GROUP BY ALL”, I traced the code and found a > suspicious difference. In theory, GROUP BY ALL should behave the same as > spelling out all inferred grouping expressions, but I found that they go > through different code paths. > > After entering transformGroupClause(), if this is GROUP BY ALL, it > immediately enters a separate branch, loops over the target list, calls > addTargetToGroupList() for every TLE, and then returns. > > For non-ALL, it calls transformGroupClauseExpr() for every group clause. > Inside transformGroupClauseExpr(), there is logic that the ALL path misses: > ``` > /* > * If the GROUP BY tlist entry also appears in ORDER BY, copy operator > * info from the (first) matching ORDER BY item. This means that if > * you write something like "GROUP BY foo ORDER BY foo USING <<<", the > * GROUP BY operation silently takes on the equality semantics implied > * by the ORDER BY. There are two reasons to do this: it improves the > * odds that we can implement both GROUP BY and ORDER BY with a single > * sort step, and it allows the user to choose the equality semantics > * used by GROUP BY, should she be working with a datatype that has > * more than one equality operator. > * > * If we're in a grouping set, though, we force our requested ordering > * to be NULLS LAST, because if we have any hope of using a sorted agg > * for the job, we're going to be tacking on generated NULL values > * after the corresponding groups. If the user demands nulls first, > * another sort step is going to be inevitable, but that's the > * planner's problem. > */ > > foreach(sl, sortClause) > { > SortGroupClause *sc = (SortGroupClause *) lfirst(sl); > > if (sc->tleSortGroupRef == tle->ressortgroupref) > { > SortGroupClause *grpc = copyObject(sc); > > if (!toplevel) > grpc->nulls_first = false; > *flatresult = lappend(*flatresult, grpc); > found = true; > break; > } > } > ``` > > Based on this finding, I built a test using record_image_ops, where row(1.0) > and row(1.00) compare as distinct under record-image equality: > ``` > evantest=# CREATE TYPE t_rec AS (x numeric); > CREATE TYPE > evantest=# > evantest=# SELECT row(1.0)::t_rec OPERATOR(pg_catalog.*=) row(1.00)::t_rec; > ?column? > ---------- > f > (1 row) > ``` > > Here is the repro: > ``` > evantest=# create table t (a t_rec); > CREATE TABLE > evantest=# set enable_hashagg = 0; > SET > evantest=# insert into t values(row(1.0)::t_rec), (row(1.00)::t_rec); > INSERT 0 2 > evantest=# select a, count(a) from t group by a order by a using > operator(pg_catalog.*<); > a | count > --------+------- > (1.00) | 1 > (1.0) | 1 > (2 rows) > > evantest=# select a, count(a) from t group by all order by a using > operator(pg_catalog.*<); > a | count > -------+------- > (1.0) | 2 > (1 row) > ``` > > As we can see, "GROUP BY a" distinguishes the two rows because it uses the > equality semantics implied by ORDER BY ... USING, but "GROUP BY ALL" groups > the two rows together because it uses the default grouping semantics instead. > > The fix mostly refactors the existing logic so the GROUP BY ALL path also > handles the ORDER BY sort clause. See the attached patch for details. >
Oops! Once again, forgot attachment. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
v1-0001-Fix-GROUP-BY-ALL-handling-of-ORDER-BY-operator-se.patch
Description: Binary data
