On Wed, Jul 17, 2024 at 8:50 AM Paul George <p.a.georg...@gmail.com> wrote: > > Since a subquery is a volatile expression, each of its instances > should be evaluated separately.
I don't think this conclusion is correct. Look at: select random(), random() from t group by random(); random | random --------------------+-------------------- 0.7972330769936766 | 0.7972330769936766 (1 row) > This seems like a valid point, though "query 2" below which groups over a > RANDOM() column and outputs an additional RANDOM() column a potential, albeit > contrived, counter-example? [NOTE: this was done on Postgres 16.3] I've > included a few different combinations of GROUP BYs. Interesting. I looked into the scenarios with multiple instances of the same volatile grouping expressions and here is what I observed. create table t (a int, b int); insert into t select 1,1; -- on master, with plain volatile functions select random() as c1, random() as c2, random() as c3 from t t1 group by c1; c1 | c2 | c3 -------------------+-------------------+------------------- 0.567478050404431 | 0.567478050404431 | 0.567478050404431 (1 row) So the random() function is evaluated only once, even though it appears three times. -- on master, with subqueries that are 'volatile' select (select random() from t t2 where a = t1.a) as c1, (select random() from t t2 where a = t1.a) as c2, (select random() from t t2 where a = t1.a) as c3 from t t1 group by c1; c1 | c2 | c3 --------------------+--------------------+-------------------- 0.8420177313766823 | 0.2969648209746336 | 0.3499675329093421 (1 row) So on master the subquery is evaluated three times. Why isn't this consistent with the behavior of the first query? -- on patched, with subqueries that are 'volatile' select (select random() from t t2 where a = t1.a) as c1, (select random() from t t2 where a = t1.a) as c2, (select random() from t t2 where a = t1.a) as c3 from t t1 group by c1; c1 | c2 | c3 --------------------+--------------------+-------------------- 0.5203586066423254 | 0.5203586066423254 | 0.5203586066423254 (1 row) So on patched the subquery is evaluated only once, which is consistent with the behavior of the first query. Does this suggest that the patched version is more 'correct' for this case? Now let's look at the scenario with two grouping keys. -- on master, with plain volatile functions select random() as c1, random() as c2, random() as c3 from t t1 group by c1, c2; c1 | c2 | c3 --------------------+--------------------+-------------------- 0.9388558105069595 | 0.2900389441597979 | 0.9388558105069595 (1 row) So the first two random() functions are evaluated independently, and the third random() function references the result of the first one. -- on master, with subqueries that are 'volatile' select (select random() from t t2 where a = t1.a) as c1, (select random() from t t2 where a = t1.a) as c2, (select random() from t t2 where a = t1.a) as c3 from t t1 group by c1, c2; c1 | c2 | c3 ---------------------+--------------------+-------------------- 0.46275163300894073 | 0.5083760995112951 | 0.6752682696191123 (1 row) So on master the subquery is evaluated three times. -- on patched, with subqueries that are 'volatile' select (select random() from t t2 where a = t1.a) as c1, (select random() from t t2 where a = t1.a) as c2, (select random() from t t2 where a = t1.a) as c3 from t t1 group by c1, c2; c1 | c2 | c3 --------------------+--------------------+-------------------- 0.9887848690744176 | 0.9887848690744176 | 0.9887848690744176 (1 row) So on patched the subquery is evaluated only once. It seems that in this scenario, neither the master nor the patched version handles volatile subqueries in grouping expressions the same way as it handles plain volatile functions. I am confused. Does the SQL standard explicitly define or standardize the behavior of grouping by volatile expressions? Does anyone know about that? Thanks Richard