Omega359 commented on PR #22161:
URL: https://github.com/apache/datafusion/pull/22161#issuecomment-4546284431
The fix looks good. I wasn't able to find any other issues however codex did:
Incorrect pruning with volatile GROUP BY expressions in mod.rs.
can_remove_unused_unnest_for_exprs only checks whether required expressions
reference unnested output columns. For GROUP BY uuid() / GROUP BY random(),
there are no column refs, so the rule removes a row-multiplying UNNEST. That
changes the number of volatile evaluations and therefore the number of groups.
datafusion/sqllogictest/test_files/unnest.slt:
```
# Volatile GROUP BY expressions are evaluated once per input row, so
removing a
# row-multiplying UNNEST changes the number of groups even when elem is
unused.
query I
SELECT COUNT(*)
FROM (
SELECT 1
FROM (
SELECT id, UNNEST(make_array(1, 2, 3)) AS elem
FROM unused_unnest_pruning
)
GROUP BY uuid()
);
----
9
```
The second query is the failing repro. On the PR it returns 3, but it should
return 9 because UNNEST(make_array(1, 2, 3)) expands the 3 input rows to 9
rows, and uuid() should be evaluated per row before grouping.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]