kosiew opened a new pull request, #22074: URL: https://github.com/apache/datafusion/pull/22074
## Which issue does this PR close? * Part of #20118 ## Rationale for this change This PR adds test-only coverage documenting the current optimization gap around unused `UNNEST` outputs. The new tests capture a case where the unnested column becomes duplicate-insensitive under a `GROUP BY`, while also documenting counterexamples where removing `UNNEST` would incorrectly change row cardinality or null/empty-array semantics. These tests are intended to guide future optimizer work without changing current behavior. ## What changes are included in this PR? * Added a regression dataset in `sqllogictest/test_files/unnest.slt`. * Added a reproducer showing an unused `UNNEST` output under `GROUP BY`. * Added `EXPLAIN` assertions documenting that the current logical and physical plans still contain `Unnest` / `UnnestExec`. * Added counterexamples demonstrating cases where removing `UNNEST` would change result cardinality. * Added coverage for empty and `NULL` array semantics to document current select-list `UNNEST` behavior. * Added cleanup for the temporary test table. ## Are these changes tested? Yes. This PR adds SQL logic tests in `datafusion/sqllogictest/test_files/unnest.slt`, including: * A reproducer for unused `UNNEST` output below `GROUP BY` * `EXPLAIN` plan assertions for `Unnest` and `UnnestExec` * Cardinality-sensitive counterexamples * Empty/NULL array semantic coverage ## Are there any user-facing changes? No. This PR only adds tests and documentation of current behavior; it does not change optimizer behavior or query semantics. ## LLM-generated code disclosure This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
