kosiew opened a new pull request, #22074:
URL: https://github.com/apache/datafusion/pull/22074

   ## Which issue does this PR close?
   
   * Part of #20118
   
   ## Rationale for this change
   
   This PR adds test-only coverage documenting the current optimization gap 
around unused `UNNEST` outputs.
   
   The new tests capture a case where the unnested column becomes 
duplicate-insensitive under a `GROUP BY`, while also documenting 
counterexamples where removing `UNNEST` would incorrectly change row 
cardinality or null/empty-array semantics. These tests are intended to guide 
future optimizer work without changing current behavior.
   
   ## What changes are included in this PR?
   
   * Added a regression dataset in `sqllogictest/test_files/unnest.slt`.
   * Added a reproducer showing an unused `UNNEST` output under `GROUP BY`.
   * Added `EXPLAIN` assertions documenting that the current logical and 
physical plans still contain `Unnest` / `UnnestExec`.
   * Added counterexamples demonstrating cases where removing `UNNEST` would 
change result cardinality.
   * Added coverage for empty and `NULL` array semantics to document current 
select-list `UNNEST` behavior.
   * Added cleanup for the temporary test table.
   
   ## Are these changes tested?
   
   Yes.
   
   This PR adds SQL logic tests in 
`datafusion/sqllogictest/test_files/unnest.slt`, including:
   
   * A reproducer for unused `UNNEST` output below `GROUP BY`
   * `EXPLAIN` plan assertions for `Unnest` and `UnnestExec`
   * Cardinality-sensitive counterexamples
   * Empty/NULL array semantic coverage
   
   ## Are there any user-facing changes?
   
   No. This PR only adds tests and documentation of current behavior; it does 
not change optimizer behavior or query semantics.
   
   ## LLM-generated code disclosure
   
   This PR includes LLM-generated code and comments. All LLM-generated content 
has been manually reviewed and tested.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to