kosiew opened a new pull request, #18866:
URL: https://github.com/apache/datafusion/pull/18866
## Which issue does this PR close?
* Closes #16276.
## Rationale for this change
This PR addresses failures in Substrait round‑trip mode where Virtual Tables
were limited to accepting only literals or aliased literals. Several
SQLLogicTests—especially those involving nested or scalar functions such as
`map()`—failed because expressions were rejected prematurely.
Allowing expression-based rows when exporting Virtual Tables is essential to
making Substrait → DataFusion → Substrait round‑trips stable and representative
of real workloads. Fixing this improves test coverage, unblocks Substrait
integration efforts, and eliminates a major source of round‑trip
inconsistencies.
### Before
```
❯ cargo test --test sqllogictests -- --substrait-round-trip map.slt:388
Finished `test` profile [unoptimized + debuginfo] target(s) in 0.66s
Running bin/sqllogictests.rs
(target/debug/deps/sqllogictests-d29eafa7e841495b)
Completed 1 test files in 0 seconds
External error: 2 errors in file
/Users/kosiew/GitHub/datafusion/datafusion/sqllogictest/test_files/map.slt
1. query failed: DataFusion error: This feature is not implemented:
Unsupported plan type: DescribeTable { schema: Schema ...
[SQL] describe data;
at
/Users/kosiew/GitHub/datafusion/datafusion/sqllogictest/test_files/map.slt:43
2. query failed: DataFusion error: Substrait error: Only literal types and
aliases are supported in Virtual Tables, got: ScalarFunction
[SQL] VALUES (MAP(['a'], [1])), (MAP(['b'], [2])), (MAP(['c', 'a'], [3, 1]))
at
/Users/kosiew/GitHub/datafusion/datafusion/sqllogictest/test_files/map.slt:388
```
### After
```
❯ cargo test --test sqllogictests -- --substrait-round-trip map.slt:388
Finished `test` profile [unoptimized + debuginfo] target(s) in 0.50s
Running bin/sqllogictests.rs
(target/debug/deps/sqllogictests-62df8e0045f6176f)
Completed 1 test files in 0 seconds
```
## What changes are included in this PR?
* Introduces `convert_literal_rows` and `convert_expression_rows` helpers to
properly distinguish literal-only rows from expression-based rows.
* Adds support for generating nested `Struct` expressions (Substrait
`NestedStruct`) for non-literal Virtual Table values.
* Updates `from_values` to dynamically choose between literal structs and
nested structs.
* Ensures schema consistency validation for expression-based rows.
* Adds new round‑trip test cases, including
`roundtrip_values_with_scalar_function`.
* Refactors tests to use a new `substrait_roundtrip` helper to reduce
duplication.
* Adds `LogicalPlan::DescribeTable(_)` handling in the round‑trip engine.
## Are these changes tested?
Yes. New test coverage includes:
* A dedicated round‑trip test for scalar functions inside `VALUES` clauses.
* Use of `substrait_roundtrip` across multiple existing tests to ensure
consistent behavior.
* Validation of expression-based row handling, alias stripping, and schema
equivalence.
## Are there any user-facing changes?
No direct API-breaking changes. The behavior of Substrait round‑trip
conversion improves and becomes more permissive regarding expressions in
Virtual Tables. This is internal to Substrait/DataFusion integration and does
not change SQL semantics or user-level APIs.
## LLM-generated code disclosure
This PR includes LLM-generated code and comments. All LLM-generated content
has been manually reviewed and tested.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]