kosiew opened a new pull request, #20771:
URL: https://github.com/apache/datafusion/pull/20771
## Which issue does this PR close?
- Part of #19950.
## Rationale for this change
`UPDATE ... FROM` planning was rejected entirely, which blocked the baseline
provider and planner work needed to support this statement shape at all. This
PR implements the first scoped slice for #19950: enable planning and execution
plumbing for baseline `UPDATE ... FROM` support without mixing in later
correctness fixes and cleanup refactors.
The goal of this PR is to land the foundational path first:
- allow `UPDATE ... FROM` through SQL planning
- preserve source-table qualifiers in joined assignments
- add a provider hook for multi-table updates
- route physical planning through that hook when the update input contains a
join
- add the minimum planner and SQLLogicTest coverage needed to prove the path
works
Follow-up fixes for target-row image correctness and related edge cases are
intentionally left to later PRs.
## What changes are included in this PR?
- Enable `UPDATE ... FROM` in the SQL planner by removing the previous
not-implemented rejection in
[datafusion/sql/src/statement.rs](/Users/kosiew/GitHub/df-temp/datafusion/sql/src/statement.rs).
- Extend `TableProvider` with `update_from(...)` for multi-table update
execution in
[datafusion/catalog/src/table.rs](/Users/kosiew/GitHub/df-temp/datafusion/catalog/src/table.rs).
- Implement baseline `MemTable::update_from(...)` support in
[datafusion/catalog/src/memory/table.rs](/Users/kosiew/GitHub/df-temp/datafusion/catalog/src/memory/table.rs).
- Update the physical planner in
[datafusion/core/src/physical_planner.rs](/Users/kosiew/GitHub/df-temp/datafusion/core/src/physical_planner.rs)
to:
- distinguish single-table `UPDATE` from `UPDATE ... FROM`
- preserve source-table qualifiers in extracted joined assignments
- call `update_from(...)` when the logical update input contains a join
- resolve join partitioning before handing the physical input to providers
- Add planner coverage in
[datafusion/core/tests/custom_sources_cases/dml_planning.rs](/Users/kosiew/GitHub/df-temp/datafusion/core/tests/custom_sources_cases/dml_planning.rs).
- Add SQLLogicTest coverage for alias handling and baseline `UPDATE ...
FROM` execution in
[datafusion/sqllogictest/test_files/update.slt](/Users/kosiew/GitHub/df-temp/datafusion/sqllogictest/test_files/update.slt).
## Are these changes tested?
Yes.
This PR adds targeted coverage for:
- planner behavior for `UPDATE ... FROM`
- assignment extraction for joined updates and aliases
- provider routing through `update_from(...)`
- SQLLogicTest coverage for baseline `UPDATE ... FROM` statements
Validated with:
- `cargo test -p datafusion --test core_integration
custom_sources_cases::dml_planning -- --nocapture`
- `cargo test -p datafusion-sqllogictest update -- --nocapture`
## Are there any user-facing changes?
Yes. DataFusion now accepts baseline `UPDATE ... FROM` statements, including
aliased forms covered by the new tests, instead of rejecting them as
unsupported.
There is also a public API addition for custom table providers:
`TableProvider::update_from(...)`.
### LLM-generated code disclosure
This PR includes LLM-generated code and comments. All LLM-generated content
has been manually reviewed and tested.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]