GitHub user schenksj added a comment to the discussion: Delta Contribution 
Review Plan

Quick update — rather than keep waiting on the tracking branch, I went ahead 
and started landing the split.

**Part 1 is open upstream** as #4700 (core SPI for contrib leaf scans, 
`CometScanWithPlanData`) — it's core-only and inert on default builds. **Parts 
2–9 are split out and staged as a stacked review-draft chain on my fork** (each 
part builds on the previous); they're ready to open here in dependency order as 
each base merges to `main`. Every part is carved, fully verified, and reviewed 
clean.

| Part | What | PR |
|---|---|---|
| 1 | Core SPI for contrib leaf scans (`CometScanWithPlanData`) — core-only, 
inert on default builds | #4700 |
| 2 | Build gate + inert Delta wiring (Maven profile, Cargo feature, proto, 
`DeltaIntegration` bridge) | 
[schenksj#4](https://github.com/schenksj/datafusion-comet/pull/4) |
| 3a | Rust: driver-side planning (log replay, predicate pushdown, JNI) | 
[schenksj#5](https://github.com/schenksj/datafusion-comet/pull/5) |
| 3b | Rust: executor-side read path (kernel read + deletion vectors) | 
[schenksj#6](https://github.com/schenksj/datafusion-comet/pull/6) |
| 4a | Scala: claim/decline + scan marker | 
[schenksj#7](https://github.com/schenksj/datafusion-comet/pull/7) |
| 4b | Scala: serde + native exec — end-to-end native reads | 
[schenksj#8](https://github.com/schenksj/datafusion-comet/pull/8) |
| 5 | Change Data Feed native reads | 
[schenksj#9](https://github.com/schenksj/datafusion-comet/pull/9) |
| 6 | Contrib test battery + CI workflow | 
[schenksj#10](https://github.com/schenksj/datafusion-comet/pull/10) |
| 7 | Delta own-suite regression harness | 
[schenksj#11](https://github.com/schenksj/datafusion-comet/pull/11) |
| 8 | Docs (design docs + user guide) | 
[schenksj#12](https://github.com/schenksj/datafusion-comet/pull/12) |
| 9 | `FAILED_READ_FILE` parity for Delta reads | 
[schenksj#13](https://github.com/schenksj/datafusion-comet/pull/13) |

Full tracking (sizes, dependencies, status) is in the #4366 description. A 
review of **#4700** would get the chain moving — keeping the tracking branch 
current is the time sink I mentioned.

Thanks again for all the guidance. @andygrove @mbutrovich @parthchandra

GitHub link: 
https://github.com/apache/datafusion-comet/discussions/4648#discussioncomment-17386495

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: 
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to