yihua opened a new issue, #18780:
URL: https://github.com/apache/hudi/issues/18780

   ## Summary
   
   This issue tracks the migration of the Trino-Hudi connector internals from 
Trino OSS (`plugin/trino-hudi`) into a Hudi-published Maven artifact 
`org.apache.hudi:hudi-trino`. The Trino side becomes a thin shim that registers 
the `Plugin` SPI entry point and pulls in the Hudi-published artifact; all 
connector logic (split generation, page source, metadata, indexes, MOR merging, 
storage bridge) lives in Hudi OSS.
   
   Detailed design will land as **RFC-105** in `apache/hudi`. This issue is the 
feature-tracking issue referenced from the RFC's `Status` field.
   
   ## Motivation
   
   - The Trino-Hudi connector currently lives in `trinodb/trino` at 
`plugin/trino-hudi`. Trino-side maintainer focus is on Iceberg/Delta Lake; Hudi 
has limited Trino-side review bandwidth.
   - Four stacked Hudi-side improvement PRs were closed by Trino's stale-bot 
for lack of activity:
     - trinodb/trino#28518
     - trinodb/trino#28533
     - trinodb/trino#28644
     - trinodb/trino#28645
   - Trino maintainer @raunaqm proposed reducing the Trino-side Hudi connector 
to a thin shim around a Hudi-published library; Hudi community accepted. 
Condition from @raunaqm: "still maintain a comprehensive test suite on Trino 
side."
   - Hudi-side improvements ready to land via this mechanism: 
metadata-table-driven partition listing, index support (column-stats, 
partition-stats, RLI, secondary), MOR snapshot correctness fixes, file-system 
caching, snapshot isolation via lazy commit time on `HudiTableHandle`.
   
   ## Proposed Approach
   
   Two artifacts:
   
   | Artifact | Owner | Contents | Cadence |
   |---|---|---|---|
   | `io.trino:trino-hudi` | Trino OSS | `HudiPlugin` shim + test harness 
(smoke tests, query runner, MinIO tests) | Per Trino release |
   | `org.apache.hudi:hudi-trino` | Hudi OSS | Full connector logic at 
`io.trino.plugin.hudi.*` | Per Hudi release; first ship: Hudi 1.3.0 |
   
   Key design points:
   - **Single Hudi artifact** targets the latest Trino release; assumes Trino 
SPI backward compatibility (RevAPI-enforced) across recent Trino releases.
   - **Not shaded**: regular Maven jar; `hudi-common`, `hudi-io` as transitive 
deps; Trino's `trino-spi`, `trino-filesystem` as `provided`.
   - **Hudi-side GH Action** pulls latest `trinodb/trino` master and recompiles 
`hudi-trino` against it on every push, catching SPI drift early.
   - **Hudi-side GH Action** runs the full `hudi-trino` test suite (including 
the Trino-side smoke tests mirrored on Hudi side).
   - **Separate Maven build target on Hudi side** (excluded from default `mvn 
install`) — `hudi-trino` needs Java 25 to match Trino OSS; the rest of Hudi 
targets a lower Java floor.
   - **Full test duplication**: smoke + integration tests live on both Trino 
side (per @raunaqm) and Hudi side.
   - **More frequent Hudi releases** going forward to keep the integration 
model responsive.
   
   ## Rollout Plan
   
   1. Hudi prepares `hudi-trino` module (evolved from current 
`hudi-trino-plugin/` work) — first publication in **Hudi 1.3.0**.
   2. Trino-side `plugin/trino-hudi` becomes a thin shim, depends on 
`org.apache.hudi:hudi-trino:1.3.0`. Submitted as a small Trino PR.
   3. Subsequent Trino-Hudi work happens on Hudi side; Trino picks up via 
version bumps.
   
   ## Prior Discussion
   
   Slack `#dev` thread (May 2026): alignment between Hudi (@yihua, @voonhous) 
and Trino (@raunaqm, @ebyhr, @mariusgrama, @findepi, @manfred-moser) on this 
approach.
   
   ## RFC
   
   Design details will live in **RFC-105** (PR forthcoming in 
`apache/hudi/rfc/rfc-105/`).
   
   ## Owners
   
   - @yihua
   - @voonhous
   
   cc @codope @bhasudha @vinothchandar


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to