rahil-c opened a new issue, #14127: URL: https://github.com/apache/hudi/issues/14127
### Feature Description **What the feature achieves:** This feature enables Apache Hudi to use Lance as a storage file format, similar to existing formats we support like Parquet and ORC. By doing so, Hudi tables can store and query multi-modal data —including , semi-structured, and unstructured (e.g., embeddings, images, video) — while still benefiting from Hudi’s transactional, incremental, and metadata management layers. **Why this feature is needed:** Existing file formats like Parquet and ORC are optimized for tabular analytics, not AI/ML workloads involving embeddings, tensors, or unstructured content. Modern data platforms increasingly need to manage hybrid data — where text, image, and vector data coexist with traditional tabular features. ### User Experience **How users will use this feature:** For more details please reference the RFC: https://github.com/apache/hudi/pull/13924/files#diff-f05ae69c4f41edc32aabfbfc016a12ad1af72917314844f8ae52671234508c56R37 ### Hudi RFC Requirements **RFC PR link:** (if applicable) https://github.com/apache/hudi/pull/13924/files#diff-f05ae69c4f41edc32aabfbfc016a12ad1af72917314844f8ae52671234508c56R37 **Why RFC is/isn't needed:** - Does this change public interfaces/APIs? (Yes/No) Yes - Does this change storage format? (Yes/No) Yes - Justification: We will be incrementally making changes to the storage format, there are two other prerequisite RFCs, one around introducing a new type system [RFC-99](https://github.com/apache/hudi/pull/13743/files#diff-578d129aa7ddaddc5fac3c89de3d753f18419f85d96dd8281e91047e3ce49465), and the other introducing the notion of a Column Group in Hudi [RFC-80](https://github.com/apache/hudi/blob/master/rfc/rfc-80/rfc-80.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
