GitHub user rahil-c created a discussion: RFC-100: Lance File Format support in 
Hudi

## ✅ Lance File Format Integration Tasks

See the following feature for more context: 
https://github.com/apache/hudi/issues/14127

In regards to the following new feature for supporting unstructured data in 
Hudi via formats like Lance, here is the initial scope of what we are 
targeting(Note this list will continue to grow as we find get deeper within the 
integration):

- [ ] Add base `HoodieFileWriter` for Lance with a Spark implementation
- [ ] Add base `HoodieFileReader` for Lance with a Spark implementation
- [ ] Add basic Avro → Arrow schema conversion  
- [ ] Add `SparkColumnarFileReader` implementation for Lance  
- [ ] Implement append-only validation (bulk insert)  
- [ ] Integrate Lance as a log file format  
- [ ] Implement insert / upsert / delete validation  
- [ ] Add predicate (filter) push-down  
- [ ] Support `ColumnarBatch` vectorized reading

GitHub link: https://github.com/apache/hudi/discussions/14128

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to