timsaucer opened a new issue, #1458: URL: https://github.com/apache/datafusion-python/issues/1458
## Summary Several SessionContext methods for reading data sources and registering tables from upstream DataFusion v53 are not yet exposed in datafusion-python. ## Missing Methods **Read methods:** - [ ] `read_arrow` — read an Arrow IPC file into a DataFrame - [ ] `read_batch` — read a single RecordBatch into a DataFrame - [ ] `read_batches` — read multiple RecordBatches into a DataFrame - [ ] `read_empty` — create an empty DataFrame with a given schema **Write methods:** - [ ] `write_csv` — write query results to CSV directly from context - [ ] `write_json` — write query results to JSON directly from context - [ ] `write_parquet` — write query results to Parquet directly from context **Registration:** - [ ] `register_arrow` — register an Arrow IPC file as a table - [ ] `register_batch` — register a single RecordBatch as a table ## Upstream Reference - https://docs.rs/datafusion/53.0.0/datafusion/execution/context/struct.SessionContext.html ## Implementation - Rust bindings: `crates/core/src/context.rs` - Python wrappers: `python/datafusion/context.py` > **Note:** This gap analysis was performed using an AI agent comparing upstream DataFusion v53 documentation against the current datafusion-python codebase. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
