andygrove opened a new issue, #35:
URL: https://github.com/apache/datafusion-java/issues/35
### Is your feature request related to a problem or challenge?
DataFusion 53.1 has built-in support for newline-delimited JSON
(`SessionContext::read_json` / `register_json`), but the Java binding
currently only exposes Parquet (#18) and CSV (#21). Users wanting to
query JSON files have to fall back to `CREATE EXTERNAL TABLE` via SQL,
which loses the typed-options ergonomics the Parquet/CSV bindings
already provide.
### Describe the solution you'd like
Mirror the existing reader pattern:
- Add an `NdJsonReadOptions` value class analogous to
`ParquetReadOptions` / `CsvReadOptions` (file extension, schema,
schema-infer-max-records, compression, etc.).
- Add a `proto/json_read_options.proto` and pass options through the
established proto-over-JNI convention (#29).
- Expose `SessionContext.registerJson(name, path[, options])` and
`readJson(path[, options])`.
- Cover with tests in the spirit of `SessionContextCsvTest` /
`ParquetReadOptionsTest`.
### Describe alternatives you've considered
Users can issue `CREATE EXTERNAL TABLE … STORED AS JSON` via
`SessionContext.sql`. This works but bypasses the typed builder pattern
and gives a less discoverable Java API.
### Additional context
DataFusion's JSON reader is in the default feature set, so no Cargo
feature flag changes are required on the native side.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]