alexrafferty-qoria opened a new issue, #17222:
URL: https://github.com/apache/datafusion/issues/17222
### Is your feature request related to a problem or challenge?
Datafusion can't read JSONL/ND-JSON files where the rows are arrays rather
than objects. e.g.:
```
[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
```
### Describe the solution you'd like
It would be great if Datafusion could directly consume ND-JSON files
containing top-level arrays, mapping the array to a single field with a
customisable name, like `data` or `items`.
I notice that the underlying arrow library seems to support this concept via
`datafusion::arrow::json::ReaderBuilder::new_with_field`, but this doesn't seem
to be exposed by the JSON `FileFormat` that Datafusion exposes.
### Describe alternatives you've considered
At present I am working around the issue with a pre-processing step that
wraps each line with a `{ "data": ... }` wrapper, however this comes with a
heavy performance penalty.
### Additional context
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]