0x26res opened a new issue, #44949:
URL: https://github.com/apache/arrow/issues/44949
### Describe the bug, including details regarding any error messages,
version, and platform.
I had mistakenly put a duplicate field in my schema when calling
`pyarrow.json.read_json` with `ParseOptions.explicit_schema`
I was either getting seg faults or an error message that wasn't clear.
Given the nature of json, there shouldn't be duplicate fields in the schema.
`pyarrow.json.ParseOptions` should throw if there's any duplicate field in the
`explicit_schema`.
I can send an MR, I'm just wondering if there's anywhere else where we do
similar checks?
```python
import io
import pyarrow as pa
import pyarrow.json
SCHEMA = pa.schema(
[
pa.field("foo", pa.bool_()),
pa.field("foo", pa.bool_()),
]
)
with io.BytesIO(b'{"foo": true,"other": "bar"}') as buffer:
buffer.seek(0)
pyarrow.json.read_json(
buffer,
parse_options=pyarrow.json.ParseOptions(explicit_schema=SCHEMA),
)
```
```
pyarrow.json.read_json(
File "pyarrow/_json.pyx", line 308, in pyarrow._json.read_json
File "pyarrow/error.pxi", line 155, in
pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Failed to convert JSON to bool from
dictionary<values=string, indices=int32, ordered=0>
```
### Component(s)
Python
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]