[
https://issues.apache.org/jira/browse/ARROW-7703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joris Van den Bossche updated ARROW-7703:
-----------------------------------------
Component/s: Python
C++ - Dataset
> [C++][Dataset] Give more informative error message for mismatching schemas
> for FileSystemSources
> ------------------------------------------------------------------------------------------------
>
> Key: ARROW-7703
> URL: https://issues.apache.org/jira/browse/ARROW-7703
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++ - Dataset, Python
> Reporter: Joris Van den Bossche
> Priority: Major
>
> Currently, if you try to create a dataset from files with different schemes,
> you get this error:
> {code}
> ArrowInvalid: Unable to merge: Field a has incompatible types: int64 vs int32
> {code}
> If you are reading a directory of files, it would be very helpful if the
> error message can indicate which files are involved here (eg if you have a
> lot of files and only one has an error).
> You can already inspect the schema's if you first make a SourceFactory
> manually, but that also only gives a list of schema's, not mapped to the
> original file (this last item probably relates to ARROW-7608
--
This message was sent by Atlassian Jira
(v8.3.4#803005)