hadrian-reppas commented on issue #46629: URL: https://github.com/apache/arrow/issues/46629#issuecomment-3019853367
I'll give it a shot. Instead of exposing `InspectOptions`, I think it could be better to just add two parameters to `FileSystemDatasetFactory.inspect`: - `promote_options='default'` which can also be set to `'permissive'` (same as [`unify_schemas`](https://arrow.apache.org/docs/python/generated/pyarrow.unify_schemas.html)). - `fragments=None` which can also be set to a non-negative integer. C++ defaults to only checking one fragment for performance reasons, but this could lead to cases where `factory.inspect('permissive')` does not actually unify anything because it just looks at the first schema. There are certainly good reasons to have it default to 1, though. What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
