AlenkaF commented on issue #46151: URL: https://github.com/apache/arrow/issues/46151#issuecomment-2808500280
Thank you for opening up the issue @vadimkantorov. Pandas is not a required dependency for PyArrow, but a lot of functionality is designed to work seamlessly when both PyArrow and Pandas are installed. The error you're encountering is actually coming from the use of an `Expression` class and `scalar()` method in `dataset.py,` which is imported for backward compatibility: https://github.com/apache/arrow/blob/4937cf5721bd4912438964377361c4ec49fd5e80/python/pyarrow/dataset.py#L62-L63 In that part of the code, the Pandas API is used to check whether an object is array-like. You can confirm that Pandas is not a required dependency by running the example you included in this issue without Pandas installed. In that case, you should see the expected `FileNotFoundError: non-existing.parquet`—the same error you'd get with a updated version of Pandas. Regarding Pandas version compatibility, version 1.0 is currently listed under PyArrow's [optional dependencies](https://arrow.apache.org/docs/python/install.html#dependencies). So it may be worth revisiting this, or at least improving the warning message in the situation you described. cc @raulcd. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org