[ https://issues.apache.org/jira/browse/ARROW-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17662144#comment-17662144 ]
Rok Mihevc commented on ARROW-5122: ----------------------------------- This issue has been migrated to [issue #21607|https://github.com/apache/arrow/issues/21607] on GitHub. Please see the [migration documentation|https://github.com/apache/arrow/issues/14542] for further details. > [Python] pyarrow.parquet.read_table raises non-file path error when given a > windows path to a directory > ------------------------------------------------------------------------------------------------------- > > Key: ARROW-5122 > URL: https://issues.apache.org/jira/browse/ARROW-5122 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 0.12.1 > Environment: Windows > Reporter: Ilya Orson Sandoval > Priority: Minor > Labels: parquet > Fix For: 0.14.0 > > > I think this might be a small bug with the read_table interface when used to > load a directory full of parquets in Windows. It works just fine if I use > directly a ParquetDataset object to read the table represented by the > directory, or if I use {{read_table}} in a linux terminal. > Apparently the problem comes from the {{_make_manifest()}} method in > {{parquet.py}}, I think around line ~1045. Either {{_is_path_like()}} or the > FileSystem method {{isdir()}} fail to recognize the path as a valid directory > (I tested with a raw Windows path and a {{pathlib.WindowsPath}} object). > I hope this helps a little. > P.D. Thank you for your effort developing this package! -- This message was sent by Atlassian Jira (v8.20.10#820010)