DB Tsai created ARROW-1830:
------------------------------
Summary: [Python] Error when loading all the files in a dictionary
Key: ARROW-1830
URL: https://issues.apache.org/jira/browse/ARROW-1830
Project: Apache Arrow
Issue Type: Bug
Components: Python
Affects Versions: 0.7.1
Environment: Python 2.7.11 (default, Jan 22 2016, 08:29:18) + pyarrow
0.7.1
Reporter: DB Tsai
I can read one parquet file, but when I tried to read all the parquet files in
a folder, I got an error.
{code:python}
>>> data =
>>> pq.ParquetDataset('./aaa/part-00000-d8268e3a-4e65-41a3-a43e-01e0bf68ee86')
>>> data = pq.ParquetDataset('./aaa/')
Ignoring path: ./aaa//part-00000-d8268e3a-4e65-41a3-a43e-01e0bf68ee86
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/site-packages/pyarrow/parquet.py", line 638,
in __init__
self.validate_schemas()
File "/usr/local/lib/python2.7/site-packages/pyarrow/parquet.py", line 647,
in validate_schemas
self.schema = self.pieces[0].get_metadata(open_file).schema
IndexError: list index out of range
>>>
{code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)