Bob created ARROW-6876: -------------------------- Summary: Reading parquet file becomes really slow for 0.15.0 Key: ARROW-6876 URL: https://issues.apache.org/jira/browse/ARROW-6876 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.15.0 Environment: python3.7 Reporter: Bob
Hi, I just noticed that reading a parquet file becomes really slow after I upgraded to 0.15.0 when using pandas. Example: *With 0.14.1* In [4]: %timeit df = pd.read_parquet(path) 2.02 s ± 47.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) *With 0.15.0* In [5]: %timeit df = pd.read_parquet(path) 22.9 s ± 478 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) The file is about 15MB in size. I am testing on the same machine using the same version of python and pandas. Have you received similar complain? What could be the issue here? Thanks a lot. -- This message was sent by Atlassian Jira (v8.3.4#803005)