[jira] [Created] (ARROW-6876) Reading parquet file becomes really slow for 0.15.0

Bob (Jira) Mon, 14 Oct 2019 10:08:13 -0700

Bob created ARROW-6876:
--------------------------

             Summary: Reading parquet file becomes really slow for 0.15.0
                 Key: ARROW-6876
                 URL: https://issues.apache.org/jira/browse/ARROW-6876
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
    Affects Versions: 0.15.0
         Environment: python3.7
            Reporter: Bob



Hi,

 

I just noticed that reading a parquet file becomes really slow after I upgraded 
to 0.15.0 when using pandas.

 

Example:

*With 0.14.1*
In [4]: %timeit df = pd.read_parquet(path)
2.02 s ± 47.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

*With 0.15.0*
In [5]: %timeit df = pd.read_parquet(path)
22.9 s ± 478 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

 

The file is about 15MB in size. I am testing on the same machine using the same 
version of python and pandas.

 

Have you received similar complain? What could be the issue here?

 

Thanks a lot.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ARROW-6876) Reading parquet file becomes really slow for 0.15.0

Reply via email to