[jira] [Created] (ARROW-4143) Skip rows while reading parquet file

Sanchit (JIRA) Tue, 01 Jan 2019 15:33:48 -0800

Sanchit created ARROW-4143:
------------------------------

             Summary: Skip rows while reading parquet file
                 Key: ARROW-4143
                 URL: https://issues.apache.org/jira/browse/ARROW-4143
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Developer Tools
            Reporter: Sanchit



Is there any functionality in pyarrow that allows reading the file partially. 
Means if I wish to read only the first 10 rows from the parquet file. 

I got this situation while doing this:

`df = pd.read_parquet(path= 'filepath', nrows = 10)`  #Gave me error

I wanted to read just the 10 rows into pandas dataframe using the read_parquet, 
(read_parquet uses pyarrow as one of the engines to read parquet file). As the 
parquet file is considerably huge in size, if one wants to read only a few n 
rows is there any functionality we can add in the engine to do so?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ARROW-4143) Skip rows while reading parquet file

Reply via email to