Sanchit created ARROW-4143:
------------------------------
Summary: Skip rows while reading parquet file
Key: ARROW-4143
URL: https://issues.apache.org/jira/browse/ARROW-4143
Project: Apache Arrow
Issue Type: Improvement
Components: Developer Tools
Reporter: Sanchit
Is there any functionality in pyarrow that allows reading the file partially.
Means if I wish to read only the first 10 rows from the parquet file.
I got this situation while doing this:
`df = pd.read_parquet(path= 'filepath', nrows = 10)` #Gave me error
I wanted to read just the 10 rows into pandas dataframe using the read_parquet,
(read_parquet uses pyarrow as one of the engines to read parquet file). As the
parquet file is considerably huge in size, if one wants to read only a few n
rows is there any functionality we can add in the engine to do so?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)