[ https://issues.apache.org/jira/browse/ARROW-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Antoine Pitrou updated ARROW-6469: ---------------------------------- Summary: [Python] HDFS documentation does not mention HDFS short circuit readings (was: PyArrow HDFS documentation does not mention HDFS short circuit readings) > [Python] HDFS documentation does not mention HDFS short circuit readings > ------------------------------------------------------------------------ > > Key: ARROW-6469 > URL: https://issues.apache.org/jira/browse/ARROW-6469 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Reporter: Paulo Roberto Cerioni > Priority: Minor > Labels: documentation > > Due to PyArrow using libhdfs underneath, it is expected that files read from > HDFS are going to make use of short circuit readings. > However, the PyArrow documentation does not explain whether this feature is > supported (and on what situations) and if that works without any > configuration. > For instance, I'm interested in the use case in which we make use of short > circuit feature to read some of the columns from a Parquet file located in > HDFS into a dataframe. > -- This message was sent by Atlassian Jira (v8.3.4#803005)