kou commented on code in PR #44766:
URL: https://github.com/apache/arrow/pull/44766#discussion_r1864753423
##########
python/pyarrow/parquet/core.py:
##########
@@ -1169,7 +1169,13 @@ def _get_pandas_index_columns(keyvalues):
assumes directory names with key=value pairs like "/year=2009/month=11".
In addition, a scheme like "/2009/11" is also supported, in which case
you need to specify the field names or a full schema. See the
- ``pyarrow.dataset.partitioning()`` function for more details."""
+ ``pyarrow.dataset.partitioning()`` function for more details.
+partition_base_dir : str, optional
+ For the purposes of applying the partitioning, paths will be
+ stripped of the partition_base_dir. Files not matching the
+ partition_base_dir prefix will be skipped for partitioning discovery.
+ The ignored files will still be part of the Dataset, but will not
+ have partition information."""
Review Comment:
```suggestion
For the purposes of applying the partitioning, paths will be
stripped of the partition_base_dir. Files not matching the
partition_base_dir prefix will be skipped for partitioning discovery.
The ignored files will still be part of the Dataset, but will not
have partition information."""
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]