[ https://issues.apache.org/jira/browse/ARROW-13317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun Joseph updated ARROW-13317: -------------------------------- Description: The current documentation for [read_feather|https://arrow.apache.org/docs/python/generated/pyarrow.feather.read_feather.html] states the following: *use_threads* (_bool__,_ _default True_) – Whether to parallelize reading using multiple threads. if the underlying file uses compression, then multiple threads can still be spawned. The verbiage of the *use_threads* is ambiguous on whether the restriction on multiple threads is only for the conversion from pyarrow to the pandas dataframe vs the reading/decompression of the file itself which might spawn additional threads. [set_cpu_count|http://arrow.apache.org/docs/python/generated/pyarrow.set_cpu_count.html#pyarrow.set_cpu_count] might be good to mention as a way to actually limit threads spawned was: The current documentation for [read_feather|https://arrow.apache.org/docs/python/generated/pyarrow.feather.read_feather.html] states the following: *use_threads* (_bool__,_ _default True_) – Whether to parallelize reading using multiple threads. if the underlying file uses compression, then multiple threads can still be spawned. The verbiage of the *use_threads* is ambiguous on whether the restriction on multiple threads is only for the conversion from pyarrow to the pandas dataframe vs the reading/decompression of the file itself which might spawn additional threads. [set_cpu_count|http://arrow.apache.org/docs/python/generated/pyarrow.set_cpu_count.html#pyarrow.set_cpu_count] might good to mention as well > [Python] Improve documentation on what 'use_threads' does in 'read_feather' > --------------------------------------------------------------------------- > > Key: ARROW-13317 > URL: https://issues.apache.org/jira/browse/ARROW-13317 > Project: Apache Arrow > Issue Type: Improvement > Components: Python > Affects Versions: 4.0.1 > Reporter: Arun Joseph > Priority: Trivial > Labels: documentation > > The current documentation for > [read_feather|https://arrow.apache.org/docs/python/generated/pyarrow.feather.read_feather.html] > states the following: > *use_threads* (_bool__,_ _default True_) – Whether to parallelize reading > using multiple threads. > if the underlying file uses compression, then multiple threads can still be > spawned. The verbiage of the *use_threads* is ambiguous on whether the > restriction on multiple threads is only for the conversion from pyarrow to > the pandas dataframe vs the reading/decompression of the file itself which > might spawn additional threads. > [set_cpu_count|http://arrow.apache.org/docs/python/generated/pyarrow.set_cpu_count.html#pyarrow.set_cpu_count] > might be good to mention as a way to actually limit threads spawned -- This message was sent by Atlassian Jira (v8.3.4#803005)