[ 
https://issues.apache.org/jira/browse/ARROW-13317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17379430#comment-17379430
 ] 

Weston Pace commented on ARROW-13317:
-------------------------------------

Also, it probably isn't clear in my previous comment, but my motivation here is 
that, instead of improving documentation, a better change might be to simply 
wire up use_threads so that use_threads=False does in fact control whether 
threads are used or not for the decompression.  Then the existing documentation 
will be fine.  This would be more in line with other readers (e.g. parquet & 
CSV).

> [Python] Improve documentation on what 'use_threads' does in 'read_feather'
> ---------------------------------------------------------------------------
>
>                 Key: ARROW-13317
>                 URL: https://issues.apache.org/jira/browse/ARROW-13317
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>    Affects Versions: 4.0.1
>            Reporter: Arun Joseph
>            Priority: Trivial
>              Labels: documentation
>
> The current documentation for 
> [read_feather|https://arrow.apache.org/docs/python/generated/pyarrow.feather.read_feather.html]
>  states the following:
> *use_threads* (_bool__,_ _default True_) – Whether to parallelize reading 
> using multiple threads.
> if the underlying file uses compression, then multiple threads can still be 
> spawned. The verbiage of the *use_threads* is ambiguous on whether the 
> restriction on multiple threads is only for the conversion from pyarrow to 
> the pandas dataframe vs the reading/decompression of the file itself which 
> might spawn additional threads.
> [set_cpu_count|http://arrow.apache.org/docs/python/generated/pyarrow.set_cpu_count.html#pyarrow.set_cpu_count]
>  might be good to mention as a way to actually limit threads spawned



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to