[ 
https://issues.apache.org/jira/browse/SPARK-51966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wojciech Szlachta updated SPARK-51966:
--------------------------------------
    Target Version/s:   (was: 4.0.0)

> Replace select.select() with select.poll() when running on POSIX os
> -------------------------------------------------------------------
>
>                 Key: SPARK-51966
>                 URL: https://issues.apache.org/jira/browse/SPARK-51966
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark
>    Affects Versions: 4.0.0, 3.5.5
>            Reporter: Wojciech Szlachta
>            Priority: Minor
>
> On glibc based Linux systems {{select()}} can monitor only file descriptor 
> numbers that are less than {{FD_SETSIZE}} (1024).
> This is an unreasonably low limit for many modern applications.
> When running via {{pyspark}} we frequently observe:
> {code}
> Exception occurred during processing of request from ('127.0.0.1', 46334)
> Traceback (most recent call last):
>   File "/usr/lib/python3.11/socketserver.py", line 317, in 
> _handle_request_noblock
>     self.process_request(request, client_address)
>   File "/usr/lib/python3.11/socketserver.py", line 348, in process_request
>     self.finish_request(request, client_address)
>   File "/usr/lib/python3.11/socketserver.py", line 361, in finish_request
>     self.RequestHandlerClass(request, client_address, self)
>   File "/usr/lib/python3.11/socketserver.py", line 755, in __init__
>     self.handle()
>   File "/usr/lib/python3.11/site-packages/pyspark/accumulators.py", line 293, 
> in handle
>     poll(authenticate_and_accum_updates)
>   File "/usr/lib/python3.11/site-packages/pyspark/accumulators.py", line 266, 
> in poll
>     r, _, _ = select.select([self.rfile], [], [], 1)
>               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> ValueError: filedescriptor out of range in select()
> {code}
> On POSIX systems {{poll()}} should be used instead of {{select()}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to