[
https://issues.apache.org/jira/browse/SPARK-51966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon resolved SPARK-51966.
----------------------------------
Fix Version/s: 4.2.0
Resolution: Fixed
Issue resolved by pull request 53306
[https://github.com/apache/spark/pull/53306]
> Replace select.select() with select.poll() when running on POSIX os
> -------------------------------------------------------------------
>
> Key: SPARK-51966
> URL: https://issues.apache.org/jira/browse/SPARK-51966
> Project: Spark
> Issue Type: Improvement
> Components: PySpark
> Affects Versions: 3.5.5, 4.0.0
> Reporter: Wojciech Szlachta
> Assignee: Wojciech Szlachta
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.2.0
>
>
> On glibc based Linux systems {{select()}} can monitor only file descriptor
> numbers that are less than {{FD_SETSIZE}} (1024).
> This is an unreasonably low limit for many modern applications.
> When running via {{pyspark}} we frequently observe:
> {code}
> Exception occurred during processing of request from ('127.0.0.1', 46334)
> Traceback (most recent call last):
> File "/usr/lib/python3.11/socketserver.py", line 317, in
> _handle_request_noblock
> self.process_request(request, client_address)
> File "/usr/lib/python3.11/socketserver.py", line 348, in process_request
> self.finish_request(request, client_address)
> File "/usr/lib/python3.11/socketserver.py", line 361, in finish_request
> self.RequestHandlerClass(request, client_address, self)
> File "/usr/lib/python3.11/socketserver.py", line 755, in __init__
> self.handle()
> File "/usr/lib/python3.11/site-packages/pyspark/accumulators.py", line 293,
> in handle
> poll(authenticate_and_accum_updates)
> File "/usr/lib/python3.11/site-packages/pyspark/accumulators.py", line 266,
> in poll
> r, _, _ = select.select([self.rfile], [], [], 1)
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> ValueError: filedescriptor out of range in select()
> {code}
> On POSIX systems {{poll()}} should be used instead of {{select()}}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]