[ https://issues.apache.org/jira/browse/SPARK-51966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wojciech Szlachta updated SPARK-51966: -------------------------------------- Target Version/s: (was: 4.0.0) > Replace select.select() with select.poll() when running on POSIX os > ------------------------------------------------------------------- > > Key: SPARK-51966 > URL: https://issues.apache.org/jira/browse/SPARK-51966 > Project: Spark > Issue Type: Improvement > Components: PySpark > Affects Versions: 4.0.0, 3.5.5 > Reporter: Wojciech Szlachta > Priority: Minor > > On glibc based Linux systems {{select()}} can monitor only file descriptor > numbers that are less than {{FD_SETSIZE}} (1024). > This is an unreasonably low limit for many modern applications. > When running via {{pyspark}} we frequently observe: > {code} > Exception occurred during processing of request from ('127.0.0.1', 46334) > Traceback (most recent call last): > File "/usr/lib/python3.11/socketserver.py", line 317, in > _handle_request_noblock > self.process_request(request, client_address) > File "/usr/lib/python3.11/socketserver.py", line 348, in process_request > self.finish_request(request, client_address) > File "/usr/lib/python3.11/socketserver.py", line 361, in finish_request > self.RequestHandlerClass(request, client_address, self) > File "/usr/lib/python3.11/socketserver.py", line 755, in __init__ > self.handle() > File "/usr/lib/python3.11/site-packages/pyspark/accumulators.py", line 293, > in handle > poll(authenticate_and_accum_updates) > File "/usr/lib/python3.11/site-packages/pyspark/accumulators.py", line 266, > in poll > r, _, _ = select.select([self.rfile], [], [], 1) > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > ValueError: filedescriptor out of range in select() > {code} > On POSIX systems {{poll()}} should be used instead of {{select()}}. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org