Neha Singla created SPARK-39920: ----------------------------------- Summary: Interrupt on running spark job doesn't kill underneath spark job Key: SPARK-39920 URL: https://issues.apache.org/jira/browse/SPARK-39920 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 3.2.2 Reporter: Neha Singla
If I start a long running spark job in jupyter notebook with pyspark kernel and hit interrupt while spark job is running. I get keyboard interrupt message but underneath spark job keeps running. Attaching my notebook file. I tried to create my own launcher and register a different signal handler for Interrupt, but that gives py4j error. {quote}from signal import signal, SIGINT, SIG_IGN from ipykernel.ipkernel import IPythonKernel from functools import partial from pyspark.sql import SparkSession import os class PyKernel(IPythonKernel): def pre_handler_hook(self): """Hook to execute before calling message handler""" self.log.info("PyKernel: Registering PreHandler Hook") self.saved_sigint_handler = signal(SIGINT, self.signal_handler) def post_handler_hook(self): """Hook to execute after calling message handler""" self.log.info("PyKernel: Registering PostHandler Hook") signal(SIGINT, self.saved_sigint_handler) def signal_handler(signal, frame): self.log.info("PyKernel: Registering Interrupt handler for PyKernel") SparkSession.builder.getOrCreate().sparkContext.cancelAllJobs() raise KeyboardInterrupt() {quote} Here is the error message: {quote}ERROR:root:Exception while sending command. Traceback (most recent call last): File "/mnt/aci-spark/app/apache-spark/python/lib/py4j-0.10.9.5-src.zip/py4j/clientserver.py", line 511, in send_command answer = smart_decode(self.stream.readline()[:-1]) RuntimeError: reentrant call inside <_io.BufferedReader name=71> During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/mnt/aci-spark/app/apache-spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1038, in send_command response = connection.send_command(command) File "/mnt/aci-spark/app/apache-spark/python/lib/py4j-0.10.9.5-src.zip/py4j/clientserver.py", line 540, in send_command "Error while sending or receiving", e, proto.ERROR_ON_RECEIVE) py4j.protocol.Py4JNetworkError: Error while sending or receiving ERROR:root:Exception while sending command. Traceback (most recent call last): File "/mnt/aci-spark/app/apache-spark/python/lib/py4j-0.10.9.5-src.zip/py4j/clientserver.py", line 511, in send_command answer = smart_decode(self.stream.readline()[:-1]) File "/app/.runtimes/python/lib/python3.7/socket.py", line 589, in readinto return self._sock.recv_into(b) File "/app/.runtimes/python/lib/python3.7/site-packages/kernel/pykernel.py", line 32, in signal_handler SparkSession.builder.getOrCreate().sparkContext.cancelAllJobs() File "/mnt/aci-spark/app/apache-spark/python/lib/pyspark.zip/pyspark/context.py", line 1195, in cancelAllJobs self._jsc.sc().cancelAllJobs() File "/mnt/aci-spark/app/apache-spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1322, in __call__ answer, self.gateway_client, self.target_id, self.name) File "/mnt/aci-spark/app/apache-spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 111, in deco return f(*a, **kw) File "/mnt/aci-spark/app/apache-spark/python/lib/py4j-0.10.9.5-src.zip/py4j/protocol.py", line 336, in get_return_value format(target_id, ".", name)) py4j.protocol.Py4JError: An error occurred while calling o134.sc During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/mnt/aci-spark/app/apache-spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1038, in send_command response = connection.send_command(command) File "/mnt/aci-spark/app/apache-spark/python/lib/py4j-0.10.9.5-src.zip/py4j/clientserver.py", line 540, in send_command "Error while sending or receiving", e, proto.ERROR_ON_RECEIVE) py4j.protocol.Py4JNetworkError: Error while sending or receiving{quote} {quote} --------------------------------------------------------------------------- Py4JError Traceback (most recent call last) /tmp/ipykernel_177/3554383046.py in <module> ----> 1 df1.join(df2, df1.id1 == df2.id2, 'inner').show(5, False) /mnt/aci-spark/app/apache-spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py in show(self, n, truncate, vertical) 500 "Parameter 'truncate={}' should be either bool or int.".format(truncate)) 501 --> 502 print(self._jdf.showString(n, int_truncate, vertical)) 503 504 def __repr__(self): /mnt/aci-spark/app/apache-spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py in __call__(self, *args) 1320 answer = self.gateway_client.send_command(command) 1321 return_value = get_return_value( -> 1322 answer, self.gateway_client, self.target_id, self.name) 1323 1324 for temp_arg in temp_args: /mnt/aci-spark/app/apache-spark/python/lib/pyspark.zip/pyspark/sql/utils.py in deco(*a, **kw) 109 def deco(*a, **kw): 110 try: --> 111 return f(*a, **kw) 112 except py4j.protocol.Py4JJavaError as e: 113 converted = convert_exception(e.java_exception) /mnt/aci-spark/app/apache-spark/python/lib/py4j-0.10.9.5-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name) 334 raise Py4JError( 335 "An error occurred while calling \{0}{1}\{2}". --> 336 format(target_id, ".", name)) 337 else: 338 type = answer[1] Py4JError: An error occurred while calling o247.showString{quote} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org