Xi Chen created SPARK-49479: ------------------------------- Summary: Non-daemon Timer prevents Spark driver JVM from stopping Key: SPARK-49479 URL: https://issues.apache.org/jira/browse/SPARK-49479 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.5.2 Reporter: Xi Chen
It is observed that when using [Spark Torch Distributor|https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.ml.torch.distributor.TorchDistributor.html], Spark driver pod could hang around after calling {_}spark.stop(){_}. Although the Spark Context was shutdown, the JVM was still running. The reason was that there is a non-daemon Timer thread named {_}BarrierCoordinator barrier epoch increment timer{_}, which prevented the driver JVM from stopping. This issue iss fixed in master branch by [https://github.com/apache/spark/pull/44718/files#diff-c2ca635ca0080bea12bcb5e25272a830019b3b150fc6c1cee0d268e0c12b82ceR82] as a side effect. We should backport SPARK-46895 and SPARK-46698 to branch-3.5 for fix. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org