[ https://issues.apache.org/jira/browse/SPARK-31549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Weichen Xu updated SPARK-31549: ------------------------------- Issue Type: Bug (was: Improvement) > Pyspark SparkContext.cancelJobGroup do not work correctly > --------------------------------------------------------- > > Key: SPARK-31549 > URL: https://issues.apache.org/jira/browse/SPARK-31549 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 2.4.5, 3.0.0 > Reporter: Weichen Xu > Priority: Critical > > Pyspark SparkContext.cancelJobGroup do not work correctly. This is an issue > existing for a long time. This is because of pyspark thread didn't pinned to > jvm thread when invoking java side methods, which leads to all pyspark API > which related to java local thread variables do not work correctly. > (Including `sc.setLocalProperty`, `sc.cancelJobGroup`, `sc.setJobDescription` > and so on.) > This is serious issue. Now there's an experimental pyspark 'PIN_THREAD' mode > added in spark-3.0 which address it, but the 'PIN_THREAD' mode exists two > issue: > * It is disabled by default. We need to set additional environment variable > to enable it. > * There's memory leak issue which haven't been addressed. > Now there's a series of project like hyperopt-spark, spark-joblib which rely > on `sc.cancelJobGroup` API (use it to stop running jobs in their code). So it > is critical to address this issue and we hope it work under default pyspark > mode. An optional approach is implementing methods like > `rdd.setGroupAndCollect`. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org