[ 
https://issues.apache.org/jira/browse/SPARK-31549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangrui Meng updated SPARK-31549:
----------------------------------
    Target Version/s: 3.0.0

> Pyspark SparkContext.cancelJobGroup do not work correctly
> ---------------------------------------------------------
>
>                 Key: SPARK-31549
>                 URL: https://issues.apache.org/jira/browse/SPARK-31549
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 2.4.5, 3.0.0
>            Reporter: Weichen Xu
>            Priority: Critical
>
> Pyspark SparkContext.cancelJobGroup do not work correctly. This is an issue 
> existing for a long time. This is because of pyspark thread didn't pinned to 
> jvm thread when invoking java side methods, which leads to all pyspark API 
> which related to java local thread variables do not work correctly. 
> (Including `sc.setLocalProperty`, `sc.cancelJobGroup`, `sc.setJobDescription` 
> and so on.)
> This is serious issue. Now there's an experimental pyspark 'PIN_THREAD' mode 
> added in spark-3.0 which address it, but the 'PIN_THREAD' mode exists two 
> issue:
> * It is disabled by default. We need to set additional environment variable 
> to enable it.
> * There's memory leak issue which haven't been addressed.
> Now there's a series of project like hyperopt-spark, spark-joblib which rely 
> on `sc.cancelJobGroup` API (use it to stop running jobs in their code). So it 
> is critical to address this issue and we hope it work under default pyspark 
> mode. An optional approach is implementing methods like 
> `rdd.setGroupAndCollect`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to