[ 
https://issues.apache.org/jira/browse/SPARK-47488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47488:
----------------------------------
    Affects Version/s: 4.0.0
                           (was: 3.2.0)

> [k8s]Driver stuck when thread pool is not shut down 
> ----------------------------------------------------
>
>                 Key: SPARK-47488
>                 URL: https://issues.apache.org/jira/browse/SPARK-47488
>             Project: Spark
>          Issue Type: Improvement
>          Components: k8s
>    Affects Versions: 4.0.0
>            Reporter: Zhou Tong
>            Priority: Major
>              Labels: pull-request-available
>
> The app example:
>  
> {code:java}
> object SparkTest {
>   def main(args: Array[String]): Unit = {
>     val spark = SparkSession.builder()
>       .appName("zt-test")
>       .config("spark.logConf",true)
>       .getOrCreate()
>     spark.sparkContext.setLogLevel("INFO")
>     val threadPool = Executors.newFixedThreadPool(5)
>     for (i <- 0 until 10) {
>       threadPool.execute(new Task("Task " + i))
>     }
>     val rdd = spark.sparkContext.makeRDD(Seq(1,2,4))
>     val res = rdd.collect()
>   }
> }
> class Task(private var name: String) extends Runnable {
>   override def run(): Unit = {
>     System.out.println("Executing task: " + name + " by " + 
> Thread.currentThread.getName)
>   }
> }{code}
>  
> when app is running on yarn with cluster mode, even if thread pool is not 
> closed, driver will shut down, which can not lead container not to stop. 
> However, when running on k8s, if thread pool is not closed, the driver pod 
> will be stuck, and will not release resource. 
> With yarn-cluster mode, the ApplicationMaster wrapped with 'System.exit', 
> like this
> {code:java}
> ugi.doAs(new PrivilegedExceptionAction[Unit]() {
>   override def run(): Unit = System.exit(master.run()) 
> }) {code}
> so, when threads are parking, exitcode can also be passed to System#exit. In 
> this sutiation, AM can stop.
> When driver is on k8s with client mode, if encounters exception and thread 
> pool is not closed, driver pod may stuck.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to