[ 
https://issues.apache.org/jira/browse/SPARK-27927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16884675#comment-16884675
 ] 

Stavros Kontopoulos edited comment on SPARK-27927 at 7/14/19 1:50 PM:
----------------------------------------------------------------------

That call, among others, creates a SparkContext if it does not exist. The 
SparkContext will start the dag scheduler thread which starts this eventLoop 
thread. We have the following facts: a) a non-daemon thread is running due to 
https://issues.apache.org/jira/browse/SPARK-27812 b) a daemon thread is blocked 
which could cause issues 
([https://meteatamel.wordpress.com/2012/05/22/when-a-daemon-thread-is-not-so-daemon/])
  c) no shutdownhook was run although main has exited, as jvm cannot exit.

I would start with commenting out  
[https://github.com/apache/spark/blob/v2.4.0/core/src/main/scala/org/apache/spark/util/EventLoop.scala#L47]
 build and re-run the job. Since this is a dummy job with no actions it does 
not matter. If jvm still does not exit then the only explanation is that 
https://issues.apache.org/jira/browse/SPARK-27812 stops us from that. If it 
exits then it could mean that for some reason in 2.4.0 EventLoop will not have 
the time to block as things move faster (we can show that with adding logging).


was (Author: skonto):
That call, among others, creates a SparkContext if it does not exist. The 
SparkContext will start the dag scheduler thread which starts this eventLoop 
thread. We have the following facts: a) a non-daemon thread is running due to 
https://issues.apache.org/jira/browse/SPARK-27812 b) a daemon thread is blocked 
which could cause issues 
([https://meteatamel.wordpress.com/2012/05/22/when-a-daemon-thread-is-not-so-daemon/])
  c) no shutdownhook was run although main has exited, as jvm cannot exit.

I would start with commenting out  
[https://github.com/apache/spark/blob/v2.4.0/core/src/main/scala/org/apache/spark/util/EventLoop.scala#L47]
 build and re-run the job. Since this is a dummy job with no actions it does 
not matter. If jvm still does not exit then the only explanation is that 
https://issues.apache.org/jira/browse/SPARK-27812 stops us from that. If it 
exits then it could mean that for some reason in 2.4.0 EventLoop will not have 
the time to block as things move faster.

> driver pod hangs with pyspark 2.4.3 and master on kubenetes
> -----------------------------------------------------------
>
>                 Key: SPARK-27927
>                 URL: https://issues.apache.org/jira/browse/SPARK-27927
>             Project: Spark
>          Issue Type: Bug
>          Components: Kubernetes, PySpark
>    Affects Versions: 3.0.0, 2.4.3
>         Environment: k8s 1.11.9
> spark 2.4.3 and master branch.
>            Reporter: Edwin Biemond
>            Priority: Major
>         Attachments: driver_threads.log, executor_threads.log
>
>
> When we run a simple pyspark on spark 2.4.3 or 3.0.0 the driver pods hangs 
> and never calls the shutdown hook. 
> {code:java}
> #!/usr/bin/env python
> from __future__ import print_function
> import os
> import os.path
> import sys
> # Are we really in Spark?
> from pyspark.sql import SparkSession
> spark = SparkSession.builder.appName('hello_world').getOrCreate()
> print('Our Spark version is {}'.format(spark.version))
> print('Spark context information: {} parallelism={} python version={}'.format(
> str(spark.sparkContext),
> spark.sparkContext.defaultParallelism,
> spark.sparkContext.pythonVer
> ))
> {code}
> When we run this on kubernetes the driver and executer are just hanging. We 
> see the output of this python script. 
> {noformat}
> bash-4.2# cat stdout.log
> Our Spark version is 2.4.3
> Spark context information: <SparkContext 
> master=k8s://https://kubernetes.default.svc:443 appName=hello_world> 
> parallelism=2 python version=3.6{noformat}
> What works
>  * a simple python with a print works fine on 2.4.3 and 3.0.0
>  * same setup on 2.4.0
>  * 2.4.3 spark-submit with the above pyspark
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to