[ 
https://issues.apache.org/jira/browse/SPARK-27927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16884791#comment-16884791
 ] 

Stavros Kontopoulos edited comment on SPARK-27927 at 7/15/19 12:35 AM:
-----------------------------------------------------------------------

I was able to reproduce it reasily with 2.4.3.

I used this tool [https://github.com/jglick/jkillthread] to kill eventloop and 
then the other okhttp thread:

19/07/15 00:12:06 INFO StateStoreCoordinatorRef: Registered 
StateStoreCoordinator endpoint
 Killing "OkHttp [https://kubernetes.default.svc/]...";
 Did not find "OkHttp [https://kubernetes.default.svc/]...";
 Killing "dag-scheduler-event-loop"
 Killing "OkHttp WebSocket [https://kubernetes.default.svc/]...";
 Exception in thread "OkHttp WebSocket [https://kubernetes.default.svc/]..."; 
java.lang.IllegalMonitorStateException
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.signal(AbstractQueuedSynchronizer.java:1939)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1103)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
 at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
 Killing "OkHttp WebSocket [https://kubernetes.default.svc/]...";
 Exception in thread "OkHttp WebSocket [https://kubernetes.default.svc/]..."; 
java.lang.IllegalMonitorStateException

Unfortunately I cant the kill the latter as another one is created. Anyway that 
means that this is just another case of 
https://issues.apache.org/jira/browse/SPARK-27812 .

spark.stop() obviously stops the k8s client and everything finishes as expected.


was (Author: skonto):
I was able to reproduce it reasily with 2.4.3.

I used this tool [https://github.com/jglick/jkillthread] to kill eventloop and 
then the other okhttp thread:

19/07/15 00:12:06 INFO StateStoreCoordinatorRef: Registered 
StateStoreCoordinator endpoint
Killing "OkHttp https://kubernetes.default.svc/...";
Did not find "OkHttp https://kubernetes.default.svc/...";
Killing "dag-scheduler-event-loop"
Killing "OkHttp WebSocket https://kubernetes.default.svc/...";
Exception in thread "OkHttp WebSocket https://kubernetes.default.svc/..."; 
java.lang.IllegalMonitorStateException
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.signal(AbstractQueuedSynchronizer.java:1939)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1103)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
 at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
Killing "OkHttp WebSocket https://kubernetes.default.svc/...";
Exception in thread "OkHttp WebSocket https://kubernetes.default.svc/..."; 
java.lang.IllegalMonitorStateException

Unfortunately I cant the kill the latter as another one is created. Anyway that 
means that this is just another case of 
https://issues.apache.org/jira/browse/SPARK-27812 

> driver pod hangs with pyspark 2.4.3 and master on kubenetes
> -----------------------------------------------------------
>
>                 Key: SPARK-27927
>                 URL: https://issues.apache.org/jira/browse/SPARK-27927
>             Project: Spark
>          Issue Type: Bug
>          Components: Kubernetes, PySpark
>    Affects Versions: 3.0.0, 2.4.3
>         Environment: k8s 1.11.9
> spark 2.4.3 and master branch.
>            Reporter: Edwin Biemond
>            Priority: Major
>         Attachments: driver_threads.log, executor_threads.log
>
>
> When we run a simple pyspark on spark 2.4.3 or 3.0.0 the driver pods hangs 
> and never calls the shutdown hook. 
> {code:java}
> #!/usr/bin/env python
> from __future__ import print_function
> import os
> import os.path
> import sys
> # Are we really in Spark?
> from pyspark.sql import SparkSession
> spark = SparkSession.builder.appName('hello_world').getOrCreate()
> print('Our Spark version is {}'.format(spark.version))
> print('Spark context information: {} parallelism={} python version={}'.format(
> str(spark.sparkContext),
> spark.sparkContext.defaultParallelism,
> spark.sparkContext.pythonVer
> ))
> {code}
> When we run this on kubernetes the driver and executer are just hanging. We 
> see the output of this python script. 
> {noformat}
> bash-4.2# cat stdout.log
> Our Spark version is 2.4.3
> Spark context information: <SparkContext 
> master=k8s://https://kubernetes.default.svc:443 appName=hello_world> 
> parallelism=2 python version=3.6{noformat}
> What works
>  * a simple python with a print works fine on 2.4.3 and 3.0.0
>  * same setup on 2.4.0
>  * 2.4.3 spark-submit with the above pyspark
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to