Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21356#discussion_r189702478
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala ---
    @@ -130,7 +129,11 @@ private class AsyncEventQueue(val name: String, conf: 
SparkConf, metrics: LiveLi
           eventCount.incrementAndGet()
           eventQueue.put(POISON_PILL)
         }
    -    dispatchThread.join()
    +    // this thread might be trying to stop itself as part of error 
handling -- we can't join
    +    // in that case.
    +    if (Thread.currentThread() != dispatchThread) {
    --- End diff --
    
    It does still happen, we need this.  We see the interrupt in postToAll, 
which is in the queue thread.  If it fails, we call `removeListenerOnError`.  
If that results in the queue being empty, we stop the queue.
    
    
    ```
    "spark-listener-group-eventLog" #20 daemon prio=5 os_prio=31 
tid=0x00007f831379e800 nid=0x6303 in Object.wait() [0x0000000129226000]
       java.lang.Thread.State: WAITING (on object monitor)
            at java.lang.Object.wait(Native Method)
            - waiting on <0x000000078047ae28> (a 
org.apache.spark.scheduler.AsyncEventQueue$$anon$1)
            at java.lang.Thread.join(Thread.java:1245)
            - locked <0x000000078047ae28> (a 
org.apache.spark.scheduler.AsyncEventQueue$$anon$1)
            at java.lang.Thread.join(Thread.java:1319)
            at 
org.apache.spark.scheduler.AsyncEventQueue.stop(AsyncEventQueue.scala:135)
            at 
org.apache.spark.scheduler.LiveListenerBus$$anonfun$removeListener$2.apply(LiveListenerBus.scala:123)
            at 
org.apache.spark.scheduler.LiveListenerBus$$anonfun$removeListener$2.apply(LiveListenerBus.scala:121)
            at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
            at 
scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
            at 
org.apache.spark.scheduler.LiveListenerBus.removeListener(LiveListenerBus.scala:121)
            - locked <0x0000000780475fe8> (a 
org.apache.spark.scheduler.LiveListenerBus)
            at 
org.apache.spark.scheduler.AsyncEventQueue.removeListenerOnError(AsyncEventQueue.scala:196)
            at 
org.apache.spark.scheduler.AsyncEventQueue.removeListenerOnError(AsyncEventQueue.scala:37)
            at 
org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:101)
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to