Github user markgrover commented on the pull request:

    https://github.com/apache/spark/pull/8093#issuecomment-130167519
  
    >  guess we could do that. My concern is that the race is probably always 
going to be won by the executor disconnect message (instead of the explicit 
RemoveExecutor message), which means that most of the time these messages will 
still not show up in the driver UI...
    
    @vanzin regarding the above, I got some data to help us out. I ran a job in 
yarn client mode that allocated a lot of ByteBuffers and had a 1000 tasks. 72 
of these tasks failed, 70 of these were won by the onDisconnected event and 
hence displayed a generic message, the other 2 were won by the RemoveExecutor 
event and showed the yarn killing container error in the UI. So, you are right.
    
    However, I still think that even showing the generic message in the UI 
`Remote Rpc client disassociated. Likely due to containers exceeding 
thresholds, or network issues. Check driver logs for WARNings` is better than 
the status quo. So opening a separate JIRA for the race condition and exploring 
the best way to proceed there, makes the most sense to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to