Github user markgrover commented on the pull request: https://github.com/apache/spark/pull/8093#issuecomment-130167519 > guess we could do that. My concern is that the race is probably always going to be won by the executor disconnect message (instead of the explicit RemoveExecutor message), which means that most of the time these messages will still not show up in the driver UI... @vanzin regarding the above, I got some data to help us out. I ran a job in yarn client mode that allocated a lot of ByteBuffers and had a 1000 tasks. 72 of these tasks failed, 70 of these were won by the onDisconnected event and hence displayed a generic message, the other 2 were won by the RemoveExecutor event and showed the yarn killing container error in the UI. So, you are right. However, I still think that even showing the generic message in the UI `Remote Rpc client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARNings` is better than the status quo. So opening a separate JIRA for the race condition and exploring the best way to proceed there, makes the most sense to me.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org