Github user markgrover commented on the pull request:

    https://github.com/apache/spark/pull/8093#issuecomment-129709754
  
    This pull request is meant to achieve two goals:
    1. Show in driver logs, primarily in yarn client mode, if YARN is killing 
containers because of one or more of the thresholds (of say, physical or 
virtual memory) is being exceeded.
    2. Display the above reason in the Spark UI.
    
    Here's some more context into how the above have been achieved:
    For (1) above, I considered two options - a) adding a new RPC message 
ContainerRemoved from the YarnAllocator to the YarnSchedulerBackend which will 
be sent when a container is killed by YARN or b) simply extending and using the 
RemoveExecutor message that was being passed from YarnAllocator to 
YarnSchedulerBackend already. While I did implement (a), I ended up [reverting 
it](https://github.com/markgrover/spark/commit/47c20c0f794d654bc4c7f08809373274cc16b7be),
 and going with (2) because of its simplicity.
    
    For (2) above, I extended the ExecutorLostFailure case class, that gets 
sent down the ListenerBus by the scheduler whenever an executor is lost. That 
ends up being picked by JobProgressListener and finally shows up in the UI. 
I've attached a [picture on the 
JIRA](https://issues.apache.org/jira/secure/attachment/12749771/error_showing_in_UI.png)
 of what the error message in the UI looks like.
    
    **Testing**
    While I have updated any unit tests that have been impacted by this, I had 
to also find a determinstic way of getting YARN to kill the container. For 
that, I set spark.yarn.executor.memoryOverhead to a very low number and used an 
app that allocates a lot of 
[ByteBuffers](http://docs.oracle.com/javase/7/docs/api/java/nio/ByteBuffer.html#allocateDirect(int))
    The code for this app can be found at 
https://github.com/markgrover/spark-app It's simply a Pi program much like the 
default Spark Pi app but it creates a bunch of ByteBuffers while it's at it. It 
can be invoked like:
        spark-submit --class com.markgrover.spark.ModifiedPi --master yarn 
--deploy-mode client ~/spark-app/target/my-spark-app-0.0.1-SNAPSHOT.jar 1000


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to