GitHub user wangyum opened a pull request:

    https://github.com/apache/spark/pull/20735

    [MINOR][YARN] Add disable yarn.nodemanager.vmem-check-enabled option to 
memLimitExceededLogMessage

    ## What changes were proposed in this pull request?
    My spark application sometimes will throw `Container killed by YARN for 
exceeding memory limits`.
    Even I increased `spark.yarn.executor.memoryOverhead` to 10G, this error 
still happen.  The latest config:
    <img width="685" alt="memory-config" 
src="https://user-images.githubusercontent.com/5399861/36975716-f5c548d2-20b5-11e8-95e5-b228d50917b9.png";>
    
    And error message:
    ```
    ExecutorLostFailure (executor 121 exited caused by one of the running 
tasks) Reason: Container killed by YARN for exceeding memory limits. 30.7 GB of 
30 GB physical memory used. Consider boosting 
spark.yarn.executor.memoryOverhead.
    ```
    
    This is because of [Linux glibc >= 2.10 (RHEL 6) malloc may show excessive 
virtual memory 
usage](https://www.ibm.com/developerworks/community/blogs/kevgrig/entry/linux_glibc_2_10_rhel_6_malloc_may_show_excessive_virtual_memory_usage?lang=en).
 So disable `yarn.nodemanager.vmem-check-enabled` looks like a good option as 
[MapR mentioned 
](https://mapr.com/blog/best-practices-yarn-resource-management).
    
    This PR add disable `yarn.nodemanager.vmem-check-enabled` option to 
memLimitExceededLogMessage.
    
    More details:
    https://issues.apache.org/jira/browse/YARN-4714
    https://stackoverflow.com/a/31450291
    https://stackoverflow.com/a/42091255
    
    After this PR:
    <img width="898" alt="yarn" 
src="https://user-images.githubusercontent.com/5399861/36975949-c8e7bbbe-20b6-11e8-9513-9f903b868d8d.png";>
    
    ## How was this patch tested?
    
    N/A


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/wangyum/spark YARN-4714

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20735.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20735
    
----
commit 3fc05b4f8599ee65e8c4f808aee238d212c22b17
Author: Yuming Wang <yumwang@...>
Date:   2018-03-05T12:38:21Z

    Update memLimitExceededLogMessage

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to