GitHub user jeanlyn opened a pull request:

    https://github.com/apache/spark/pull/11779

    [SPARK-13845][CORE]Using onBlockUpdated to replace onTaskEnd avioding 
driver OOM

    ## What changes were proposed in this pull request?
    
    We have a streaming job using `FlumePollInputStream` always driver OOM 
after few days, here is some driver heap dump before OOM
    ```
     num     #instances         #bytes  class name
    ----------------------------------------------
       1:      13845916      553836640  org.apache.spark.storage.BlockStatus
       2:      14020324      336487776  org.apache.spark.storage.StreamBlockId
       3:      13883881      333213144  scala.collection.mutable.DefaultEntry
       4:          8907       89043952  [Lscala.collection.mutable.HashEntry;
       5:         62360       65107352  [B
       6:        163368       24453904  [Ljava.lang.Object;
       7:        293651       20342664  [C
    ...
    ```
    `BlockStatus` and `StreamBlockId` keep on growing, and the driver OOM in 
the end.
    After investigated, i found the `executorIdToStorageStatus` in 
`StorageStatusListener` seems never remove the blocks from `StorageStatus`. 
    In order to fix the issue, i try to use `onBlockUpdated` replace `onTaskEnd 
` , so we can update the block informations(add blocks, drop the block from 
memory to disk and delete the blocks) in time.
    
    
    ## How was this patch tested?
    
    Existing unit tests and manual tests


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jeanlyn/spark fix_driver_oom

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11779.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11779
    
----
commit f1f6df685b416b1f7fa47f9419dc5a0300e63070
Author: jeanlyn <jeanly...@gmail.com>
Date:   2016-03-12T08:01:39Z

    using onBlockUpdated to replace onTaskEnd when the block updated

commit 5255067c40b49efab597dfbb13ccfa9b6f7b0833
Author: jeanlyn <jeanly...@gmail.com>
Date:   2016-03-12T12:52:04Z

    fix unit test

commit ab4a863b71ee28791b43daea1af58453906af8c0
Author: jeanlyn <jeanly...@gmail.com>
Date:   2016-03-15T01:01:53Z

    fix scala style

commit 59150927dfe7c1eacd244a359fc5e8d040d4dc07
Author: jeanlyn <jeanly...@gmail.com>
Date:   2016-03-17T06:24:01Z

    fix HistoryServerSuite

commit 2e2c319cd0dafe3ee8ae8d4fb8fe958ae5ddcd4e
Author: jeanlyn <jeanly...@gmail.com>
Date:   2016-03-17T09:16:39Z

    fix typos

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to