GitHub user daroo opened a pull request:

    https://github.com/apache/spark/pull/19789

    [SPARK-22562][Streaming] CachedKafkaConsumer unsafe eviction from cache

    ## What changes were proposed in this pull request?
    Fixes a problem when one thread wants to add a new consumer into fully 
packed cache and another one still uses an instance of cached consumer which is 
marked for eviction. In such cases underlying KafkaConsumer throws 
ConcurrentModificationException.
    
    My solution is to always remove the eldest consumer from the cache, but 
sometimes delay calling close() method (in separate thread) until is no longer 
used (released) by KafkaRDDIterator
    
    ## How was this patch tested?
    Any ideas how to write good unit test to cover this are more than welcome. 
In the meantime I'll try to run the code on our DEV env for a longer period of 
time.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/daroo/spark SPARK-22562

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19789.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19789
    
----
commit 9b16ddd723bc3dc324c33bd39a4aa2b065e926b1
Author: Dariusz Szablinski <dariusz.szablin...@ig.com>
Date:   2017-11-20T16:47:29Z

    [SPARK-22562][Streaming] CachedKafkaConsumer unsafe eviction from cache

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to