GitHub user daroo opened a pull request: https://github.com/apache/spark/pull/19789
[SPARK-22562][Streaming] CachedKafkaConsumer unsafe eviction from cache ## What changes were proposed in this pull request? Fixes a problem when one thread wants to add a new consumer into fully packed cache and another one still uses an instance of cached consumer which is marked for eviction. In such cases underlying KafkaConsumer throws ConcurrentModificationException. My solution is to always remove the eldest consumer from the cache, but sometimes delay calling close() method (in separate thread) until is no longer used (released) by KafkaRDDIterator ## How was this patch tested? Any ideas how to write good unit test to cover this are more than welcome. In the meantime I'll try to run the code on our DEV env for a longer period of time. You can merge this pull request into a Git repository by running: $ git pull https://github.com/daroo/spark SPARK-22562 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19789.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19789 ---- commit 9b16ddd723bc3dc324c33bd39a4aa2b065e926b1 Author: Dariusz Szablinski <dariusz.szablin...@ig.com> Date: 2017-11-20T16:47:29Z [SPARK-22562][Streaming] CachedKafkaConsumer unsafe eviction from cache ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org