[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

HeartSaVioR Thu, 30 Aug 2018 08:08:40 -0700

Github user HeartSaVioR commented on the issue:

    https://github.com/apache/spark/pull/22138
  
    I just applied a new approach: "separate of concerns". This approach does 
pooling for both kafka consumers as well as fetched data.
    
    Both pools support eviction on idle objects, which will help closing 
invalid idle objects which topic or partition are no longer be assigned to any 
tasks.
    
    It also enables applying different policies on pool, which helps 
optimization of pooling for each pool.
    
    We concerned about multiple tasks pointing same topic partition as well as 
same group id, and existing code can't handle this hence excess seek and fetch 
could happen. This approach properly handles the case.
    
    It also makes the code always safe to leverage cache, hence no need to 
maintain reuseCache parameter.
    
    @koeninger @tdas @zsxwing @arunmahadevan 
    Could you please take a look at the new approach? I think this approach 
solves multiple issues existing code has.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

Reply via email to