[GitHub] spark issue #17203: [SPARK-19863][DStream] Whether or not use CachedKafkaCon...

lvdongr Wed, 08 Mar 2017 17:21:58 -0800

Github user lvdongr commented on the issue:

    https://github.com/apache/spark/pull/17203
  
    In our case,we deploy a streaming application whose data source are 20 
topics with 30 partitions in kafka cluster(3 brokers). Then the amount of 
connection with kafka is very large,up to a thousand, and the consumer will not 
got message from kafka sometimesï¼which may lead some jobs to fail. But when 
we replace the consumer with uncached ones, the number of connection decreased, 
then there were no jobs fail. We are still not sure if the large number of 
connection to kafka cause the job fail or not.But we test the result, and we 
want to use the uncached consumers for we can keep our streaming jobs running 
successfully first. So we think there are some occasions not to use the 
uncached consumer,and the developer can choose the way.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17203: [SPARK-19863][DStream] Whether or not use CachedKafkaCon...

Reply via email to