Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/17203
I see, thanks. So the real issue is that KafkaConsumer cannot support
multi-threading.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as we
Github user lvdongr commented on the issue:
https://github.com/apache/spark/pull/17203
You can see this issue ,and this is a problem of cached KafkaConsumer,
https://issues.apache.org/jira/browse/SPARK-19185, and a commentator
suggest the same method not to use cached kafka consum
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/17203
With cached KafkaConsumer, by default the largest connections per executor
would be 64, in your case 64 may not enough. Your fix looks not so solid and
cannot explain why shifting to uncached one
Github user lvdongr commented on the issue:
https://github.com/apache/spark/pull/17203
In our case,we deploy a streaming application whose data source are 20
topics with 30 partitions in kafka cluster(3 brokers). Then the amount of
connection with kafka is very large,up to a thousand,
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/17203
What is the purpose of this change, did you see any problem when using
cached consumers?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/17203
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feat