Re: Spark 2.0 with Kafka 0.10 exception

2016-10-21 Thread Cody Koeninger
That's a good point... the dstreams package is still on 10.0.1 though. I'll make a ticket to update it. On Fri, Oct 21, 2016 at 1:02 PM, Srikanth wrote: > Kakfa 0.10.1 release separates poll() from heartbeat. So session.timeout.ms > & max.poll.interval.ms can be set

Re: Spark 2.0 with Kafka 0.10 exception

2016-10-21 Thread Srikanth
Kakfa 0.10.1 release separates poll() from heartbeat. So session.timeout.ms & max.poll.interval.ms can be set differently. I'll leave it to you on how to add this to docs! On Thu, Oct 20, 2016 at 1:41 PM, Cody Koeninger wrote: > Right on, I put in a PR to make a note of

Re: Spark 2.0 with Kafka 0.10 exception

2016-10-20 Thread Cody Koeninger
Right on, I put in a PR to make a note of that in the docs. On Thu, Oct 20, 2016 at 12:13 PM, Srikanth wrote: > Yeah, setting those params helped. > > On Wed, Oct 19, 2016 at 1:32 PM, Cody Koeninger wrote: >> >> 60 seconds for a batch is above the

Re: Spark 2.0 with Kafka 0.10 exception

2016-10-20 Thread Srikanth
Yeah, setting those params helped. On Wed, Oct 19, 2016 at 1:32 PM, Cody Koeninger wrote: > 60 seconds for a batch is above the default settings in kafka related > to heartbeat timeouts, so that might be related. Have you tried > tweaking session.timeout.ms,

Re: Spark 2.0 with Kafka 0.10 exception

2016-10-19 Thread Cody Koeninger
60 seconds for a batch is above the default settings in kafka related to heartbeat timeouts, so that might be related. Have you tried tweaking session.timeout.ms, heartbeat.interval.ms, or related configs? On Wed, Oct 19, 2016 at 12:22 PM, Srikanth wrote: > Bringing this

Re: Spark 2.0 with Kafka 0.10 exception

2016-10-19 Thread Srikanth
Bringing this thread back as I'm seeing this exception on a production kafka cluster. I have two Spark streaming apps reading the same topic. App1 has batch interval 2secs and app2 has 60secs. Both apps are running on the same cluster on similar hardware. I see this exception only in app2 and

Re: Spark 2.0 with Kafka 0.10 exception

2016-09-07 Thread Cody Koeninger
It's a really noticeable overhead, without the cache you're basically pulling every message twice due to prefetching. On Wed, Sep 7, 2016 at 3:23 PM, Srikanth wrote: > Yea, disabling cache was not going to be my permanent solution either. > I was going to ask how big an

Re: Spark 2.0 with Kafka 0.10 exception

2016-09-07 Thread Srikanth
Yea, disabling cache was not going to be my permanent solution either. I was going to ask how big an overhead is that? It happens intermittently and each time it happens retry is successful. Srikanth On Wed, Sep 7, 2016 at 3:55 PM, Cody Koeninger wrote: > That's not what I

Re: Spark 2.0 with Kafka 0.10 exception

2016-09-07 Thread Cody Koeninger
That's not what I would have expected to happen with a lower cache setting, but in general disabling the cache isn't something you want to do with the new kafka consumer. As far as the original issue, are you seeing those polling errors intermittently, or consistently? From your description, it

Re: Spark 2.0 with Kafka 0.10 exception

2016-09-07 Thread Srikanth
Setting those two results in below exception. No.of executors < no.of partitions. Could that be triggering this? 16/09/07 15:33:13 ERROR Executor: Exception in task 2.0 in stage 2.0 (TID 9) java.util.ConcurrentModificationException: KafkaConsumer is not safe for multi-threaded access at

Re: Spark 2.0 with Kafka 0.10 exception

2016-09-07 Thread Cody Koeninger
you could try setting spark.streaming.kafka.consumer.cache.initialCapacity spark.streaming.kafka.consumer.cache.maxCapacity to 1 On Wed, Sep 7, 2016 at 2:02 PM, Srikanth wrote: > I had a look at the executor logs and noticed that this exception happens > only when using

Re: Spark 2.0 with Kafka 0.10 exception

2016-09-07 Thread Srikanth
I had a look at the executor logs and noticed that this exception happens only when using the cached consumer. Every retry is successful. This is consistent. One possibility is that the cached consumer is causing the failure as retry clears it. Is there a way to disable cache and test this? Again,

Re: Spark 2.0 with Kafka 0.10 exception

2016-08-24 Thread Srikanth
Thanks Cody. Setting poll timeout helped. Our network is fine but brokers are not fully provisioned in test cluster. But there isn't enough load to max out on broker capacity. Curious that kafkacat running on the same node doesn't have any issues. Srikanth On Tue, Aug 23, 2016 at 9:52 PM, Cody

Re: Spark 2.0 with Kafka 0.10 exception

2016-08-23 Thread Cody Koeninger
You can set that poll timeout higher with spark.streaming.kafka.consumer.poll.ms but half a second is fairly generous. I'd try to take a look at what's going on with your network or kafka broker during that time. On Tue, Aug 23, 2016 at 4:44 PM, Srikanth wrote: > Hello,

Spark 2.0 with Kafka 0.10 exception

2016-08-23 Thread Srikanth
Hello, I'm getting the below exception when testing Spark 2.0 with Kafka 0.10. 16/08/23 16:31:01 INFO AppInfoParser: Kafka version : 0.10.0.0 > 16/08/23 16:31:01 INFO AppInfoParser: Kafka commitId : b8642491e78c5a13 > 16/08/23 16:31:01 INFO CachedKafkaConsumer: Initial fetch for >