Post the actual stacktrace you're getting
On Thu, Sep 10, 2015 at 12:21 AM, Shushant Arora
wrote:
> Executors in spark streaming 1.3 fetch messages from kafka in batches and
> what happens when executor takes longer time to complete a fetch batch
>
> say in
>
>
>
My bad Got that exception in driver code of same job not in executor.
But it says of socket close exception only.
org.apache.spark.SparkException: ArrayBuffer(java.io.EOFException: Received
-1 when reading from channel, socket has likely been closed.,
org.apache.spark.SparkException: Couldn't
Again, that looks like you lost a kafka broker. Executors will retry
failed tasks automatically up to the max failures.
spark.streaming.kafka.maxRetries controls the number of times the driver
will retry when attempting to get offsets.
If your broker isn't up / rebalance hasn't finished after N
Stack trace is
15/09/09 22:49:52 ERROR kafka.KafkaRDD: Lost leader for topic topicname
partition 99, sleeping for 200ms
kafka.common.NotLeaderForPartitionException
at sun.reflect.GeneratedConstructorAccessor26.newInstance(Unknown
Source)
at
NotLeaderForPartitionException means you lost a kafka broker or had a
rebalance... why did you say " I am getting Connection tmeout in my code."
You've asked questions about this exact same situation before, the answer
remains the same
On Thu, Sep 10, 2015 at 9:44 AM, Shushant Arora
Executors in spark streaming 1.3 fetch messages from kafka in batches and
what happens when executor takes longer time to complete a fetch batch
say in
directKafkaStream.foreachRDD(new Function, Void>() {
@Override
public Void call(JavaRDD v1) throws Exception {
gt;>>> @Override
>>>>>> public Void call(JavaRDD<byte[][]> v1) throws Exception {
>>>>>> v1.foreachPartition(new VoidFunction<Iterator<byte[][]>>{
>>>>>> @Override
>>>>>> public void call(Iterator<by
Hi
In spark streaming 1.3 with kafka- when does driver bring latest offsets of
this run - at start of each batch or at time when batch gets queued ?
Say few of my batches take longer time to complete than their batch
interval. So some of batches will go in queue. Will driver waits for
queued
.
>>>>
>>>> I can't do this in compute method since compute would have been called
>>>> at current batch queue time - but condition is set at previous batch run
>>>> time.
>>>>
>>>>
>>>> On Tue, Sep 1, 2015
>> public void call(Iterator<byte[][]> t) throws Exception {
>>>>> }});}});
>>>>>
>>>>> change directKafkaStream's RDD's offset range.(fromOffset).
>>>>>
>>>>> I can't do this in compute method since compu
ant Arora <shushantaror...@gmail.com>
> wrote:
>
>> Hi
>>
>> In spark streaming 1.3 with kafka- when does driver bring latest offsets
>> of this run - at start of each batch or at time when batch gets queued ?
>>
>> Say few of my batches take longer
e near the time the
>> batch should have been queued.
>>
>> On Tue, Sep 1, 2015 at 8:02 AM, Shushant Arora <shushantaror...@gmail.com
>> > wrote:
>>
>>> Hi
>>>
>>> In spark streaming 1.3 with kafka- when does driver bring latest
gt; wrote:
>>
>>> It's at the time compute() gets called, which should be near the time
>>> the batch should have been queued.
>>>
>>> On Tue, Sep 1, 2015 at 8:02 AM, Shushant Arora <
>>> shushantaror...@gmail.com> wrote:
>>>
>>
It's at the time compute() gets called, which should be near the time the
batch should have been queued.
On Tue, Sep 1, 2015 at 8:02 AM, Shushant Arora <shushantaror...@gmail.com>
wrote:
> Hi
>
> In spark streaming 1.3 with kafka- when does driver bring latest offsets
> of t
>
>>> On Tue, Sep 1, 2015 at 7:09 PM, Cody Koeninger <c...@koeninger.org>
>>> wrote:
>>>
>>>> It's at the time compute() gets called, which should be near the time
>>>> the batch should have been queued.
>>>>
>>>> On Tu
15 matches
Mail list logo