Great! :)

On 5/16/17 2:31 AM, Sameer Kumar wrote:
> I see now that my Kafka cluster is very stable, and these errors dont come
> now.
> 
> -Sameer.
> 
> On Fri, May 5, 2017 at 7:53 AM, Sameer Kumar <sam.kum.w...@gmail.com> wrote:
> 
>> Yes, I have upgraded my cluster and client both to version 10.2.1 and
>> currently monitoring the situation.
>> Will report back in case I find any errors. Thanks for the help though.
>>
>> -Sameer.
>>
>> On Fri, May 5, 2017 at 3:37 AM, Matthias J. Sax <matth...@confluent.io>
>> wrote:
>>
>>> Did you see Eno's reply?
>>>
>>> Please try out Streams 0.10.2.1 -- this should be fixed there. If not,
>>> please report back.
>>>
>>> I would also recommend to subscribe to the list. It's self-service
>>> http://kafka.apache.org/contact
>>>
>>>
>>> -Matthias
>>>
>>> On 5/3/17 10:49 PM, Sameer Kumar wrote:
>>>> My brokers are on version 10.1.0 and my clients are on version 10.2.0.
>>>> Also, do a reply to all, I am currently not subscribed to the list.
>>>>
>>>> -Sameer.
>>>>
>>>> On Wed, May 3, 2017 at 6:34 PM, Sameer Kumar <sam.kum.w...@gmail.com>
>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>>
>>>>>
>>>>> I ran two nodes in my streams compute cluster, they were running fine
>>> for
>>>>> few hours before outputting with failure to rebalance errors.
>>>>>
>>>>>
>>>>> I couldnt understand why this happened but I saw one strange
>>> behaviour...
>>>>>
>>>>> at 16:53 on node1, I saw "Failed to lock the state directory" error,
>>> this
>>>>> might have caused the partitions to relocate and hence the error.
>>>>>
>>>>>
>>>>>
>>>>> I am attaching detailed logs for both the nodes, please see if you can
>>>>> help.
>>>>>
>>>>>
>>>>>
>>>>> Some of the logs for quick reference are these.
>>>>>
>>>>>
>>>>>
>>>>> 2017-05-03 16:53:53 ERROR Kafka10Base:44 - Exception caught in thread
>>>>> StreamThread-2
>>>>>
>>>>> org.apache.kafka.streams.errors.StreamsException: stream-thread
>>>>> [StreamThread-2] Failed to rebalance
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.StreamThread.runLoop(StreamThread.java:612)
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.StreamThread.run(StreamThread.java:368)
>>>>>
>>>>> Caused by: org.apache.kafka.streams.errors.StreamsException:
>>>>> stream-thread [StreamThread-2] failed to suspend stream tasks
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.StreamThread.suspendTasksAndState(StreamThrea
>>> d.java:488)
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.StreamThread.access$1200(StreamThread.java:69)
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.StreamThread$1.onPartitionsRevoked(StreamThre
>>> ad.java:259)
>>>>>
>>>>>                 at org.apache.kafka.clients.consu
>>>>> mer.internals.ConsumerCoordinator.onJoinPrepare(ConsumerCoor
>>>>> dinator.java:396)
>>>>>
>>>>>                 at org.apache.kafka.clients.consu
>>>>> mer.internals.AbstractCoordinator.joinGroupIfNeeded(Abstract
>>>>> Coordinator.java:329)
>>>>>
>>>>>                 at org.apache.kafka.clients.consu
>>>>> mer.internals.AbstractCoordinator.ensureActiveGroup(Abstract
>>>>> Coordinator.java:303)
>>>>>
>>>>>                 at org.apache.kafka.clients.consu
>>>>> mer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
>>>>>
>>>>>                 at org.apache.kafka.clients.consu
>>>>> mer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
>>>>>
>>>>>                 at org.apache.kafka.clients.consu
>>>>> mer.KafkaConsumer.poll(KafkaConsumer.java:995)
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.StreamThread.runLoop(StreamThread.java:582)
>>>>>
>>>>>                 ... 1 more
>>>>>
>>>>> Caused by: org.apache.kafka.clients.consumer.CommitFailedException:
>>>>> Commit cannot be completed since the group has already rebalanced and
>>>>> assigned the partitions to another member. This means that the time
>>> between
>>>>> subsequent calls to poll() was longer than the configured
>>>>> max.poll.interval.ms, which typically implies that the poll loop is
>>>>> spending too much time message processing. You can address this either
>>> by
>>>>> increasing the session timeout or by reducing the maximum size of
>>> batches
>>>>> returned in poll() with max.poll.records.
>>>>>
>>>>>                 at org.apache.kafka.clients.consu
>>>>> mer.internals.ConsumerCoordinator.sendOffsetCommitRequest(Co
>>>>> nsumerCoordinator.java:698)
>>>>>
>>>>>                 at org.apache.kafka.clients.consu
>>>>> mer.internals.ConsumerCoordinator.commitOffsetsSync(Consumer
>>>>> Coordinator.java:577)
>>>>>
>>>>>                 at org.apache.kafka.clients.consu
>>>>> mer.KafkaConsumer.commitSync(KafkaConsumer.java:1125)
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.StreamTask.commitOffsets(StreamTask.java:296)
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.StreamThread$3.apply(StreamThread.java:535)
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.StreamThread.performOnAllTasks(StreamThread.java:503)
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.StreamThread.commitOffsets(StreamThread.java:531)
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.StreamThread.suspendTasksAndState(StreamThrea
>>> d.java:480)
>>>>>
>>>>>                 ... 10 more
>>>>>
>>>>>
>>>>>
>>>>> 2017-05-03 16:53:57 WARN  StreamThread:1184 - Could not create task
>>> 1_38.
>>>>> Will retry.
>>>>>
>>>>> org.apache.kafka.streams.errors.LockException: task [1_38] Failed to
>>> lock
>>>>> the state directory: /data/streampoc/LIC2-5/1_38
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.ProcessorStateManager.<init>(ProcessorStateMa
>>>>> nager.java:102)
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.AbstractTask.<init>(AbstractTask.java:73)
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.StreamTask.<init>(StreamTask.java:108)
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.StreamThread.createStreamTask(StreamThread.java:834)
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.StreamThread$TaskCreator.createTask(StreamThr
>>> ead.java:1207)
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.StreamThread$AbstractTaskCreator.retryWithBac
>>>>> koff(StreamThread.java:1180)
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.StreamThread.addStreamTasks(StreamThread.java:937)
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.StreamThread.access$500(StreamThread.java:69)
>>>>>
>>>>>                 at org.apache.kafka.streams.proce
>>>>> ssor.internals.StreamThread$1.onPartitionsAssigned(StreamThr
>>> ead.java:236)
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> -Sameer.
>>>>>
>>>>
>>>
>>>
>>
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to