Great! :) On 5/16/17 2:31 AM, Sameer Kumar wrote: > I see now that my Kafka cluster is very stable, and these errors dont come > now. > > -Sameer. > > On Fri, May 5, 2017 at 7:53 AM, Sameer Kumar <sam.kum.w...@gmail.com> wrote: > >> Yes, I have upgraded my cluster and client both to version 10.2.1 and >> currently monitoring the situation. >> Will report back in case I find any errors. Thanks for the help though. >> >> -Sameer. >> >> On Fri, May 5, 2017 at 3:37 AM, Matthias J. Sax <matth...@confluent.io> >> wrote: >> >>> Did you see Eno's reply? >>> >>> Please try out Streams 0.10.2.1 -- this should be fixed there. If not, >>> please report back. >>> >>> I would also recommend to subscribe to the list. It's self-service >>> http://kafka.apache.org/contact >>> >>> >>> -Matthias >>> >>> On 5/3/17 10:49 PM, Sameer Kumar wrote: >>>> My brokers are on version 10.1.0 and my clients are on version 10.2.0. >>>> Also, do a reply to all, I am currently not subscribed to the list. >>>> >>>> -Sameer. >>>> >>>> On Wed, May 3, 2017 at 6:34 PM, Sameer Kumar <sam.kum.w...@gmail.com> >>> wrote: >>>> >>>>> Hi, >>>>> >>>>> >>>>> >>>>> I ran two nodes in my streams compute cluster, they were running fine >>> for >>>>> few hours before outputting with failure to rebalance errors. >>>>> >>>>> >>>>> I couldnt understand why this happened but I saw one strange >>> behaviour... >>>>> >>>>> at 16:53 on node1, I saw "Failed to lock the state directory" error, >>> this >>>>> might have caused the partitions to relocate and hence the error. >>>>> >>>>> >>>>> >>>>> I am attaching detailed logs for both the nodes, please see if you can >>>>> help. >>>>> >>>>> >>>>> >>>>> Some of the logs for quick reference are these. >>>>> >>>>> >>>>> >>>>> 2017-05-03 16:53:53 ERROR Kafka10Base:44 - Exception caught in thread >>>>> StreamThread-2 >>>>> >>>>> org.apache.kafka.streams.errors.StreamsException: stream-thread >>>>> [StreamThread-2] Failed to rebalance >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.StreamThread.runLoop(StreamThread.java:612) >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.StreamThread.run(StreamThread.java:368) >>>>> >>>>> Caused by: org.apache.kafka.streams.errors.StreamsException: >>>>> stream-thread [StreamThread-2] failed to suspend stream tasks >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.StreamThread.suspendTasksAndState(StreamThrea >>> d.java:488) >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.StreamThread.access$1200(StreamThread.java:69) >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.StreamThread$1.onPartitionsRevoked(StreamThre >>> ad.java:259) >>>>> >>>>> at org.apache.kafka.clients.consu >>>>> mer.internals.ConsumerCoordinator.onJoinPrepare(ConsumerCoor >>>>> dinator.java:396) >>>>> >>>>> at org.apache.kafka.clients.consu >>>>> mer.internals.AbstractCoordinator.joinGroupIfNeeded(Abstract >>>>> Coordinator.java:329) >>>>> >>>>> at org.apache.kafka.clients.consu >>>>> mer.internals.AbstractCoordinator.ensureActiveGroup(Abstract >>>>> Coordinator.java:303) >>>>> >>>>> at org.apache.kafka.clients.consu >>>>> mer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286) >>>>> >>>>> at org.apache.kafka.clients.consu >>>>> mer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030) >>>>> >>>>> at org.apache.kafka.clients.consu >>>>> mer.KafkaConsumer.poll(KafkaConsumer.java:995) >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.StreamThread.runLoop(StreamThread.java:582) >>>>> >>>>> ... 1 more >>>>> >>>>> Caused by: org.apache.kafka.clients.consumer.CommitFailedException: >>>>> Commit cannot be completed since the group has already rebalanced and >>>>> assigned the partitions to another member. This means that the time >>> between >>>>> subsequent calls to poll() was longer than the configured >>>>> max.poll.interval.ms, which typically implies that the poll loop is >>>>> spending too much time message processing. You can address this either >>> by >>>>> increasing the session timeout or by reducing the maximum size of >>> batches >>>>> returned in poll() with max.poll.records. >>>>> >>>>> at org.apache.kafka.clients.consu >>>>> mer.internals.ConsumerCoordinator.sendOffsetCommitRequest(Co >>>>> nsumerCoordinator.java:698) >>>>> >>>>> at org.apache.kafka.clients.consu >>>>> mer.internals.ConsumerCoordinator.commitOffsetsSync(Consumer >>>>> Coordinator.java:577) >>>>> >>>>> at org.apache.kafka.clients.consu >>>>> mer.KafkaConsumer.commitSync(KafkaConsumer.java:1125) >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.StreamTask.commitOffsets(StreamTask.java:296) >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.StreamThread$3.apply(StreamThread.java:535) >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.StreamThread.performOnAllTasks(StreamThread.java:503) >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.StreamThread.commitOffsets(StreamThread.java:531) >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.StreamThread.suspendTasksAndState(StreamThrea >>> d.java:480) >>>>> >>>>> ... 10 more >>>>> >>>>> >>>>> >>>>> 2017-05-03 16:53:57 WARN StreamThread:1184 - Could not create task >>> 1_38. >>>>> Will retry. >>>>> >>>>> org.apache.kafka.streams.errors.LockException: task [1_38] Failed to >>> lock >>>>> the state directory: /data/streampoc/LIC2-5/1_38 >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.ProcessorStateManager.<init>(ProcessorStateMa >>>>> nager.java:102) >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.AbstractTask.<init>(AbstractTask.java:73) >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.StreamTask.<init>(StreamTask.java:108) >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.StreamThread.createStreamTask(StreamThread.java:834) >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.StreamThread$TaskCreator.createTask(StreamThr >>> ead.java:1207) >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.StreamThread$AbstractTaskCreator.retryWithBac >>>>> koff(StreamThread.java:1180) >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.StreamThread.addStreamTasks(StreamThread.java:937) >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.StreamThread.access$500(StreamThread.java:69) >>>>> >>>>> at org.apache.kafka.streams.proce >>>>> ssor.internals.StreamThread$1.onPartitionsAssigned(StreamThr >>> ead.java:236) >>>>> >>>>> >>>>> Regards, >>>>> >>>>> -Sameer. >>>>> >>>> >>> >>> >> >
signature.asc
Description: OpenPGP digital signature