Hi, Aren't you using storm-kafka-client and not storm-kafka? If so STORM-1455 isn't relevant because it's for a different component than what you're using.
You might consider enabling trace logging for the spout and KafkaConsumer. It'll produce a lot of logging, but it will also let you see which requests the consumer sends. More logging is probably your best bet for figuring out what's going wrong. For example after adding the following to storm/log4j2/worker.xml <Logger name="org.apache.kafka" level="TRACE"> <appender-ref ref="A1"/> </Logger> I get these logs: 2017-09-07 19:02:21.772 o.a.k.c.c.i.ConsumerCoordinator Thread-16-kafka_spout-executor[4, 4] [DEBUG] Group kafkaSpoutTestGroup committed offset 2446 for partition kafka-spout-test-0 2017-09-07 19:02:21.772 o.a.k.c.c.i.ConsumerCoordinator Thread-16-kafka_spout-executor[4, 4] [DEBUG] Group kafkaSpoutTestGroup committed offset 2446 for partition kafka-spout-test-0 2017-09-07 19:02:21.779 o.a.k.c.c.i.Fetcher Thread-16-kafka_spout-executor[4, 4] [TRACE] Returning fetched records at offset 2447 for assigned partition kafka-spout-test-0 and update position to 2448 2017-09-07 19:02:21.779 o.a.k.c.c.i.Fetcher Thread-16-kafka_spout-executor[4, 4] [TRACE] Returning fetched records at offset 2447 for assigned partition kafka-spout-test-0 and update position to 2448 2017-09-07 19:02:21.779 o.a.k.c.c.i.Fetcher Thread-16-kafka_spout-executor[4, 4] [TRACE] Added fetch request for partition kafka-spout-test-1-0 at offset 2452 2017-09-07 19:02:21.779 o.a.k.c.c.i.Fetcher Thread-16-kafka_spout-executor[4, 4] [TRACE] Added fetch request for partition kafka-spout-test-1-0 at offset 2452 Regarding the NotLeaderForPartitionException, each partition has a leader node, which is the broker responsible for reads and writes for that partition. That exception is telling you that the producer sent a request to the wrong Kafka node. This can happen if leadership for a partition passed from one Kafka broker to another, e.g. because one of the brokers was temporarily down/unavailable. The producer will try to figure out who the new leader is after that exception. It's probably not a concern unless the exceptions happen very frequently. Then you'd want to investigate why leadership is changing so often. 2017-09-07 1:06 GMT+02:00 pradeep s <sreekumar.prad...@gmail.com>: > Hi, > Can you please confirm whether the below bug is fixed in Stomr 1.1.0 > version > https://issues.apache.org/jira/browse/STORM-1455 > > We are seeing that consumer offset is getting reset to earliest offset for > few topics in a group. > > This is observed in prod environment and there was only info logs . There > were not much we can figure out from the logs. > Any suggestions on how to replicate the issue . > > One issue noticed with Kafka cluster is that , we were getting producer > errors like below > > ERROR 2017-09-05 12:39:49,735 [kafka-producer-network-thread | producer-1] > A failure occurred sending a message to Kafka. > org.apache.kafka.common.errors.NotLeaderForPartitionException: This > server is not the leader for that topic-partition. > > Thanks > Pradeep > > >