Re: BufferOverflowException on rolling new segment after upgrading Kafka from 1.1.0 to 2.3.1
bumping this up with new update: I've investigated another occurrence of this exception. For analyzes, I used: 1) a memory dump that was taken from the broker 2) kafka log file 3) kafka state-change log 4) log, index and time-index files of a failed segment 5) Kafka source code, version 2.3.1 and 1.1.0 Here's how the exception looks like in the kafka log: 2019/11/19 16:03:00 INFO [ProducerStateManager partition=ad_group_metrics-62] Writing producer snapshot at offset 13886052 (kafka.log.ProducerStateManager) 2019/11/19 16:03:00 INFO [Log partition=ad_group_metrics-62, dir=/mnt/kafka] Rolled new log segment at offset 13886052 in 1 ms. (kafka.log.Log) 2019/11/19 16:03:00 ERROR [ReplicaManager broker=17] Error processing append operation on partition ad_group_metrics-62 (kafka.server.ReplicaManager) 2019/11/19 16:03:00 java.nio.BufferOverflowException 2019/11/19 16:03:00 at java.nio.Buffer.nextPutIndex(Buffer.java:527) 2019/11/19 16:03:00 at java.nio.DirectByteBuffer.putLong(DirectByteBuffer.java:797) 2019/11/19 16:03:00 at kafka.log.TimeIndex.$anonfun$maybeAppend$1(TimeIndex.scala:134) 2019/11/19 16:03:00 at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) 2019/11/19 16:03:00 at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) 2019/11/19 16:03:00 at kafka.log.TimeIndex.maybeAppend(TimeIndex.scala:114) 2019/11/19 16:03:00 at kafka.log.LogSegment.onBecomeInactiveSegment(LogSegment.scala:520) 2019/11/19 16:03:00 at kafka.log.Log.$anonfun$roll$8(Log.scala:1690) 2019/11/19 16:03:00 at kafka.log.Log.$anonfun$roll$8$adapted(Log.scala:1690) 2019/11/19 16:03:00 at scala.Option.foreach(Option.scala:407) 2019/11/19 16:03:00 at kafka.log.Log.$anonfun$roll$2(Log.scala:1690) 2019/11/19 16:03:00 at kafka.log.Log.maybeHandleIOException(Log.scala:2085) 2019/11/19 16:03:00 at kafka.log.Log.roll(Log.scala:1654) 2019/11/19 16:03:00 at kafka.log.Log.maybeRoll(Log.scala:1639) 2019/11/19 16:03:00 at kafka.log.Log.$anonfun$append$2(Log.scala:966) 2019/11/19 16:03:00 at kafka.log.Log.maybeHandleIOException(Log.scala:2085) 2019/11/19 16:03:00 at kafka.log.Log.append(Log.scala:850) 2019/11/19 16:03:00 at kafka.log.Log.appendAsLeader(Log.scala:819) 2019/11/19 16:03:00 at kafka.cluster.Partition.$anonfun$appendRecordsToLeader$1(Partition.scala:772) 2019/11/19 16:03:00 at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) 2019/11/19 16:03:00 at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:259) 2019/11/19 16:03:00 at kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:759) 2019/11/19 16:03:00 at kafka.server.ReplicaManager.$anonfun$appendToLocalLog$2(ReplicaManager.scala:763) 2019/11/19 16:03:00 at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) 2019/11/19 16:03:00 at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149) 2019/11/19 16:03:00 at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237) 2019/11/19 16:03:00 at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230) 2019/11/19 16:03:00 at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44) 2019/11/19 16:03:00 at scala.collection.mutable.HashMap.foreach(HashMap.scala:149) 2019/11/19 16:03:00 at scala.collection.TraversableLike.map(TraversableLike.scala:238) 2019/11/19 16:03:00 at scala.collection.TraversableLike.map$(TraversableLike.scala:231) 2019/11/19 16:03:00 at scala.collection.AbstractTraversable.map(Traversable.scala:108) 2019/11/19 16:03:00 at kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:751) 2019/11/19 16:03:00 at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:492) 2019/11/19 16:03:00 at kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:544) 2019/11/19 16:03:00 at kafka.server.KafkaApis.handle(KafkaApis.scala:113) 2019/11/19 16:03:00 at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) 2019/11/19 16:03:00 at java.lang.Thread.run(Thread.java:748) ... What we see here, is that a new segment was rolled out at the offset 13886052 and then an exception happened while trying to make *some* segment as inactive (`onBecomeInactiveSegment`) on appending new messages to the Log. The timing of the rolling out of a new segment and appending new messages doesn't play a role. There are many other similar exceptions where this occurs a few seconds after rolling out of a new segment. I managed to find the `LogSegment` object for the offset 13886052 in the memory dump. I followed the source code logic, checking the LogSegment state and Kafka logs, and found that the `TimeIndex` object somehow went into the state with 0 entries and 0 max possible entries (and an empty memory map). Having 0 entries is normal for TimeIndex and OffsetIndex even if there are some records in the Log unless their size passes some threshold. But having 0 max possible entries along with 0 entries made the TimeIndex considered as full (0 entries == 0 max entries) and was triggering the rolling out a new segment. The Log was
Re: BufferOverflowException on rolling new segment after upgrading Kafka from 1.1.0 to 2.3.1
Filed JIRA bug: https://issues.apache.org/jira/browse/KAFKA-9213 On Tue, Nov 19, 2019 at 2:58 PM Ismael Juma wrote: > Can you please file a JIRA? > > Ismael > > On Tue, Nov 19, 2019 at 11:52 AM Daniyar Kulakhmetov < > dkulakhme...@liftoff.io> wrote: > > > Hi Kafka users, > > > > We updated our Kafka cluster from 1.1.0 version to 2.3.1. > > Message format and inter-broker protocol versions left the same: > > > > inter.broker.protocol.version=1.1 > > log.message.format.version=1.1 > > > > After upgrading, we started to get some occasional exceptions: > > > > 2019/11/19 05:30:53 INFO [ProducerStateManager > > partition=matchmaker_retry_clicks_15m-2] Writing producer snapshot at > > offset 788532 (kafka.log.ProducerStateManager) > > 2019/11/19 05:30:53 INFO [Log partition=matchmaker_retry_clicks_15m-2, > > dir=/mnt/kafka] Rolled new log segment at offset 788532 in 1 ms. > > (kafka.log.Log) > > 2019/11/19 05:31:01 ERROR [ReplicaManager broker=0] Error processing > append > > operation on partition matchmaker_retry_clicks_15m-2 > > (kafka.server.ReplicaManager) > > 2019/11/19 05:31:01 java.nio.BufferOverflowException > > 2019/11/19 05:31:01 at java.nio.Buffer.nextPutIndex(Buffer.java:527) > > 2019/11/19 05:31:01 at > > java.nio.DirectByteBuffer.putLong(DirectByteBuffer.java:797) > > 2019/11/19 05:31:01 at > > kafka.log.TimeIndex.$anonfun$maybeAppend$1(TimeIndex.scala:134) > > 2019/11/19 05:31:01 at > > scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > > 2019/11/19 05:31:01 at > > kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) > > 2019/11/19 05:31:01 at > > kafka.log.TimeIndex.maybeAppend(TimeIndex.scala:114) > > 2019/11/19 05:31:01 at > > kafka.log.LogSegment.onBecomeInactiveSegment(LogSegment.scala:520) > > 2019/11/19 05:31:01 at kafka.log.Log.$anonfun$roll$8(Log.scala:1690) > > 2019/11/19 05:31:01 at > > kafka.log.Log.$anonfun$roll$8$adapted(Log.scala:1690) > > 2019/11/19 05:31:01 at scala.Option.foreach(Option.scala:407) > > 2019/11/19 05:31:01 at kafka.log.Log.$anonfun$roll$2(Log.scala:1690) > > 2019/11/19 05:31:01 at > > kafka.log.Log.maybeHandleIOException(Log.scala:2085) > > 2019/11/19 05:31:01 at kafka.log.Log.roll(Log.scala:1654) > > 2019/11/19 05:31:01 at kafka.log.Log.maybeRoll(Log.scala:1639) > > 2019/11/19 05:31:01 at kafka.log.Log.$anonfun$append$2(Log.scala:966) > > 2019/11/19 05:31:01 at > > kafka.log.Log.maybeHandleIOException(Log.scala:2085) > > 2019/11/19 05:31:01 at kafka.log.Log.append(Log.scala:850) > > 2019/11/19 05:31:01 at kafka.log.Log.appendAsLeader(Log.scala:819) > > 2019/11/19 05:31:01 at > > > > > kafka.cluster.Partition.$anonfun$appendRecordsToLeader$1(Partition.scala:772) > > 2019/11/19 05:31:01 at > > kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) > > 2019/11/19 05:31:01 at > > kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:259) > > 2019/11/19 05:31:01 at > > kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:759) > > 2019/11/19 05:31:01 at > > > > > kafka.server.ReplicaManager.$anonfun$appendToLocalLog$2(ReplicaManager.scala:763) > > 2019/11/19 05:31:01 at > > > scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) > > 2019/11/19 05:31:01 at > > scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149) > > 2019/11/19 05:31:01 at > > scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237) > > 2019/11/19 05:31:01 at > > scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230) > > 2019/11/19 05:31:01 at > > scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44) > > 2019/11/19 05:31:01 at > > scala.collection.mutable.HashMap.foreach(HashMap.scala:149) > > 2019/11/19 05:31:01 at > > scala.collection.TraversableLike.map(TraversableLike.scala:238) > > 2019/11/19 05:31:01 at > > scala.collection.TraversableLike.map$(TraversableLike.scala:231) > > 2019/11/19 05:31:01 at > > scala.collection.AbstractTraversable.map(Traversable.scala:108) > > 2019/11/19 05:31:01 at > > kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:751) > > 2019/11/19 05:31:01 at > > kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:492) > > 2019/11/19 05:31:01 at > > kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:544) > > 2019/11/19 05:31:01 at > > kafka.server.KafkaApis.handle(KafkaApis.scala:113) > > 2019/11/19 05:31:01 at > > kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) > > 2019/11/19 05:31:01 at java.lang.Thread.run(Thread.java:748) > > > > > > This error persists until broker gets restarted (or leadership get moved > to > > another broker). > > > > What could be the issue and how we can solve it? > > > > Thank you! > > > > Best regards, > > Daniyar. > > >
Re: BufferOverflowException on rolling new segment after upgrading Kafka from 1.1.0 to 2.3.1
Can you please file a JIRA? Ismael On Tue, Nov 19, 2019 at 11:52 AM Daniyar Kulakhmetov < dkulakhme...@liftoff.io> wrote: > Hi Kafka users, > > We updated our Kafka cluster from 1.1.0 version to 2.3.1. > Message format and inter-broker protocol versions left the same: > > inter.broker.protocol.version=1.1 > log.message.format.version=1.1 > > After upgrading, we started to get some occasional exceptions: > > 2019/11/19 05:30:53 INFO [ProducerStateManager > partition=matchmaker_retry_clicks_15m-2] Writing producer snapshot at > offset 788532 (kafka.log.ProducerStateManager) > 2019/11/19 05:30:53 INFO [Log partition=matchmaker_retry_clicks_15m-2, > dir=/mnt/kafka] Rolled new log segment at offset 788532 in 1 ms. > (kafka.log.Log) > 2019/11/19 05:31:01 ERROR [ReplicaManager broker=0] Error processing append > operation on partition matchmaker_retry_clicks_15m-2 > (kafka.server.ReplicaManager) > 2019/11/19 05:31:01 java.nio.BufferOverflowException > 2019/11/19 05:31:01 at java.nio.Buffer.nextPutIndex(Buffer.java:527) > 2019/11/19 05:31:01 at > java.nio.DirectByteBuffer.putLong(DirectByteBuffer.java:797) > 2019/11/19 05:31:01 at > kafka.log.TimeIndex.$anonfun$maybeAppend$1(TimeIndex.scala:134) > 2019/11/19 05:31:01 at > scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > 2019/11/19 05:31:01 at > kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) > 2019/11/19 05:31:01 at > kafka.log.TimeIndex.maybeAppend(TimeIndex.scala:114) > 2019/11/19 05:31:01 at > kafka.log.LogSegment.onBecomeInactiveSegment(LogSegment.scala:520) > 2019/11/19 05:31:01 at kafka.log.Log.$anonfun$roll$8(Log.scala:1690) > 2019/11/19 05:31:01 at > kafka.log.Log.$anonfun$roll$8$adapted(Log.scala:1690) > 2019/11/19 05:31:01 at scala.Option.foreach(Option.scala:407) > 2019/11/19 05:31:01 at kafka.log.Log.$anonfun$roll$2(Log.scala:1690) > 2019/11/19 05:31:01 at > kafka.log.Log.maybeHandleIOException(Log.scala:2085) > 2019/11/19 05:31:01 at kafka.log.Log.roll(Log.scala:1654) > 2019/11/19 05:31:01 at kafka.log.Log.maybeRoll(Log.scala:1639) > 2019/11/19 05:31:01 at kafka.log.Log.$anonfun$append$2(Log.scala:966) > 2019/11/19 05:31:01 at > kafka.log.Log.maybeHandleIOException(Log.scala:2085) > 2019/11/19 05:31:01 at kafka.log.Log.append(Log.scala:850) > 2019/11/19 05:31:01 at kafka.log.Log.appendAsLeader(Log.scala:819) > 2019/11/19 05:31:01 at > > kafka.cluster.Partition.$anonfun$appendRecordsToLeader$1(Partition.scala:772) > 2019/11/19 05:31:01 at > kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) > 2019/11/19 05:31:01 at > kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:259) > 2019/11/19 05:31:01 at > kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:759) > 2019/11/19 05:31:01 at > > kafka.server.ReplicaManager.$anonfun$appendToLocalLog$2(ReplicaManager.scala:763) > 2019/11/19 05:31:01 at > scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) > 2019/11/19 05:31:01 at > scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149) > 2019/11/19 05:31:01 at > scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237) > 2019/11/19 05:31:01 at > scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230) > 2019/11/19 05:31:01 at > scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44) > 2019/11/19 05:31:01 at > scala.collection.mutable.HashMap.foreach(HashMap.scala:149) > 2019/11/19 05:31:01 at > scala.collection.TraversableLike.map(TraversableLike.scala:238) > 2019/11/19 05:31:01 at > scala.collection.TraversableLike.map$(TraversableLike.scala:231) > 2019/11/19 05:31:01 at > scala.collection.AbstractTraversable.map(Traversable.scala:108) > 2019/11/19 05:31:01 at > kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:751) > 2019/11/19 05:31:01 at > kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:492) > 2019/11/19 05:31:01 at > kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:544) > 2019/11/19 05:31:01 at > kafka.server.KafkaApis.handle(KafkaApis.scala:113) > 2019/11/19 05:31:01 at > kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) > 2019/11/19 05:31:01 at java.lang.Thread.run(Thread.java:748) > > > This error persists until broker gets restarted (or leadership get moved to > another broker). > > What could be the issue and how we can solve it? > > Thank you! > > Best regards, > Daniyar. >
Re: BufferOverflowException on rolling new segment after upgrading Kafka from 1.1.0 to 2.3.1
Hi, We followed the upgrade instruction ( https://kafka.apache.org/documentation/#upgrade) up to step 2, and as it is said in step 3 "Once the cluster's behavior and performance has been verified, bump the protocol version by editing" we were verifying cluster's behavior. Thanks, On Tue, Nov 19, 2019 at 12:19 PM M. Manna wrote: > Hi, > > Is there any reason why you haven’t performed the upgrade based on official > docs ? Or, is this something you’re planning to do now? > > Thanks, > > On Tue, 19 Nov 2019 at 19:52, Daniyar Kulakhmetov > > wrote: > > > Hi Kafka users, > > > > We updated our Kafka cluster from 1.1.0 version to 2.3.1. > > Message format and inter-broker protocol versions left the same: > > > > inter.broker.protocol.version=1.1 > > log.message.format.version=1.1 > > > > After upgrading, we started to get some occasional exceptions: > > > > 2019/11/19 05:30:53 INFO [ProducerStateManager > > partition=matchmaker_retry_clicks_15m-2] Writing producer snapshot at > > offset 788532 (kafka.log.ProducerStateManager) > > 2019/11/19 05:30:53 INFO [Log partition=matchmaker_retry_clicks_15m-2, > > dir=/mnt/kafka] Rolled new log segment at offset 788532 in 1 ms. > > (kafka.log.Log) > > 2019/11/19 05:31:01 ERROR [ReplicaManager broker=0] Error processing > append > > operation on partition matchmaker_retry_clicks_15m-2 > > (kafka.server.ReplicaManager) > > 2019/11/19 05:31:01 java.nio.BufferOverflowException > > 2019/11/19 05:31:01 at java.nio.Buffer.nextPutIndex(Buffer.java:527) > > 2019/11/19 05:31:01 at > > java.nio.DirectByteBuffer.putLong(DirectByteBuffer.java:797) > > 2019/11/19 05:31:01 at > > kafka.log.TimeIndex.$anonfun$maybeAppend$1(TimeIndex.scala:134) > > 2019/11/19 05:31:01 at > > scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > > 2019/11/19 05:31:01 at > > kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) > > 2019/11/19 05:31:01 at > > kafka.log.TimeIndex.maybeAppend(TimeIndex.scala:114) > > 2019/11/19 05:31:01 at > > kafka.log.LogSegment.onBecomeInactiveSegment(LogSegment.scala:520) > > 2019/11/19 05:31:01 at kafka.log.Log.$anonfun$roll$8(Log.scala:1690) > > 2019/11/19 05:31:01 at > > kafka.log.Log.$anonfun$roll$8$adapted(Log.scala:1690) > > 2019/11/19 05:31:01 at scala.Option.foreach(Option.scala:407) > > 2019/11/19 05:31:01 at kafka.log.Log.$anonfun$roll$2(Log.scala:1690) > > 2019/11/19 05:31:01 at > > kafka.log.Log.maybeHandleIOException(Log.scala:2085) > > 2019/11/19 05:31:01 at kafka.log.Log.roll(Log.scala:1654) > > 2019/11/19 05:31:01 at kafka.log.Log.maybeRoll(Log.scala:1639) > > 2019/11/19 05:31:01 at kafka.log.Log.$anonfun$append$2(Log.scala:966) > > 2019/11/19 05:31:01 at > > kafka.log.Log.maybeHandleIOException(Log.scala:2085) > > 2019/11/19 05:31:01 at kafka.log.Log.append(Log.scala:850) > > 2019/11/19 05:31:01 at kafka.log.Log.appendAsLeader(Log.scala:819) > > 2019/11/19 05:31:01 at > > > > > kafka.cluster.Partition.$anonfun$appendRecordsToLeader$1(Partition.scala:772) > > 2019/11/19 05:31:01 at > > kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) > > 2019/11/19 05:31:01 at > > kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:259) > > 2019/11/19 05:31:01 at > > kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:759) > > 2019/11/19 05:31:01 at > > > > > kafka.server.ReplicaManager.$anonfun$appendToLocalLog$2(ReplicaManager.scala:763) > > 2019/11/19 05:31:01 at > > > scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) > > 2019/11/19 05:31:01 at > > scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149) > > 2019/11/19 05:31:01 at > > scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237) > > 2019/11/19 05:31:01 at > > scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230) > > 2019/11/19 05:31:01 at > > scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44) > > 2019/11/19 05:31:01 at > > scala.collection.mutable.HashMap.foreach(HashMap.scala:149) > > 2019/11/19 05:31:01 at > > scala.collection.TraversableLike.map(TraversableLike.scala:238) > > 2019/11/19 05:31:01 at > > scala.collection.TraversableLike.map$(TraversableLike.scala:231) > > 2019/11/19 05:31:01 at > > scala.collection.AbstractTraversable.map(Traversable.scala:108) > > 2019/11/19 05:31:01 at > > kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:751) > > 2019/11/19 05:31:01 at > > kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:492) > > 2019/11/19 05:31:01 at > > kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:544) > > 2019/11/19 05:31:01 at > > kafka.server.KafkaApis.handle(KafkaApis.scala:113) > > 2019/11/19 05:31:01 at > > kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) > > 2019/11/19 05:31:01 at java.lang.Thread.run(Thread.java:748) > > > > > > This error persists until broker gets
Re: BufferOverflowException on rolling new segment after upgrading Kafka from 1.1.0 to 2.3.1
Hi, Is there any reason why you haven’t performed the upgrade based on official docs ? Or, is this something you’re planning to do now? Thanks, On Tue, 19 Nov 2019 at 19:52, Daniyar Kulakhmetov wrote: > Hi Kafka users, > > We updated our Kafka cluster from 1.1.0 version to 2.3.1. > Message format and inter-broker protocol versions left the same: > > inter.broker.protocol.version=1.1 > log.message.format.version=1.1 > > After upgrading, we started to get some occasional exceptions: > > 2019/11/19 05:30:53 INFO [ProducerStateManager > partition=matchmaker_retry_clicks_15m-2] Writing producer snapshot at > offset 788532 (kafka.log.ProducerStateManager) > 2019/11/19 05:30:53 INFO [Log partition=matchmaker_retry_clicks_15m-2, > dir=/mnt/kafka] Rolled new log segment at offset 788532 in 1 ms. > (kafka.log.Log) > 2019/11/19 05:31:01 ERROR [ReplicaManager broker=0] Error processing append > operation on partition matchmaker_retry_clicks_15m-2 > (kafka.server.ReplicaManager) > 2019/11/19 05:31:01 java.nio.BufferOverflowException > 2019/11/19 05:31:01 at java.nio.Buffer.nextPutIndex(Buffer.java:527) > 2019/11/19 05:31:01 at > java.nio.DirectByteBuffer.putLong(DirectByteBuffer.java:797) > 2019/11/19 05:31:01 at > kafka.log.TimeIndex.$anonfun$maybeAppend$1(TimeIndex.scala:134) > 2019/11/19 05:31:01 at > scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > 2019/11/19 05:31:01 at > kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) > 2019/11/19 05:31:01 at > kafka.log.TimeIndex.maybeAppend(TimeIndex.scala:114) > 2019/11/19 05:31:01 at > kafka.log.LogSegment.onBecomeInactiveSegment(LogSegment.scala:520) > 2019/11/19 05:31:01 at kafka.log.Log.$anonfun$roll$8(Log.scala:1690) > 2019/11/19 05:31:01 at > kafka.log.Log.$anonfun$roll$8$adapted(Log.scala:1690) > 2019/11/19 05:31:01 at scala.Option.foreach(Option.scala:407) > 2019/11/19 05:31:01 at kafka.log.Log.$anonfun$roll$2(Log.scala:1690) > 2019/11/19 05:31:01 at > kafka.log.Log.maybeHandleIOException(Log.scala:2085) > 2019/11/19 05:31:01 at kafka.log.Log.roll(Log.scala:1654) > 2019/11/19 05:31:01 at kafka.log.Log.maybeRoll(Log.scala:1639) > 2019/11/19 05:31:01 at kafka.log.Log.$anonfun$append$2(Log.scala:966) > 2019/11/19 05:31:01 at > kafka.log.Log.maybeHandleIOException(Log.scala:2085) > 2019/11/19 05:31:01 at kafka.log.Log.append(Log.scala:850) > 2019/11/19 05:31:01 at kafka.log.Log.appendAsLeader(Log.scala:819) > 2019/11/19 05:31:01 at > > kafka.cluster.Partition.$anonfun$appendRecordsToLeader$1(Partition.scala:772) > 2019/11/19 05:31:01 at > kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) > 2019/11/19 05:31:01 at > kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:259) > 2019/11/19 05:31:01 at > kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:759) > 2019/11/19 05:31:01 at > > kafka.server.ReplicaManager.$anonfun$appendToLocalLog$2(ReplicaManager.scala:763) > 2019/11/19 05:31:01 at > scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) > 2019/11/19 05:31:01 at > scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149) > 2019/11/19 05:31:01 at > scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237) > 2019/11/19 05:31:01 at > scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230) > 2019/11/19 05:31:01 at > scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44) > 2019/11/19 05:31:01 at > scala.collection.mutable.HashMap.foreach(HashMap.scala:149) > 2019/11/19 05:31:01 at > scala.collection.TraversableLike.map(TraversableLike.scala:238) > 2019/11/19 05:31:01 at > scala.collection.TraversableLike.map$(TraversableLike.scala:231) > 2019/11/19 05:31:01 at > scala.collection.AbstractTraversable.map(Traversable.scala:108) > 2019/11/19 05:31:01 at > kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:751) > 2019/11/19 05:31:01 at > kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:492) > 2019/11/19 05:31:01 at > kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:544) > 2019/11/19 05:31:01 at > kafka.server.KafkaApis.handle(KafkaApis.scala:113) > 2019/11/19 05:31:01 at > kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) > 2019/11/19 05:31:01 at java.lang.Thread.run(Thread.java:748) > > > This error persists until broker gets restarted (or leadership get moved to > another broker). > > What could be the issue and how we can solve it? > > Thank you! > > Best regards, > Daniyar. >
BufferOverflowException on rolling new segment after upgrading Kafka from 1.1.0 to 2.3.1
Hi Kafka users, We updated our Kafka cluster from 1.1.0 version to 2.3.1. Message format and inter-broker protocol versions left the same: inter.broker.protocol.version=1.1 log.message.format.version=1.1 After upgrading, we started to get some occasional exceptions: 2019/11/19 05:30:53 INFO [ProducerStateManager partition=matchmaker_retry_clicks_15m-2] Writing producer snapshot at offset 788532 (kafka.log.ProducerStateManager) 2019/11/19 05:30:53 INFO [Log partition=matchmaker_retry_clicks_15m-2, dir=/mnt/kafka] Rolled new log segment at offset 788532 in 1 ms. (kafka.log.Log) 2019/11/19 05:31:01 ERROR [ReplicaManager broker=0] Error processing append operation on partition matchmaker_retry_clicks_15m-2 (kafka.server.ReplicaManager) 2019/11/19 05:31:01 java.nio.BufferOverflowException 2019/11/19 05:31:01 at java.nio.Buffer.nextPutIndex(Buffer.java:527) 2019/11/19 05:31:01 at java.nio.DirectByteBuffer.putLong(DirectByteBuffer.java:797) 2019/11/19 05:31:01 at kafka.log.TimeIndex.$anonfun$maybeAppend$1(TimeIndex.scala:134) 2019/11/19 05:31:01 at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) 2019/11/19 05:31:01 at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) 2019/11/19 05:31:01 at kafka.log.TimeIndex.maybeAppend(TimeIndex.scala:114) 2019/11/19 05:31:01 at kafka.log.LogSegment.onBecomeInactiveSegment(LogSegment.scala:520) 2019/11/19 05:31:01 at kafka.log.Log.$anonfun$roll$8(Log.scala:1690) 2019/11/19 05:31:01 at kafka.log.Log.$anonfun$roll$8$adapted(Log.scala:1690) 2019/11/19 05:31:01 at scala.Option.foreach(Option.scala:407) 2019/11/19 05:31:01 at kafka.log.Log.$anonfun$roll$2(Log.scala:1690) 2019/11/19 05:31:01 at kafka.log.Log.maybeHandleIOException(Log.scala:2085) 2019/11/19 05:31:01 at kafka.log.Log.roll(Log.scala:1654) 2019/11/19 05:31:01 at kafka.log.Log.maybeRoll(Log.scala:1639) 2019/11/19 05:31:01 at kafka.log.Log.$anonfun$append$2(Log.scala:966) 2019/11/19 05:31:01 at kafka.log.Log.maybeHandleIOException(Log.scala:2085) 2019/11/19 05:31:01 at kafka.log.Log.append(Log.scala:850) 2019/11/19 05:31:01 at kafka.log.Log.appendAsLeader(Log.scala:819) 2019/11/19 05:31:01 at kafka.cluster.Partition.$anonfun$appendRecordsToLeader$1(Partition.scala:772) 2019/11/19 05:31:01 at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) 2019/11/19 05:31:01 at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:259) 2019/11/19 05:31:01 at kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:759) 2019/11/19 05:31:01 at kafka.server.ReplicaManager.$anonfun$appendToLocalLog$2(ReplicaManager.scala:763) 2019/11/19 05:31:01 at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) 2019/11/19 05:31:01 at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149) 2019/11/19 05:31:01 at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237) 2019/11/19 05:31:01 at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230) 2019/11/19 05:31:01 at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44) 2019/11/19 05:31:01 at scala.collection.mutable.HashMap.foreach(HashMap.scala:149) 2019/11/19 05:31:01 at scala.collection.TraversableLike.map(TraversableLike.scala:238) 2019/11/19 05:31:01 at scala.collection.TraversableLike.map$(TraversableLike.scala:231) 2019/11/19 05:31:01 at scala.collection.AbstractTraversable.map(Traversable.scala:108) 2019/11/19 05:31:01 at kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:751) 2019/11/19 05:31:01 at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:492) 2019/11/19 05:31:01 at kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:544) 2019/11/19 05:31:01 at kafka.server.KafkaApis.handle(KafkaApis.scala:113) 2019/11/19 05:31:01 at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) 2019/11/19 05:31:01 at java.lang.Thread.run(Thread.java:748) This error persists until broker gets restarted (or leadership get moved to another broker). What could be the issue and how we can solve it? Thank you! Best regards, Daniyar.