wangliucheng created KAFKA-15851:
------------------------------------
Summary: broker under replicated due to error
java.nio.BufferOverflowException
Key: KAFKA-15851
URL: https://issues.apache.org/jira/browse/KAFKA-15851
Project: Kafka
Issue Type: Bug
Affects Versions: 3.3.2
Environment: Kafka Version: 3.3.2
Deployment mode: zookeeper
Reporter: wangliucheng
Attachments: p1.png, server.log
In my kafka cluster, kafka update 2.0 to 3.3.2 version
{*}first start failed{*}, because the same directory was configured
The error is as follows:
{code:java}
[2023-11-16 10:04:09,952] ERROR (main kafka.Kafka$ 159) Exiting Kafka due to
fatal exception during startup.
java.lang.IllegalStateException: Duplicate log directories for
skydas_sc_tdevirsec-12 are found in both
/data01/kafka/log/skydas_sc_tdevirsec-12 and
/data07/kafka/log/skydas_sc_tdevirsec-12. It is likely because log directory
failure happened while broker was replacing current replica with future
replica. Recover broker from this failure by manually deleting one of the two
directories for this partition. It is recommended to delete the partition in
the log directory that is known to have failed recently.
at kafka.log.LogManager.loadLog(LogManager.scala:305)
at kafka.log.LogManager.$anonfun$loadLogs$14(LogManager.scala:403)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2023-11-16 10:04:09,953] INFO (kafka-shutdown-hook kafka.server.KafkaServer
66) [KafkaServer id=1434] shutting down {code}
*second,* remove /data07/kafka/log in log.dirs and start kafka also reported
an error :
{code:java}
[2023-11-16 10:13:10,713] INFO (ReplicaFetcherThread-3-1008
kafka.log.UnifiedLog 66) [UnifiedLog partition=ty_udp_full-60,
dir=/data04/kafka/log] Rolling new log segment (log_size =
755780551/1073741824}, offset_index_size = 2621440/2621440, time_index_size =
1747626/1747626, inactive_time_ms = 2970196/604800000).
[2023-11-16 10:13:10,714] ERROR (ReplicaFetcherThread-3-1008
kafka.server.ReplicaFetcherThread 76) [ReplicaFetcher replicaId=1434,
leaderId=1008, fetcherId=3] Unexpected error occurred while processing data for
partition ty_udp_full-60 at offset 2693467479
java.nio.BufferOverflowException
at java.nio.Buffer.nextPutIndex(Buffer.java:555)
at java.nio.DirectByteBuffer.putLong(DirectByteBuffer.java:794)
at kafka.log.TimeIndex.$anonfun$maybeAppend$1(TimeIndex.scala:135)
at kafka.log.TimeIndex.maybeAppend(TimeIndex.scala:114)
at kafka.log.LogSegment.onBecomeInactiveSegment(LogSegment.scala:510)
at kafka.log.LocalLog.$anonfun$roll$9(LocalLog.scala:529)
at kafka.log.LocalLog.$anonfun$roll$9$adapted(LocalLog.scala:529)
at scala.Option.foreach(Option.scala:437)
at kafka.log.LocalLog.$anonfun$roll$2(LocalLog.scala:529)
at kafka.log.LocalLog.roll(LocalLog.scala:786)
at kafka.log.UnifiedLog.roll(UnifiedLog.scala:1537)
at kafka.log.UnifiedLog.maybeRoll(UnifiedLog.scala:1523)
at kafka.log.UnifiedLog.append(UnifiedLog.scala:919)
at kafka.log.UnifiedLog.appendAsFollower(UnifiedLog.scala:778)
at
kafka.cluster.Partition.doAppendRecordsToFollowerOrFutureReplica(Partition.scala:1121)
at
kafka.cluster.Partition.appendRecordsToFollowerOrFutureReplica(Partition.scala:1128)
at
kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:121)
at
kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$7(AbstractFetcherThread.scala:336)
at scala.Option.foreach(Option.scala:437)
at
kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$6(AbstractFetcherThread.scala:325)
at
kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$6$adapted(AbstractFetcherThread.scala:324)
at
kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62)
at
scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359)
at
scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355)
at
scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309)
at
kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:324)
at
kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:124)
at
kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3$adapted(AbstractFetcherThread.scala:123)
at scala.Option.foreach(Option.scala:437)
at
kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:123)
at
kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:106)
at
kafka.server.ReplicaFetcherThread.doWork(ReplicaFetcherThread.scala:97)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)
[2023-11-16 10:13:10,714] WARN (ReplicaFetcherThread-3-1008
kafka.server.ReplicaFetcherThread 70) [ReplicaFetcher replicaId=1434,
leaderId=1008, fetcherId=3] Partition ty_udp_full-60 marked as failed
{code}
start again and finally run normally but log rolled
I want to know how this error occurred, I suspect that the status of the
timeindex file is incorrect,
Thanks
The detailed log information can be found in server.log
--
This message was sent by Atlassian Jira
(v8.20.10#820010)