[jira] [Created] (KAFKA-16114) Fix partiton not retention after cancel alter intra broker log dir task
wangliucheng created KAFKA-16114: Summary: Fix partiton not retention after cancel alter intra broker log dir task Key: KAFKA-16114 URL: https://issues.apache.org/jira/browse/KAFKA-16114 Project: Kafka Issue Type: Bug Components: log Affects Versions: 3.6.1, 3.3.2 Reporter: wangliucheng -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16080) partiton not retention after execute ALTER_REPLICA_LOG_DIRS and LEADER_AND_ISR request at the same time
wangliucheng created KAFKA-16080: Summary: partiton not retention after execute ALTER_REPLICA_LOG_DIRS and LEADER_AND_ISR request at the same time Key: KAFKA-16080 URL: https://issues.apache.org/jira/browse/KAFKA-16080 Project: Kafka Issue Type: Bug Affects Versions: 3.3.2 Reporter: wangliucheng Hi, I found a reproducible problem, when server running a task which is ALTER_REPLICA_LOG_DIRS e.g. test-1 /data01/kafka/log/test01-1 -> /data02/kafka/log/test01-1.xxx-future then 1) thread task kafka-log-retention not work both /data01/kafka/log/test01-1 and /data02/kafka/log/test01-1.xxx-future, The result is that the data will not retention while task is running analysis: The kafka-log-retention thread not work on test01-1 after invoking logManager.abortAndPauseCleaning(topicPartition) 2) If LEADER_AND_ISR is request while ALTER_REPLICA_LOG_DIRS task is running, After the task is end, the data which in /data02/kafka/log/test01-1 will not be deleted all the time analysis: logManager.abortAndPauseCleaning(topicPartition) invoked twice, but resumed only once How to optimize this problem Thanks -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15851) broker under replicated due to error java.nio.BufferOverflowException
wangliucheng created KAFKA-15851: Summary: broker under replicated due to error java.nio.BufferOverflowException Key: KAFKA-15851 URL: https://issues.apache.org/jira/browse/KAFKA-15851 Project: Kafka Issue Type: Bug Affects Versions: 3.3.2 Environment: Kafka Version: 3.3.2 Deployment mode: zookeeper Reporter: wangliucheng Attachments: p1.png, server.log In my kafka cluster, kafka update 2.0 to 3.3.2 version {*}first start failed{*}, because the same directory was configured The error is as follows: {code:java} [2023-11-16 10:04:09,952] ERROR (main kafka.Kafka$ 159) Exiting Kafka due to fatal exception during startup. java.lang.IllegalStateException: Duplicate log directories for skydas_sc_tdevirsec-12 are found in both /data01/kafka/log/skydas_sc_tdevirsec-12 and /data07/kafka/log/skydas_sc_tdevirsec-12. It is likely because log directory failure happened while broker was replacing current replica with future replica. Recover broker from this failure by manually deleting one of the two directories for this partition. It is recommended to delete the partition in the log directory that is known to have failed recently. at kafka.log.LogManager.loadLog(LogManager.scala:305) at kafka.log.LogManager.$anonfun$loadLogs$14(LogManager.scala:403) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) [2023-11-16 10:04:09,953] INFO (kafka-shutdown-hook kafka.server.KafkaServer 66) [KafkaServer id=1434] shutting down {code} *second,* remove /data07/kafka/log in log.dirs and start kafka also reported an error : {code:java} [2023-11-16 10:13:10,713] INFO (ReplicaFetcherThread-3-1008 kafka.log.UnifiedLog 66) [UnifiedLog partition=ty_udp_full-60, dir=/data04/kafka/log] Rolling new log segment (log_size = 755780551/1073741824}, offset_index_size = 2621440/2621440, time_index_size = 1747626/1747626, inactive_time_ms = 2970196/60480). [2023-11-16 10:13:10,714] ERROR (ReplicaFetcherThread-3-1008 kafka.server.ReplicaFetcherThread 76) [ReplicaFetcher replicaId=1434, leaderId=1008, fetcherId=3] Unexpected error occurred while processing data for partition ty_udp_full-60 at offset 2693467479 java.nio.BufferOverflowException at java.nio.Buffer.nextPutIndex(Buffer.java:555) at java.nio.DirectByteBuffer.putLong(DirectByteBuffer.java:794) at kafka.log.TimeIndex.$anonfun$maybeAppend$1(TimeIndex.scala:135) at kafka.log.TimeIndex.maybeAppend(TimeIndex.scala:114) at kafka.log.LogSegment.onBecomeInactiveSegment(LogSegment.scala:510) at kafka.log.LocalLog.$anonfun$roll$9(LocalLog.scala:529) at kafka.log.LocalLog.$anonfun$roll$9$adapted(LocalLog.scala:529) at scala.Option.foreach(Option.scala:437) at kafka.log.LocalLog.$anonfun$roll$2(LocalLog.scala:529) at kafka.log.LocalLog.roll(LocalLog.scala:786) at kafka.log.UnifiedLog.roll(UnifiedLog.scala:1537) at kafka.log.UnifiedLog.maybeRoll(UnifiedLog.scala:1523) at kafka.log.UnifiedLog.append(UnifiedLog.scala:919) at kafka.log.UnifiedLog.appendAsFollower(UnifiedLog.scala:778) at kafka.cluster.Partition.doAppendRecordsToFollowerOrFutureReplica(Partition.scala:1121) at kafka.cluster.Partition.appendRecordsToFollowerOrFutureReplica(Partition.scala:1128) at kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:121) at kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$7(AbstractFetcherThread.scala:336) at scala.Option.foreach(Option.scala:437) at kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$6(AbstractFetcherThread.scala:325) at kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$6$adapted(AbstractFetcherThread.scala:324) at kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62) at scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359) at scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355) at scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309) at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:324) at kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:124) at
[jira] [Created] (KAFKA-15506) follower receive KafkaStorageException before leader raise disk error
wangliucheng created KAFKA-15506: Summary: follower receive KafkaStorageException before leader raise disk error Key: KAFKA-15506 URL: https://issues.apache.org/jira/browse/KAFKA-15506 Project: Kafka Issue Type: Bug Components: core Affects Versions: 3.3.2 Environment: Kafka Version: 3.3.2 Jdk Version: jdk1.8.0_301 Deployment mode: kraft Reporter: wangliucheng In my kafka environment, topic has 2 replicas, leader and follower unavailable when disk error of leader The follower detects disk error before the leader Here is the logs: *follower recive KafkaStorageException:* {code:java} [2023-08-17 08:40:15,516] ERROR [ReplicaFetcher replicaId=4, leaderId=1, fetcherId=10] Error for partition __consumer_offsets-37 at offset 305860652 (kafka.server.ReplicaFetcherThread) org.apache.kafka.common.errors.KafkaStorageException: Disk error when trying to access log file on the disk. {code} *isr shrink 4,1 to 1:* {code:java} [2023-08-17 08:41:49,953] INFO [Partition __consumer_offsets-37 broker=1] Shrinking ISR from 4,1 to 1. Leader: (highWatermark: 305860652, endOffset: 305860653). Out of sync replicas: (brokerId: 4, endOffset: 305860652). (kafka.cluster.Partition) {code} *broker marking dir to offline:* {code:java} [2023-08-17 08:41:50,188] ERROR Error while appending records to eb_raw_legendsec_flow_2-33 in dir /data09/kafka/log (kafka.server.LogDirFailureChannel) java.io.IOException: Read-only file system at sun.nio.ch.FileDispatcherImpl.write0(Native Method) at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) at sun.nio.ch.IOUtil.write(IOUtil.java:65) at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:211) at org.apache.kafka.common.record.MemoryRecords.writeFullyTo(MemoryRecords.java:92) at org.apache.kafka.common.record.FileRecords.append(FileRecords.java:188) at kafka.log.LogSegment.append(LogSegment.scala:158) at kafka.log.LocalLog.append(LocalLog.scala:436) at kafka.log.UnifiedLog.append(UnifiedLog.scala:949) at kafka.log.UnifiedLog.appendAsFollower(UnifiedLog.scala:778) at kafka.cluster.Partition.doAppendRecordsToFollowerOrFutureReplica(Partition.scala:1121) at kafka.cluster.Partition.appendRecordsToFollowerOrFutureReplica(Partition.scala:1128) at kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:121) at kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$7(AbstractFetcherThread.scala:336) at scala.Option.foreach(Option.scala:437) at kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$6(AbstractFetcherThread.scala:325) at kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$6$adapted(AbstractFetcherThread.scala:324) at kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62) at scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359) at scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355) at scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309) at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:324) at kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:124) at kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3$adapted(AbstractFetcherThread.scala:123) at scala.Option.foreach(Option.scala:437) at kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:123) at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:106) at kafka.server.ReplicaFetcherThread.doWork(ReplicaFetcherThread.scala:97) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)