[jira] [Created] (KAFKA-16114) Fix partiton not retention after cancel alter intra broker log dir task

2024-01-10 Thread wangliucheng (Jira)
wangliucheng created KAFKA-16114:


 Summary: Fix partiton not retention after cancel alter intra 
broker log dir task 
 Key: KAFKA-16114
 URL: https://issues.apache.org/jira/browse/KAFKA-16114
 Project: Kafka
  Issue Type: Bug
  Components: log
Affects Versions: 3.6.1, 3.3.2
Reporter: wangliucheng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16080) partiton not retention after execute ALTER_REPLICA_LOG_DIRS and LEADER_AND_ISR request at the same time

2024-01-04 Thread wangliucheng (Jira)
wangliucheng created KAFKA-16080:


 Summary: partiton not retention after execute 
ALTER_REPLICA_LOG_DIRS and LEADER_AND_ISR request at the same time
 Key: KAFKA-16080
 URL: https://issues.apache.org/jira/browse/KAFKA-16080
 Project: Kafka
  Issue Type: Bug
Affects Versions: 3.3.2
Reporter: wangliucheng


Hi,
I found a reproducible problem,

when server running a task which is ALTER_REPLICA_LOG_DIRS 
e.g. test-1 /data01/kafka/log/test01-1 -> /data02/kafka/log/test01-1.xxx-future
then 
1) thread task kafka-log-retention not work both /data01/kafka/log/test01-1 and 
/data02/kafka/log/test01-1.xxx-future,
   The result is that the data will not  retention while task is running
   analysis: The kafka-log-retention thread not work on test01-1  after 
invoking logManager.abortAndPauseCleaning(topicPartition)
   
2) If LEADER_AND_ISR is request while ALTER_REPLICA_LOG_DIRS task is running,
   After the task is end,  the data which in /data02/kafka/log/test01-1  will 
not be deleted all the time 
   analysis: logManager.abortAndPauseCleaning(topicPartition) invoked twice, 
but resumed only once

How to optimize this problem
Thanks 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15851) broker under replicated due to error java.nio.BufferOverflowException

2023-11-16 Thread wangliucheng (Jira)
wangliucheng created KAFKA-15851:


 Summary: broker under replicated due to error 
java.nio.BufferOverflowException
 Key: KAFKA-15851
 URL: https://issues.apache.org/jira/browse/KAFKA-15851
 Project: Kafka
  Issue Type: Bug
Affects Versions: 3.3.2
 Environment: Kafka Version: 3.3.2

Deployment mode: zookeeper

Reporter: wangliucheng
 Attachments: p1.png, server.log

In my kafka cluster, kafka update 2.0 to 3.3.2 version 

{*}first start failed{*}, because the same directory was configured

The error is as follows:

 
{code:java}
[2023-11-16 10:04:09,952] ERROR (main kafka.Kafka$ 159) Exiting Kafka due to 
fatal exception during startup.
java.lang.IllegalStateException: Duplicate log directories for 
skydas_sc_tdevirsec-12 are found in both 
/data01/kafka/log/skydas_sc_tdevirsec-12 and 
/data07/kafka/log/skydas_sc_tdevirsec-12. It is likely because log directory 
failure happened while broker was replacing current replica with future 
replica. Recover broker from this failure by manually deleting one of the two 
directories for this partition. It is recommended to delete the partition in 
the log directory that is known to have failed recently.
        at kafka.log.LogManager.loadLog(LogManager.scala:305)
        at kafka.log.LogManager.$anonfun$loadLogs$14(LogManager.scala:403)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
[2023-11-16 10:04:09,953] INFO (kafka-shutdown-hook kafka.server.KafkaServer 
66) [KafkaServer id=1434] shutting down {code}
 

 

*second,* remove /data07/kafka/log in log.dirs  and start kafka also reported 
an error :

 
{code:java}
[2023-11-16 10:13:10,713] INFO (ReplicaFetcherThread-3-1008 
kafka.log.UnifiedLog 66) [UnifiedLog partition=ty_udp_full-60, 
dir=/data04/kafka/log] Rolling new log segment (log_size = 
755780551/1073741824}, offset_index_size = 2621440/2621440, time_index_size = 
1747626/1747626, inactive_time_ms = 2970196/60480).
[2023-11-16 10:13:10,714] ERROR (ReplicaFetcherThread-3-1008 
kafka.server.ReplicaFetcherThread 76) [ReplicaFetcher replicaId=1434, 
leaderId=1008, fetcherId=3] Unexpected error occurred while processing data for 
partition ty_udp_full-60 at offset 2693467479
java.nio.BufferOverflowException
        at java.nio.Buffer.nextPutIndex(Buffer.java:555)
        at java.nio.DirectByteBuffer.putLong(DirectByteBuffer.java:794)
        at kafka.log.TimeIndex.$anonfun$maybeAppend$1(TimeIndex.scala:135)
        at kafka.log.TimeIndex.maybeAppend(TimeIndex.scala:114)
        at kafka.log.LogSegment.onBecomeInactiveSegment(LogSegment.scala:510)
        at kafka.log.LocalLog.$anonfun$roll$9(LocalLog.scala:529)
        at kafka.log.LocalLog.$anonfun$roll$9$adapted(LocalLog.scala:529)
        at scala.Option.foreach(Option.scala:437)
        at kafka.log.LocalLog.$anonfun$roll$2(LocalLog.scala:529)
        at kafka.log.LocalLog.roll(LocalLog.scala:786)
        at kafka.log.UnifiedLog.roll(UnifiedLog.scala:1537)
        at kafka.log.UnifiedLog.maybeRoll(UnifiedLog.scala:1523)
        at kafka.log.UnifiedLog.append(UnifiedLog.scala:919)
        at kafka.log.UnifiedLog.appendAsFollower(UnifiedLog.scala:778)
        at 
kafka.cluster.Partition.doAppendRecordsToFollowerOrFutureReplica(Partition.scala:1121)
        at 
kafka.cluster.Partition.appendRecordsToFollowerOrFutureReplica(Partition.scala:1128)
        at 
kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:121)
        at 
kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$7(AbstractFetcherThread.scala:336)
        at scala.Option.foreach(Option.scala:437)
        at 
kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$6(AbstractFetcherThread.scala:325)
        at 
kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$6$adapted(AbstractFetcherThread.scala:324)
        at 
kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62)
        at 
scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359)
        at 
scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355)
        at 
scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309)
        at 
kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:324)
        at 
kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:124)
        at 

[jira] [Created] (KAFKA-15506) follower receive KafkaStorageException before leader raise disk error

2023-09-26 Thread wangliucheng (Jira)
wangliucheng created KAFKA-15506:


 Summary: follower receive KafkaStorageException before leader 
raise disk error 
 Key: KAFKA-15506
 URL: https://issues.apache.org/jira/browse/KAFKA-15506
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 3.3.2
 Environment: Kafka Version: 3.3.2
Jdk Version: jdk1.8.0_301
Deployment mode: kraft 
Reporter: wangliucheng


In my kafka environment, topic has 2 replicas, leader and follower unavailable 
when disk error of leader 
The follower detects disk error before the leader
Here is the logs:

*follower recive KafkaStorageException:*
{code:java}
[2023-08-17 08:40:15,516] ERROR [ReplicaFetcher replicaId=4, leaderId=1, 
fetcherId=10] Error for partition __consumer_offsets-37 at offset 305860652 
(kafka.server.ReplicaFetcherThread)
org.apache.kafka.common.errors.KafkaStorageException: Disk error when trying to 
access log file on the disk.
 {code}
*isr shrink 4,1 to 1:*
{code:java}
[2023-08-17 08:41:49,953] INFO [Partition __consumer_offsets-37 broker=1] 
Shrinking ISR from 4,1 to 1. Leader: (highWatermark: 305860652, endOffset: 
305860653). Out of sync replicas: (brokerId: 4, endOffset: 305860652). 
(kafka.cluster.Partition)
 {code}
*broker marking dir to offline:*
{code:java}
[2023-08-17 08:41:50,188] ERROR Error while appending records to 
eb_raw_legendsec_flow_2-33 in dir /data09/kafka/log 
(kafka.server.LogDirFailureChannel)
java.io.IOException: Read-only file system
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
        at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60)
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
        at sun.nio.ch.IOUtil.write(IOUtil.java:65)
        at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:211)
        at 
org.apache.kafka.common.record.MemoryRecords.writeFullyTo(MemoryRecords.java:92)
        at 
org.apache.kafka.common.record.FileRecords.append(FileRecords.java:188)
        at kafka.log.LogSegment.append(LogSegment.scala:158)
        at kafka.log.LocalLog.append(LocalLog.scala:436)
        at kafka.log.UnifiedLog.append(UnifiedLog.scala:949)
        at kafka.log.UnifiedLog.appendAsFollower(UnifiedLog.scala:778)
        at 
kafka.cluster.Partition.doAppendRecordsToFollowerOrFutureReplica(Partition.scala:1121)
        at 
kafka.cluster.Partition.appendRecordsToFollowerOrFutureReplica(Partition.scala:1128)
        at 
kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:121)
        at 
kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$7(AbstractFetcherThread.scala:336)
        at scala.Option.foreach(Option.scala:437)
        at 
kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$6(AbstractFetcherThread.scala:325)
        at 
kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$6$adapted(AbstractFetcherThread.scala:324)
        at 
kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62)
        at 
scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359)
        at 
scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355)
        at 
scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309)
        at 
kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:324)
        at 
kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:124)
        at 
kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3$adapted(AbstractFetcherThread.scala:123)
        at scala.Option.foreach(Option.scala:437)
        at 
kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:123)
        at 
kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:106)
        at 
kafka.server.ReplicaFetcherThread.doWork(ReplicaFetcherThread.scala:97)
        at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96) 
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)