[jira] [Commented] (KAFKA-17244) java.base/java.lang.VirtualThread$VThreadContinuation.onPinned
[ https://issues.apache.org/jira/browse/KAFKA-17244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17870479#comment-17870479 ] Jianbin Chen commented on KAFKA-17244: -- # jdk21, Rocky linux 8.4, intel # This issue must occur after you invoke {{kafkaProducer#send}} in the virtual thread. You can add the {{-Djdk.tracePinnedThreads=full}} parameter in your test program to observe this phenomenon Thanks for your attention to this issue I think that this is closely related to the following code. If a synchronized lock is triggered in a virtual thread, the current virtual thread will be pinned. {code:java} synchronized (dq) { // After taking the lock, validate that the partition hasn't changed and retry. if (partitionChanged(topic, topicInfo, partitionInfo, dq, nowMs, cluster)) continue; RecordAppendResult appendResult = appendNewBatch(topic, effectivePartition, dq, timestamp, key, value, headers, callbacks, buffer, nowMs); // Set buffer to null, so that deallocate doesn't return it back to free pool, since it's used in the batch. if (appendResult.newBatchCreated) buffer = null; // If queue has incomplete batches we disable switch (see comments in updatePartitionInfo). boolean enableSwitch = allBatchesFull(dq); topicInfo.builtInPartitioner.updatePartitionInfo(partitionInfo, appendResult.appendedBytes, cluster, enableSwitch); return appendResult; } {code} Should we add a ReentrantLock to dq to replace synchronized? > java.base/java.lang.VirtualThread$VThreadContinuation.onPinned > -- > > Key: KAFKA-17244 > URL: https://issues.apache.org/jira/browse/KAFKA-17244 > Project: Kafka > Issue Type: Wish > Components: clients, producer >Affects Versions: 3.7.1 >Reporter: Jianbin Chen >Priority: Major > > {code:java} > Thread[#121,ForkJoinPool-1-worker-2,5,CarrierThreads] > java.base/java.lang.VirtualThread$VThreadContinuation.onPinned(VirtualThread.java:183) > java.base/jdk.internal.vm.Continuation.onPinned0(Continuation.java:393) > java.base/java.lang.VirtualThread.tryYield(VirtualThread.java:756) > java.base/java.lang.Thread.yield(Thread.java:443) > java.base/java.util.concurrent.ConcurrentHashMap.initTable(ConcurrentHashMap.java:2295) > java.base/java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1017) > java.base/java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:1541) > org.apache.kafka.common.record.CompressionRatioEstimator.getAndCreateEstimationIfAbsent(CompressionRatioEstimator.java:96) > org.apache.kafka.common.record.CompressionRatioEstimator.estimation(CompressionRatioEstimator.java:59) > org.apache.kafka.clients.producer.internals.ProducerBatch.(ProducerBatch.java:95) > org.apache.kafka.clients.producer.internals.ProducerBatch.(ProducerBatch.java:83) > org.apache.kafka.clients.producer.internals.RecordAccumulator.appendNewBatch(RecordAccumulator.java:399) > org.apache.kafka.clients.producer.internals.RecordAccumulator.append(RecordAccumulator.java:350) > <== monitors:1 > org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:1025) > org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:947) > {code} > Because there is synchronized in the {{RecordAccumulator.append}} method, > which causes the virtual thread to be {{{}onPinned{}}}, if this is considered > an optimization item, please assign it to me, and I will try to optimize the > problem. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-17244) java.base/java.lang.VirtualThread$VThreadContinuation.onPinned
[ https://issues.apache.org/jira/browse/KAFKA-17244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17870479#comment-17870479 ] Jianbin Chen edited comment on KAFKA-17244 at 8/2/24 12:00 PM: --- Hi [~kirktrue] , # jdk21, Rocky linux 8.4, intel # This issue must occur after you invoke {{kafkaProducer#send}} in the virtual thread. You can add the {{-Djdk.tracePinnedThreads=full}} parameter in your test program to observe this phenomenon Thanks for your attention to this issue I think that this is closely related to the following code. If a synchronized lock is triggered in a virtual thread, the current virtual thread will be pinned. {code:java} synchronized (dq) { // After taking the lock, validate that the partition hasn't changed and retry. if (partitionChanged(topic, topicInfo, partitionInfo, dq, nowMs, cluster)) continue; RecordAppendResult appendResult = appendNewBatch(topic, effectivePartition, dq, timestamp, key, value, headers, callbacks, buffer, nowMs); // Set buffer to null, so that deallocate doesn't return it back to free pool, since it's used in the batch. if (appendResult.newBatchCreated) buffer = null; // If queue has incomplete batches we disable switch (see comments in updatePartitionInfo). boolean enableSwitch = allBatchesFull(dq); topicInfo.builtInPartitioner.updatePartitionInfo(partitionInfo, appendResult.appendedBytes, cluster, enableSwitch); return appendResult; } {code} Should we add a ReentrantLock to dq to replace synchronized? was (Author: jianbin): # jdk21, Rocky linux 8.4, intel # This issue must occur after you invoke {{kafkaProducer#send}} in the virtual thread. You can add the {{-Djdk.tracePinnedThreads=full}} parameter in your test program to observe this phenomenon Thanks for your attention to this issue I think that this is closely related to the following code. If a synchronized lock is triggered in a virtual thread, the current virtual thread will be pinned. {code:java} synchronized (dq) { // After taking the lock, validate that the partition hasn't changed and retry. if (partitionChanged(topic, topicInfo, partitionInfo, dq, nowMs, cluster)) continue; RecordAppendResult appendResult = appendNewBatch(topic, effectivePartition, dq, timestamp, key, value, headers, callbacks, buffer, nowMs); // Set buffer to null, so that deallocate doesn't return it back to free pool, since it's used in the batch. if (appendResult.newBatchCreated) buffer = null; // If queue has incomplete batches we disable switch (see comments in updatePartitionInfo). boolean enableSwitch = allBatchesFull(dq); topicInfo.builtInPartitioner.updatePartitionInfo(partitionInfo, appendResult.appendedBytes, cluster, enableSwitch); return appendResult; } {code} Should we add a ReentrantLock to dq to replace synchronized? > java.base/java.lang.VirtualThread$VThreadContinuation.onPinned > -- > > Key: KAFKA-17244 > URL: https://issues.apache.org/jira/browse/KAFKA-17244 > Project: Kafka > Issue Type: Wish > Components: clients, producer >Affects Versions: 3.7.1 >Reporter: Jianbin Chen >Priority: Major > > {code:java} > Thread[#121,ForkJoinPool-1-worker-2,5,CarrierThreads] > java.base/java.lang.VirtualThread$VThreadContinuation.onPinned(VirtualThread.java:183) > java.base/jdk.internal.vm.Continuation.onPinned0(Continuation.java:393) > java.base/java.lang.VirtualThread.tryYield(VirtualThread.java:756) > java.base/java.lang.Thread.yield(Thread.java:443) > java.base/java.util.concurrent.ConcurrentHashMap.initTable(ConcurrentHashMap.java:2295) > java.base/java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1017) > java.base/java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:1541) > org.apache.kafka.common.record.CompressionRatioEstimator.getAndCreateEstimationIfAbsent(CompressionRatioEstimator.java:96) > org.apache.kafka.common.record.CompressionRatioEstimator.estimation(CompressionRatioEstimator.java:59) > org.apache.kafka.clients.producer.internals.ProducerBatch.(ProducerBatch.java:95) > org.apache.kafka.clients.producer.internals.ProducerBatch.(ProducerBatch.java:83) > org.apache.kafka.clients.producer.internals.RecordAccumulator.appendNewBatch(RecordAccumulator.java:399) > org.a
[jira] [Created] (KAFKA-17244) java.base/java.lang.VirtualThread$VThreadContinuation.onPinned
Jianbin Chen created KAFKA-17244: Summary: java.base/java.lang.VirtualThread$VThreadContinuation.onPinned Key: KAFKA-17244 URL: https://issues.apache.org/jira/browse/KAFKA-17244 Project: Kafka Issue Type: Wish Affects Versions: 3.7.1 Reporter: Jianbin Chen {code:java} Thread[#121,ForkJoinPool-1-worker-2,5,CarrierThreads] java.base/java.lang.VirtualThread$VThreadContinuation.onPinned(VirtualThread.java:183) java.base/jdk.internal.vm.Continuation.onPinned0(Continuation.java:393) java.base/java.lang.VirtualThread.tryYield(VirtualThread.java:756) java.base/java.lang.Thread.yield(Thread.java:443) java.base/java.util.concurrent.ConcurrentHashMap.initTable(ConcurrentHashMap.java:2295) java.base/java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1017) java.base/java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:1541) org.apache.kafka.common.record.CompressionRatioEstimator.getAndCreateEstimationIfAbsent(CompressionRatioEstimator.java:96) org.apache.kafka.common.record.CompressionRatioEstimator.estimation(CompressionRatioEstimator.java:59) org.apache.kafka.clients.producer.internals.ProducerBatch.(ProducerBatch.java:95) org.apache.kafka.clients.producer.internals.ProducerBatch.(ProducerBatch.java:83) org.apache.kafka.clients.producer.internals.RecordAccumulator.appendNewBatch(RecordAccumulator.java:399) org.apache.kafka.clients.producer.internals.RecordAccumulator.append(RecordAccumulator.java:350) <== monitors:1 org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:1025) org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:947) {code} Because there is synchronized in the {{RecordAccumulator.append}} method, which causes the virtual thread to be {{{}onPinned{}}}, if this is considered an optimization item, please assign it to me, and I will try to optimize the problem. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-17068) Failed to modify controller IP under Raft mode
[ https://issues.apache.org/jira/browse/KAFKA-17068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17862667#comment-17862667 ] Jianbin Chen commented on KAFKA-17068: -- [~showuon] I'll go check out this KIP. Thanks for your response > Failed to modify controller IP under Raft mode > -- > > Key: KAFKA-17068 > URL: https://issues.apache.org/jira/browse/KAFKA-17068 > Project: Kafka > Issue Type: Wish >Affects Versions: 3.7.1 >Reporter: Jianbin Chen >Priority: Major > > {code:java} > controller.quorum.voters=1@192.168.1.123:9093,2@192.168.1.124:9093,3@192.168.1.125:9093{code} > change to > {code:java} > controller.quorum.voters=1@192.168.1.126:9093,2@192.168.1.124:9093,3@192.168.1.125:9093{code} > 192.168.1.126 log : > {code:java} > [2024-07-03 14:05:22,236] INFO [ControllerRegistrationManager id=1 > incarnation=ULBsL0bbRvG7iXKCKtCQgg] RegistrationResponseHandler: controller > acknowledged ControllerRegistrationRequest. > (kafka.server.ControllerRegistrationManager) > [2024-07-03 14:05:22,708] INFO [ControllerRegistrationManager id=1 > incarnation=ULBsL0bbRvG7iXKCKtCQgg] Our registration has been persisted to > the metadata log. (kafka.server.ControllerRegistrationManager) > [2024-07-03 14:05:22,816] INFO [AdminClient clientId=adminclient-1] Node -1 > disconnected. (org.apache.kafka.clients.NetworkClient) > [2024-07-03 14:05:22,816] WARN [AdminClient clientId=adminclient-1] > Connection to node -1 (/192.168.1.126:9092) could not be established. Node > may not be available. (org.apache.kafka.clients.NetworkClient){code} > Then the node at 192.168.1.126 crashed > {code:java} > [2024-07-03 14:06:55,134] INFO [MetadataLoader id=1] beginShutdown: shutting > down event queue. (org.apache.kafka.queue.KafkaEventQueue) > [2024-07-03 14:06:55,134] INFO [SnapshotGenerator id=1] beginShutdown: > shutting down event queue. (org.apache.kafka.queue.KafkaEventQueue) > [2024-07-03 14:06:55,134] INFO [SnapshotGenerator id=1] closed event queue. > (org.apache.kafka.queue.KafkaEventQueue) > [2024-07-03 14:06:55,135] INFO [MetadataLoader id=1] closed event queue. > (org.apache.kafka.queue.KafkaEventQueue) > [2024-07-03 14:06:55,135] INFO [SnapshotGenerator id=1] closed event queue. > (org.apache.kafka.queue.KafkaEventQueue) > [2024-07-03 14:06:55,136] INFO Metrics scheduler closed > (org.apache.kafka.common.metrics.Metrics) > [2024-07-03 14:06:55,136] INFO Closing reporter > org.apache.kafka.common.metrics.JmxReporter > (org.apache.kafka.common.metrics.Metrics) > [2024-07-03 14:06:55,137] INFO Metrics reporters closed > (org.apache.kafka.common.metrics.Metrics) > [2024-07-03 14:06:55,137] INFO App info kafka.server for 1 unregistered > (org.apache.kafka.common.utils.AppInfoParser) > [2024-07-03 14:06:55,137] INFO App info kafka.server for 1 unregistered > (org.apache.kafka.common.utils.AppInfoParser) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-17068) Failed to modify controller IP under Raft mode
[ https://issues.apache.org/jira/browse/KAFKA-17068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17862666#comment-17862666 ] Jianbin Chen commented on KAFKA-17068: -- I found the reason. It's because I forgot to stop the node at 192.168.1.123, while I had already changed the other two brokers to 192.168.1.126. Starting 192.168.1.126 resulted in failure. Is there any way to optimize this situation? Otherwise, it's a bit misleading and could lead to confusion during troubleshooting > Failed to modify controller IP under Raft mode > -- > > Key: KAFKA-17068 > URL: https://issues.apache.org/jira/browse/KAFKA-17068 > Project: Kafka > Issue Type: Wish >Affects Versions: 3.7.1 >Reporter: Jianbin Chen >Priority: Major > > {code:java} > controller.quorum.voters=1@192.168.1.123:9093,2@192.168.1.124:9093,3@192.168.1.125:9093{code} > change to > {code:java} > controller.quorum.voters=1@192.168.1.126:9093,2@192.168.1.124:9093,3@192.168.1.125:9093{code} > 192.168.1.126 log : > {code:java} > [2024-07-03 14:05:22,236] INFO [ControllerRegistrationManager id=1 > incarnation=ULBsL0bbRvG7iXKCKtCQgg] RegistrationResponseHandler: controller > acknowledged ControllerRegistrationRequest. > (kafka.server.ControllerRegistrationManager) > [2024-07-03 14:05:22,708] INFO [ControllerRegistrationManager id=1 > incarnation=ULBsL0bbRvG7iXKCKtCQgg] Our registration has been persisted to > the metadata log. (kafka.server.ControllerRegistrationManager) > [2024-07-03 14:05:22,816] INFO [AdminClient clientId=adminclient-1] Node -1 > disconnected. (org.apache.kafka.clients.NetworkClient) > [2024-07-03 14:05:22,816] WARN [AdminClient clientId=adminclient-1] > Connection to node -1 (/192.168.1.126:9092) could not be established. Node > may not be available. (org.apache.kafka.clients.NetworkClient){code} > Then the node at 192.168.1.126 crashed > {code:java} > [2024-07-03 14:06:55,134] INFO [MetadataLoader id=1] beginShutdown: shutting > down event queue. (org.apache.kafka.queue.KafkaEventQueue) > [2024-07-03 14:06:55,134] INFO [SnapshotGenerator id=1] beginShutdown: > shutting down event queue. (org.apache.kafka.queue.KafkaEventQueue) > [2024-07-03 14:06:55,134] INFO [SnapshotGenerator id=1] closed event queue. > (org.apache.kafka.queue.KafkaEventQueue) > [2024-07-03 14:06:55,135] INFO [MetadataLoader id=1] closed event queue. > (org.apache.kafka.queue.KafkaEventQueue) > [2024-07-03 14:06:55,135] INFO [SnapshotGenerator id=1] closed event queue. > (org.apache.kafka.queue.KafkaEventQueue) > [2024-07-03 14:06:55,136] INFO Metrics scheduler closed > (org.apache.kafka.common.metrics.Metrics) > [2024-07-03 14:06:55,136] INFO Closing reporter > org.apache.kafka.common.metrics.JmxReporter > (org.apache.kafka.common.metrics.Metrics) > [2024-07-03 14:06:55,137] INFO Metrics reporters closed > (org.apache.kafka.common.metrics.Metrics) > [2024-07-03 14:06:55,137] INFO App info kafka.server for 1 unregistered > (org.apache.kafka.common.utils.AppInfoParser) > [2024-07-03 14:06:55,137] INFO App info kafka.server for 1 unregistered > (org.apache.kafka.common.utils.AppInfoParser) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-17068) Failed to modify controller IP under Raft mode
Jianbin Chen created KAFKA-17068: Summary: Failed to modify controller IP under Raft mode Key: KAFKA-17068 URL: https://issues.apache.org/jira/browse/KAFKA-17068 Project: Kafka Issue Type: Wish Affects Versions: 3.7.1 Reporter: Jianbin Chen {code:java} controller.quorum.voters=1@192.168.1.123:9093,2@192.168.1.124:9093,3@192.168.1.125:9093{code} change to {code:java} controller.quorum.voters=1@192.168.1.126:9093,2@192.168.1.124:9093,3@192.168.1.125:9093{code} 192.168.1.126 log : {code:java} [2024-07-03 14:05:22,236] INFO [ControllerRegistrationManager id=1 incarnation=ULBsL0bbRvG7iXKCKtCQgg] RegistrationResponseHandler: controller acknowledged ControllerRegistrationRequest. (kafka.server.ControllerRegistrationManager) [2024-07-03 14:05:22,708] INFO [ControllerRegistrationManager id=1 incarnation=ULBsL0bbRvG7iXKCKtCQgg] Our registration has been persisted to the metadata log. (kafka.server.ControllerRegistrationManager) [2024-07-03 14:05:22,816] INFO [AdminClient clientId=adminclient-1] Node -1 disconnected. (org.apache.kafka.clients.NetworkClient) [2024-07-03 14:05:22,816] WARN [AdminClient clientId=adminclient-1] Connection to node -1 (/192.168.1.126:9092) could not be established. Node may not be available. (org.apache.kafka.clients.NetworkClient){code} Then the node at 192.168.1.126 crashed {code:java} [2024-07-03 14:06:55,134] INFO [MetadataLoader id=1] beginShutdown: shutting down event queue. (org.apache.kafka.queue.KafkaEventQueue) [2024-07-03 14:06:55,134] INFO [SnapshotGenerator id=1] beginShutdown: shutting down event queue. (org.apache.kafka.queue.KafkaEventQueue) [2024-07-03 14:06:55,134] INFO [SnapshotGenerator id=1] closed event queue. (org.apache.kafka.queue.KafkaEventQueue) [2024-07-03 14:06:55,135] INFO [MetadataLoader id=1] closed event queue. (org.apache.kafka.queue.KafkaEventQueue) [2024-07-03 14:06:55,135] INFO [SnapshotGenerator id=1] closed event queue. (org.apache.kafka.queue.KafkaEventQueue) [2024-07-03 14:06:55,136] INFO Metrics scheduler closed (org.apache.kafka.common.metrics.Metrics) [2024-07-03 14:06:55,136] INFO Closing reporter org.apache.kafka.common.metrics.JmxReporter (org.apache.kafka.common.metrics.Metrics) [2024-07-03 14:06:55,137] INFO Metrics reporters closed (org.apache.kafka.common.metrics.Metrics) [2024-07-03 14:06:55,137] INFO App info kafka.server for 1 unregistered (org.apache.kafka.common.utils.AppInfoParser) [2024-07-03 14:06:55,137] INFO App info kafka.server for 1 unregistered (org.apache.kafka.common.utils.AppInfoParser) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-17020) After enabling tiered storage, occasional residual logs are left in the replica
[ https://issues.apache.org/jira/browse/KAFKA-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17859890#comment-17859890 ] Jianbin Chen commented on KAFKA-17020: -- [~showuon] My cluster has been running for over half a month, and this is the first time this issue has occurred, so it's an intermittent event. Therefore, I cannot directly reproduce the problem. I have currently reassigned the partitions for some topics with residual log files, and now the issue has not reoccurred. I am thinking it might take a long period of operation before it appears again. > After enabling tiered storage, occasional residual logs are left in the > replica > --- > > Key: KAFKA-17020 > URL: https://issues.apache.org/jira/browse/KAFKA-17020 > Project: Kafka > Issue Type: Wish >Affects Versions: 3.7.0 >Reporter: Jianbin Chen >Priority: Major > Attachments: image-2024-06-22-21-45-43-815.png, > image-2024-06-22-21-46-12-371.png, image-2024-06-22-21-46-26-530.png, > image-2024-06-22-21-46-42-917.png, image-2024-06-22-21-47-00-230.png > > > After enabling tiered storage, occasional residual logs are left in the > replica. > Based on the observed phenomenon, the index values of the rolled-out logs > generated by the replica and the leader are not the same. As a result, the > logs uploaded to S3 at the same time do not include the corresponding log > files on the replica side, making it impossible to delete the local logs. > !image-2024-06-22-21-45-43-815.png! > leader config: > {code:java} > num.partitions=3 > default.replication.factor=2 > delete.topic.enable=true > auto.create.topics.enable=false > num.recovery.threads.per.data.dir=1 > offsets.topic.replication.factor=3 > transaction.state.log.replication.factor=2 > transaction.state.log.min.isr=1 > offsets.retention.minutes=4320 > log.roll.ms=8640 > log.local.retention.ms=60 > log.segment.bytes=536870912 > num.replica.fetchers=1 > log.retention.ms=1581120 > remote.log.manager.thread.pool.size=4 > remote.log.reader.threads=4 > remote.log.metadata.topic.replication.factor=3 > remote.log.storage.system.enable=true > remote.log.metadata.topic.retention.ms=18000 > rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache > rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache > Pick some cache size, 16 GiB here: > rsm.config.fetch.chunk.cache.size=34359738368 > rsm.config.fetch.chunk.cache.retention.ms=120 > # # Prefetching size, 16 MiB here: > rsm.config.fetch.chunk.cache.prefetch.max.size=33554432 > rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage > rsm.config.storage.s3.bucket.name= > rsm.config.storage.s3.region=us-west-1 > rsm.config.storage.aws.secret.access.key= > rsm.config.storage.aws.access.key.id= > rsm.config.chunk.size=8388608 > remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/:/home/admin/s3-0.0.1-SNAPSHOT/ > remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager > remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager > remote.log.metadata.manager.listener.name=PLAINTEXT > rsm.config.upload.rate.limit.bytes.per.second=31457280 > {code} > replica config: > {code:java} > num.partitions=3 > default.replication.factor=2 > delete.topic.enable=true > auto.create.topics.enable=false > num.recovery.threads.per.data.dir=1 > offsets.topic.replication.factor=3 > transaction.state.log.replication.factor=2 > transaction.state.log.min.isr=1 > offsets.retention.minutes=4320 > log.roll.ms=8640 > log.local.retention.ms=60 > log.segment.bytes=536870912 > num.replica.fetchers=1 > log.retention.ms=1581120 > remote.log.manager.thread.pool.size=4 > remote.log.reader.threads=4 > remote.log.metadata.topic.replication.factor=3 > remote.log.storage.system.enable=true > #remote.log.metadata.topic.retention.ms=18000 > rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache > rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache > # Pick some cache size, 16 GiB here: > rsm.config.fetch.chunk.cache.size=34359738368 > rsm.config.fetch.chunk.cache.retention.ms=120 > # # # Prefetching size, 16 MiB here: > rsm.config.fetch.chunk.cache.prefetch.max.size=33554432 > rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage > rsm.config.storage.s3.bucket.name= > rsm.config.storage.s3.region=us-west-1 > rsm.config.storage.aws.secret.access.key= > rsm.config.storage.aws.access.key.id= > rsm.config.chunk.size=8388608 > remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/*:/home/admin/s3-0.0.1-SNAPSHOT/* > remo
[jira] [Commented] (KAFKA-17020) After enabling tiered storage, occasional residual logs are left in the replica
[ https://issues.apache.org/jira/browse/KAFKA-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17859884#comment-17859884 ] Jianbin Chen commented on KAFKA-17020: -- [~showuon] I still appreciate your response. The screenshots of the logs in the folders corresponding to the topics and partitions for my leader and replicas actually illustrate the issue. My configuration is set to store logs locally for only 10 minutes, and the segment size is limited to 512MB. This segment was already full at 512MB over 2 hours ago and satisfied the 10-minute TTL requirement. Additionally, the leader has already uploaded the corresponding log to remote storage. However, the replicas still retain this log file for several hours, and a simple restart does not resolve the issue. Currently, there are a few temporary solutions: # Stop the replicas' processes, delete the topic partition folders with residual logs, and then restart the corresponding broker nodes. # Perform a topic partition reassignment. After re-electing the partition leader, the issue will also be resolved. However, these methods are only temporary fixes. I still do not understand why this issue suddenly appeared after running smoothly for half a month > After enabling tiered storage, occasional residual logs are left in the > replica > --- > > Key: KAFKA-17020 > URL: https://issues.apache.org/jira/browse/KAFKA-17020 > Project: Kafka > Issue Type: Wish >Affects Versions: 3.7.0 >Reporter: Jianbin Chen >Priority: Major > Attachments: image-2024-06-22-21-45-43-815.png, > image-2024-06-22-21-46-12-371.png, image-2024-06-22-21-46-26-530.png, > image-2024-06-22-21-46-42-917.png, image-2024-06-22-21-47-00-230.png > > > After enabling tiered storage, occasional residual logs are left in the > replica. > Based on the observed phenomenon, the index values of the rolled-out logs > generated by the replica and the leader are not the same. As a result, the > logs uploaded to S3 at the same time do not include the corresponding log > files on the replica side, making it impossible to delete the local logs. > !image-2024-06-22-21-45-43-815.png! > leader config: > {code:java} > num.partitions=3 > default.replication.factor=2 > delete.topic.enable=true > auto.create.topics.enable=false > num.recovery.threads.per.data.dir=1 > offsets.topic.replication.factor=3 > transaction.state.log.replication.factor=2 > transaction.state.log.min.isr=1 > offsets.retention.minutes=4320 > log.roll.ms=8640 > log.local.retention.ms=60 > log.segment.bytes=536870912 > num.replica.fetchers=1 > log.retention.ms=1581120 > remote.log.manager.thread.pool.size=4 > remote.log.reader.threads=4 > remote.log.metadata.topic.replication.factor=3 > remote.log.storage.system.enable=true > remote.log.metadata.topic.retention.ms=18000 > rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache > rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache > Pick some cache size, 16 GiB here: > rsm.config.fetch.chunk.cache.size=34359738368 > rsm.config.fetch.chunk.cache.retention.ms=120 > # # Prefetching size, 16 MiB here: > rsm.config.fetch.chunk.cache.prefetch.max.size=33554432 > rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage > rsm.config.storage.s3.bucket.name= > rsm.config.storage.s3.region=us-west-1 > rsm.config.storage.aws.secret.access.key= > rsm.config.storage.aws.access.key.id= > rsm.config.chunk.size=8388608 > remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/:/home/admin/s3-0.0.1-SNAPSHOT/ > remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager > remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager > remote.log.metadata.manager.listener.name=PLAINTEXT > rsm.config.upload.rate.limit.bytes.per.second=31457280 > {code} > replica config: > {code:java} > num.partitions=3 > default.replication.factor=2 > delete.topic.enable=true > auto.create.topics.enable=false > num.recovery.threads.per.data.dir=1 > offsets.topic.replication.factor=3 > transaction.state.log.replication.factor=2 > transaction.state.log.min.isr=1 > offsets.retention.minutes=4320 > log.roll.ms=8640 > log.local.retention.ms=60 > log.segment.bytes=536870912 > num.replica.fetchers=1 > log.retention.ms=1581120 > remote.log.manager.thread.pool.size=4 > remote.log.reader.threads=4 > remote.log.metadata.topic.replication.factor=3 > remote.log.storage.system.enable=true > #remote.log.metadata.topic.retention.ms=18000 > rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache > rsm.config.fetch.chunk.cache.path=
[jira] [Commented] (KAFKA-17020) After enabling tiered storage, occasional residual logs are left in the replica
[ https://issues.apache.org/jira/browse/KAFKA-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17859618#comment-17859618 ] Jianbin Chen commented on KAFKA-17020: -- [~showuon] Can you tell me what kind of logs you need as evidence? This issue has been ongoing for several days. I can look for the relevant logs, but first, you need to provide me with keywords to use when searching for the logs. > After enabling tiered storage, occasional residual logs are left in the > replica > --- > > Key: KAFKA-17020 > URL: https://issues.apache.org/jira/browse/KAFKA-17020 > Project: Kafka > Issue Type: Wish >Affects Versions: 3.7.0 >Reporter: Jianbin Chen >Priority: Major > Attachments: image-2024-06-22-21-45-43-815.png, > image-2024-06-22-21-46-12-371.png, image-2024-06-22-21-46-26-530.png, > image-2024-06-22-21-46-42-917.png, image-2024-06-22-21-47-00-230.png > > > After enabling tiered storage, occasional residual logs are left in the > replica. > Based on the observed phenomenon, the index values of the rolled-out logs > generated by the replica and the leader are not the same. As a result, the > logs uploaded to S3 at the same time do not include the corresponding log > files on the replica side, making it impossible to delete the local logs. > !image-2024-06-22-21-45-43-815.png! > leader config: > {code:java} > num.partitions=3 > default.replication.factor=2 > delete.topic.enable=true > auto.create.topics.enable=false > num.recovery.threads.per.data.dir=1 > offsets.topic.replication.factor=3 > transaction.state.log.replication.factor=2 > transaction.state.log.min.isr=1 > offsets.retention.minutes=4320 > log.roll.ms=8640 > log.local.retention.ms=60 > log.segment.bytes=536870912 > num.replica.fetchers=1 > log.retention.ms=1581120 > remote.log.manager.thread.pool.size=4 > remote.log.reader.threads=4 > remote.log.metadata.topic.replication.factor=3 > remote.log.storage.system.enable=true > remote.log.metadata.topic.retention.ms=18000 > rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache > rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache > Pick some cache size, 16 GiB here: > rsm.config.fetch.chunk.cache.size=34359738368 > rsm.config.fetch.chunk.cache.retention.ms=120 > # # Prefetching size, 16 MiB here: > rsm.config.fetch.chunk.cache.prefetch.max.size=33554432 > rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage > rsm.config.storage.s3.bucket.name= > rsm.config.storage.s3.region=us-west-1 > rsm.config.storage.aws.secret.access.key= > rsm.config.storage.aws.access.key.id= > rsm.config.chunk.size=8388608 > remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/:/home/admin/s3-0.0.1-SNAPSHOT/ > remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager > remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager > remote.log.metadata.manager.listener.name=PLAINTEXT > rsm.config.upload.rate.limit.bytes.per.second=31457280 > {code} > replica config: > {code:java} > num.partitions=3 > default.replication.factor=2 > delete.topic.enable=true > auto.create.topics.enable=false > num.recovery.threads.per.data.dir=1 > offsets.topic.replication.factor=3 > transaction.state.log.replication.factor=2 > transaction.state.log.min.isr=1 > offsets.retention.minutes=4320 > log.roll.ms=8640 > log.local.retention.ms=60 > log.segment.bytes=536870912 > num.replica.fetchers=1 > log.retention.ms=1581120 > remote.log.manager.thread.pool.size=4 > remote.log.reader.threads=4 > remote.log.metadata.topic.replication.factor=3 > remote.log.storage.system.enable=true > #remote.log.metadata.topic.retention.ms=18000 > rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache > rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache > # Pick some cache size, 16 GiB here: > rsm.config.fetch.chunk.cache.size=34359738368 > rsm.config.fetch.chunk.cache.retention.ms=120 > # # # Prefetching size, 16 MiB here: > rsm.config.fetch.chunk.cache.prefetch.max.size=33554432 > rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage > rsm.config.storage.s3.bucket.name= > rsm.config.storage.s3.region=us-west-1 > rsm.config.storage.aws.secret.access.key= > rsm.config.storage.aws.access.key.id= > rsm.config.chunk.size=8388608 > remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/*:/home/admin/s3-0.0.1-SNAPSHOT/* > remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager > remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.st
[jira] [Commented] (KAFKA-17020) After enabling tiered storage, occasional residual logs are left in the replica
[ https://issues.apache.org/jira/browse/KAFKA-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17859617#comment-17859617 ] Jianbin Chen commented on KAFKA-17020: -- [~showuon] The issue is that after the logs in local storage are uploaded to remote storage and deleted on the leader side, the replicas fail to delete the local logs. > After enabling tiered storage, occasional residual logs are left in the > replica > --- > > Key: KAFKA-17020 > URL: https://issues.apache.org/jira/browse/KAFKA-17020 > Project: Kafka > Issue Type: Wish >Affects Versions: 3.7.0 >Reporter: Jianbin Chen >Priority: Major > Attachments: image-2024-06-22-21-45-43-815.png, > image-2024-06-22-21-46-12-371.png, image-2024-06-22-21-46-26-530.png, > image-2024-06-22-21-46-42-917.png, image-2024-06-22-21-47-00-230.png > > > After enabling tiered storage, occasional residual logs are left in the > replica. > Based on the observed phenomenon, the index values of the rolled-out logs > generated by the replica and the leader are not the same. As a result, the > logs uploaded to S3 at the same time do not include the corresponding log > files on the replica side, making it impossible to delete the local logs. > !image-2024-06-22-21-45-43-815.png! > leader config: > {code:java} > num.partitions=3 > default.replication.factor=2 > delete.topic.enable=true > auto.create.topics.enable=false > num.recovery.threads.per.data.dir=1 > offsets.topic.replication.factor=3 > transaction.state.log.replication.factor=2 > transaction.state.log.min.isr=1 > offsets.retention.minutes=4320 > log.roll.ms=8640 > log.local.retention.ms=60 > log.segment.bytes=536870912 > num.replica.fetchers=1 > log.retention.ms=1581120 > remote.log.manager.thread.pool.size=4 > remote.log.reader.threads=4 > remote.log.metadata.topic.replication.factor=3 > remote.log.storage.system.enable=true > remote.log.metadata.topic.retention.ms=18000 > rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache > rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache > Pick some cache size, 16 GiB here: > rsm.config.fetch.chunk.cache.size=34359738368 > rsm.config.fetch.chunk.cache.retention.ms=120 > # # Prefetching size, 16 MiB here: > rsm.config.fetch.chunk.cache.prefetch.max.size=33554432 > rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage > rsm.config.storage.s3.bucket.name= > rsm.config.storage.s3.region=us-west-1 > rsm.config.storage.aws.secret.access.key= > rsm.config.storage.aws.access.key.id= > rsm.config.chunk.size=8388608 > remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/:/home/admin/s3-0.0.1-SNAPSHOT/ > remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager > remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager > remote.log.metadata.manager.listener.name=PLAINTEXT > rsm.config.upload.rate.limit.bytes.per.second=31457280 > {code} > replica config: > {code:java} > num.partitions=3 > default.replication.factor=2 > delete.topic.enable=true > auto.create.topics.enable=false > num.recovery.threads.per.data.dir=1 > offsets.topic.replication.factor=3 > transaction.state.log.replication.factor=2 > transaction.state.log.min.isr=1 > offsets.retention.minutes=4320 > log.roll.ms=8640 > log.local.retention.ms=60 > log.segment.bytes=536870912 > num.replica.fetchers=1 > log.retention.ms=1581120 > remote.log.manager.thread.pool.size=4 > remote.log.reader.threads=4 > remote.log.metadata.topic.replication.factor=3 > remote.log.storage.system.enable=true > #remote.log.metadata.topic.retention.ms=18000 > rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache > rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache > # Pick some cache size, 16 GiB here: > rsm.config.fetch.chunk.cache.size=34359738368 > rsm.config.fetch.chunk.cache.retention.ms=120 > # # # Prefetching size, 16 MiB here: > rsm.config.fetch.chunk.cache.prefetch.max.size=33554432 > rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage > rsm.config.storage.s3.bucket.name= > rsm.config.storage.s3.region=us-west-1 > rsm.config.storage.aws.secret.access.key= > rsm.config.storage.aws.access.key.id= > rsm.config.chunk.size=8388608 > remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/*:/home/admin/s3-0.0.1-SNAPSHOT/* > remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager > remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager > remote.log.metadata.
[jira] [Updated] (KAFKA-17020) After enabling tiered storage, occasional residual logs are left in the replica
[ https://issues.apache.org/jira/browse/KAFKA-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianbin Chen updated KAFKA-17020: - Attachment: image-2024-06-22-21-47-00-230.png image-2024-06-22-21-46-42-917.png image-2024-06-22-21-46-26-530.png image-2024-06-22-21-46-12-371.png image-2024-06-22-21-45-43-815.png External issue URL: https://github.com/Aiven-Open/tiered-storage-for-apache-kafka/issues/562 Description: After enabling tiered storage, occasional residual logs are left in the replica. Based on the observed phenomenon, the index values of the rolled-out logs generated by the replica and the leader are not the same. As a result, the logs uploaded to S3 at the same time do not include the corresponding log files on the replica side, making it impossible to delete the local logs. !image-2024-06-22-21-45-43-815.png! leader config: {code:java} num.partitions=3 default.replication.factor=2 delete.topic.enable=true auto.create.topics.enable=false num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=3 transaction.state.log.replication.factor=2 transaction.state.log.min.isr=1 offsets.retention.minutes=4320 log.roll.ms=8640 log.local.retention.ms=60 log.segment.bytes=536870912 num.replica.fetchers=1 log.retention.ms=1581120 remote.log.manager.thread.pool.size=4 remote.log.reader.threads=4 remote.log.metadata.topic.replication.factor=3 remote.log.storage.system.enable=true remote.log.metadata.topic.retention.ms=18000 rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache Pick some cache size, 16 GiB here: rsm.config.fetch.chunk.cache.size=34359738368 rsm.config.fetch.chunk.cache.retention.ms=120 # # Prefetching size, 16 MiB here: rsm.config.fetch.chunk.cache.prefetch.max.size=33554432 rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage rsm.config.storage.s3.bucket.name= rsm.config.storage.s3.region=us-west-1 rsm.config.storage.aws.secret.access.key= rsm.config.storage.aws.access.key.id= rsm.config.chunk.size=8388608 remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/:/home/admin/s3-0.0.1-SNAPSHOT/ remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager remote.log.metadata.manager.listener.name=PLAINTEXT rsm.config.upload.rate.limit.bytes.per.second=31457280 {code} replica config: {code:java} num.partitions=3 default.replication.factor=2 delete.topic.enable=true auto.create.topics.enable=false num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=3 transaction.state.log.replication.factor=2 transaction.state.log.min.isr=1 offsets.retention.minutes=4320 log.roll.ms=8640 log.local.retention.ms=60 log.segment.bytes=536870912 num.replica.fetchers=1 log.retention.ms=1581120 remote.log.manager.thread.pool.size=4 remote.log.reader.threads=4 remote.log.metadata.topic.replication.factor=3 remote.log.storage.system.enable=true #remote.log.metadata.topic.retention.ms=18000 rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache # Pick some cache size, 16 GiB here: rsm.config.fetch.chunk.cache.size=34359738368 rsm.config.fetch.chunk.cache.retention.ms=120 # # # Prefetching size, 16 MiB here: rsm.config.fetch.chunk.cache.prefetch.max.size=33554432 rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage rsm.config.storage.s3.bucket.name= rsm.config.storage.s3.region=us-west-1 rsm.config.storage.aws.secret.access.key= rsm.config.storage.aws.access.key.id= rsm.config.chunk.size=8388608 remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/*:/home/admin/s3-0.0.1-SNAPSHOT/* remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager remote.log.metadata.manager.listener.name=PLAINTEXT rsm.config.upload.rate.limit.bytes.per.second=31457280 {code} topic config: {code:java} Dynamic configs for topic xx are: local.retention.ms=60 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:local.retention.ms=60, STATIC_BROKER_CONFIG:log.local.retention.ms=60, DEFAULT_CONFIG:log.local.retention.ms=-2} remote.storage.enable=true sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:remote.storage.enable=true} retention.ms=1581120 sensitive=false synonyms={DYNAMIC_TOPIC_CONFIG:retention.ms=1581120, STATIC_BROKER_CONFIG:log.retention.ms=1581120, DEFAULT_CON
[jira] [Updated] (KAFKA-17020) After enabling tiered storage, occasional residual logs are left in the replica
[ https://issues.apache.org/jira/browse/KAFKA-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianbin Chen updated KAFKA-17020: - Description: After enabling tiered storage, occasional residual logs are left in the replica. Based on the observed phenomenon, the index values of the rolled-out logs generated by the replica and the leader are not the same. As a result, the logs uploaded to S3 at the same time do not include the corresponding log files on the replica side, making it impossible to delete the local logs. [!https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E!|https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E] leader config: {code:java} num.partitions=3 default.replication.factor=2 delete.topic.enable=true auto.create.topics.enable=false num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=3 transaction.state.log.replication.factor=2 transaction.state.log.min.isr=1 offsets.retention.minutes=4320 log.roll.ms=8640 log.local.retention.ms=60 log.segment.bytes=536870912 num.replica.fetchers=1 log.retention.ms=1581120 remote.log.manager.thread.pool.size=4 remote.log.reader.threads=4 remote.log.metadata.topic.replication.factor=3 remote.log.storage.system.enable=true remote.log.metadata.topic.retention.ms=18000 rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache Pick some cache size, 16 GiB here: rsm.config.fetch.chunk.cache.size=34359738368 rsm.config.fetch.chunk.cache.retention.ms=120 # # Prefetching size, 16 MiB here: rsm.config.fetch.chunk.cache.prefetch.max.size=33554432 rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage rsm.config.storage.s3.bucket.name= rsm.config.storage.s3.region=us-west-1 rsm.config.storage.aws.secret.access.key= rsm.config.storage.aws.access.key.id= rsm.config.chunk.size=8388608 remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/:/home/admin/s3-0.0.1-SNAPSHOT/ remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager remote.log.metadata.manager.listener.name=PLAINTEXT rsm.config.upload.rate.limit.bytes.per.second=31457280 {code} replica config: {code:java} num.partitions=3 default.replication.factor=2 delete.topic.enable=true auto.create.topics.enable=false num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=3 transaction.state.log.replication.factor=2 transaction.state.log.min.isr=1 offsets.retention.minutes=4320 log.roll.ms=8640 log.local.retention.ms=60 log.segment.bytes=536870912 num.replica.fetchers=1 log.retention.ms=1581120 remote.log.manager.thread.pool.size=4 remote.log.reader.threads=4 remote.log.metadata.topic.replication.factor=3 remote.log.storage.system.enable=true #remote.log.metadata.topic.retention.ms=18000 rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache # Pick some cache size, 16 GiB here: rsm.config.fetch.chunk.cache.size=34359738368 rsm.config.fetch.chunk.cache.retention.ms=120 # # # Prefetching size, 16 MiB here: rsm.config.fetch.chunk.cache.prefetch.max.size=33554432 rsm.config.stor
[jira] [Updated] (KAFKA-17020) After enabling tiered storage, occasional residual logs are left in the replica
[ https://issues.apache.org/jira/browse/KAFKA-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianbin Chen updated KAFKA-17020: - Description: After enabling tiered storage, occasional residual logs are left in the replica. Based on the observed phenomenon, the index values of the rolled-out logs generated by the replica and the leader are not the same. As a result, the logs uploaded to S3 at the same time do not include the corresponding log files on the replica side, making it impossible to delete the local logs. [!https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E!|https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E] leader config: num.partitions=3 default.replication.factor=2 delete.topic.enable=true auto.create.topics.enable=false num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=3 transaction.state.log.replication.factor=2 transaction.state.log.min.isr=1 offsets.retention.minutes=4320 log.roll.ms=8640 log.local.retention.ms=60 log.segment.bytes=536870912 num.replica.fetchers=1 log.retention.ms=1581120 remote.log.manager.thread.pool.size=4 remote.log.reader.threads=4 remote.log.metadata.topic.replication.factor=3 remote.log.storage.system.enable=true remote.log.metadata.topic.retention.ms=18000 rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache # Pick some cache size, 16 GiB here: rsm.config.fetch.chunk.cache.size=34359738368 rsm.config.fetch.chunk.cache.retention.ms=120 # # # Prefetching size, 16 MiB here: rsm.config.fetch.chunk.cache.prefetch.max.size=33554432 rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage rsm.config.storage.s3.bucket.name= rsm.config.storage.s3.region=us-west-1 rsm.config.storage.aws.secret.access.key= rsm.config.storage.aws.access.key.id= rsm.config.chunk.size=8388608 remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/{*}:/home/admin/s3-0.0.1-SNAPSHOT/{*} remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager remote.log.metadata.manager.listener.name=PLAINTEXT rsm.config.upload.rate.limit.bytes.per.second=31457280 replica config: num.partitions=3 default.replication.factor=2 delete.topic.enable=true auto.create.topics.enable=false num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=3 transaction.state.log.replication.factor=2 transaction.state.log.min.isr=1 offsets.retention.minutes=4320 log.roll.ms=8640 log.local.retention.ms=60 log.segment.bytes=536870912 num.replica.fetchers=1 log.retention.ms=1581120 remote.log.manager.thread.pool.size=4 remote.log.reader.threads=4 remote.log.metadata.topic.replication.factor=3 remote.log.storage.system.enable=true #remote.log.metadata.topic.retention.ms=18000 rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache # Pick some cache size, 16 GiB here: rsm.config.fetch.chunk.cache.size=34359738368 rsm.config.fetch.chunk.cache.retention.ms=120 # # # Prefetching size, 16 MiB here: rsm.config.fetch.chunk.cache.prefetch.max.size=33554432 rsm.config.storage.backend.class=i
[jira] (KAFKA-17020) After enabling tiered storage, occasional residual logs are left in the replica
[ https://issues.apache.org/jira/browse/KAFKA-17020 ] Jianbin Chen deleted comment on KAFKA-17020: -- was (Author: jianbin): Restarting does not resolve this issue. The only solution is to delete the log folder corresponding to the replica where the log segment anomaly occurred and then resynchronize from the leader. ![image](https://github.com/Aiven-Open/tiered-storage-for-apache-kafka/assets/19943636/7256c156-6e90-4799-b0cf-a48c247c5b51) > After enabling tiered storage, occasional residual logs are left in the > replica > --- > > Key: KAFKA-17020 > URL: https://issues.apache.org/jira/browse/KAFKA-17020 > Project: Kafka > Issue Type: Wish >Affects Versions: 3.7.0 >Reporter: Jianbin Chen >Priority: Major > > After enabling tiered storage, occasional residual logs are left in the > replica. > Based on the observed phenomenon, the index values of the rolled-out logs > generated by the replica and the leader are not the same. As a result, the > logs uploaded to S3 at the same time do not include the corresponding log > files on the replica side, making it impossible to delete the local logs. > [!https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E!|https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E] > leader config: > num.partitions=3 > default.replication.factor=2 > delete.topic.enable=true > auto.create.topics.enable=false > num.recovery.threads.per.data.dir=1 > offsets.topic.replication.factor=3 > transaction.state.log.replication.factor=2 > transaction.state.log.min.isr=1 > offsets.retention.minutes=4320 > log.roll.ms=8640 > log.local.retention.ms=60 > log.segment.bytes=536870912 > num.replica.fetchers=1 > log.retention.ms=1581120 > remote.log.manager.thread.pool.size=4 > remote.log.reader.threads=4 > remote.log.metadata.topic.replication.factor=3 > remote.log.storage.system.enable=true > remote.log.metadata.topic.retention.ms=18000 > rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache > rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache > # Pick some cache size, 16 GiB here: > rsm.config.fetch.chunk.cache.size=34359738368 > rsm.config.fetch.chunk.cache.retention.ms=120 > # # # Prefetching size, 16 MiB here: > rsm.config.fetch.chunk.cache.prefetch.max.size=33554432 > rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage > rsm.config.storage.s3.bucket.name= > rsm.config.storage.s3.region=us-west-1 > rsm.config.storage.aws.secret.access.key= > rsm.config.storage.aws.access.key.id= > rsm.config.chunk.size=8388608 > remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/{*}:/home/admin/s3-0.0.1-SNAPSHOT/{*} > remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager > remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager > remote.log.metadata.manager.listener.name=PLAINTEXT > rsm.config.upload.rate.limit.bytes.per.second=31457280 > replica config: > num.partitions=3 > default.replication.factor=2 > delete.topic.enable=true > auto.create.topics.enable=false > num.recovery.threads.per.data.dir=1 > offsets.topic.replication.factor=3 > transaction.state.log.replication.facto
[jira] [Updated] (KAFKA-17020) After enabling tiered storage, occasional residual logs are left in the replica
[ https://issues.apache.org/jira/browse/KAFKA-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianbin Chen updated KAFKA-17020: - Description: After enabling tiered storage, occasional residual logs are left in the replica. Based on the observed phenomenon, the index values of the rolled-out logs generated by the replica and the leader are not the same. As a result, the logs uploaded to S3 at the same time do not include the corresponding log files on the replica side, making it impossible to delete the local logs. [!https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E!|https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E] leader config: num.partitions=3 default.replication.factor=2 delete.topic.enable=true auto.create.topics.enable=false num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=3 transaction.state.log.replication.factor=2 transaction.state.log.min.isr=1 offsets.retention.minutes=4320 log.roll.ms=8640 log.local.retention.ms=60 log.segment.bytes=536870912 num.replica.fetchers=1 log.retention.ms=1581120 remote.log.manager.thread.pool.size=4 remote.log.reader.threads=4 remote.log.metadata.topic.replication.factor=3 remote.log.storage.system.enable=true remote.log.metadata.topic.retention.ms=18000 rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache # Pick some cache size, 16 GiB here: rsm.config.fetch.chunk.cache.size=34359738368 rsm.config.fetch.chunk.cache.retention.ms=120 # # # Prefetching size, 16 MiB here: rsm.config.fetch.chunk.cache.prefetch.max.size=33554432 rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage rsm.config.storage.s3.bucket.name= rsm.config.storage.s3.region=us-west-1 rsm.config.storage.aws.secret.access.key= rsm.config.storage.aws.access.key.id= rsm.config.chunk.size=8388608 remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/{*}:/home/admin/s3-0.0.1-SNAPSHOT/{*} remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager remote.log.metadata.manager.listener.name=PLAINTEXT rsm.config.upload.rate.limit.bytes.per.second=31457280 replica config: num.partitions=3 default.replication.factor=2 delete.topic.enable=true auto.create.topics.enable=false num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=3 transaction.state.log.replication.factor=2 transaction.state.log.min.isr=1 offsets.retention.minutes=4320 log.roll.ms=8640 log.local.retention.ms=60 log.segment.bytes=536870912 num.replica.fetchers=1 log.retention.ms=1581120 remote.log.manager.thread.pool.size=4 remote.log.reader.threads=4 remote.log.metadata.topic.replication.factor=3 remote.log.storage.system.enable=true #remote.log.metadata.topic.retention.ms=18000 rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache # Pick some cache size, 16 GiB here: rsm.config.fetch.chunk.cache.size=34359738368 rsm.config.fetch.chunk.cache.retention.ms=120 # # # Prefetching size, 16 MiB here: rsm.config.fetch.chunk.cache.prefetch.max.size=33554432 rsm.config.storage.backend.class=i
[jira] [Commented] (KAFKA-17020) After enabling tiered storage, occasional residual logs are left in the replica
[ https://issues.apache.org/jira/browse/KAFKA-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17856903#comment-17856903 ] Jianbin Chen commented on KAFKA-17020: -- Restarting does not resolve this issue. The only solution is to delete the log folder corresponding to the replica where the log segment anomaly occurred and then resynchronize from the leader. ![image](https://github.com/Aiven-Open/tiered-storage-for-apache-kafka/assets/19943636/7256c156-6e90-4799-b0cf-a48c247c5b51) > After enabling tiered storage, occasional residual logs are left in the > replica > --- > > Key: KAFKA-17020 > URL: https://issues.apache.org/jira/browse/KAFKA-17020 > Project: Kafka > Issue Type: Wish >Affects Versions: 3.7.0 >Reporter: Jianbin Chen >Priority: Major > > After enabling tiered storage, occasional residual logs are left in the > replica. > Based on the observed phenomenon, the index values of the rolled-out logs > generated by the replica and the leader are not the same. As a result, the > logs uploaded to S3 at the same time do not include the corresponding log > files on the replica side, making it impossible to delete the local logs. > [!https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E!|https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E] > leader config: > num.partitions=3 > default.replication.factor=2 > delete.topic.enable=true > auto.create.topics.enable=false > num.recovery.threads.per.data.dir=1 > offsets.topic.replication.factor=3 > transaction.state.log.replication.factor=2 > transaction.state.log.min.isr=1 > offsets.retention.minutes=4320 > log.roll.ms=8640 > log.local.retention.ms=60 > log.segment.bytes=536870912 > num.replica.fetchers=1 > log.retention.ms=1581120 > remote.log.manager.thread.pool.size=4 > remote.log.reader.threads=4 > remote.log.metadata.topic.replication.factor=3 > remote.log.storage.system.enable=true > remote.log.metadata.topic.retention.ms=18000 > rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache > rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache > # Pick some cache size, 16 GiB here: > rsm.config.fetch.chunk.cache.size=34359738368 > rsm.config.fetch.chunk.cache.retention.ms=120 > # # # Prefetching size, 16 MiB here: > rsm.config.fetch.chunk.cache.prefetch.max.size=33554432 > rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage > rsm.config.storage.s3.bucket.name= > rsm.config.storage.s3.region=us-west-1 > rsm.config.storage.aws.secret.access.key= > rsm.config.storage.aws.access.key.id= > rsm.config.chunk.size=8388608 > remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/*:/home/admin/s3-0.0.1-SNAPSHOT/* > remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager > remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager > remote.log.metadata.manager.listener.name=PLAINTEXT > rsm.config.upload.rate.limit.bytes.per.second=31457280 > replica config: > num.partitions=3 > default.replication.factor=2 > delete.topic.enable=true > auto.create.topics.enable=false > num.recovery.threads.per.data.dir=1 > offs
[jira] [Created] (KAFKA-17020) After enabling tiered storage, occasional residual logs are left in the replica
Jianbin Chen created KAFKA-17020: Summary: After enabling tiered storage, occasional residual logs are left in the replica Key: KAFKA-17020 URL: https://issues.apache.org/jira/browse/KAFKA-17020 Project: Kafka Issue Type: Wish Affects Versions: 3.7.0 Reporter: Jianbin Chen After enabling tiered storage, occasional residual logs are left in the replica. Based on the observed phenomenon, the index values of the rolled-out logs generated by the replica and the leader are not the same. As a result, the logs uploaded to S3 at the same time do not include the corresponding log files on the replica side, making it impossible to delete the local logs. [!https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E!|https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E] leader config: num.partitions=3 default.replication.factor=2 delete.topic.enable=true auto.create.topics.enable=false num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=3 transaction.state.log.replication.factor=2 transaction.state.log.min.isr=1 offsets.retention.minutes=4320 log.roll.ms=8640 log.local.retention.ms=60 log.segment.bytes=536870912 num.replica.fetchers=1 log.retention.ms=1581120 remote.log.manager.thread.pool.size=4 remote.log.reader.threads=4 remote.log.metadata.topic.replication.factor=3 remote.log.storage.system.enable=true remote.log.metadata.topic.retention.ms=18000 rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache # Pick some cache size, 16 GiB here: rsm.config.fetch.chunk.cache.size=34359738368 rsm.config.fetch.chunk.cache.retention.ms=120 # # # Prefetching size, 16 MiB here: rsm.config.fetch.chunk.cache.prefetch.max.size=33554432 rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage rsm.config.storage.s3.bucket.name= rsm.config.storage.s3.region=us-west-1 rsm.config.storage.aws.secret.access.key= rsm.config.storage.aws.access.key.id= rsm.config.chunk.size=8388608 remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/*:/home/admin/s3-0.0.1-SNAPSHOT/* remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager remote.log.metadata.manager.listener.name=PLAINTEXT rsm.config.upload.rate.limit.bytes.per.second=31457280 replica config: num.partitions=3 default.replication.factor=2 delete.topic.enable=true auto.create.topics.enable=false num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=3 transaction.state.log.replication.factor=2 transaction.state.log.min.isr=1 offsets.retention.minutes=4320 log.roll.ms=8640 log.local.retention.ms=60 log.segment.bytes=536870912 num.replica.fetchers=1 log.retention.ms=1581120 remote.log.manager.thread.pool.size=4 remote.log.reader.threads=4 remote.log.metadata.topic.replication.factor=3 remote.log.storage.system.enable=true #remote.log.metadata.topic.retention.ms=18000 rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache # Pick some cache size, 16 GiB here: rsm.config.fetch.chunk.cache.size=34359738368
[jira] [Updated] (KAFKA-16834) add the reason for the failure of PartitionRegistration#toRecord
[ https://issues.apache.org/jira/browse/KAFKA-16834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianbin Chen updated KAFKA-16834: - Description: Transform it into the following output, which is easier for users to understand and identify the cause of the problem. {code:java} options.handleLoss("the directory " + (directory == DirectoryId.UNASSIGNED ? "unassigned" : "lost") + " state of one or more replicas");{code} was: Transform it into the following output, which is easier for users to understand and identify the cause of the problem. {code:java} options.handleLoss("the directory " + directory + " state of one or more replicas");{code} > add the reason for the failure of PartitionRegistration#toRecord > > > Key: KAFKA-16834 > URL: https://issues.apache.org/jira/browse/KAFKA-16834 > Project: Kafka > Issue Type: Wish >Affects Versions: 3.7.0 >Reporter: Jianbin Chen >Priority: Minor > > Transform it into the following output, which is easier for users to > understand and identify the cause of the problem. > {code:java} > options.handleLoss("the directory " + (directory == DirectoryId.UNASSIGNED ? > "unassigned" : "lost") > + " state of one or more replicas");{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16834) add the reason for the failure of PartitionRegistration#toRecord
[ https://issues.apache.org/jira/browse/KAFKA-16834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianbin Chen updated KAFKA-16834: - Summary: add the reason for the failure of PartitionRegistration#toRecord (was: add PartitionRegistration#toRecord loss info) > add the reason for the failure of PartitionRegistration#toRecord > > > Key: KAFKA-16834 > URL: https://issues.apache.org/jira/browse/KAFKA-16834 > Project: Kafka > Issue Type: Wish >Affects Versions: 3.7.0 >Reporter: Jianbin Chen >Priority: Minor > > Transform it into the following output, which is easier for users to > understand and identify the cause of the problem. > {code:java} > options.handleLoss("the directory " + directory + " state of one or more > replicas");{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16834) add PartitionRegistration#toRecord loss info
Jianbin Chen created KAFKA-16834: Summary: add PartitionRegistration#toRecord loss info Key: KAFKA-16834 URL: https://issues.apache.org/jira/browse/KAFKA-16834 Project: Kafka Issue Type: Wish Affects Versions: 3.7.0 Reporter: Jianbin Chen Transform it into the following output, which is easier for users to understand and identify the cause of the problem. {code:java} options.handleLoss("the directory " + directory + " state of one or more replicas");{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16583) Update from 3.4.0 to 3.7.0 image write failed in Kraft mode
[ https://issues.apache.org/jira/browse/KAFKA-16583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849145#comment-17849145 ] Jianbin Chen commented on KAFKA-16583: -- I want to know when this PR can be merged, I am deeply affected by this bug! > Update from 3.4.0 to 3.7.0 image write failed in Kraft mode > --- > > Key: KAFKA-16583 > URL: https://issues.apache.org/jira/browse/KAFKA-16583 > Project: Kafka > Issue Type: Bug > Components: kraft >Affects Versions: 3.7.0 >Reporter: HanXu >Assignee: HanXu >Priority: Major > Original Estimate: 6h > Remaining Estimate: 6h > > How to reproduce: > 1. Launch a 3.4.0 controller and a 3.4.0 broker(BrokerA) in Kraft mode; > 2. Create a topic with 1 partition; > 3. Launch a 3.4.0 broker(Broker B) in Kraft mode and reassign the step 2 > partition to Broker B; > 4. Upgrade Broker B to 3.7.0; > The Broker B will keep log the following error: > {code:java} > [2024-04-18 14:46:54,144] ERROR Encountered metadata loading fault: Unhandled > error initializing new publishers > (org.apache.kafka.server.fault.LoggingFaultHandler) > org.apache.kafka.image.writer.UnwritableMetadataException: Metadata has been > lost because the following could not be represented in metadata version > 3.4-IV0: the directory assignment state of one or more replicas > at > org.apache.kafka.image.writer.ImageWriterOptions.handleLoss(ImageWriterOptions.java:94) > at > org.apache.kafka.metadata.PartitionRegistration.toRecord(PartitionRegistration.java:391) > at org.apache.kafka.image.TopicImage.write(TopicImage.java:71) > at org.apache.kafka.image.TopicsImage.write(TopicsImage.java:84) > at org.apache.kafka.image.MetadataImage.write(MetadataImage.java:155) > at > org.apache.kafka.image.loader.MetadataLoader.initializeNewPublishers(MetadataLoader.java:295) > at > org.apache.kafka.image.loader.MetadataLoader.lambda$scheduleInitializeNewPublishers$0(MetadataLoader.java:266) > at > org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:127) > at > org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:210) > at > org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:181) > at java.base/java.lang.Thread.run(Thread.java:840) > {code} > Bug: > - When reassigning partition, PartitionRegistration#merge will set the new > replicas with UNASSIGNED directory; > - But in metadata version 3.4.0 PartitionRegistration#toRecord only allows > MIGRATING directory; > {code:java} > if (options.metadataVersion().isDirectoryAssignmentSupported()) { > record.setDirectories(Uuid.toList(directories)); > } else { > for (Uuid directory : directories) { > if (!DirectoryId.MIGRATING.equals(directory)) { > options.handleLoss("the directory assignment state of one > or more replicas"); > break; > } > } > } > {code} > Solution: > - PartitionRegistration#toRecord allows both MIGRATING and UNASSIGNED -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16662) UnwritableMetadataException: Metadata has been lost
[ https://issues.apache.org/jira/browse/KAFKA-16662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848436#comment-17848436 ] Jianbin Chen commented on KAFKA-16662: -- After I deleted all of the __cluster_metadata-0, the problem did not occur when I started the cluster, but all my topic information was lost. Fortunately, this is just an offline test environment cluster. According to the phenomenon, it is certain that the incompatibility between the 3.5 version of metadata and the 3.7 version caused this problem. This makes me dare not try to smoothly upgrade the cluster. In the past, when using zk, upgrading the broker would never cause similar problems! > UnwritableMetadataException: Metadata has been lost > --- > > Key: KAFKA-16662 > URL: https://issues.apache.org/jira/browse/KAFKA-16662 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 > Environment: Docker Image (bitnami/kafka:3.7.0) > via Docker Compose >Reporter: Tobias Bohn >Priority: Major > Attachments: log.txt > > > Hello, > First of all: I am new to this Jira and apologize if anything is set or > specified incorrectly. Feel free to advise me. > We currently have an error in our test system, which unfortunately I can't > solve, because I couldn't find anything related to it. No solution could be > found via the mailing list either. > The error occurs when we want to start up a node. The node runs using Kraft > and is both a controller and a broker. The following error message appears at > startup: > {code:java} > kafka | [2024-04-16 06:18:13,707] ERROR Encountered fatal fault: Unhandled > error initializing new publishers > (org.apache.kafka.server.fault.ProcessTerminatingFaultHandler) > kafka | org.apache.kafka.image.writer.UnwritableMetadataException: Metadata > has been lost because the following could not be represented in metadata > version 3.5-IV2: the directory assignment state of one or more replicas > kafka | at > org.apache.kafka.image.writer.ImageWriterOptions.handleLoss(ImageWriterOptions.java:94) > kafka | at > org.apache.kafka.metadata.PartitionRegistration.toRecord(PartitionRegistration.java:391) > kafka | at org.apache.kafka.image.TopicImage.write(TopicImage.java:71) > kafka | at > org.apache.kafka.image.TopicsImage.write(TopicsImage.java:84) > kafka | at > org.apache.kafka.image.MetadataImage.write(MetadataImage.java:155) > kafka | at > org.apache.kafka.image.loader.MetadataLoader.initializeNewPublishers(MetadataLoader.java:295) > kafka | at > org.apache.kafka.image.loader.MetadataLoader.lambda$scheduleInitializeNewPublishers$0(MetadataLoader.java:266) > kafka | at > org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:127) > kafka | at > org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:210) > kafka | at > org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:181) > kafka | at java.base/java.lang.Thread.run(Thread.java:840) > kafka exited with code 0 {code} > We use Docker to operate the cluster. The error occurred while we were trying > to restart a node. All other nodes in the cluster are still running correctly. > If you need further information, please let us know. The complete log is > attached to this issue. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16662) UnwritableMetadataException: Metadata has been lost
[ https://issues.apache.org/jira/browse/KAFKA-16662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848434#comment-17848434 ] Jianbin Chen commented on KAFKA-16662: -- When I executed ./bin/kafka-features.sh --bootstrap-server 10.58.16.231:9092 upgrade --metadata 3.7, It continuously outputs the following exception {panel:title=我的标题} [2024-05-22 11:30:36,491] INFO [UnifiedLog partition=remote-test-5, dir=/data01/kafka-logs-351] Incremented log start offset to 26267689 due to leader offset increment (kafka.log.UnifiedLog) [2024-05-22 11:30:36,497] INFO [UnifiedLog partition=remote-test2-0, dir=/data01/kafka-logs-351] Incremented log start offset to 3099360 due to leader offset increment (kafka.log.UnifiedLog) [2024-05-22 11:30:37,149] ERROR Failed to propagate directory assignments because the Controller returned error STALE_BROKER_EPOCH (org.apache.kafka.server.AssignmentsManager) [2024-05-22 11:30:38,064] ERROR Failed to propagate directory assignments because the Controller returned error STALE_BROKER_EPOCH (org.apache.kafka.server.AssignmentsManager) [2024-05-22 11:30:39,376] ERROR Failed to propagate directory assignments because the Controller returned error STALE_BROKER_EPOCH (org.apache.kafka.server.AssignmentsManager) [2024-05-22 11:30:41,486] ERROR Failed to propagate directory assignments because the Controller returned error STALE_BROKER_EPOCH (org.apache.kafka.server.AssignmentsManager) [2024-05-22 11:30:43,794] INFO [BrokerLifecycleManager id=3] Unable to register broker 3 because the controller returned error INVALID_REGISTRATION (kafka.server.BrokerLifecycleManager) [2024-05-22 11:30:45,224] ERROR Failed to propagate directory assignments because the Controller returned error STALE_BROKER_EPOCH (org.apache.kafka.server.AssignmentsManager) {panel} controller logs: {code:java} java.util.concurrent.CompletionException: org.apache.kafka.common.errors.StaleBrokerEpochException: Expected broker epoch 41885255, but got broker epoch -1 at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:332) at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:347) at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:636) at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2194) at org.apache.kafka.controller.QuorumController$ControllerWriteEvent.complete(QuorumController.java:880) at org.apache.kafka.controller.QuorumController$ControllerWriteEvent.handleException(QuorumController.java:871) at org.apache.kafka.queue.KafkaEventQueue$EventContext.completeWithException(KafkaEventQueue.java:148) at org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:137) at org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:210) at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:181) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: org.apache.kafka.common.errors.StaleBrokerEpochException: Expected broker epoch 41885255, but got broker epoch -1{code} > UnwritableMetadataException: Metadata has been lost > --- > > Key: KAFKA-16662 > URL: https://issues.apache.org/jira/browse/KAFKA-16662 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 > Environment: Docker Image (bitnami/kafka:3.7.0) > via Docker Compose >Reporter: Tobias Bohn >Priority: Major > Attachments: log.txt > > > Hello, > First of all: I am new to this Jira and apologize if anything is set or > specified incorrectly. Feel free to advise me. > We currently have an error in our test system, which unfortunately I can't > solve, because I couldn't find anything related to it. No solution could be > found via the mailing list either. > The error occurs when we want to start up a node. The node runs using Kraft > and is both a controller and a broker. The following error message appears at > startup: > {code:java} > kafka | [2024-04-16 06:18:13,707] ERROR Encountered fatal fault: Unhandled > error initializing new publishers > (org.apache.kafka.server.fault.ProcessTerminatingFaultHandler) > kafka | org.apache.kafka.image.writer.UnwritableMetadataException: Metadata > has been lost because the following could not be represented in metadata > version 3.5-IV2: the directory assignment state of one or more replicas > kafka | at > org.apache.kafka.image.writer.ImageWriterOptions.handleLoss(ImageWriterOptions.java:94) > kafka | at > org.apache.kafka.metadata.PartitionRegistration.toRecord(Pa
[jira] [Commented] (KAFKA-16662) UnwritableMetadataException: Metadata has been lost
[ https://issues.apache.org/jira/browse/KAFKA-16662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848431#comment-17848431 ] Jianbin Chen commented on KAFKA-16662: -- Could someone please pay attention to this issue and help me out? {code:java} [admin@kafka-dev-d-010058016231 kafka]$ ./bin/kafka-features.sh --bootstrap-server 10.58.16.231:9092 describe Feature: metadata.version SupportedMinVersion: 3.0-IV1 SupportedMaxVersion: 3.7-IV4 FinalizedVersionLevel: 3.5-IV2 Epoch: 41885646{code} > UnwritableMetadataException: Metadata has been lost > --- > > Key: KAFKA-16662 > URL: https://issues.apache.org/jira/browse/KAFKA-16662 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 > Environment: Docker Image (bitnami/kafka:3.7.0) > via Docker Compose >Reporter: Tobias Bohn >Priority: Major > Attachments: log.txt > > > Hello, > First of all: I am new to this Jira and apologize if anything is set or > specified incorrectly. Feel free to advise me. > We currently have an error in our test system, which unfortunately I can't > solve, because I couldn't find anything related to it. No solution could be > found via the mailing list either. > The error occurs when we want to start up a node. The node runs using Kraft > and is both a controller and a broker. The following error message appears at > startup: > {code:java} > kafka | [2024-04-16 06:18:13,707] ERROR Encountered fatal fault: Unhandled > error initializing new publishers > (org.apache.kafka.server.fault.ProcessTerminatingFaultHandler) > kafka | org.apache.kafka.image.writer.UnwritableMetadataException: Metadata > has been lost because the following could not be represented in metadata > version 3.5-IV2: the directory assignment state of one or more replicas > kafka | at > org.apache.kafka.image.writer.ImageWriterOptions.handleLoss(ImageWriterOptions.java:94) > kafka | at > org.apache.kafka.metadata.PartitionRegistration.toRecord(PartitionRegistration.java:391) > kafka | at org.apache.kafka.image.TopicImage.write(TopicImage.java:71) > kafka | at > org.apache.kafka.image.TopicsImage.write(TopicsImage.java:84) > kafka | at > org.apache.kafka.image.MetadataImage.write(MetadataImage.java:155) > kafka | at > org.apache.kafka.image.loader.MetadataLoader.initializeNewPublishers(MetadataLoader.java:295) > kafka | at > org.apache.kafka.image.loader.MetadataLoader.lambda$scheduleInitializeNewPublishers$0(MetadataLoader.java:266) > kafka | at > org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:127) > kafka | at > org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:210) > kafka | at > org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:181) > kafka | at java.base/java.lang.Thread.run(Thread.java:840) > kafka exited with code 0 {code} > We use Docker to operate the cluster. The error occurred while we were trying > to restart a node. All other nodes in the cluster are still running correctly. > If you need further information, please let us know. The complete log is > attached to this issue. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-16662) UnwritableMetadataException: Metadata has been lost
[ https://issues.apache.org/jira/browse/KAFKA-16662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848424#comment-17848424 ] Jianbin Chen edited comment on KAFKA-16662 at 5/22/24 3:10 AM: --- I have encountered the same issue. Can anyone help me with this? I upgraded from 3.5.1 to 3.7.0, and I have already changed inter.broker.protocol.version to 3.7 and ran it for some time. But I have never executed {code:java} ./bin/kafka-features.sh upgrade --metadata 3.7 {code} The last time I restarted the cluster, I found that it could not be started anymore. The last line of the log is as follows: {code:java} [2024-05-22 11:01:41,087] INFO [MetadataLoader id=3] maybePublishMetadata(LOG_DELTA): The loader is still catching up because we have loaded up to offset 41872530, but the high water mark is 41872532 (org.apache.kafka.image.loader.MetadataLoader) [2024-05-22 11:01:41,088] INFO [MetadataLoader id=3] maybePublishMetadata(LOG_DELTA): The loader finished catching up to the current high water mark of 41872532 (org.apache.kafka.image.loader.MetadataLoader) [2024-05-22 11:01:41,092] INFO [BrokerLifecycleManager id=3] The broker has caught up. Transitioning from STARTING to RECOVERY. (kafka.server.BrokerLifecycleManager) [2024-05-22 11:01:41,092] INFO [BrokerServer id=3] Finished waiting for the controller to acknowledge that we are caught up (kafka.server.BrokerServer) [2024-05-22 11:01:41,092] INFO [BrokerServer id=3] Waiting for the initial broker metadata update to be published (kafka.server.BrokerServer) [2024-05-22 11:01:41,095] ERROR Encountered fatal fault: Unhandled error initializing new publishers (org.apache.kafka.server.fault.ProcessTerminatingFaultHandler) org.apache.kafka.image.writer.UnwritableMetadataException: Metadata has been lost because the following could not be represented in metadata version 3.5-IV2: the directory assignment state of one or more replicas at org.apache.kafka.image.writer.ImageWriterOptions.handleLoss(ImageWriterOptions.java:94) at org.apache.kafka.metadata.PartitionRegistration.toRecord(PartitionRegistration.java:391) at org.apache.kafka.image.TopicImage.write(TopicImage.java:71) at org.apache.kafka.image.TopicsImage.write(TopicsImage.java:84) at org.apache.kafka.image.MetadataImage.write(MetadataImage.java:155) at org.apache.kafka.image.loader.MetadataLoader.initializeNewPublishers(MetadataLoader.java:295) at org.apache.kafka.image.loader.MetadataLoader.lambda$scheduleInitializeNewPublishers$0(MetadataLoader.java:266) at org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:127) at org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:210) at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:181) at java.base/java.lang.Thread.run(Thread.java:1583) {code} was (Author: jianbin): I have encountered the same issue. Can anyone help me with this? I upgraded from 3.5.1 to 3.7.0, and I have already changed inter.broker.protocol.version to 3.7 and ran it for some time. But I have never executed `./bin/kafka-features.sh upgrade --metadata 3.7` The last time I restarted the cluster, I found that it could not be started anymore. The last line of the log is as follows: ``` [2024-05-22 11:01:41,087] INFO [MetadataLoader id=3] maybePublishMetadata(LOG_DELTA): The loader is still catching up because we have loaded up to offset 41872530, but the high water mark is 41872532 (org.apache.kafka.image.loader.MetadataLoader) [2024-05-22 11:01:41,088] INFO [MetadataLoader id=3] maybePublishMetadata(LOG_DELTA): The loader finished catching up to the current high water mark of 41872532 (org.apache.kafka.image.loader.MetadataLoader) [2024-05-22 11:01:41,092] INFO [BrokerLifecycleManager id=3] The broker has caught up. Transitioning from STARTING to RECOVERY. (kafka.server.BrokerLifecycleManager) [2024-05-22 11:01:41,092] INFO [BrokerServer id=3] Finished waiting for the controller to acknowledge that we are caught up (kafka.server.BrokerServer) [2024-05-22 11:01:41,092] INFO [BrokerServer id=3] Waiting for the initial broker metadata update to be published (kafka.server.BrokerServer) [2024-05-22 11:01:41,095] ERROR Encountered fatal fault: Unhandled error initializing new publishers (org.apache.kafka.server.fault.ProcessTerminatingFaultHandler) org.apache.kafka.image.writer.UnwritableMetadataException: Metadata has been lost because the following could not be represented in metadata version 3.5-IV2: the directory assignment state of one or more replicas at org.apache.kafka.image.writer.ImageWriterOptions.handleLoss(ImageWriterOptions.java:94) at org.apache.kafka.metadata.PartitionRegistration.toRecord(PartitionRegistration.java:391) at org.apache.kafka.image.TopicImage.write(TopicImage.java:71)
[jira] [Commented] (KAFKA-16662) UnwritableMetadataException: Metadata has been lost
[ https://issues.apache.org/jira/browse/KAFKA-16662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848424#comment-17848424 ] Jianbin Chen commented on KAFKA-16662: -- I have encountered the same issue. Can anyone help me with this? I upgraded from 3.5.1 to 3.7.0, and I have already changed inter.broker.protocol.version to 3.7 and ran it for some time. But I have never executed `./bin/kafka-features.sh upgrade --metadata 3.7` The last time I restarted the cluster, I found that it could not be started anymore. The last line of the log is as follows: ``` [2024-05-22 11:01:41,087] INFO [MetadataLoader id=3] maybePublishMetadata(LOG_DELTA): The loader is still catching up because we have loaded up to offset 41872530, but the high water mark is 41872532 (org.apache.kafka.image.loader.MetadataLoader) [2024-05-22 11:01:41,088] INFO [MetadataLoader id=3] maybePublishMetadata(LOG_DELTA): The loader finished catching up to the current high water mark of 41872532 (org.apache.kafka.image.loader.MetadataLoader) [2024-05-22 11:01:41,092] INFO [BrokerLifecycleManager id=3] The broker has caught up. Transitioning from STARTING to RECOVERY. (kafka.server.BrokerLifecycleManager) [2024-05-22 11:01:41,092] INFO [BrokerServer id=3] Finished waiting for the controller to acknowledge that we are caught up (kafka.server.BrokerServer) [2024-05-22 11:01:41,092] INFO [BrokerServer id=3] Waiting for the initial broker metadata update to be published (kafka.server.BrokerServer) [2024-05-22 11:01:41,095] ERROR Encountered fatal fault: Unhandled error initializing new publishers (org.apache.kafka.server.fault.ProcessTerminatingFaultHandler) org.apache.kafka.image.writer.UnwritableMetadataException: Metadata has been lost because the following could not be represented in metadata version 3.5-IV2: the directory assignment state of one or more replicas at org.apache.kafka.image.writer.ImageWriterOptions.handleLoss(ImageWriterOptions.java:94) at org.apache.kafka.metadata.PartitionRegistration.toRecord(PartitionRegistration.java:391) at org.apache.kafka.image.TopicImage.write(TopicImage.java:71) at org.apache.kafka.image.TopicsImage.write(TopicsImage.java:84) at org.apache.kafka.image.MetadataImage.write(MetadataImage.java:155) at org.apache.kafka.image.loader.MetadataLoader.initializeNewPublishers(MetadataLoader.java:295) at org.apache.kafka.image.loader.MetadataLoader.lambda$scheduleInitializeNewPublishers$0(MetadataLoader.java:266) at org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:127) at org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:210) at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:181) at java.base/java.lang.Thread.run(Thread.java:1583) ``` > UnwritableMetadataException: Metadata has been lost > --- > > Key: KAFKA-16662 > URL: https://issues.apache.org/jira/browse/KAFKA-16662 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 > Environment: Docker Image (bitnami/kafka:3.7.0) > via Docker Compose >Reporter: Tobias Bohn >Priority: Major > Attachments: log.txt > > > Hello, > First of all: I am new to this Jira and apologize if anything is set or > specified incorrectly. Feel free to advise me. > We currently have an error in our test system, which unfortunately I can't > solve, because I couldn't find anything related to it. No solution could be > found via the mailing list either. > The error occurs when we want to start up a node. The node runs using Kraft > and is both a controller and a broker. The following error message appears at > startup: > {code:java} > kafka | [2024-04-16 06:18:13,707] ERROR Encountered fatal fault: Unhandled > error initializing new publishers > (org.apache.kafka.server.fault.ProcessTerminatingFaultHandler) > kafka | org.apache.kafka.image.writer.UnwritableMetadataException: Metadata > has been lost because the following could not be represented in metadata > version 3.5-IV2: the directory assignment state of one or more replicas > kafka | at > org.apache.kafka.image.writer.ImageWriterOptions.handleLoss(ImageWriterOptions.java:94) > kafka | at > org.apache.kafka.metadata.PartitionRegistration.toRecord(PartitionRegistration.java:391) > kafka | at org.apache.kafka.image.TopicImage.write(TopicImage.java:71) > kafka | at > org.apache.kafka.image.TopicsImage.write(TopicsImage.java:84) > kafka | at > org.apache.kafka.image.MetadataImage.write(MetadataImage.java:155) > kafka | at > org.apache.kafka.image.loader.MetadataLoader.initializeNewPublishers(MetadataLoader.java:295) > kafka | at > org.apache.kafka.image.loader.MetadataLoader.
[jira] [Resolved] (KAFKA-16378) Under tiered storage, deleting local logs does not free disk space
[ https://issues.apache.org/jira/browse/KAFKA-16378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianbin Chen resolved KAFKA-16378. -- Resolution: Fixed > Under tiered storage, deleting local logs does not free disk space > -- > > Key: KAFKA-16378 > URL: https://issues.apache.org/jira/browse/KAFKA-16378 > Project: Kafka > Issue Type: Bug > Components: Tiered-Storage >Affects Versions: 3.7.0 >Reporter: Jianbin Chen >Priority: Major > Attachments: image-2024-03-15-09-33-13-903.png > > > Of course, this is an occasional phenomenon, as long as the tiered storage > topic triggered the deletion of the local log action, there is always the > possibility of residual file references, but these files on the local disk is > already impossible to find! > I use the implementation as: [Aiven-Open/tiered-storage-for-apache-kafka: > RemoteStorageManager for Apache Kafka® Tiered Storage > (github.com)|https://github.com/Aiven-Open/tiered-storage-for-apache-kafka] > I also filed an issue in their community, which also contains a full > description of the problem > [Disk space not released · Issue #513 · > Aiven-Open/tiered-storage-for-apache-kafka > (github.com)|https://github.com/Aiven-Open/tiered-storage-for-apache-kafka/issues/513] > !image-2024-03-15-09-33-13-903.png! > You can clearly see in this figure that the kafka log has already output the > log of the operation that deleted the log, but the log is still referenced > and the disk space has not been released -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16378) Under tiered storage, deleting local logs does not free disk space
[ https://issues.apache.org/jira/browse/KAFKA-16378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianbin Chen updated KAFKA-16378: - Component/s: Tiered-Storage > Under tiered storage, deleting local logs does not free disk space > -- > > Key: KAFKA-16378 > URL: https://issues.apache.org/jira/browse/KAFKA-16378 > Project: Kafka > Issue Type: Bug > Components: Tiered-Storage >Affects Versions: 3.7.0 >Reporter: Jianbin Chen >Priority: Major > Attachments: image-2024-03-15-09-33-13-903.png > > > Of course, this is an occasional phenomenon, as long as the tiered storage > topic triggered the deletion of the local log action, there is always the > possibility of residual file references, but these files on the local disk is > already impossible to find! > I use the implementation as: [Aiven-Open/tiered-storage-for-apache-kafka: > RemoteStorageManager for Apache Kafka® Tiered Storage > (github.com)|https://github.com/Aiven-Open/tiered-storage-for-apache-kafka] > I also filed an issue in their community, which also contains a full > description of the problem > [Disk space not released · Issue #513 · > Aiven-Open/tiered-storage-for-apache-kafka > (github.com)|https://github.com/Aiven-Open/tiered-storage-for-apache-kafka/issues/513] > !image-2024-03-15-09-33-13-903.png! > You can clearly see in this figure that the kafka log has already output the > log of the operation that deleted the log, but the log is still referenced > and the disk space has not been released -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16378) Under tiered storage, deleting local logs does not free disk space
[ https://issues.apache.org/jira/browse/KAFKA-16378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianbin Chen updated KAFKA-16378: - Issue Type: Bug (was: Wish) > Under tiered storage, deleting local logs does not free disk space > -- > > Key: KAFKA-16378 > URL: https://issues.apache.org/jira/browse/KAFKA-16378 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 >Reporter: Jianbin Chen >Priority: Major > Attachments: image-2024-03-15-09-33-13-903.png > > > Of course, this is an occasional phenomenon, as long as the tiered storage > topic triggered the deletion of the local log action, there is always the > possibility of residual file references, but these files on the local disk is > already impossible to find! > I use the implementation as: [Aiven-Open/tiered-storage-for-apache-kafka: > RemoteStorageManager for Apache Kafka® Tiered Storage > (github.com)|https://github.com/Aiven-Open/tiered-storage-for-apache-kafka] > I also filed an issue in their community, which also contains a full > description of the problem > [Disk space not released · Issue #513 · > Aiven-Open/tiered-storage-for-apache-kafka > (github.com)|https://github.com/Aiven-Open/tiered-storage-for-apache-kafka/issues/513] > !image-2024-03-15-09-33-13-903.png! > You can clearly see in this figure that the kafka log has already output the > log of the operation that deleted the log, but the log is still referenced > and the disk space has not been released -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16378) Under tiered storage, deleting local logs does not free disk space
Jianbin Chen created KAFKA-16378: Summary: Under tiered storage, deleting local logs does not free disk space Key: KAFKA-16378 URL: https://issues.apache.org/jira/browse/KAFKA-16378 Project: Kafka Issue Type: Wish Affects Versions: 3.7.0 Reporter: Jianbin Chen Attachments: image-2024-03-15-09-33-13-903.png Of course, this is an occasional phenomenon, as long as the tiered storage topic triggered the deletion of the local log action, there is always the possibility of residual file references, but these files on the local disk is already impossible to find! I use the implementation as: [Aiven-Open/tiered-storage-for-apache-kafka: RemoteStorageManager for Apache Kafka® Tiered Storage (github.com)|https://github.com/Aiven-Open/tiered-storage-for-apache-kafka] I also filed an issue in their community, which also contains a full description of the problem [Disk space not released · Issue #513 · Aiven-Open/tiered-storage-for-apache-kafka (github.com)|https://github.com/Aiven-Open/tiered-storage-for-apache-kafka/issues/513] !image-2024-03-15-09-33-13-903.png! You can clearly see in this figure that the kafka log has already output the log of the operation that deleted the log, but the log is still referenced and the disk space has not been released -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16340) Replication factor: 3 larger than available brokers: 1.
[ https://issues.apache.org/jira/browse/KAFKA-16340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianbin Chen updated KAFKA-16340: - Description: Setting remote.log.metadata.topic.replication.factor is invalid {code:java} broker.id=1 log.cleanup.policy=delete log.cleaner.enable=true log.cleaner.delete.retention.ms=30 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 message.max.bytes=5242880 replica.fetch.max.bytes=5242880 log.dirs=/data01/kafka110-logs num.partitions=2 default.replication.factor=1 delete.topic.enable=true auto.create.topics.enable=true num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=1 transaction.state.log.replication.factor=1 transaction.state.log.min.isr=1 offsets.retention.minutes=1440 log.retention.minutes=10 log.local.retention.ms=30 log.segment.bytes=104857600 log.retention.check.interval.ms=30 remote.log.metadata.topic.replication.factor=1 remote.log.storage.system.enable=true remote.log.metadata.topic.retention.ms=-1{code} !image-2024-03-05-09-31-35-058.png! {code:java} [2024-03-05 09:27:49,672] ERROR Encountered error while creating __remote_log_metadata topic. (org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager) java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1. at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396) at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073) at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165) at org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager.createTopic(TopicBasedRemoteLogMetadataManager.java:509) at org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager.initializeResources(TopicBasedRemoteLogMetadataManager.java:396) at java.base/java.lang.Thread.run(Thread.java:1589) Caused by: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1.{code} was: Setting remote.log.metadata .topic.replication.factor is invalid {code:java} broker.id=1 log.cleanup.policy=delete log.cleaner.enable=true log.cleaner.delete.retention.ms=30 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 message.max.bytes=5242880 replica.fetch.max.bytes=5242880 log.dirs=/data01/kafka110-logs num.partitions=2 default.replication.factor=1 delete.topic.enable=true auto.create.topics.enable=true num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=1 transaction.state.log.replication.factor=1 transaction.state.log.min.isr=1 offsets.retention.minutes=1440 log.retention.minutes=10 log.local.retention.ms=30 log.segment.bytes=104857600 log.retention.check.interval.ms=30 remote.log.metadata.topic.replication.factor=1 remote.log.storage.system.enable=true remote.log.metadata.topic.retention.ms=-1{code} !image-2024-03-05-09-31-35-058.png! {code:java} [2024-03-05 09:27:49,672] ERROR Encountered error while creating __remote_log_metadata topic. (org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager) java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1. at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396) at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073) at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165) at org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager.createTopic(TopicBasedRemoteLogMetadataManager.java:509) at org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager.initializeResources(TopicBasedRemoteLogMetadataManager.java:396) at java.base/java.lang.Thread.run(Thread.java:1589) Caused by: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1.{code} > Replication factor: 3 larger than available brokers: 1. > > > Key: KAFKA-16340 > URL: https://issues.apache.org/jira/browse/KAFKA-16340 > Project: Kafka > Issue Type: Wish >Affects Versions: 3.7.0 >Reporter: Jianbin Chen >Priority: Major > Attachments: image-2024-03-05-09-31-35-058.png > > > Setting remote.log.metadata.topic.replication.factor is invalid > {code:java} > broker.id=1 > log.cleanup.policy=delete > log.cleaner.enable=true > log.
[jira] [Updated] (KAFKA-16340) Replication factor: 3 larger than available brokers: 1.
[ https://issues.apache.org/jira/browse/KAFKA-16340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianbin Chen updated KAFKA-16340: - Description: Setting remote.log.metadata .topic.replication.factor is invalid {code:java} broker.id=1 log.cleanup.policy=delete log.cleaner.enable=true log.cleaner.delete.retention.ms=30 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 message.max.bytes=5242880 replica.fetch.max.bytes=5242880 log.dirs=/data01/kafka110-logs num.partitions=2 default.replication.factor=1 delete.topic.enable=true auto.create.topics.enable=true num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=1 transaction.state.log.replication.factor=1 transaction.state.log.min.isr=1 offsets.retention.minutes=1440 log.retention.minutes=10 log.local.retention.ms=30 log.segment.bytes=104857600 log.retention.check.interval.ms=30 remote.log.metadata.topic.replication.factor=1 remote.log.storage.system.enable=true remote.log.metadata.topic.retention.ms=-1{code} !image-2024-03-05-09-31-35-058.png! {code:java} [2024-03-05 09:27:49,672] ERROR Encountered error while creating __remote_log_metadata topic. (org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager) java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1. at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396) at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073) at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165) at org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager.createTopic(TopicBasedRemoteLogMetadataManager.java:509) at org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager.initializeResources(TopicBasedRemoteLogMetadataManager.java:396) at java.base/java.lang.Thread.run(Thread.java:1589) Caused by: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1.{code} was: 设置 remote.log.metadata.topic.replication.factor 是无效的 {code:java} broker.id=1 log.cleanup.policy=delete log.cleaner.enable=true log.cleaner.delete.retention.ms=30 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 message.max.bytes=5242880 replica.fetch.max.bytes=5242880 log.dirs=/data01/kafka110-logs num.partitions=2 default.replication.factor=1 delete.topic.enable=true auto.create.topics.enable=true num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=1 transaction.state.log.replication.factor=1 transaction.state.log.min.isr=1 offsets.retention.minutes=1440 log.retention.minutes=10 log.local.retention.ms=30 log.segment.bytes=104857600 log.retention.check.interval.ms=30 remote.log.metadata.topic.replication.factor=1 remote.log.storage.system.enable=true remote.log.metadata.topic.retention.ms=-1{code} !image-2024-03-05-09-31-35-058.png! {code:java} [2024-03-05 09:27:49,672] ERROR Encountered error while creating __remote_log_metadata topic. (org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager) java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1. at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396) at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073) at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165) at org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager.createTopic(TopicBasedRemoteLogMetadataManager.java:509) at org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager.initializeResources(TopicBasedRemoteLogMetadataManager.java:396) at java.base/java.lang.Thread.run(Thread.java:1589) Caused by: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1.{code} > Replication factor: 3 larger than available brokers: 1. > > > Key: KAFKA-16340 > URL: https://issues.apache.org/jira/browse/KAFKA-16340 > Project: Kafka > Issue Type: Wish >Affects Versions: 3.7.0 >Reporter: Jianbin Chen >Priority: Major > Attachments: image-2024-03-05-09-31-35-058.png > > > Setting remote.log.metadata .topic.replication.factor is invalid > {code:java} > broker.id=1 > log.cleanup.policy=delete > log.cleaner.enable=true > log.cleaner.de
[jira] [Updated] (KAFKA-16340) Replication factor: 3 larger than available brokers: 1.
[ https://issues.apache.org/jira/browse/KAFKA-16340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianbin Chen updated KAFKA-16340: - Description: 设置 remote.log.metadata.topic.replication.factor 是无效的 {code:java} broker.id=1 log.cleanup.policy=delete log.cleaner.enable=true log.cleaner.delete.retention.ms=30 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 message.max.bytes=5242880 replica.fetch.max.bytes=5242880 log.dirs=/data01/kafka110-logs num.partitions=2 default.replication.factor=1 delete.topic.enable=true auto.create.topics.enable=true num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=1 transaction.state.log.replication.factor=1 transaction.state.log.min.isr=1 offsets.retention.minutes=1440 log.retention.minutes=10 log.local.retention.ms=30 log.segment.bytes=104857600 log.retention.check.interval.ms=30 remote.log.metadata.topic.replication.factor=1 remote.log.storage.system.enable=true remote.log.metadata.topic.retention.ms=-1{code} !image-2024-03-05-09-31-35-058.png! {code:java} [2024-03-05 09:27:49,672] ERROR Encountered error while creating __remote_log_metadata topic. (org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager) java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1. at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396) at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073) at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165) at org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager.createTopic(TopicBasedRemoteLogMetadataManager.java:509) at org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager.initializeResources(TopicBasedRemoteLogMetadataManager.java:396) at java.base/java.lang.Thread.run(Thread.java:1589) Caused by: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1.{code} was: I'm having trouble setting remote.log.metadata.topic.replication.factor to be invalid when testing tiered storage {code:java} broker.id=1 log.cleanup.policy=delete log.cleaner.enable=true log.cleaner.delete.retention.ms=30 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 message.max.bytes=5242880 replica.fetch.max.bytes=5242880 log.dirs=/data01/kafka110-logs num.partitions=2 default.replication.factor=1 delete.topic.enable=true auto.create.topics.enable=true num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=1 transaction.state.log.replication.factor=1 transaction.state.log.min.isr=1 offsets.retention.minutes=1440 log.retention.minutes=10 log.local.retention.ms=30 log.segment.bytes=104857600 log.retention.check.interval.ms=30 remote.log.metadata.topic.replication.factor=1 remote.log.storage.system.enable=true remote.log.metadata.topic.retention.ms=-1{code} !image-2024-03-05-09-31-35-058.png! {code:java} [2024-03-05 09:27:49,672] ERROR Encountered error while creating __remote_log_metadata topic. (org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager) java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1. at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396) at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073) at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165) at org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager.createTopic(TopicBasedRemoteLogMetadataManager.java:509) at org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager.initializeResources(TopicBasedRemoteLogMetadataManager.java:396) at java.base/java.lang.Thread.run(Thread.java:1589) Caused by: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1.{code} > Replication factor: 3 larger than available brokers: 1. > > > Key: KAFKA-16340 > URL: https://issues.apache.org/jira/browse/KAFKA-16340 > Project: Kafka > Issue Type: Wish >Affects Versions: 3.7.0 >Reporter: Jianbin Chen >Priority: Major > Attachments: image-2024-03-05-09-31-35-058.png > > > 设置 remote.log.metadata.topic.replication.factor 是无效的 > {code:java} > broker.id=1 > log.cleanup.policy=delete > lo
[jira] [Updated] (KAFKA-16340) Replication factor: 3 larger than available brokers: 1.
[ https://issues.apache.org/jira/browse/KAFKA-16340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianbin Chen updated KAFKA-16340: - Description: I'm having trouble setting remote.log.metadata.topic.replication.factor to be invalid when testing tiered storage {code:java} broker.id=1 log.cleanup.policy=delete log.cleaner.enable=true log.cleaner.delete.retention.ms=30 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 message.max.bytes=5242880 replica.fetch.max.bytes=5242880 log.dirs=/data01/kafka110-logs num.partitions=2 default.replication.factor=1 delete.topic.enable=true auto.create.topics.enable=true num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=1 transaction.state.log.replication.factor=1 transaction.state.log.min.isr=1 offsets.retention.minutes=1440 log.retention.minutes=10 log.local.retention.ms=30 log.segment.bytes=104857600 log.retention.check.interval.ms=30 remote.log.metadata.topic.replication.factor=1 remote.log.storage.system.enable=true remote.log.metadata.topic.retention.ms=-1{code} !image-2024-03-05-09-31-35-058.png! {code:java} [2024-03-05 09:27:49,672] ERROR Encountered error while creating __remote_log_metadata topic. (org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager) java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1. at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396) at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073) at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165) at org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager.createTopic(TopicBasedRemoteLogMetadataManager.java:509) at org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager.initializeResources(TopicBasedRemoteLogMetadataManager.java:396) at java.base/java.lang.Thread.run(Thread.java:1589) Caused by: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1.{code} was: 我在测试分层存储时遇到了设置remote.log.metadata.topic.replication.factor 无效的问题 {code:java} broker.id=1 log.cleanup.policy=delete log.cleaner.enable=true log.cleaner.delete.retention.ms=30 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 message.max.bytes=5242880 replica.fetch.max.bytes=5242880 log.dirs=/data01/kafka110-logs num.partitions=2 default.replication.factor=1 delete.topic.enable=true auto.create.topics.enable=true num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=1 transaction.state.log.replication.factor=1 transaction.state.log.min.isr=1 offsets.retention.minutes=1440 log.retention.minutes=10 log.local.retention.ms=30 log.segment.bytes=104857600 log.retention.check.interval.ms=30 remote.log.metadata.topic.replication.factor=1 remote.log.storage.system.enable=true remote.log.metadata.topic.retention.ms=-1{code} !image-2024-03-05-09-31-35-058.png! {code:java} [2024-03-05 09:27:49,672] ERROR Encountered error while creating __remote_log_metadata topic. (org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager) java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1. at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396) at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073) at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165) at org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager.createTopic(TopicBasedRemoteLogMetadataManager.java:509) at org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager.initializeResources(TopicBasedRemoteLogMetadataManager.java:396) at java.base/java.lang.Thread.run(Thread.java:1589) Caused by: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1.{code} > Replication factor: 3 larger than available brokers: 1. > > > Key: KAFKA-16340 > URL: https://issues.apache.org/jira/browse/KAFKA-16340 > Project: Kafka > Issue Type: Wish >Affects Versions: 3.7.0 >Reporter: Jianbin Chen >Priority: Major > Attachments: image-2024-03-05-09-31-35-058.png > > > I'm having trouble setting remote.log.metadata.topic.replication.factor to be > invalid when testing
[jira] [Created] (KAFKA-16340) Replication factor: 3 larger than available brokers: 1.
Jianbin Chen created KAFKA-16340: Summary: Replication factor: 3 larger than available brokers: 1. Key: KAFKA-16340 URL: https://issues.apache.org/jira/browse/KAFKA-16340 Project: Kafka Issue Type: Wish Affects Versions: 3.7.0 Reporter: Jianbin Chen Attachments: image-2024-03-05-09-31-35-058.png 我在测试分层存储时遇到了设置remote.log.metadata.topic.replication.factor 无效的问题 {code:java} broker.id=1 log.cleanup.policy=delete log.cleaner.enable=true log.cleaner.delete.retention.ms=30 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 message.max.bytes=5242880 replica.fetch.max.bytes=5242880 log.dirs=/data01/kafka110-logs num.partitions=2 default.replication.factor=1 delete.topic.enable=true auto.create.topics.enable=true num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=1 transaction.state.log.replication.factor=1 transaction.state.log.min.isr=1 offsets.retention.minutes=1440 log.retention.minutes=10 log.local.retention.ms=30 log.segment.bytes=104857600 log.retention.check.interval.ms=30 remote.log.metadata.topic.replication.factor=1 remote.log.storage.system.enable=true remote.log.metadata.topic.retention.ms=-1{code} !image-2024-03-05-09-31-35-058.png! {code:java} [2024-03-05 09:27:49,672] ERROR Encountered error while creating __remote_log_metadata topic. (org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager) java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1. at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396) at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073) at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165) at org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager.createTopic(TopicBasedRemoteLogMetadataManager.java:509) at org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager.initializeResources(TopicBasedRemoteLogMetadataManager.java:396) at java.base/java.lang.Thread.run(Thread.java:1589) Caused by: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1.{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16060) Some questions about tiered storage capabilities
[ https://issues.apache.org/jira/browse/KAFKA-16060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17805362#comment-17805362 ] Jianbin Chen commented on KAFKA-16060: -- Thank you for your replies. > Some questions about tiered storage capabilities > > > Key: KAFKA-16060 > URL: https://issues.apache.org/jira/browse/KAFKA-16060 > Project: Kafka > Issue Type: Wish > Components: core >Affects Versions: 3.6.1 >Reporter: Jianbin Chen >Priority: Major > > # If a topic has 3 replicas, when the local expiration time is reached, will > all 3 replicas trigger the log transfer to the remote storage, or will only > the leader in the isr transfer the log to the remote storage (hdfs, s3) > # Topics that do not support compression, do you mean topics that > log.cleanup.policy=compact? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16060) Some questions about tiered storage capabilities
[ https://issues.apache.org/jira/browse/KAFKA-16060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17803449#comment-17803449 ] Jianbin Chen commented on KAFKA-16060: -- Thank you both for your replies, and allow me to ask an additional question for jbod disk mount access, i.e. log.dirs=/data01,/data02 Will there be any future plans to support tiered storage in this manner? > Some questions about tiered storage capabilities > > > Key: KAFKA-16060 > URL: https://issues.apache.org/jira/browse/KAFKA-16060 > Project: Kafka > Issue Type: Wish > Components: core >Affects Versions: 3.6.1 >Reporter: Jianbin Chen >Priority: Major > > # If a topic has 3 replicas, when the local expiration time is reached, will > all 3 replicas trigger the log transfer to the remote storage, or will only > the leader in the isr transfer the log to the remote storage (hdfs, s3) > # Topics that do not support compression, do you mean topics that > log.cleanup.policy=compact? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16060) Some questions about tiered storage capabilities
Jianbin Chen created KAFKA-16060: Summary: Some questions about tiered storage capabilities Key: KAFKA-16060 URL: https://issues.apache.org/jira/browse/KAFKA-16060 Project: Kafka Issue Type: Wish Components: core Affects Versions: 3.6.1 Reporter: Jianbin Chen # If a topic has 3 replicas, when the local expiration time is reached, will all 3 replicas trigger the log transfer to the remote storage, or will only the leader in the isr transfer the log to the remote storage (hdfs, s3) # Topics that do not support compression, do you mean topics that log.cleanup.policy=compact? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16039) RecordHeaders supports the addAll method
[ https://issues.apache.org/jira/browse/KAFKA-16039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianbin Chen updated KAFKA-16039: - External issue URL: https://github.com/apache/kafka/pull/15034 > RecordHeaders supports the addAll method > > > Key: KAFKA-16039 > URL: https://issues.apache.org/jira/browse/KAFKA-16039 > Project: Kafka > Issue Type: Improvement > Components: clients >Reporter: Jianbin Chen >Priority: Minor > > Why not provide an addAll method in RecordHeaders? This will help reduce the > amount of code required to copy between headers -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16039) RecordHeaders supports the addAll method
Jianbin Chen created KAFKA-16039: Summary: RecordHeaders supports the addAll method Key: KAFKA-16039 URL: https://issues.apache.org/jira/browse/KAFKA-16039 Project: Kafka Issue Type: Improvement Components: clients Reporter: Jianbin Chen Why not provide an addAll method in RecordHeaders? This will help reduce the amount of code required to copy between headers -- This message was sent by Atlassian Jira (v8.20.10#820010)