[jira] [Commented] (KAFKA-6762) log-cleaner thread terminates due to java.lang.IllegalStateException
[ https://issues.apache.org/jira/browse/KAFKA-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17003098#comment-17003098 ] Fangbin Sun commented on KAFKA-6762: [~ijuma] This happens in our production env which version is 0.10.2.0, and the above workaround comment by [~ricbartm] didn't work. How can we fix this, any advice? > log-cleaner thread terminates due to java.lang.IllegalStateException > > > Key: KAFKA-6762 > URL: https://issues.apache.org/jira/browse/KAFKA-6762 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.0.0 > Environment: os: GNU/Linux > arch: x86_64 > Kernel: 4.9.77 > jvm: OpenJDK 1.8.0 >Reporter: Ricardo Bartolome >Priority: Major > Attachments: __consumer_offsets-9_.tar.xz > > > We are experiencing some problems with kafka log-cleaner thread on Kafka > 1.0.0. We have planned to update this cluster to 1.1.0 by next week in order > to fix KAFKA-6683, but until then we can only confirm that it happens in > 1.0.0. > log-cleaner thread crashes after a while with the following error: > {code:java} > [2018-03-28 11:14:40,199] INFO Cleaner 0: Beginning cleaning of log > __consumer_offsets-31. (kafka.log.LogCleaner) > [2018-03-28 11:14:40,199] INFO Cleaner 0: Building offset map for > __consumer_offsets-31... (kafka.log.LogCleaner) > [2018-03-28 11:14:40,218] INFO Cleaner 0: Building offset map for log > __consumer_offsets-31 for 16 segments in offset range [1612869, 14282934). > (kafka.log.LogCleaner) > [2018-03-28 11:14:58,566] INFO Cleaner 0: Offset map for log > __consumer_offsets-31 complete. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,566] INFO Cleaner 0: Cleaning log __consumer_offsets-31 > (cleaning prior to Tue Mar 27 09:25:09 GMT 2018, discarding tombstones prior > to Sat Feb 24 11:04:21 GMT 2018 > )... (kafka.log.LogCleaner) > [2018-03-28 11:14:58,567] INFO Cleaner 0: Cleaning segment 0 in log > __consumer_offsets-31 (largest timestamp Fri Feb 23 11:40:54 GMT 2018) into > 0, discarding deletes. (kafka.log.LogClea > ner) > [2018-03-28 11:14:58,570] INFO Cleaner 0: Growing cleaner I/O buffers from > 262144bytes to 524288 bytes. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,576] INFO Cleaner 0: Growing cleaner I/O buffers from > 524288bytes to 112 bytes. (kafka.log.LogCleaner) > [2018-03-28 11:14:58,593] ERROR [kafka-log-cleaner-thread-0]: Error due to > (kafka.log.LogCleaner) > java.lang.IllegalStateException: This log contains a message larger than > maximum allowable size of 112. > at kafka.log.Cleaner.growBuffers(LogCleaner.scala:622) > at kafka.log.Cleaner.cleanInto(LogCleaner.scala:574) > at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:459) > at kafka.log.Cleaner.$anonfun$doClean$6(LogCleaner.scala:396) > at kafka.log.Cleaner.$anonfun$doClean$6$adapted(LogCleaner.scala:395) > at scala.collection.immutable.List.foreach(List.scala:389) > at kafka.log.Cleaner.doClean(LogCleaner.scala:395) > at kafka.log.Cleaner.clean(LogCleaner.scala:372) > at > kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:263) > at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:243) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:64) > [2018-03-28 11:14:58,601] INFO [kafka-log-cleaner-thread-0]: Stopped > (kafka.log.LogCleaner) > [2018-04-04 14:25:12,773] INFO The cleaning for partition > __broker-11-health-check-0 is aborted and paused (kafka.log.LogCleaner) > [2018-04-04 14:25:12,773] INFO Compaction for partition > __broker-11-health-check-0 is resumed (kafka.log.LogCleaner) > [2018-04-04 14:25:12,774] INFO The cleaning for partition > __broker-11-health-check-0 is aborted (kafka.log.LogCleaner) > [2018-04-04 14:25:22,850] INFO Shutting down the log cleaner. > (kafka.log.LogCleaner) > [2018-04-04 14:25:22,850] INFO [kafka-log-cleaner-thread-0]: Shutting down > (kafka.log.LogCleaner) > [2018-04-04 14:25:22,850] INFO [kafka-log-cleaner-thread-0]: Shutdown > completed (kafka.log.LogCleaner) > {code} > What we know so far is: > * We are unable to reproduce it yet in a consistent manner. > * It only happens in the PRO cluster and not in the PRE cluster for the same > customer (which message payloads are very similar) > * Checking our Kafka logs, it only happened on the internal topics > *__consumer_offsets-** > * When we restart the broker process the log-cleaner starts working again > but it can take between 3 minutes and some hours to die again. > * We workaround it by temporary increasing the message.max.bytes and > replica.fetch.max.bytes values to 10485760 (10MB) from default 112 (~1MB). > ** Before message.max.bytes = 1
[jira] [Comment Edited] (KAFKA-6582) Partitions get underreplicated, with a single ISR, and doesn't recover. Other brokers do not take over and we need to manually restart the broker.
[ https://issues.apache.org/jira/browse/KAFKA-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16930261#comment-16930261 ] Fangbin Sun edited comment on KAFKA-6582 at 9/16/19 6:55 AM: - Someone encountered similar issue in version 2.1.1(KAFKA-7870), is the issue indeed resolved in 2.1.1? was (Author: fangbin): Someone encountered similar issue in version 2.1.1, KAFKA-7870, is the issue indeed resolved in 2.1.1? > Partitions get underreplicated, with a single ISR, and doesn't recover. Other > brokers do not take over and we need to manually restart the broker. > -- > > Key: KAFKA-6582 > URL: https://issues.apache.org/jira/browse/KAFKA-6582 > Project: Kafka > Issue Type: Bug > Components: network >Affects Versions: 1.0.0 > Environment: Ubuntu 16.04 > Linux kafka04 4.4.0-109-generic #132-Ubuntu SMP Tue Jan 9 19:52:39 UTC 2018 > x86_64 x86_64 x86_64 GNU/Linux > java version "9.0.1" > Java(TM) SE Runtime Environment (build 9.0.1+11) > Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode) > but also tried with the latest JVM 8 before with the same result. >Reporter: Jurriaan Pruis >Priority: Major > Attachments: Screenshot 2019-01-18 at 13.08.17.png, Screenshot > 2019-01-18 at 13.16.59.png > > > Partitions get underreplicated, with a single ISR, and doesn't recover. Other > brokers do not take over and we need to manually restart the 'single ISR' > broker (if you describe the partitions of replicated topic it is clear that > some partitions are only in sync on this broker). > This bug resembles KAFKA-4477 a lot, but since that issue is marked as > resolved this is probably something else but similar. > We have the same issue (or at least it looks pretty similar) on Kafka 1.0. > Since upgrading to Kafka 1.0 in November 2017 we've had these issues (we've > upgraded from Kafka 0.10.2.1). > This happens almost every 24-48 hours on a random broker. This is why we > currently have a cronjob which restarts every broker every 24 hours. > During this issue the ISR shows the following server log: > {code:java} > [2018-02-20 12:02:08,342] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.148.20:56352-96708 (kafka.network.Processor) > [2018-02-20 12:02:08,364] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.150.25:54412-96715 (kafka.network.Processor) > [2018-02-20 12:02:08,349] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.149.18:35182-96705 (kafka.network.Processor) > [2018-02-20 12:02:08,379] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.150.25:54456-96717 (kafka.network.Processor) > [2018-02-20 12:02:08,448] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.159.20:36388-96720 (kafka.network.Processor) > [2018-02-20 12:02:08,683] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.157.110:41922-96740 (kafka.network.Processor) > {code} > Also on the ISR broker, the controller log shows this: > {code:java} > [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-3-send-thread]: > Controller 3 connected to 10.132.0.32:9092 (id: 3 rack: null) for sending > state change requests (kafka.controller.RequestSendThread) > [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-0-send-thread]: > Controller 3 connected to 10.132.0.10:9092 (id: 0 rack: null) for sending > state change requests (kafka.controller.RequestSendThread) > [2018-02-20 12:02:14,928] INFO [Controller-3-to-broker-1-send-thread]: > Controller 3 connected to 10.132.0.12:9092 (id: 1 rack: null) for sending > state change requests (kafka.controller.RequestSendThread){code} > And the non-ISR brokers show these kind of errors: > > {code:java} > 2018-02-20 12:02:29,204] WARN [ReplicaFetcher replicaId=1, leaderId=3, > fetcherId=0] Error in fetch to broker 3, request (type=FetchRequest, > replicaId=1, maxWait=500, minBytes=1, maxBytes=10485760, > fetchData={..}, isolationLevel=READ_UNCOMMITTED) > (kafka.server.ReplicaFetcherThread) > java.io.IOException: Connection to 3 was disconnected before the response was > read > at > org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:95) > at > kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingS
[jira] [Comment Edited] (KAFKA-6582) Partitions get underreplicated, with a single ISR, and doesn't recover. Other brokers do not take over and we need to manually restart the broker.
[ https://issues.apache.org/jira/browse/KAFKA-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16930261#comment-16930261 ] Fangbin Sun edited comment on KAFKA-6582 at 9/16/19 6:54 AM: - Someone encountered similar issue in version 2.1.1, KAFKA-7870, is the issue indeed resolved in 2.1.1? was (Author: fangbin): Someone encountered similar issue in [KAFKA-7870|https://issues.apache.org/jira/browse/KAFKA-7870], is the issue indeed resolved in 2.1.1? > Partitions get underreplicated, with a single ISR, and doesn't recover. Other > brokers do not take over and we need to manually restart the broker. > -- > > Key: KAFKA-6582 > URL: https://issues.apache.org/jira/browse/KAFKA-6582 > Project: Kafka > Issue Type: Bug > Components: network >Affects Versions: 1.0.0 > Environment: Ubuntu 16.04 > Linux kafka04 4.4.0-109-generic #132-Ubuntu SMP Tue Jan 9 19:52:39 UTC 2018 > x86_64 x86_64 x86_64 GNU/Linux > java version "9.0.1" > Java(TM) SE Runtime Environment (build 9.0.1+11) > Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode) > but also tried with the latest JVM 8 before with the same result. >Reporter: Jurriaan Pruis >Priority: Major > Attachments: Screenshot 2019-01-18 at 13.08.17.png, Screenshot > 2019-01-18 at 13.16.59.png > > > Partitions get underreplicated, with a single ISR, and doesn't recover. Other > brokers do not take over and we need to manually restart the 'single ISR' > broker (if you describe the partitions of replicated topic it is clear that > some partitions are only in sync on this broker). > This bug resembles KAFKA-4477 a lot, but since that issue is marked as > resolved this is probably something else but similar. > We have the same issue (or at least it looks pretty similar) on Kafka 1.0. > Since upgrading to Kafka 1.0 in November 2017 we've had these issues (we've > upgraded from Kafka 0.10.2.1). > This happens almost every 24-48 hours on a random broker. This is why we > currently have a cronjob which restarts every broker every 24 hours. > During this issue the ISR shows the following server log: > {code:java} > [2018-02-20 12:02:08,342] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.148.20:56352-96708 (kafka.network.Processor) > [2018-02-20 12:02:08,364] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.150.25:54412-96715 (kafka.network.Processor) > [2018-02-20 12:02:08,349] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.149.18:35182-96705 (kafka.network.Processor) > [2018-02-20 12:02:08,379] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.150.25:54456-96717 (kafka.network.Processor) > [2018-02-20 12:02:08,448] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.159.20:36388-96720 (kafka.network.Processor) > [2018-02-20 12:02:08,683] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.157.110:41922-96740 (kafka.network.Processor) > {code} > Also on the ISR broker, the controller log shows this: > {code:java} > [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-3-send-thread]: > Controller 3 connected to 10.132.0.32:9092 (id: 3 rack: null) for sending > state change requests (kafka.controller.RequestSendThread) > [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-0-send-thread]: > Controller 3 connected to 10.132.0.10:9092 (id: 0 rack: null) for sending > state change requests (kafka.controller.RequestSendThread) > [2018-02-20 12:02:14,928] INFO [Controller-3-to-broker-1-send-thread]: > Controller 3 connected to 10.132.0.12:9092 (id: 1 rack: null) for sending > state change requests (kafka.controller.RequestSendThread){code} > And the non-ISR brokers show these kind of errors: > > {code:java} > 2018-02-20 12:02:29,204] WARN [ReplicaFetcher replicaId=1, leaderId=3, > fetcherId=0] Error in fetch to broker 3, request (type=FetchRequest, > replicaId=1, maxWait=500, minBytes=1, maxBytes=10485760, > fetchData={..}, isolationLevel=READ_UNCOMMITTED) > (kafka.server.ReplicaFetcherThread) > java.io.IOException: Connection to 3 was disconnected before the response was > read > at > org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:95) > at > kafka.server.ReplicaFetcherBlockingSen
[jira] [Commented] (KAFKA-6582) Partitions get underreplicated, with a single ISR, and doesn't recover. Other brokers do not take over and we need to manually restart the broker.
[ https://issues.apache.org/jira/browse/KAFKA-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16930261#comment-16930261 ] Fangbin Sun commented on KAFKA-6582: Someone encountered similar issue in [KAFKA-7870|https://issues.apache.org/jira/browse/KAFKA-7870], is the issue indeed resolved in 2.1.1? > Partitions get underreplicated, with a single ISR, and doesn't recover. Other > brokers do not take over and we need to manually restart the broker. > -- > > Key: KAFKA-6582 > URL: https://issues.apache.org/jira/browse/KAFKA-6582 > Project: Kafka > Issue Type: Bug > Components: network >Affects Versions: 1.0.0 > Environment: Ubuntu 16.04 > Linux kafka04 4.4.0-109-generic #132-Ubuntu SMP Tue Jan 9 19:52:39 UTC 2018 > x86_64 x86_64 x86_64 GNU/Linux > java version "9.0.1" > Java(TM) SE Runtime Environment (build 9.0.1+11) > Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode) > but also tried with the latest JVM 8 before with the same result. >Reporter: Jurriaan Pruis >Priority: Major > Attachments: Screenshot 2019-01-18 at 13.08.17.png, Screenshot > 2019-01-18 at 13.16.59.png > > > Partitions get underreplicated, with a single ISR, and doesn't recover. Other > brokers do not take over and we need to manually restart the 'single ISR' > broker (if you describe the partitions of replicated topic it is clear that > some partitions are only in sync on this broker). > This bug resembles KAFKA-4477 a lot, but since that issue is marked as > resolved this is probably something else but similar. > We have the same issue (or at least it looks pretty similar) on Kafka 1.0. > Since upgrading to Kafka 1.0 in November 2017 we've had these issues (we've > upgraded from Kafka 0.10.2.1). > This happens almost every 24-48 hours on a random broker. This is why we > currently have a cronjob which restarts every broker every 24 hours. > During this issue the ISR shows the following server log: > {code:java} > [2018-02-20 12:02:08,342] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.148.20:56352-96708 (kafka.network.Processor) > [2018-02-20 12:02:08,364] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.150.25:54412-96715 (kafka.network.Processor) > [2018-02-20 12:02:08,349] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.149.18:35182-96705 (kafka.network.Processor) > [2018-02-20 12:02:08,379] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.150.25:54456-96717 (kafka.network.Processor) > [2018-02-20 12:02:08,448] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.159.20:36388-96720 (kafka.network.Processor) > [2018-02-20 12:02:08,683] WARN Attempting to send response via channel for > which there is no open connection, connection id > 10.132.0.32:9092-10.14.157.110:41922-96740 (kafka.network.Processor) > {code} > Also on the ISR broker, the controller log shows this: > {code:java} > [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-3-send-thread]: > Controller 3 connected to 10.132.0.32:9092 (id: 3 rack: null) for sending > state change requests (kafka.controller.RequestSendThread) > [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-0-send-thread]: > Controller 3 connected to 10.132.0.10:9092 (id: 0 rack: null) for sending > state change requests (kafka.controller.RequestSendThread) > [2018-02-20 12:02:14,928] INFO [Controller-3-to-broker-1-send-thread]: > Controller 3 connected to 10.132.0.12:9092 (id: 1 rack: null) for sending > state change requests (kafka.controller.RequestSendThread){code} > And the non-ISR brokers show these kind of errors: > > {code:java} > 2018-02-20 12:02:29,204] WARN [ReplicaFetcher replicaId=1, leaderId=3, > fetcherId=0] Error in fetch to broker 3, request (type=FetchRequest, > replicaId=1, maxWait=500, minBytes=1, maxBytes=10485760, > fetchData={..}, isolationLevel=READ_UNCOMMITTED) > (kafka.server.ReplicaFetcherThread) > java.io.IOException: Connection to 3 was disconnected before the response was > read > at > org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:95) > at > kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:96) > at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:205) > at kafka.server.ReplicaFetcherThread.fetch(Rep
[jira] [Created] (KAFKA-7735) StateChangeLogMerger tool can not work due to incorrect topic regular matches
Fangbin Sun created KAFKA-7735: -- Summary: StateChangeLogMerger tool can not work due to incorrect topic regular matches Key: KAFKA-7735 URL: https://issues.apache.org/jira/browse/KAFKA-7735 Project: Kafka Issue Type: Bug Components: tools Affects Versions: 2.0.0 Reporter: Fangbin Sun When StateChangeLogMerger tool tries to obtain a topic's state-change-log, it returns nothing. {code:java} bin/kafka-run-class.sh com.cmss.kafka.api.StateChangeLogMerger --logs state-change.log --topic test{code} This tool uses a topic partition regex as follows: {code:java} val topicPartitionRegex = new Regex("\\[(" + Topic.LEGAL_CHARS + "+),( )*([0-9]+)\\]"){code} However the state-change-log no longer prints log in the above format. e.g. in 0.10.2.0, it prints some state-change logs by case class TopicAndPartition which overrided as follows: {code:java} override def toString = "[%s,%d]".format(topic, partition){code} In a newer version (e.g. 1.0.0+) it prints most of state-change logs in the form of "partition $topic-$partition", as a workaround one can modify the topic partition regex like: {code:java} val topicPartitionRegex = new Regex("(partition " + Topic.LEGAL_CHARS + "+)-([0-9]+)"){code} and match topic with "matcher.group(1).substring(10)", however some output of state changes might be a little bit redundant. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KAFKA-7262) Ability to support heterogeneous storage in Kafka
[ https://issues.apache.org/jira/browse/KAFKA-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16574282#comment-16574282 ] fangbin sun edited comment on KAFKA-7262 at 8/9/18 5:29 AM: In addition, how it goes when one of HDD fails, could Kafka itself move the general topic-partitions to SSD-mounted directories? Is there an internal request or mechanism in Kafka that might lead to this situation? Many thanks! was (Author: fangbin): In addition, how it goes when one of HDD fails, could Kafka itself move the general topic-partitions to SSD-mounted directories? Is it possible that any requests or mechanisms in Kafka could lead to this situation? Many thanks! > Ability to support heterogeneous storage in Kafka > - > > Key: KAFKA-7262 > URL: https://issues.apache.org/jira/browse/KAFKA-7262 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 2.0.0 >Reporter: fangbin sun >Priority: Major > > Currently we have a batch of servers, each broker has one SSD (1.5T) and ten > HDDs (10T). For all I know, Kafka itself doesn't know much about the > underlying hardware, it chooses the directory with the least number of > partitions when creating a topic-partition. > Is it possible to deploy a heterogeneous cluster for taking advantage of SSD > with smaller disk capacity? Hope you all consider this. Any insights or > guidance would be greatly appreciated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7262) Ability to support heterogeneous storage in Kafka
[ https://issues.apache.org/jira/browse/KAFKA-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16574282#comment-16574282 ] fangbin sun commented on KAFKA-7262: In addition, how it goes when one of HDD fails, could Kafka itself move the general topic-partitions to SSD-mounted directories? Is it possible that any requests or mechanisms in Kafka could lead to this situation? Many thanks! > Ability to support heterogeneous storage in Kafka > - > > Key: KAFKA-7262 > URL: https://issues.apache.org/jira/browse/KAFKA-7262 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 2.0.0 >Reporter: fangbin sun >Priority: Major > > Currently we have a batch of servers, each broker has one SSD (1.5T) and ten > HDDs (10T). For all I know, Kafka itself doesn't know much about the > underlying hardware, it chooses the directory with the least number of > partitions when creating a topic-partition. > Is it possible to deploy a heterogeneous cluster for taking advantage of SSD > with smaller disk capacity? Hope you all consider this. Any insights or > guidance would be greatly appreciated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7262) Ability to support heterogeneous storage in Kafka
[ https://issues.apache.org/jira/browse/KAFKA-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16574281#comment-16574281 ] fangbin sun commented on KAFKA-7262: [~huxi_2b] Maybe we need "topic-isolation", key topic-partition distributes to SSD-mounted directories, the others to HDD-mounted directories. As Kafka ensure that number of partitions is evenly distributed across the disks, a large amount of partitions should be moved back and forth between SSD and HDDs. It's a huge amount of work especially with numerous brokers and partitions. > Ability to support heterogeneous storage in Kafka > - > > Key: KAFKA-7262 > URL: https://issues.apache.org/jira/browse/KAFKA-7262 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 2.0.0 >Reporter: fangbin sun >Priority: Major > > Currently we have a batch of servers, each broker has one SSD (1.5T) and ten > HDDs (10T). For all I know, Kafka itself doesn't know much about the > underlying hardware, it chooses the directory with the least number of > partitions when creating a topic-partition. > Is it possible to deploy a heterogeneous cluster for taking advantage of SSD > with smaller disk capacity? Hope you all consider this. Any insights or > guidance would be greatly appreciated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-7262) Ability to support heterogeneous storage in Kafka
fangbin sun created KAFKA-7262: -- Summary: Ability to support heterogeneous storage in Kafka Key: KAFKA-7262 URL: https://issues.apache.org/jira/browse/KAFKA-7262 Project: Kafka Issue Type: Improvement Components: log Affects Versions: 2.0.0 Reporter: fangbin sun Currently we have a batch of servers, each broker has one SSD (1.5T) and ten HDDs (10T). For all I know, Kafka itself doesn't know much about the underlying hardware, it chooses the directory with the least number of partitions when creating a topic-partition. Is it possible to deploy a heterogeneous cluster for taking advantage of SSD with smaller disk capacity? Hope you all consider this. Any insights or guidance would be greatly appreciated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)