[jira] [Commented] (KAFKA-6762) log-cleaner thread terminates due to java.lang.IllegalStateException

2019-12-24 Thread Fangbin Sun (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17003098#comment-17003098
 ] 

Fangbin Sun commented on KAFKA-6762:


[~ijuma] This happens in our production env which version is 0.10.2.0, and the 
above workaround comment by [~ricbartm] didn't work. How can we fix this, any 
advice?

> log-cleaner thread terminates due to java.lang.IllegalStateException
> 
>
> Key: KAFKA-6762
> URL: https://issues.apache.org/jira/browse/KAFKA-6762
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.0.0
> Environment: os: GNU/Linux 
> arch: x86_64 
> Kernel: 4.9.77 
> jvm: OpenJDK 1.8.0
>Reporter: Ricardo Bartolome
>Priority: Major
> Attachments: __consumer_offsets-9_.tar.xz
>
>
> We are experiencing some problems with kafka log-cleaner thread on Kafka 
> 1.0.0. We have planned to update this cluster to 1.1.0 by next week in order 
> to fix KAFKA-6683, but until then we can only confirm that it happens in 
> 1.0.0.
> log-cleaner thread crashes after a while with the following error:
> {code:java}
> [2018-03-28 11:14:40,199] INFO Cleaner 0: Beginning cleaning of log 
> __consumer_offsets-31. (kafka.log.LogCleaner)
> [2018-03-28 11:14:40,199] INFO Cleaner 0: Building offset map for 
> __consumer_offsets-31... (kafka.log.LogCleaner)
> [2018-03-28 11:14:40,218] INFO Cleaner 0: Building offset map for log 
> __consumer_offsets-31 for 16 segments in offset range [1612869, 14282934). 
> (kafka.log.LogCleaner)
> [2018-03-28 11:14:58,566] INFO Cleaner 0: Offset map for log 
> __consumer_offsets-31 complete. (kafka.log.LogCleaner)
> [2018-03-28 11:14:58,566] INFO Cleaner 0: Cleaning log __consumer_offsets-31 
> (cleaning prior to Tue Mar 27 09:25:09 GMT 2018, discarding tombstones prior 
> to Sat Feb 24 11:04:21 GMT 2018
> )... (kafka.log.LogCleaner)
> [2018-03-28 11:14:58,567] INFO Cleaner 0: Cleaning segment 0 in log 
> __consumer_offsets-31 (largest timestamp Fri Feb 23 11:40:54 GMT 2018) into 
> 0, discarding deletes. (kafka.log.LogClea
> ner)
> [2018-03-28 11:14:58,570] INFO Cleaner 0: Growing cleaner I/O buffers from 
> 262144bytes to 524288 bytes. (kafka.log.LogCleaner)
> [2018-03-28 11:14:58,576] INFO Cleaner 0: Growing cleaner I/O buffers from 
> 524288bytes to 112 bytes. (kafka.log.LogCleaner)
> [2018-03-28 11:14:58,593] ERROR [kafka-log-cleaner-thread-0]: Error due to 
> (kafka.log.LogCleaner)
> java.lang.IllegalStateException: This log contains a message larger than 
> maximum allowable size of 112.
> at kafka.log.Cleaner.growBuffers(LogCleaner.scala:622)
> at kafka.log.Cleaner.cleanInto(LogCleaner.scala:574)
> at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:459)
> at kafka.log.Cleaner.$anonfun$doClean$6(LogCleaner.scala:396)
> at kafka.log.Cleaner.$anonfun$doClean$6$adapted(LogCleaner.scala:395)
> at scala.collection.immutable.List.foreach(List.scala:389)
> at kafka.log.Cleaner.doClean(LogCleaner.scala:395)
> at kafka.log.Cleaner.clean(LogCleaner.scala:372)
> at 
> kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:263)
> at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:243)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:64)
> [2018-03-28 11:14:58,601] INFO [kafka-log-cleaner-thread-0]: Stopped 
> (kafka.log.LogCleaner)
> [2018-04-04 14:25:12,773] INFO The cleaning for partition 
> __broker-11-health-check-0 is aborted and paused (kafka.log.LogCleaner)
> [2018-04-04 14:25:12,773] INFO Compaction for partition 
> __broker-11-health-check-0 is resumed (kafka.log.LogCleaner)
> [2018-04-04 14:25:12,774] INFO The cleaning for partition 
> __broker-11-health-check-0 is aborted (kafka.log.LogCleaner)
> [2018-04-04 14:25:22,850] INFO Shutting down the log cleaner. 
> (kafka.log.LogCleaner)
> [2018-04-04 14:25:22,850] INFO [kafka-log-cleaner-thread-0]: Shutting down 
> (kafka.log.LogCleaner)
> [2018-04-04 14:25:22,850] INFO [kafka-log-cleaner-thread-0]: Shutdown 
> completed (kafka.log.LogCleaner)
> {code}
> What we know so far is:
>  * We are unable to reproduce it yet in a consistent manner.
>  * It only happens in the PRO cluster and not in the PRE cluster for the same 
> customer (which message payloads are very similar)
>  * Checking our Kafka logs, it only happened on the internal topics 
> *__consumer_offsets-**
>  * When we restart the broker process the log-cleaner starts working again 
> but it can take between 3 minutes and some hours to die again.
>  * We workaround it by temporary increasing the message.max.bytes and 
> replica.fetch.max.bytes values to 10485760 (10MB) from default 112 (~1MB).
> ** Before message.max.bytes = 1

[jira] [Comment Edited] (KAFKA-6582) Partitions get underreplicated, with a single ISR, and doesn't recover. Other brokers do not take over and we need to manually restart the broker.

2019-09-15 Thread Fangbin Sun (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16930261#comment-16930261
 ] 

Fangbin Sun edited comment on KAFKA-6582 at 9/16/19 6:55 AM:
-

Someone encountered similar issue in version 2.1.1(KAFKA-7870), is the issue 
indeed resolved in 2.1.1?


was (Author: fangbin):
Someone encountered similar issue in version 2.1.1, KAFKA-7870, is the issue 
indeed resolved in 2.1.1?

> Partitions get underreplicated, with a single ISR, and doesn't recover. Other 
> brokers do not take over and we need to manually restart the broker.
> --
>
> Key: KAFKA-6582
> URL: https://issues.apache.org/jira/browse/KAFKA-6582
> Project: Kafka
>  Issue Type: Bug
>  Components: network
>Affects Versions: 1.0.0
> Environment: Ubuntu 16.04
> Linux kafka04 4.4.0-109-generic #132-Ubuntu SMP Tue Jan 9 19:52:39 UTC 2018 
> x86_64 x86_64 x86_64 GNU/Linux
> java version "9.0.1"
> Java(TM) SE Runtime Environment (build 9.0.1+11)
> Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode) 
> but also tried with the latest JVM 8 before with the same result.
>Reporter: Jurriaan Pruis
>Priority: Major
> Attachments: Screenshot 2019-01-18 at 13.08.17.png, Screenshot 
> 2019-01-18 at 13.16.59.png
>
>
> Partitions get underreplicated, with a single ISR, and doesn't recover. Other 
> brokers do not take over and we need to manually restart the 'single ISR' 
> broker (if you describe the partitions of replicated topic it is clear that 
> some partitions are only in sync on this broker).
> This bug resembles KAFKA-4477 a lot, but since that issue is marked as 
> resolved this is probably something else but similar.
> We have the same issue (or at least it looks pretty similar) on Kafka 1.0. 
> Since upgrading to Kafka 1.0 in November 2017 we've had these issues (we've 
> upgraded from Kafka 0.10.2.1).
> This happens almost every 24-48 hours on a random broker. This is why we 
> currently have a cronjob which restarts every broker every 24 hours. 
> During this issue the ISR shows the following server log: 
> {code:java}
> [2018-02-20 12:02:08,342] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.148.20:56352-96708 (kafka.network.Processor)
> [2018-02-20 12:02:08,364] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.150.25:54412-96715 (kafka.network.Processor)
> [2018-02-20 12:02:08,349] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.149.18:35182-96705 (kafka.network.Processor)
> [2018-02-20 12:02:08,379] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.150.25:54456-96717 (kafka.network.Processor)
> [2018-02-20 12:02:08,448] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.159.20:36388-96720 (kafka.network.Processor)
> [2018-02-20 12:02:08,683] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.157.110:41922-96740 (kafka.network.Processor)
> {code}
> Also on the ISR broker, the controller log shows this:
> {code:java}
> [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-3-send-thread]: 
> Controller 3 connected to 10.132.0.32:9092 (id: 3 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread)
> [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-0-send-thread]: 
> Controller 3 connected to 10.132.0.10:9092 (id: 0 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread)
> [2018-02-20 12:02:14,928] INFO [Controller-3-to-broker-1-send-thread]: 
> Controller 3 connected to 10.132.0.12:9092 (id: 1 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread){code}
> And the non-ISR brokers show these kind of errors:
>  
> {code:java}
> 2018-02-20 12:02:29,204] WARN [ReplicaFetcher replicaId=1, leaderId=3, 
> fetcherId=0] Error in fetch to broker 3, request (type=FetchRequest, 
> replicaId=1, maxWait=500, minBytes=1, maxBytes=10485760, 
> fetchData={..}, isolationLevel=READ_UNCOMMITTED) 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 3 was disconnected before the response was 
> read
>  at 
> org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:95)
>  at 
> kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingS

[jira] [Comment Edited] (KAFKA-6582) Partitions get underreplicated, with a single ISR, and doesn't recover. Other brokers do not take over and we need to manually restart the broker.

2019-09-15 Thread Fangbin Sun (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16930261#comment-16930261
 ] 

Fangbin Sun edited comment on KAFKA-6582 at 9/16/19 6:54 AM:
-

Someone encountered similar issue in version 2.1.1, KAFKA-7870, is the issue 
indeed resolved in 2.1.1?


was (Author: fangbin):
Someone encountered similar issue in 
[KAFKA-7870|https://issues.apache.org/jira/browse/KAFKA-7870], is the issue 
indeed resolved in 2.1.1?

> Partitions get underreplicated, with a single ISR, and doesn't recover. Other 
> brokers do not take over and we need to manually restart the broker.
> --
>
> Key: KAFKA-6582
> URL: https://issues.apache.org/jira/browse/KAFKA-6582
> Project: Kafka
>  Issue Type: Bug
>  Components: network
>Affects Versions: 1.0.0
> Environment: Ubuntu 16.04
> Linux kafka04 4.4.0-109-generic #132-Ubuntu SMP Tue Jan 9 19:52:39 UTC 2018 
> x86_64 x86_64 x86_64 GNU/Linux
> java version "9.0.1"
> Java(TM) SE Runtime Environment (build 9.0.1+11)
> Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode) 
> but also tried with the latest JVM 8 before with the same result.
>Reporter: Jurriaan Pruis
>Priority: Major
> Attachments: Screenshot 2019-01-18 at 13.08.17.png, Screenshot 
> 2019-01-18 at 13.16.59.png
>
>
> Partitions get underreplicated, with a single ISR, and doesn't recover. Other 
> brokers do not take over and we need to manually restart the 'single ISR' 
> broker (if you describe the partitions of replicated topic it is clear that 
> some partitions are only in sync on this broker).
> This bug resembles KAFKA-4477 a lot, but since that issue is marked as 
> resolved this is probably something else but similar.
> We have the same issue (or at least it looks pretty similar) on Kafka 1.0. 
> Since upgrading to Kafka 1.0 in November 2017 we've had these issues (we've 
> upgraded from Kafka 0.10.2.1).
> This happens almost every 24-48 hours on a random broker. This is why we 
> currently have a cronjob which restarts every broker every 24 hours. 
> During this issue the ISR shows the following server log: 
> {code:java}
> [2018-02-20 12:02:08,342] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.148.20:56352-96708 (kafka.network.Processor)
> [2018-02-20 12:02:08,364] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.150.25:54412-96715 (kafka.network.Processor)
> [2018-02-20 12:02:08,349] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.149.18:35182-96705 (kafka.network.Processor)
> [2018-02-20 12:02:08,379] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.150.25:54456-96717 (kafka.network.Processor)
> [2018-02-20 12:02:08,448] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.159.20:36388-96720 (kafka.network.Processor)
> [2018-02-20 12:02:08,683] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.157.110:41922-96740 (kafka.network.Processor)
> {code}
> Also on the ISR broker, the controller log shows this:
> {code:java}
> [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-3-send-thread]: 
> Controller 3 connected to 10.132.0.32:9092 (id: 3 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread)
> [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-0-send-thread]: 
> Controller 3 connected to 10.132.0.10:9092 (id: 0 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread)
> [2018-02-20 12:02:14,928] INFO [Controller-3-to-broker-1-send-thread]: 
> Controller 3 connected to 10.132.0.12:9092 (id: 1 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread){code}
> And the non-ISR brokers show these kind of errors:
>  
> {code:java}
> 2018-02-20 12:02:29,204] WARN [ReplicaFetcher replicaId=1, leaderId=3, 
> fetcherId=0] Error in fetch to broker 3, request (type=FetchRequest, 
> replicaId=1, maxWait=500, minBytes=1, maxBytes=10485760, 
> fetchData={..}, isolationLevel=READ_UNCOMMITTED) 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 3 was disconnected before the response was 
> read
>  at 
> org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:95)
>  at 
> kafka.server.ReplicaFetcherBlockingSen

[jira] [Commented] (KAFKA-6582) Partitions get underreplicated, with a single ISR, and doesn't recover. Other brokers do not take over and we need to manually restart the broker.

2019-09-15 Thread Fangbin Sun (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16930261#comment-16930261
 ] 

Fangbin Sun commented on KAFKA-6582:


Someone encountered similar issue in 
[KAFKA-7870|https://issues.apache.org/jira/browse/KAFKA-7870], is the issue 
indeed resolved in 2.1.1?

> Partitions get underreplicated, with a single ISR, and doesn't recover. Other 
> brokers do not take over and we need to manually restart the broker.
> --
>
> Key: KAFKA-6582
> URL: https://issues.apache.org/jira/browse/KAFKA-6582
> Project: Kafka
>  Issue Type: Bug
>  Components: network
>Affects Versions: 1.0.0
> Environment: Ubuntu 16.04
> Linux kafka04 4.4.0-109-generic #132-Ubuntu SMP Tue Jan 9 19:52:39 UTC 2018 
> x86_64 x86_64 x86_64 GNU/Linux
> java version "9.0.1"
> Java(TM) SE Runtime Environment (build 9.0.1+11)
> Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode) 
> but also tried with the latest JVM 8 before with the same result.
>Reporter: Jurriaan Pruis
>Priority: Major
> Attachments: Screenshot 2019-01-18 at 13.08.17.png, Screenshot 
> 2019-01-18 at 13.16.59.png
>
>
> Partitions get underreplicated, with a single ISR, and doesn't recover. Other 
> brokers do not take over and we need to manually restart the 'single ISR' 
> broker (if you describe the partitions of replicated topic it is clear that 
> some partitions are only in sync on this broker).
> This bug resembles KAFKA-4477 a lot, but since that issue is marked as 
> resolved this is probably something else but similar.
> We have the same issue (or at least it looks pretty similar) on Kafka 1.0. 
> Since upgrading to Kafka 1.0 in November 2017 we've had these issues (we've 
> upgraded from Kafka 0.10.2.1).
> This happens almost every 24-48 hours on a random broker. This is why we 
> currently have a cronjob which restarts every broker every 24 hours. 
> During this issue the ISR shows the following server log: 
> {code:java}
> [2018-02-20 12:02:08,342] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.148.20:56352-96708 (kafka.network.Processor)
> [2018-02-20 12:02:08,364] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.150.25:54412-96715 (kafka.network.Processor)
> [2018-02-20 12:02:08,349] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.149.18:35182-96705 (kafka.network.Processor)
> [2018-02-20 12:02:08,379] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.150.25:54456-96717 (kafka.network.Processor)
> [2018-02-20 12:02:08,448] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.159.20:36388-96720 (kafka.network.Processor)
> [2018-02-20 12:02:08,683] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.157.110:41922-96740 (kafka.network.Processor)
> {code}
> Also on the ISR broker, the controller log shows this:
> {code:java}
> [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-3-send-thread]: 
> Controller 3 connected to 10.132.0.32:9092 (id: 3 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread)
> [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-0-send-thread]: 
> Controller 3 connected to 10.132.0.10:9092 (id: 0 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread)
> [2018-02-20 12:02:14,928] INFO [Controller-3-to-broker-1-send-thread]: 
> Controller 3 connected to 10.132.0.12:9092 (id: 1 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread){code}
> And the non-ISR brokers show these kind of errors:
>  
> {code:java}
> 2018-02-20 12:02:29,204] WARN [ReplicaFetcher replicaId=1, leaderId=3, 
> fetcherId=0] Error in fetch to broker 3, request (type=FetchRequest, 
> replicaId=1, maxWait=500, minBytes=1, maxBytes=10485760, 
> fetchData={..}, isolationLevel=READ_UNCOMMITTED) 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 3 was disconnected before the response was 
> read
>  at 
> org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:95)
>  at 
> kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:96)
>  at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:205)
>  at kafka.server.ReplicaFetcherThread.fetch(Rep

[jira] [Created] (KAFKA-7735) StateChangeLogMerger tool can not work due to incorrect topic regular matches

2018-12-14 Thread Fangbin Sun (JIRA)
Fangbin Sun created KAFKA-7735:
--

 Summary: StateChangeLogMerger tool can not work due to incorrect 
topic regular matches
 Key: KAFKA-7735
 URL: https://issues.apache.org/jira/browse/KAFKA-7735
 Project: Kafka
  Issue Type: Bug
  Components: tools
Affects Versions: 2.0.0
Reporter: Fangbin Sun


When StateChangeLogMerger tool tries to obtain a topic's state-change-log, it 
returns nothing.
{code:java}
bin/kafka-run-class.sh com.cmss.kafka.api.StateChangeLogMerger --logs 
state-change.log --topic test{code}
This tool uses a topic partition regex as follows:
{code:java}
val topicPartitionRegex = new Regex("\\[(" + Topic.LEGAL_CHARS + "+),( 
)*([0-9]+)\\]"){code}
However the state-change-log no longer prints log in the above format. e.g. in 
0.10.2.0, it prints some state-change logs by case class TopicAndPartition 
which overrided as follows:
{code:java}
override def toString = "[%s,%d]".format(topic, partition){code}
In a newer version (e.g. 1.0.0+) it prints most of state-change logs in the 
form of "partition $topic-$partition", as a workaround one can modify the topic 
partition regex like:
{code:java}
val topicPartitionRegex = new Regex("(partition " + Topic.LEGAL_CHARS + 
"+)-([0-9]+)"){code}
and match topic with "matcher.group(1).substring(10)", however some output of 
state changes might be a little bit redundant.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-7262) Ability to support heterogeneous storage in Kafka

2018-08-08 Thread fangbin sun (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16574282#comment-16574282
 ] 

fangbin sun edited comment on KAFKA-7262 at 8/9/18 5:29 AM:


In addition, how it goes when one of HDD fails, could Kafka itself move the 
general topic-partitions to SSD-mounted directories?  Is there an internal 
request or mechanism in Kafka that might lead to this situation? Many thanks!


was (Author: fangbin):
In addition, how it goes when one of HDD fails, could Kafka itself move the 
general topic-partitions to SSD-mounted directories? Is it possible that any 
requests or mechanisms in Kafka could lead to this situation? Many thanks!

> Ability to support heterogeneous storage in Kafka
> -
>
> Key: KAFKA-7262
> URL: https://issues.apache.org/jira/browse/KAFKA-7262
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 2.0.0
>Reporter: fangbin sun
>Priority: Major
>
> Currently we have a batch of servers, each broker has one SSD (1.5T) and ten 
> HDDs (10T). For all I know, Kafka itself doesn't know much about the 
> underlying hardware, it chooses the directory with the least number of 
> partitions when creating a topic-partition.
> Is it possible to deploy a heterogeneous cluster for taking advantage of SSD 
> with smaller disk capacity? Hope you all consider this. Any insights or 
> guidance would be greatly appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7262) Ability to support heterogeneous storage in Kafka

2018-08-08 Thread fangbin sun (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16574282#comment-16574282
 ] 

fangbin sun commented on KAFKA-7262:


In addition, how it goes when one of HDD fails, could Kafka itself move the 
general topic-partitions to SSD-mounted directories? Is it possible that any 
requests or mechanisms in Kafka could lead to this situation? Many thanks!

> Ability to support heterogeneous storage in Kafka
> -
>
> Key: KAFKA-7262
> URL: https://issues.apache.org/jira/browse/KAFKA-7262
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 2.0.0
>Reporter: fangbin sun
>Priority: Major
>
> Currently we have a batch of servers, each broker has one SSD (1.5T) and ten 
> HDDs (10T). For all I know, Kafka itself doesn't know much about the 
> underlying hardware, it chooses the directory with the least number of 
> partitions when creating a topic-partition.
> Is it possible to deploy a heterogeneous cluster for taking advantage of SSD 
> with smaller disk capacity? Hope you all consider this. Any insights or 
> guidance would be greatly appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7262) Ability to support heterogeneous storage in Kafka

2018-08-08 Thread fangbin sun (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16574281#comment-16574281
 ] 

fangbin sun commented on KAFKA-7262:


[~huxi_2b] Maybe we need "topic-isolation", key topic-partition distributes to 
SSD-mounted directories, the others to HDD-mounted directories. As Kafka ensure 
that number of partitions is evenly distributed across the disks, a large 
amount of partitions should be moved back and forth between SSD and HDDs. It's 
a huge amount of work especially with numerous brokers and partitions.

> Ability to support heterogeneous storage in Kafka
> -
>
> Key: KAFKA-7262
> URL: https://issues.apache.org/jira/browse/KAFKA-7262
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 2.0.0
>Reporter: fangbin sun
>Priority: Major
>
> Currently we have a batch of servers, each broker has one SSD (1.5T) and ten 
> HDDs (10T). For all I know, Kafka itself doesn't know much about the 
> underlying hardware, it chooses the directory with the least number of 
> partitions when creating a topic-partition.
> Is it possible to deploy a heterogeneous cluster for taking advantage of SSD 
> with smaller disk capacity? Hope you all consider this. Any insights or 
> guidance would be greatly appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7262) Ability to support heterogeneous storage in Kafka

2018-08-08 Thread fangbin sun (JIRA)
fangbin sun created KAFKA-7262:
--

 Summary: Ability to support heterogeneous storage in Kafka
 Key: KAFKA-7262
 URL: https://issues.apache.org/jira/browse/KAFKA-7262
 Project: Kafka
  Issue Type: Improvement
  Components: log
Affects Versions: 2.0.0
Reporter: fangbin sun


Currently we have a batch of servers, each broker has one SSD (1.5T) and ten 
HDDs (10T). For all I know, Kafka itself doesn't know much about the underlying 
hardware, it chooses the directory with the least number of partitions when 
creating a topic-partition.

Is it possible to deploy a heterogeneous cluster for taking advantage of SSD 
with smaller disk capacity? Hope you all consider this. Any insights or 
guidance would be greatly appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)