Re: [DISCUSS] KIP-49: Fair Partition Assignment Strategy
Thanks for the feedback. I have added a concrete example to the document that I think illustrates the benefit relatively well. The observation about scaling the workload of individual consumers is certainly valid. I had not really considered this. Our primary concern is being able to gradually roll out consumption configuration changes in a minimally disruptive fashion, including load-balancing. If the round robin strategy can be enhanced to adequately handle that use case, we would be happy. Is there a Jira open for the "flaw" that you mentioned? On 2/26/16, 7:22 PM, "Joel Koshy" <jjkosh...@gmail.com> wrote: >Hi Andrew, > >Thanks for the wiki. Just a couple of comments: > > - The disruptive config change issue that you mentioned is pretty much a > non-issue in the new consumer due to central assignment. > - Optional: but it may be helpful to add a concrete example. > - More of an orthogonal observation than a comment: with heavily skewed > subscriptions fairness is sort of moot. i.e., people would generally scale > up or down subscription counts with the express purpose of > reducing/increasing load on those instances. > - WRT roundrobin we later realized a significant flaw in the way we lay > out partitions: we originally wanted to randomize the partition layout to > reduce the likelihood of most partitions of the same topic from ending up > on a given consumer which is important if you have a few very large topics. > Unfortunately we used hashCode - which does a splendid job of clumping > partitions from the same topic together :( We can probably just "fix" that > in the new consumer's roundrobin assignor. > >Thanks, > >Joel > > >On Fri, Feb 26, 2016 at 2:32 PM, Olson,Andrew <aols...@cerner.com> wrote: > >> Here is a proposal for a new partition assignment strategy, >> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-49+-+Fair+Partition+Assignment+Strategy >> >> This KIP corresponds to these two pending pull requests, >> https://github.com/apache/kafka/pull/146 >> https://github.com/apache/kafka/pull/979 >> >> thanks, >> Andrew >> >> CONFIDENTIALITY NOTICE This message and any included attachments are from >> Cerner Corporation and are intended only for the addressee. The information >> contained in this message is confidential and may constitute inside or >> non-public information under international, federal, or state securities >> laws. Unauthorized forwarding, printing, copying, distribution, or use of >> such information is strictly prohibited and may be unlawful. If you are not >> the addressee, please promptly delete this message and notify the sender of >> the delivery error by e-mail or you may call Cerner's corporate offices in >> Kansas City, Missouri, U.S.A at (+1) (816)221-1024. >>
[DISCUSS] KIP-49: Fair Partition Assignment Strategy
Here is a proposal for a new partition assignment strategy, https://cwiki.apache.org/confluence/display/KAFKA/KIP-49+-+Fair+Partition+Assignment+Strategy This KIP corresponds to these two pending pull requests, https://github.com/apache/kafka/pull/146 https://github.com/apache/kafka/pull/979 thanks, Andrew CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation and are intended only for the addressee. The information contained in this message is confidential and may constitute inside or non-public information under international, federal, or state securities laws. Unauthorized forwarding, printing, copying, distribution, or use of such information is strictly prohibited and may be unlawful. If you are not the addressee, please promptly delete this message and notify the sender of the delivery error by e-mail or you may call Cerner's corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024.
[jira] [Commented] (KAFKA-2189) Snappy compression of message batches less efficient in 0.8.2.1
[ https://issues.apache.org/jira/browse/KAFKA-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14540351#comment-14540351 ] Olson,Andrew commented on KAFKA-2189: - I have verified that the issue [1] was introduced in snappy-java 1.1.1.2 and has already been fixed [2], in snappy-java 1.1.1.7. [1] https://github.com/xerial/snappy-java/issues/100 [2] https://github.com/xerial/snappy-java/commit/dc2dd27f85e5167961883f71ac2681b73b33e5df Snappy compression of message batches less efficient in 0.8.2.1 --- Key: KAFKA-2189 URL: https://issues.apache.org/jira/browse/KAFKA-2189 Project: Kafka Issue Type: Bug Components: log Affects Versions: 0.8.2.1 Reporter: Olson,Andrew Assignee: Jay Kreps We are using snappy compression and noticed a fairly substantial increase (about 2.25x) in log filesystem space consumption after upgrading a Kafka cluster from 0.8.1.1 to 0.8.2.1. We found that this is caused by messages being seemingly recompressed individually (or possibly with a much smaller buffer or dictionary?) instead of as a batch as sent by producers. We eventually tracked down the change in compression ratio/scope to this [1] commit that updated the snappy version from 1.0.5 to 1.1.1.3. The Kafka client version does not appear to be relevant as we can reproduce this with both the 0.8.1.1 and 0.8.2.1 Producer. Here are the log files from our troubleshooting that contain the same set of 1000 messages, for batch sizes of 1, 10, 100, and 1000. f9d9b was the last commit with 0.8.1.1-like behavior prior to f5ab8 introducing the issue. {noformat} -rw-rw-r-- 1 kafka kafka 404967 May 12 11:45 /var/kafka2/f9d9b-batch-1-0/.log -rw-rw-r-- 1 kafka kafka 119951 May 12 11:45 /var/kafka2/f9d9b-batch-10-0/.log -rw-rw-r-- 1 kafka kafka 89645 May 12 11:45 /var/kafka2/f9d9b-batch-100-0/.log -rw-rw-r-- 1 kafka kafka 88279 May 12 11:45 /var/kafka2/f9d9b-batch-1000-0/.log -rw-rw-r-- 1 kafka kafka 402837 May 12 11:41 /var/kafka2/f5ab8-batch-1-0/.log -rw-rw-r-- 1 kafka kafka 382437 May 12 11:41 /var/kafka2/f5ab8-batch-10-0/.log -rw-rw-r-- 1 kafka kafka 364791 May 12 11:41 /var/kafka2/f5ab8-batch-100-0/.log -rw-rw-r-- 1 kafka kafka 380693 May 12 11:41 /var/kafka2/f5ab8-batch-1000-0/.log {noformat} [1] https://github.com/apache/kafka/commit/f5ab8e1780cf80f267906e3259ad4f9278c32d28 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-2189) Snappy compression of message batches less efficient in 0.8.2.1
Olson,Andrew created KAFKA-2189: --- Summary: Snappy compression of message batches less efficient in 0.8.2.1 Key: KAFKA-2189 URL: https://issues.apache.org/jira/browse/KAFKA-2189 Project: Kafka Issue Type: Bug Components: log Affects Versions: 0.8.2.1 Reporter: Olson,Andrew Assignee: Jay Kreps We are using snappy compression and noticed a fairly substantial increase (about 2.25x) in log filesystem space consumption after upgrading a Kafka cluster from 0.8.1.1 to 0.8.2.1. We found that this is caused by messages being seemingly recompressed individually (or possibly with a much smaller buffer or dictionary?) instead of as a batch as sent by producers. We eventually tracked down the change in compression ratio/scope to this [1] commit that updated the snappy version from 1.0.5 to 1.1.1.3. The Kafka client version does not appear to be relevant as we can reproduce this with both the 0.8.1.1 and 0.8.2.1 Producer. Here are the log files from our troubleshooting that contain the same set of 1000 messages, for batch sizes of 1, 10, 100, and 1000. f9d9b was the last commit with 0.8.1.1-like behavior prior to f5ab8 introducing the issue. {noformat} -rw-rw-r-- 1 kafka kafka 404967 May 12 11:45 /var/kafka2/f9d9b-batch-1-0/.log -rw-rw-r-- 1 kafka kafka 119951 May 12 11:45 /var/kafka2/f9d9b-batch-10-0/.log -rw-rw-r-- 1 kafka kafka 89645 May 12 11:45 /var/kafka2/f9d9b-batch-100-0/.log -rw-rw-r-- 1 kafka kafka 88279 May 12 11:45 /var/kafka2/f9d9b-batch-1000-0/.log -rw-rw-r-- 1 kafka kafka 402837 May 12 11:41 /var/kafka2/f5ab8-batch-1-0/.log -rw-rw-r-- 1 kafka kafka 382437 May 12 11:41 /var/kafka2/f5ab8-batch-10-0/.log -rw-rw-r-- 1 kafka kafka 364791 May 12 11:41 /var/kafka2/f5ab8-batch-100-0/.log -rw-rw-r-- 1 kafka kafka 380693 May 12 11:41 /var/kafka2/f5ab8-batch-1000-0/.log {noformat} [1] https://github.com/apache/kafka/commit/f5ab8e1780cf80f267906e3259ad4f9278c32d28 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2189) Snappy compression of message batches less efficient in 0.8.2.1
[ https://issues.apache.org/jira/browse/KAFKA-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olson,Andrew updated KAFKA-2189: Component/s: compression build Snappy compression of message batches less efficient in 0.8.2.1 --- Key: KAFKA-2189 URL: https://issues.apache.org/jira/browse/KAFKA-2189 Project: Kafka Issue Type: Bug Components: build, compression, log Affects Versions: 0.8.2.1 Reporter: Olson,Andrew Assignee: Jay Kreps We are using snappy compression and noticed a fairly substantial increase (about 2.25x) in log filesystem space consumption after upgrading a Kafka cluster from 0.8.1.1 to 0.8.2.1. We found that this is caused by messages being seemingly recompressed individually (or possibly with a much smaller buffer or dictionary?) instead of as a batch as sent by producers. We eventually tracked down the change in compression ratio/scope to this [1] commit that updated the snappy version from 1.0.5 to 1.1.1.3. The Kafka client version does not appear to be relevant as we can reproduce this with both the 0.8.1.1 and 0.8.2.1 Producer. Here are the log files from our troubleshooting that contain the same set of 1000 messages, for batch sizes of 1, 10, 100, and 1000. f9d9b was the last commit with 0.8.1.1-like behavior prior to f5ab8 introducing the issue. {noformat} -rw-rw-r-- 1 kafka kafka 404967 May 12 11:45 /var/kafka2/f9d9b-batch-1-0/.log -rw-rw-r-- 1 kafka kafka 119951 May 12 11:45 /var/kafka2/f9d9b-batch-10-0/.log -rw-rw-r-- 1 kafka kafka 89645 May 12 11:45 /var/kafka2/f9d9b-batch-100-0/.log -rw-rw-r-- 1 kafka kafka 88279 May 12 11:45 /var/kafka2/f9d9b-batch-1000-0/.log -rw-rw-r-- 1 kafka kafka 402837 May 12 11:41 /var/kafka2/f5ab8-batch-1-0/.log -rw-rw-r-- 1 kafka kafka 382437 May 12 11:41 /var/kafka2/f5ab8-batch-10-0/.log -rw-rw-r-- 1 kafka kafka 364791 May 12 11:41 /var/kafka2/f5ab8-batch-100-0/.log -rw-rw-r-- 1 kafka kafka 380693 May 12 11:41 /var/kafka2/f5ab8-batch-1000-0/.log {noformat} [1] https://github.com/apache/kafka/commit/f5ab8e1780cf80f267906e3259ad4f9278c32d28 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-493) High CPU usage on inactive server
[ https://issues.apache.org/jira/browse/KAFKA-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311032#comment-14311032 ] Olson,Andrew commented on KAFKA-493: I will be out of the office until Monday, 3/2/2015. Andrew Olson | Sr. Software Architect | Cerner Corporation | 816.201.3825 | aols...@cerner.com | www.cerner.com CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation and are intended only for the addressee. The information contained in this message is confidential and may constitute inside or non-public information under international, federal, or state securities laws. Unauthorized forwarding, printing, copying, distribution, or use of such information is strictly prohibited and may be unlawful. If you are not the addressee, please promptly delete this message and notify the sender of the delivery error by e-mail or you may call Cerner's corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024. High CPU usage on inactive server - Key: KAFKA-493 URL: https://issues.apache.org/jira/browse/KAFKA-493 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.0 Reporter: Jay Kreps Fix For: 0.9.0 Attachments: Kafka-2014-11-10.snapshot.zip, Kafka-sampling1.zip, Kafka-sampling2.zip, Kafka-sampling3.zip, Kafka-trace1.zip, Kafka-trace2.zip, Kafka-trace3.zip, backtraces.txt, stacktrace.txt I've been playing with the 0.8 branch of Kafka and noticed that idle CPU usage is fairly high (13% of a core). Is that to be expected? I did look at the stack, but didn't see anything obvious. A background task? I wanted to mention how I am getting into this state. I've set up two machines with the latest 0.8 code base and am using a replication factor of 2. On starting the brokers there is no idle CPU activity. Then I run a test that essential does 10k publish operations followed by immediate consume operations (I was measuring latency). Once this has run the kafka nodes seem to consistently be consuming CPU essentially forever. hprof results: THREAD START (obj=53ae, id = 24, name=RMI TCP Accept-0, group=system) THREAD START (obj=53ae, id = 25, name=RMI TCP Accept-, group=system) THREAD START (obj=53ae, id = 26, name=RMI TCP Accept-0, group=system) THREAD START (obj=53ae, id = 21, name=main, group=main) THREAD START (obj=53ae, id = 27, name=Thread-2, group=main) THREAD START (obj=53ae, id = 28, name=Thread-3, group=main) THREAD START (obj=53ae, id = 29, name=kafka-processor-9092-0, group=main) THREAD START (obj=53ae, id = 200010, name=kafka-processor-9092-1, group=main) THREAD START (obj=53ae, id = 200011, name=kafka-acceptor, group=main) THREAD START (obj=574b, id = 200012, name=ZkClient-EventThread-20-localhost:2181, group=main) THREAD START (obj=576e, id = 200014, name=main-SendThread(), group=main) THREAD START (obj=576d, id = 200013, name=main-EventThread, group=main) THREAD START (obj=53ae, id = 200015, name=metrics-meter-tick-thread-1, group=main) THREAD START (obj=53ae, id = 200016, name=metrics-meter-tick-thread-2, group=main) THREAD START (obj=53ae, id = 200017, name=request-expiration-task, group=main) THREAD START (obj=53ae, id = 200018, name=request-expiration-task, group=main) THREAD START (obj=53ae, id = 200019, name=kafka-request-handler-0, group=main) THREAD START (obj=53ae, id = 200020, name=kafka-request-handler-1, group=main) THREAD START (obj=53ae, id = 200021, name=Thread-6, group=main) THREAD START (obj=53ae, id = 200022, name=Thread-7, group=main) THREAD START (obj=5899, id = 200023, name=ReplicaFetcherThread-0-2 on broker 1, , group=main) THREAD START (obj=5899, id = 200024, name=ReplicaFetcherThread-0-3 on broker 1, , group=main) THREAD START (obj=5899, id = 200025, name=ReplicaFetcherThread-0-0 on broker 1, , group=main) THREAD START (obj=5899, id = 200026, name=ReplicaFetcherThread-0-1 on broker 1, , group=main) THREAD START (obj=53ae, id = 200028, name=SIGINT handler, group=system) THREAD START (obj=53ae, id = 200029, name=Thread-5, group=main) THREAD START (obj=574b, id = 200030, name=Thread-1, group=main) THREAD START (obj=574b, id = 200031, name=Thread-0, group=main) THREAD END (id = 200031) THREAD END (id = 200029) THREAD END (id = 200020) THREAD END (id = 200019) THREAD END (id = 28) THREAD END (id = 200021) THREAD END (id = 27) THREAD END (id = 200022) THREAD END (id = 200018) THREAD END (id = 200017) THREAD END (id = 200012) THREAD END (id = 200013) THREAD END (id = 200014) THREAD END (id = 200025) THREAD END
[jira] [Commented] (KAFKA-982) Logo for Kafka
[ https://issues.apache.org/jira/browse/KAFKA-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719319#comment-13719319 ] Olson,Andrew commented on KAFKA-982: I will be out of the office without access to email until Tuesday, 8/13/2013. For urgent issues please contact Greg Whitsitt. Andrew Olson | Sr. Software Architect | Cerner Corporation | 816.201.3825 | aols...@cerner.com | www.cerner.com CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation and are intended only for the addressee. The information contained in this message is confidential and may constitute inside or non-public information under international, federal, or state securities laws. Unauthorized forwarding, printing, copying, distribution, or use of such information is strictly prohibited and may be unlawful. If you are not the addressee, please promptly delete this message and notify the sender of the delivery error by e-mail or you may call Cerner's corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024. Logo for Kafka -- Key: KAFKA-982 URL: https://issues.apache.org/jira/browse/KAFKA-982 Project: Kafka Issue Type: Improvement Reporter: Jay Kreps Attachments: 289.jpeg, 294.jpeg, 296.png, 298.jpeg, 301.png, 302.png We should have a logo for kafka. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira