Re: librdkafka 0.8.0 released
The following tests were using a single producer application (examples/rdkafka_performance): * Test1: 2 brokers, 2 partitions, required.acks=2, 100 byte messages: 85 messages/second, 85 MB/second * Test2: 1 broker, 1 partition, required.acks=0, 100 byte messages: 71 messages/second, 71 MB/second * Test3: 2 broker2, 2 partitions, required.acks=2, 100 byte messages, snappy compression: 30 messages/second, 30 MB/second * Test4: 2 broker2, 2 partitions, required.acks=2, 100 byte messages, gzip compression: 23 messages/second, 23 MB/second log.flush broker configuration was increased to avoid the disk being the bottleneck. /Magnus 2013/11/24 Neha Narkhede neha.narkh...@gmail.com So, a single producer'a throughput is 80 MB/s? That seems pretty high. What was the number of acks setting? Thanks for sharing these numbers. On Sunday, November 24, 2013, Magnus Edenhill wrote: Hi Neha, these tests were done using 100 byte messages. More information about the producer performance tests can be found here: https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md#performance-numbers The tests are indicative at best and in no way scientific, but I must say that the Kafka broker performance is impressive. Regards, Magnus 2013/11/22 Neha Narkhede neha.narkh...@gmail.com javascript:; Thanks for sharing this! What is the message size for the throughput numbers stated below? Thanks, Neha On Nov 22, 2013 6:59 AM, Magnus Edenhill mag...@edenhill.se javascript:; wrote: This announces the 0.8.0 release of librdkafka - The Apache Kafka client C library - now with 0.8 protocol support. Features: * Producer (~800K msgs/s) * Consumer (~3M msgs/s) * Compression (Snappy, gzip) * Proper failover and leader re-election support - no message is ever lost. * Configuration properties compatible with official Apache Kafka. * Stabilized ABI-safe API * Mainline Debian package submitted * Production quality Home: https://github.com/edenhill/librdkafka Introduction and performance numbers: https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md Have fun. Regards, Magnus P.S. Check out Wikimedia Foundation's varnishkafka daemon for a use case - varnish log forwarding over Kafka: https://github.com/wikimedia/varnishkafka
[jira] [Created] (KAFKA-1144) commitOffsets can be passed the offsets to commit
Imran Rashid created KAFKA-1144: --- Summary: commitOffsets can be passed the offsets to commit Key: KAFKA-1144 URL: https://issues.apache.org/jira/browse/KAFKA-1144 Project: Kafka Issue Type: Improvement Components: consumer Reporter: Imran Rashid Assignee: Neha Narkhede This adds another version of commitOffsets that takes the offsets to commit as a parameter. Without this change, getting correct user code is very hard. Despite kafka's at-least-once guarantees, most user code doesn't actually have that guarantee, and is almost certainly wrong if doing batch processing. Getting it right requires some very careful synchronization between all consumer threads, which is both: 1) painful to get right 2) slow b/c of the need to stop all workers during a commit. This small change simplifies a lot of this. This was discussed extensively on the user mailing list, on the thread are kafka consumer apps guaranteed to see msgs at least once? You can also see an example implementation of a user api which makes use of this, to get proper at-least-once guarantees by user code, even for batches: https://github.com/quantifind/kafka-utils/pull/1 I'm open to any suggestions on how to add unit tests for this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (KAFKA-1144) commitOffsets can be passed the offsets to commit
[ https://issues.apache.org/jira/browse/KAFKA-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated KAFKA-1144: Attachment: 0002-add-protection-against-backward-commits.patch 0001-allow-committing-of-arbitrary-offsets-to-facilitate-.patch commitOffsets can be passed the offsets to commit - Key: KAFKA-1144 URL: https://issues.apache.org/jira/browse/KAFKA-1144 Project: Kafka Issue Type: Improvement Components: consumer Affects Versions: 0.8 Reporter: Imran Rashid Assignee: Neha Narkhede Attachments: 0001-allow-committing-of-arbitrary-offsets-to-facilitate-.patch, 0002-add-protection-against-backward-commits.patch This adds another version of commitOffsets that takes the offsets to commit as a parameter. Without this change, getting correct user code is very hard. Despite kafka's at-least-once guarantees, most user code doesn't actually have that guarantee, and is almost certainly wrong if doing batch processing. Getting it right requires some very careful synchronization between all consumer threads, which is both: 1) painful to get right 2) slow b/c of the need to stop all workers during a commit. This small change simplifies a lot of this. This was discussed extensively on the user mailing list, on the thread are kafka consumer apps guaranteed to see msgs at least once? You can also see an example implementation of a user api which makes use of this, to get proper at-least-once guarantees by user code, even for batches: https://github.com/quantifind/kafka-utils/pull/1 I'm open to any suggestions on how to add unit tests for this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (KAFKA-1144) commitOffsets can be passed the offsets to commit
[ https://issues.apache.org/jira/browse/KAFKA-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated KAFKA-1144: Affects Version/s: 0.8 Status: Patch Available (was: Open) commitOffsets can be passed the offsets to commit - Key: KAFKA-1144 URL: https://issues.apache.org/jira/browse/KAFKA-1144 Project: Kafka Issue Type: Improvement Components: consumer Affects Versions: 0.8 Reporter: Imran Rashid Assignee: Neha Narkhede Attachments: 0001-allow-committing-of-arbitrary-offsets-to-facilitate-.patch, 0002-add-protection-against-backward-commits.patch This adds another version of commitOffsets that takes the offsets to commit as a parameter. Without this change, getting correct user code is very hard. Despite kafka's at-least-once guarantees, most user code doesn't actually have that guarantee, and is almost certainly wrong if doing batch processing. Getting it right requires some very careful synchronization between all consumer threads, which is both: 1) painful to get right 2) slow b/c of the need to stop all workers during a commit. This small change simplifies a lot of this. This was discussed extensively on the user mailing list, on the thread are kafka consumer apps guaranteed to see msgs at least once? You can also see an example implementation of a user api which makes use of this, to get proper at-least-once guarantees by user code, even for batches: https://github.com/quantifind/kafka-utils/pull/1 I'm open to any suggestions on how to add unit tests for this. -- This message was sent by Atlassian JIRA (v6.1#6144)
kafka pull request: commitOffsets can be passed the offsets to commit
Github user squito closed the pull request at: https://github.com/apache/kafka/pull/10
Re: librdkafka 0.8.0 released
Thanks for sharing the results. Was the topic created with replication factor of 2? Could you test acks=-1 as well? Thanks, Jun On Mon, Nov 25, 2013 at 4:30 AM, Magnus Edenhill mag...@edenhill.se wrote: The following tests were using a single producer application (examples/rdkafka_performance): * Test1: 2 brokers, 2 partitions, required.acks=2, 100 byte messages: 85 messages/second, 85 MB/second * Test2: 1 broker, 1 partition, required.acks=0, 100 byte messages: 71 messages/second, 71 MB/second * Test3: 2 broker2, 2 partitions, required.acks=2, 100 byte messages, snappy compression: 30 messages/second, 30 MB/second * Test4: 2 broker2, 2 partitions, required.acks=2, 100 byte messages, gzip compression: 23 messages/second, 23 MB/second log.flush broker configuration was increased to avoid the disk being the bottleneck. /Magnus 2013/11/24 Neha Narkhede neha.narkh...@gmail.com So, a single producer'a throughput is 80 MB/s? That seems pretty high. What was the number of acks setting? Thanks for sharing these numbers. On Sunday, November 24, 2013, Magnus Edenhill wrote: Hi Neha, these tests were done using 100 byte messages. More information about the producer performance tests can be found here: https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md#performance-numbers The tests are indicative at best and in no way scientific, but I must say that the Kafka broker performance is impressive. Regards, Magnus 2013/11/22 Neha Narkhede neha.narkh...@gmail.com javascript:; Thanks for sharing this! What is the message size for the throughput numbers stated below? Thanks, Neha On Nov 22, 2013 6:59 AM, Magnus Edenhill mag...@edenhill.se javascript:; wrote: This announces the 0.8.0 release of librdkafka - The Apache Kafka client C library - now with 0.8 protocol support. Features: * Producer (~800K msgs/s) * Consumer (~3M msgs/s) * Compression (Snappy, gzip) * Proper failover and leader re-election support - no message is ever lost. * Configuration properties compatible with official Apache Kafka. * Stabilized ABI-safe API * Mainline Debian package submitted * Production quality Home: https://github.com/edenhill/librdkafka Introduction and performance numbers: https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md Have fun. Regards, Magnus P.S. Check out Wikimedia Foundation's varnishkafka daemon for a use case - varnish log forwarding over Kafka: https://github.com/wikimedia/varnishkafka
[jira] [Commented] (KAFKA-1144) commitOffsets can be passed the offsets to commit
[ https://issues.apache.org/jira/browse/KAFKA-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13831608#comment-13831608 ] Jun Rao commented on KAFKA-1144: Thanks for the patch. A couple of things. 1. We have a jira that tries to move the storage of offsets off ZK (https://issues.apache.org/jira/browse/KAFKA-1000) since ZK is not really designed for that. So, we may not be able to do conditional updates for offsets in the future. 2. We will be rewriting the consumer client (https://cwiki.apache.org/confluence/display/KAFKA/Client+Rewrite). In the new api, we will add a callback during consumer rebalances. Do you think that addresses your issue as well? commitOffsets can be passed the offsets to commit - Key: KAFKA-1144 URL: https://issues.apache.org/jira/browse/KAFKA-1144 Project: Kafka Issue Type: Improvement Components: consumer Affects Versions: 0.8 Reporter: Imran Rashid Assignee: Neha Narkhede Attachments: 0001-allow-committing-of-arbitrary-offsets-to-facilitate-.patch, 0002-add-protection-against-backward-commits.patch This adds another version of commitOffsets that takes the offsets to commit as a parameter. Without this change, getting correct user code is very hard. Despite kafka's at-least-once guarantees, most user code doesn't actually have that guarantee, and is almost certainly wrong if doing batch processing. Getting it right requires some very careful synchronization between all consumer threads, which is both: 1) painful to get right 2) slow b/c of the need to stop all workers during a commit. This small change simplifies a lot of this. This was discussed extensively on the user mailing list, on the thread are kafka consumer apps guaranteed to see msgs at least once? You can also see an example implementation of a user api which makes use of this, to get proper at-least-once guarantees by user code, even for batches: https://github.com/quantifind/kafka-utils/pull/1 I'm open to any suggestions on how to add unit tests for this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (KAFKA-1144) commitOffsets can be passed the offsets to commit
[ https://issues.apache.org/jira/browse/KAFKA-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13831658#comment-13831658 ] Imran Rashid commented on KAFKA-1144: - Thanks for quick response, Jun. (1) is unfortunate, though it doesn't technically break this. It just increases the number of messages that would get processed multiple times on a rebalance + crash (I'm pretty sure thats the only way you could end up with a lasting backwards commit without the conditional update). I thought the move off of zk wasn't until 0.9, sorry -- I think this patch can go forward w/ out the conditional update. (should I update the patches to remove it? or just leave it in, and then it will go away when there is no more zk?) (2) Notification on rebalances does not eliminate the desire for this patch. (It would, however, eliminate the need for conditional updates!) Even without rebalances, with the current api, you really need to stop all worker threads before doing a commit if you want to guarantee that your app has seen all the messages. This is especially true w/ batch processing. Again, the patch isn't necessary, but its a small change that makes it sooo much easier to get user code right, not to mention more efficient. Maybe other changes in the 0.9 api will make this unnecessary, I dunno. but I think this is useful for 0.8 in the meantime. And I'd hope the client rewrite would also make it easy to write batch consumers, like the api I put together in the other repo. (I'd happily submit that directly to kafka, if it was desired, though its very scala-y, and I guess the user api is going to be java-only?) commitOffsets can be passed the offsets to commit - Key: KAFKA-1144 URL: https://issues.apache.org/jira/browse/KAFKA-1144 Project: Kafka Issue Type: Improvement Components: consumer Affects Versions: 0.8 Reporter: Imran Rashid Assignee: Neha Narkhede Attachments: 0001-allow-committing-of-arbitrary-offsets-to-facilitate-.patch, 0002-add-protection-against-backward-commits.patch This adds another version of commitOffsets that takes the offsets to commit as a parameter. Without this change, getting correct user code is very hard. Despite kafka's at-least-once guarantees, most user code doesn't actually have that guarantee, and is almost certainly wrong if doing batch processing. Getting it right requires some very careful synchronization between all consumer threads, which is both: 1) painful to get right 2) slow b/c of the need to stop all workers during a commit. This small change simplifies a lot of this. This was discussed extensively on the user mailing list, on the thread are kafka consumer apps guaranteed to see msgs at least once? You can also see an example implementation of a user api which makes use of this, to get proper at-least-once guarantees by user code, even for batches: https://github.com/quantifind/kafka-utils/pull/1 I'm open to any suggestions on how to add unit tests for this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (KAFKA-394) update site with steps and notes for doing a release under developer
[ https://issues.apache.org/jira/browse/KAFKA-394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein resolved KAFKA-394. - Resolution: Fixed https://cwiki.apache.org/confluence/display/KAFKA/Release+Process update site with steps and notes for doing a release under developer Key: KAFKA-394 URL: https://issues.apache.org/jira/browse/KAFKA-394 Project: Kafka Issue Type: Bug Reporter: Joe Stein Assignee: Joe Stein steps in release process including updating the dist directory -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] Subscription: outstanding kafka patches
Issue Subscription Filter: outstanding kafka patches (75 issues) The list of outstanding kafka patches Subscriber: kafka-mailing-list Key Summary KAFKA-1144 commitOffsets can be passed the offsets to commit https://issues.apache.org/jira/browse/KAFKA-1144 KAFKA-1142 Patch review tool should take diff with origin from last divergent point https://issues.apache.org/jira/browse/KAFKA-1142 KAFKA-1140 Move the decoding logic from ConsumerIterator.makeNext to next https://issues.apache.org/jira/browse/KAFKA-1140 KAFKA-1130 log.dirs is a confusing property name https://issues.apache.org/jira/browse/KAFKA-1130 KAFKA-1116 Need to upgrade sbt-assembly to compile on scala 2.10.2 https://issues.apache.org/jira/browse/KAFKA-1116 KAFKA-1110 Unable to produce messages with snappy/gzip compression https://issues.apache.org/jira/browse/KAFKA-1110 KAFKA-1109 Need to fix GC log configuration code, not able to override KAFKA_GC_LOG_OPTS https://issues.apache.org/jira/browse/KAFKA-1109 KAFKA-1106 HighwaterMarkCheckpoint failure puting broker into a bad state https://issues.apache.org/jira/browse/KAFKA-1106 KAFKA-1093 Log.getOffsetsBefore(t, …) does not return the last confirmed offset before t https://issues.apache.org/jira/browse/KAFKA-1093 KAFKA-1086 Improve GetOffsetShell to find metadata automatically https://issues.apache.org/jira/browse/KAFKA-1086 KAFKA-1082 zkclient dies after UnknownHostException in zk reconnect https://issues.apache.org/jira/browse/KAFKA-1082 KAFKA-1079 Liars in PrimitiveApiTest that promise to test api in compression mode, but don't do this actually https://issues.apache.org/jira/browse/KAFKA-1079 KAFKA-1074 Reassign partitions should delete the old replicas from disk https://issues.apache.org/jira/browse/KAFKA-1074 KAFKA-1049 Encoder implementations are required to provide an undocumented constructor. https://issues.apache.org/jira/browse/KAFKA-1049 KAFKA-1032 Messages sent to the old leader will be lost on broker GC resulted failure https://issues.apache.org/jira/browse/KAFKA-1032 KAFKA-1020 Remove getAllReplicasOnBroker from KafkaController https://issues.apache.org/jira/browse/KAFKA-1020 KAFKA-1012 Implement an Offset Manager and hook offset requests to it https://issues.apache.org/jira/browse/KAFKA-1012 KAFKA-1011 Decompression and re-compression on MirrorMaker could result in messages being dropped in the pipeline https://issues.apache.org/jira/browse/KAFKA-1011 KAFKA-1005 kafka.perf.ConsumerPerformance not shutting down consumer https://issues.apache.org/jira/browse/KAFKA-1005 KAFKA-1004 Handle topic event for trivial whitelist topic filters https://issues.apache.org/jira/browse/KAFKA-1004 KAFKA-998 Producer should not retry on non-recoverable error codes https://issues.apache.org/jira/browse/KAFKA-998 KAFKA-997 Provide a strict verification mode when reading configuration properties https://issues.apache.org/jira/browse/KAFKA-997 KAFKA-996 Capitalize first letter for log entries https://issues.apache.org/jira/browse/KAFKA-996 KAFKA-984 Avoid a full rebalance in cases when a new topic is discovered but container/broker set stay the same https://issues.apache.org/jira/browse/KAFKA-984 KAFKA-976 Order-Preserving Mirror Maker Testcase https://issues.apache.org/jira/browse/KAFKA-976 KAFKA-967 Use key range in ProducerPerformance https://issues.apache.org/jira/browse/KAFKA-967 KAFKA-917 Expose zk.session.timeout.ms in console consumer https://issues.apache.org/jira/browse/KAFKA-917 KAFKA-885 sbt package builds two kafka jars https://issues.apache.org/jira/browse/KAFKA-885 KAFKA-881 Kafka broker not respecting log.roll.hours https://issues.apache.org/jira/browse/KAFKA-881 KAFKA-873 Consider replacing zkclient with curator (with zkclient-bridge) https://issues.apache.org/jira/browse/KAFKA-873 KAFKA-868 System Test - add test case for rolling controlled shutdown https://issues.apache.org/jira/browse/KAFKA-868 KAFKA-863 System Test - update 0.7 version of kafka-run-class.sh for Migration Tool test cases https://issues.apache.org/jira/browse/KAFKA-863 KAFKA-859 support basic auth protection of mx4j console https://issues.apache.org/jira/browse/KAFKA-859 KAFKA-855 Ant+Ivy build for Kafka https://issues.apache.org/jira/browse/KAFKA-855 KAFKA-854 Upgrade dependencies for 0.8 https://issues.apache.org/jira/browse/KAFKA-854 KAFKA-815 Improve SimpleConsumerShell to take in a max messages config option https://issues.apache.org/jira/browse/KAFKA-815 KAFKA-745 Remove getShutdownReceive() and other kafka specific
[VOTE] Apache Kafka Release 0.8.0 - Candidate 4
This is the fourth candidate for release of Apache Kafka 0.8.0. This release resolves https://issues.apache.org/jira/browse/KAFKA-1131 and https://issues.apache.org/jira/browse/KAFKA-1133 Release Notes for the 0.8.0 release http://people.apache.org/~joestein/kafka-0.8.0-candidate4/RELEASE_NOTES.html *** Please download, test and vote by Monday December, 2nd, 12pm PDT Kafka's KEYS file containing PGP keys we use to sign the release: http://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5 and sha1 checksum * Release artifacts to be voted upon (source and binary): http://people.apache.org/~joestein/kafka-0.8.0-candidate4/ * Maven artifacts to be voted upon prior to release: https://repository.apache.org/content/groups/staging/ (i.e. in sbt land this can be added to the build.sbt to use Kafka resolvers += Apache Staging at https://repository.apache.org/content/groups/staging/; libraryDependencies += org.apache.kafka % kafka_2.10 % 0.8.0 ) * The tag to be voted upon (off the 0.8 branch) is the 0.8.0 tag https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2c20a71a010659e25af075a024cbd692c87d4c89 /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop /
Re: Review Request 15805: KAFKA-1140.v2: addressed Jun's comments
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15805/ --- (Updated Nov. 25, 2013, 8:53 p.m.) Review request for kafka. Summary (updated) - KAFKA-1140.v2: addressed Jun's comments Bugs: KAFKA-1140 https://issues.apache.org/jira/browse/KAFKA-1140 Repository: kafka Description --- KAFKA-1140.v1 Dummy Diffs (updated) - core/src/main/scala/kafka/consumer/ConsumerIterator.scala a4227a49684c7de08e07cb1f3a10d2f76ba28da7 Diff: https://reviews.apache.org/r/15805/diff/ Testing --- Thanks, Guozhang Wang
[jira] [Commented] (KAFKA-1140) Move the decoding logic from ConsumerIterator.makeNext to next
[ https://issues.apache.org/jira/browse/KAFKA-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13831892#comment-13831892 ] Guozhang Wang commented on KAFKA-1140: -- Updated reviewboard https://reviews.apache.org/r/15805/ against branch origin/trunk Move the decoding logic from ConsumerIterator.makeNext to next -- Key: KAFKA-1140 URL: https://issues.apache.org/jira/browse/KAFKA-1140 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.8.1 Attachments: KAFKA-1140.patch, KAFKA-1140_2013-11-25_12:53:17.patch Usually people will write code around consumer like while(iter.hasNext()) { try { msg = iter.next() // do something } catch{ } } However, the iter.hasNext() call itself can throw exceptions due to decoding failures. It would be better to move the decoding to the next function call. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (KAFKA-1140) Move the decoding logic from ConsumerIterator.makeNext to next
[ https://issues.apache.org/jira/browse/KAFKA-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-1140: - Attachment: KAFKA-1140_2013-11-25_12:53:17.patch Move the decoding logic from ConsumerIterator.makeNext to next -- Key: KAFKA-1140 URL: https://issues.apache.org/jira/browse/KAFKA-1140 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.8.1 Attachments: KAFKA-1140.patch, KAFKA-1140_2013-11-25_12:53:17.patch Usually people will write code around consumer like while(iter.hasNext()) { try { msg = iter.next() // do something } catch{ } } However, the iter.hasNext() call itself can throw exceptions due to decoding failures. It would be better to move the decoding to the next function call. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Review Request 15805: KAFKA-1140.v2: addressed Jun's comments
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15805/ --- (Updated Nov. 25, 2013, 8:55 p.m.) Review request for kafka. Bugs: KAFKA-1140 https://issues.apache.org/jira/browse/KAFKA-1140 Repository: kafka Description (updated) --- KAFKA-1140.v2 KAFKA-1140.v1 Dummy Diffs (updated) - core/src/main/scala/kafka/consumer/ConsumerIterator.scala a4227a49684c7de08e07cb1f3a10d2f76ba28da7 core/src/test/scala/unit/kafka/consumer/ConsumerIteratorTest.scala ef1de8321c713cd9d27ef937216f5b76a5d8c574 Diff: https://reviews.apache.org/r/15805/diff/ Testing --- Thanks, Guozhang Wang
[jira] [Commented] (KAFKA-1140) Move the decoding logic from ConsumerIterator.makeNext to next
[ https://issues.apache.org/jira/browse/KAFKA-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13831893#comment-13831893 ] Guozhang Wang commented on KAFKA-1140: -- Updated reviewboard https://reviews.apache.org/r/15805/ against branch origin/trunk Move the decoding logic from ConsumerIterator.makeNext to next -- Key: KAFKA-1140 URL: https://issues.apache.org/jira/browse/KAFKA-1140 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.8.1 Attachments: KAFKA-1140.patch, KAFKA-1140_2013-11-25_12:53:17.patch, KAFKA-1140_2013-11-25_12:55:34.patch Usually people will write code around consumer like while(iter.hasNext()) { try { msg = iter.next() // do something } catch{ } } However, the iter.hasNext() call itself can throw exceptions due to decoding failures. It would be better to move the decoding to the next function call. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (KAFKA-1140) Move the decoding logic from ConsumerIterator.makeNext to next
[ https://issues.apache.org/jira/browse/KAFKA-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-1140: - Attachment: KAFKA-1140_2013-11-25_12:55:34.patch Move the decoding logic from ConsumerIterator.makeNext to next -- Key: KAFKA-1140 URL: https://issues.apache.org/jira/browse/KAFKA-1140 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.8.1 Attachments: KAFKA-1140.patch, KAFKA-1140_2013-11-25_12:53:17.patch, KAFKA-1140_2013-11-25_12:55:34.patch Usually people will write code around consumer like while(iter.hasNext()) { try { msg = iter.next() // do something } catch{ } } However, the iter.hasNext() call itself can throw exceptions due to decoding failures. It would be better to move the decoding to the next function call. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (KAFKA-1144) commitOffsets can be passed the offsets to commit
[ https://issues.apache.org/jira/browse/KAFKA-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13831993#comment-13831993 ] Guozhang Wang commented on KAFKA-1144: -- Imran, regarding the client rewrite, your proposal is also discussed as part of the project. Basically, the consumer commitOffset function can be something like: void commit(List[String, Int, Long]) // topic, partition-id, offset The wiki page will be updated soon so that you can take a look by then. Guozhang commitOffsets can be passed the offsets to commit - Key: KAFKA-1144 URL: https://issues.apache.org/jira/browse/KAFKA-1144 Project: Kafka Issue Type: Improvement Components: consumer Affects Versions: 0.8 Reporter: Imran Rashid Assignee: Neha Narkhede Attachments: 0001-allow-committing-of-arbitrary-offsets-to-facilitate-.patch, 0002-add-protection-against-backward-commits.patch This adds another version of commitOffsets that takes the offsets to commit as a parameter. Without this change, getting correct user code is very hard. Despite kafka's at-least-once guarantees, most user code doesn't actually have that guarantee, and is almost certainly wrong if doing batch processing. Getting it right requires some very careful synchronization between all consumer threads, which is both: 1) painful to get right 2) slow b/c of the need to stop all workers during a commit. This small change simplifies a lot of this. This was discussed extensively on the user mailing list, on the thread are kafka consumer apps guaranteed to see msgs at least once? You can also see an example implementation of a user api which makes use of this, to get proper at-least-once guarantees by user code, even for batches: https://github.com/quantifind/kafka-utils/pull/1 I'm open to any suggestions on how to add unit tests for this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (KAFKA-1144) commitOffsets can be passed the offsets to commit
[ https://issues.apache.org/jira/browse/KAFKA-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13832019#comment-13832019 ] Imran Rashid commented on KAFKA-1144: - yes, that addition to the commit api would do the trick. (that is the main point of this patch, I just happened to choose a different signature.) With rebalance notifications, that removes the need for conditional updates. glad to hear this will all be in 0.9 -- but does that mean this patch is off the table for 0.8.*? That's too bad, this would be a big help now. I could change the signature to match what it will be in 0.9. If not, I suppose I can always just make the zk updates myself directly, under the covers of the consumer api wrapper I'm writing. commitOffsets can be passed the offsets to commit - Key: KAFKA-1144 URL: https://issues.apache.org/jira/browse/KAFKA-1144 Project: Kafka Issue Type: Improvement Components: consumer Affects Versions: 0.8 Reporter: Imran Rashid Assignee: Neha Narkhede Attachments: 0001-allow-committing-of-arbitrary-offsets-to-facilitate-.patch, 0002-add-protection-against-backward-commits.patch This adds another version of commitOffsets that takes the offsets to commit as a parameter. Without this change, getting correct user code is very hard. Despite kafka's at-least-once guarantees, most user code doesn't actually have that guarantee, and is almost certainly wrong if doing batch processing. Getting it right requires some very careful synchronization between all consumer threads, which is both: 1) painful to get right 2) slow b/c of the need to stop all workers during a commit. This small change simplifies a lot of this. This was discussed extensively on the user mailing list, on the thread are kafka consumer apps guaranteed to see msgs at least once? You can also see an example implementation of a user api which makes use of this, to get proper at-least-once guarantees by user code, even for batches: https://github.com/quantifind/kafka-utils/pull/1 I'm open to any suggestions on how to add unit tests for this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (KAFKA-1135) Code cleanup - use Json.encode() to write json data to zk
[ https://issues.apache.org/jira/browse/KAFKA-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13832026#comment-13832026 ] David Lao commented on KAFKA-1135: -- Hi. This patch seem to have undone all the KAFKA-1112 changes. Can you verify? Code cleanup - use Json.encode() to write json data to zk - Key: KAFKA-1135 URL: https://issues.apache.org/jira/browse/KAFKA-1135 Project: Kafka Issue Type: Bug Reporter: Swapnil Ghike Assignee: Swapnil Ghike Fix For: 0.8.1 Attachments: KAFKA-1135.patch, KAFKA-1135_2013-11-18_19:17:54.patch, KAFKA-1135_2013-11-18_19:20:58.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (KAFKA-1135) Code cleanup - use Json.encode() to write json data to zk
[ https://issues.apache.org/jira/browse/KAFKA-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13832037#comment-13832037 ] Swapnil Ghike commented on KAFKA-1135: -- Thanks for catching this David! Jun, it seems that the diff in the reviewboard and what got attached to this JIRA is different. Can you please revert commit 9b0776d157afd9eacddb84a99f2420fa9c0d505b, download the diff from the reviewboard and commit it? Code cleanup - use Json.encode() to write json data to zk - Key: KAFKA-1135 URL: https://issues.apache.org/jira/browse/KAFKA-1135 Project: Kafka Issue Type: Bug Reporter: Swapnil Ghike Assignee: Swapnil Ghike Fix For: 0.8.1 Attachments: KAFKA-1135.patch, KAFKA-1135_2013-11-18_19:17:54.patch, KAFKA-1135_2013-11-18_19:20:58.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (KAFKA-1135) Code cleanup - use Json.encode() to write json data to zk
[ https://issues.apache.org/jira/browse/KAFKA-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13832040#comment-13832040 ] Swapnil Ghike commented on KAFKA-1135: -- [~jjkoshy], does the above issue look similar to KAFKA-1142? Code cleanup - use Json.encode() to write json data to zk - Key: KAFKA-1135 URL: https://issues.apache.org/jira/browse/KAFKA-1135 Project: Kafka Issue Type: Bug Reporter: Swapnil Ghike Assignee: Swapnil Ghike Fix For: 0.8.1 Attachments: KAFKA-1135.patch, KAFKA-1135_2013-11-18_19:17:54.patch, KAFKA-1135_2013-11-18_19:20:58.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (KAFKA-1135) Code cleanup - use Json.encode() to write json data to zk
[ https://issues.apache.org/jira/browse/KAFKA-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13832067#comment-13832067 ] Jun Rao commented on KAFKA-1135: Thanks for pointing this out. Recommitted KAFKA-1112. Code cleanup - use Json.encode() to write json data to zk - Key: KAFKA-1135 URL: https://issues.apache.org/jira/browse/KAFKA-1135 Project: Kafka Issue Type: Bug Reporter: Swapnil Ghike Assignee: Swapnil Ghike Fix For: 0.8.1 Attachments: KAFKA-1135.patch, KAFKA-1135_2013-11-18_19:17:54.patch, KAFKA-1135_2013-11-18_19:20:58.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (KAFKA-1144) commitOffsets can be passed the offsets to commit
[ https://issues.apache.org/jira/browse/KAFKA-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated KAFKA-1144: Attachment: 0003-dont-do-conditional-update-check-if-the-path-doesnt-.patch just discovered a small bug -- conditional update doesn't work if the path doesn't already exist. fiured I'd supdate this just in case its still in consideration ... commitOffsets can be passed the offsets to commit - Key: KAFKA-1144 URL: https://issues.apache.org/jira/browse/KAFKA-1144 Project: Kafka Issue Type: Improvement Components: consumer Affects Versions: 0.8 Reporter: Imran Rashid Assignee: Neha Narkhede Attachments: 0001-allow-committing-of-arbitrary-offsets-to-facilitate-.patch, 0002-add-protection-against-backward-commits.patch, 0003-dont-do-conditional-update-check-if-the-path-doesnt-.patch This adds another version of commitOffsets that takes the offsets to commit as a parameter. Without this change, getting correct user code is very hard. Despite kafka's at-least-once guarantees, most user code doesn't actually have that guarantee, and is almost certainly wrong if doing batch processing. Getting it right requires some very careful synchronization between all consumer threads, which is both: 1) painful to get right 2) slow b/c of the need to stop all workers during a commit. This small change simplifies a lot of this. This was discussed extensively on the user mailing list, on the thread are kafka consumer apps guaranteed to see msgs at least once? You can also see an example implementation of a user api which makes use of this, to get proper at-least-once guarantees by user code, even for batches: https://github.com/quantifind/kafka-utils/pull/1 I'm open to any suggestions on how to add unit tests for this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (KAFKA-1004) Handle topic event for trivial whitelist topic filters
[ https://issues.apache.org/jira/browse/KAFKA-1004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Koshy updated KAFKA-1004: -- Resolution: Fixed Status: Resolved (was: Patch Available) This was fixed in KAFKA-1103 Handle topic event for trivial whitelist topic filters -- Key: KAFKA-1004 URL: https://issues.apache.org/jira/browse/KAFKA-1004 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.8.1, 0.7 Attachments: KAFKA-1004.v1.patch, KAFKA-1004.v2.patch Toay consumer's TopicEventWatcher is not subscribed with trivial whitelist topic names. Hence if the topic is not registered on ZK when the consumer is started, it will not trigger the rebalance of consumers later when it is created and hence not be consumed even if it is in the whilelist. A proposed fix would be always subscribe TopicEventWatcher for all whitelist consumers. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Closed] (KAFKA-1004) Handle topic event for trivial whitelist topic filters
[ https://issues.apache.org/jira/browse/KAFKA-1004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Koshy closed KAFKA-1004. - Handle topic event for trivial whitelist topic filters -- Key: KAFKA-1004 URL: https://issues.apache.org/jira/browse/KAFKA-1004 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.7, 0.8.1 Attachments: KAFKA-1004.v1.patch, KAFKA-1004.v2.patch Toay consumer's TopicEventWatcher is not subscribed with trivial whitelist topic names. Hence if the topic is not registered on ZK when the consumer is started, it will not trigger the rebalance of consumers later when it is created and hence not be consumed even if it is in the whilelist. A proposed fix would be always subscribe TopicEventWatcher for all whitelist consumers. -- This message was sent by Atlassian JIRA (v6.1#6144)