Re: Number of kafka topics/partitions supported per cluster of n nodes

2015-07-27 Thread Darion Yaphet
Kafka store it meta data in Zookeeper Cluster so evaluate "how many total number of topics and partitions can be created in a cluster " maybe same as to test Zookeeper's expansibility and disk IO performance . 2015-07-28 13:51 GMT+08:00 Prabhjot Bharaj : > Hi, > > I'm looking forward to a bench

Re: [DISCUSS] KIP-28 - Add a transform client for data processing

2015-07-27 Thread Yi Pan
Hi, Jay, {quote} 1. Yeah we are going to try to generalize the partition management stuff. We'll get a wiki/JIRA up for that. I think that gives what you want in terms of moving partitioning to the client side. {quote} Great! I am looking forward to that. {quote} I think the key observation is th

Re: [DISCUSS] KIP-28 - Add a transform client for data processing

2015-07-27 Thread Guozhang Wang
Hi Adi, Just to clarify, the cmdline tool would be used, as stated in the wiki page, to run the client library "as a process", which is still far away from a "service". It is just like what we have for kafka-console-producer, kafka-console-consumer, kafka-mirror-maker, etc today. Guozhang On Mon

Re: [DISCUSS] KIP-28 - Add a transform client for data processing

2015-07-27 Thread Yi Pan
Hi, Neha, {quote} We do hope to include a DSL since that is the most natural way of expressing stream processing operations on top of the processor client. The DSL layer should be equivalent to that provided by Spark streaming or Flink in terms of expressiveness though there will be differences in

Re: [DISCUSS] KIP-28 - Add a transform client for data processing

2015-07-27 Thread Neha Narkhede
Adi, How far away are we from having something a prototype patch to play with? > We are working to share a prototype next week. Though the code will evolve to match the APIs and design as it shapes up, but it will be great if people can take a look and provide feedback. Couple of observations: >

Re: [DISCUSS] KIP-28 - Add a transform client for data processing

2015-07-27 Thread Yi Pan
Hi, Aditya, {quote} - The KIP states that cmd line tools will be provided to deploy as a separate service. Is the proposed scope limited to providing a library with which makes it possible build stream-processing-as- a-service or provide such a service within Kafka itself? {quote} There has alrea

Re: [DISCUSS] KIP-28 - Add a transform client for data processing

2015-07-27 Thread Neha Narkhede
Gwen, We have a compilation of notes from comparison with other systems. They might be missing details that folks who worked on that system might be able to point out. We can share that and discuss further on the KIP call. We do hope to include a DSL since that is the most natural way of expressi

Number of kafka topics/partitions supported per cluster of n nodes

2015-07-27 Thread Prabhjot Bharaj
Hi, I'm looking forward to a benchmark which can explain how many total number of topics and partitions can be created in a cluster of n nodes, given the message size varies between x and y bytes and how does it vary with varying heap sizes and how it affects the system performance. e.g. the resu

Re: [DISCUSS] KIP-28 - Add a transform client for data processing

2015-07-27 Thread Aditya Auradkar
+1 on comparison with existing solutions. On a high level, it seems nice to have a transform library inside Kafka.. a lot of the building blocks are already there to build a stream processing framework. However the details are tricky to get right I think this discussion will get a lot more interest

[jira] [Updated] (KAFKA-2381) Possible ConcurrentModificationException while unsubscribing from a topic in new consumer

2015-07-27 Thread Ashish K Singh (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish K Singh updated KAFKA-2381: -- Attachment: KAFKA-2381_2015-07-27_21:59:41.patch > Possible ConcurrentModificationException whil

[jira] [Commented] (KAFKA-2381) Possible ConcurrentModificationException while unsubscribing from a topic in new consumer

2015-07-27 Thread Ashish K Singh (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643847#comment-14643847 ] Ashish K Singh commented on KAFKA-2381: --- Updated reviewboard https://reviews.apache.

Re: Review Request 36871: Patch for KAFKA-2381

2015-07-27 Thread Ashish Singh
> On July 28, 2015, 1:15 a.m., Aditya Auradkar wrote: > > core/src/test/scala/integration/kafka/api/ConsumerTest.scala, line 233 > > > > > > consider closing this in a finally. A failing test can cause incorrect > >

Re: Review Request 36871: Patch for KAFKA-2381

2015-07-27 Thread Ashish Singh
> On July 28, 2015, 1:11 a.m., Jason Gustafson wrote: > > Ouch. Hard to believe this wasn't caught yet. It is. Thanks for the review. Addressed your concern. - Ashish --- This is an automatically generated e-mail. To reply, visit: https

Re: Review Request 36871: Patch for KAFKA-2381

2015-07-27 Thread Ashish Singh
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36871/ --- (Updated July 28, 2015, 4:59 a.m.) Review request for kafka. Bugs: KAFKA-2381

[jira] [Updated] (KAFKA-2381) Possible ConcurrentModificationException while unsubscribing from a topic in new consumer

2015-07-27 Thread Ashish K Singh (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish K Singh updated KAFKA-2381: -- Attachment: KAFKA-2381_2015-07-27_21:56:06.patch > Possible ConcurrentModificationException whil

[jira] [Commented] (KAFKA-2381) Possible ConcurrentModificationException while unsubscribing from a topic in new consumer

2015-07-27 Thread Ashish K Singh (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643843#comment-14643843 ] Ashish K Singh commented on KAFKA-2381: --- Updated reviewboard https://reviews.apache.

Re: Review Request 36871: Patch for KAFKA-2381

2015-07-27 Thread Ashish Singh
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36871/ --- (Updated July 28, 2015, 4:56 a.m.) Review request for kafka. Bugs: KAFKA-2381

[jira] [Commented] (KAFKA-2026) Logging of unused options always shows null for the value and is misleading if the option is used by serializers

2015-07-27 Thread Xuan Gong (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643820#comment-14643820 ] Xuan Gong commented on KAFKA-2026: -- I think that the warning messages here are used to re

Re: [DISCUSS] KIP-28 - Add a transform client for data processing

2015-07-27 Thread Gwen Shapira
Hi, Since we will be discussing KIP-28 in the call tomorrow, can you update the KIP with the feature-comparison with existing solutions? I admit that I do not see a need for single-event-producer-consumer pair (AKA Flume Interceptor). I've seen tons of people implement such apps in the last year,

[jira] [Updated] (KAFKA-2360) The kafka-consumer-perf-test.sh script help information print useless parameters.

2015-07-27 Thread Bo Wang (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Wang updated KAFKA-2360: --- Description: Run kafka-consumer-perf-test.sh --help to show help information, but found 3 parameters useless

[jira] [Updated] (KAFKA-2360) The kafka-consumer-perf-test.sh script help information print useless parameters.

2015-07-27 Thread Bo Wang (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Wang updated KAFKA-2360: --- Description: Run kafka-consumer-perf-test.sh --help to show help information, but found 3 parameters useless

Re: Review Request 36871: Patch for KAFKA-2381

2015-07-27 Thread Aditya Auradkar
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36871/#review93215 --- core/src/test/scala/integration/kafka/api/ConsumerTest.scala (line

Re: Review Request 36871: Patch for KAFKA-2381

2015-07-27 Thread Jason Gustafson
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36871/#review93213 --- Ouch. Hard to believe this wasn't caught yet. core/src/test/scala/

[jira] [Commented] (KAFKA-313) Add JSON/CSV output and looping options to ConsumerGroupCommand

2015-07-27 Thread Ashish K Singh (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643716#comment-14643716 ] Ashish K Singh commented on KAFKA-313: -- [~gwenshap] need help with getting this KIP to

[jira] [Commented] (KAFKA-2301) Deprecate ConsumerOffsetChecker

2015-07-27 Thread Ashish K Singh (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643714#comment-14643714 ] Ashish K Singh commented on KAFKA-2301: --- [~junrao], [~gwenshap] can any of you help

[jira] [Commented] (KAFKA-2275) Add a ListTopics() API to the new consumer

2015-07-27 Thread Ashish K Singh (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643710#comment-14643710 ] Ashish K Singh commented on KAFKA-2275: --- [~guozhang] I think this is in good shape n

[jira] [Commented] (KAFKA-2381) Possible ConcurrentModificationException while unsubscribing from a topic in new consumer

2015-07-27 Thread Ashish K Singh (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643703#comment-14643703 ] Ashish K Singh commented on KAFKA-2381: --- [~gwenshap] could you take a look when you

[jira] [Updated] (KAFKA-2381) Possible ConcurrentModificationException while unsubscribing from a topic in new consumer

2015-07-27 Thread Ashish K Singh (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish K Singh updated KAFKA-2381: -- Attachment: KAFKA-2381_2015-07-27_17:56:00.patch > Possible ConcurrentModificationException whil

[jira] [Commented] (KAFKA-2381) Possible ConcurrentModificationException while unsubscribing from a topic in new consumer

2015-07-27 Thread Ashish K Singh (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643699#comment-14643699 ] Ashish K Singh commented on KAFKA-2381: --- Updated reviewboard https://reviews.apache.

Re: Review Request 36871: Patch for KAFKA-2381

2015-07-27 Thread Ashish Singh
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36871/ --- (Updated July 28, 2015, 12:56 a.m.) Review request for kafka. Bugs: KAFKA-238

[jira] [Commented] (KAFKA-2381) Possible ConcurrentModificationException while unsubscribing from a topic in new consumer

2015-07-27 Thread Ashish K Singh (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643693#comment-14643693 ] Ashish K Singh commented on KAFKA-2381: --- Created reviewboard https://reviews.apache.

[jira] [Updated] (KAFKA-2381) Possible ConcurrentModificationException while unsubscribing from a topic in new consumer

2015-07-27 Thread Ashish K Singh (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish K Singh updated KAFKA-2381: -- Status: Patch Available (was: Open) > Possible ConcurrentModificationException while unsubscrib

[jira] [Updated] (KAFKA-2381) Possible ConcurrentModificationException while unsubscribing from a topic in new consumer

2015-07-27 Thread Ashish K Singh (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish K Singh updated KAFKA-2381: -- Attachment: KAFKA-2381.patch > Possible ConcurrentModificationException while unsubscribing from

Review Request 36871: Patch for KAFKA-2381

2015-07-27 Thread Ashish Singh
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36871/ --- Review request for kafka. Bugs: KAFKA-2381 https://issues.apache.org/jira/b

[jira] [Commented] (KAFKA-1690) new java producer needs ssl support as a client

2015-07-27 Thread Sourabh Chandak (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643658#comment-14643658 ] Sourabh Chandak commented on KAFKA-1690: Awesome! When will this be integrated to

[jira] [Created] (KAFKA-2381) Possible ConcurrentModificationException while unsubscribing from a topic in new consumer

2015-07-27 Thread Ashish K Singh (JIRA)
Ashish K Singh created KAFKA-2381: - Summary: Possible ConcurrentModificationException while unsubscribing from a topic in new consumer Key: KAFKA-2381 URL: https://issues.apache.org/jira/browse/KAFKA-2381

[jira] [Commented] (KAFKA-2130) Resource leakage in AppInfo.scala during initialization

2015-07-27 Thread Xuan Gong (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643625#comment-14643625 ] Xuan Gong commented on KAFKA-2130: -- move {code} stream.close(); {code} to the finally bl

Re: [DISCUSS] KIP-27 - Conditional Publish

2015-07-27 Thread Jiangjie Qin
@Ewen, good point about batching. Yes, it would be tricky if we want to do a per-key conditional produce. My understanding is that the prerequisite of this KIP is: 1. Single producer for each partition. 2. Acks=-1, max.in.flight.request.per.connection=1, retries=SOME_BIG_NUMBER The major problem i

[jira] [Commented] (KAFKA-1690) new java producer needs ssl support as a client

2015-07-27 Thread Sriharsha Chintalapani (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643597#comment-14643597 ] Sriharsha Chintalapani commented on KAFKA-1690: --- [~sourabh0612] Yes. It incl

Re: error while high level consumer

2015-07-27 Thread Mayuresh Gharat
Try bouncing the host that appears in the stored data section. Thanks, Mayuresh On Mon, Jul 27, 2015 at 3:41 PM, Jiangjie Qin wrote: > This is due to the zookeeper path storing the previous owner info hasn't > been deleted at the moment. If the rebalance completes after retry, it > should be f

[jira] [Commented] (KAFKA-2268) New producer logs WARN if serializer supplied directly to constructor

2015-07-27 Thread Xuan Gong (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643573#comment-14643573 ] Xuan Gong commented on KAFKA-2268: -- Looks like this is duplicate with https://issues.apa

Re: Review Request 36858: Patch for KAFKA-2120

2015-07-27 Thread Jason Gustafson
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36858/#review93189 --- Looks pretty good overall. Found mostly trivial stuff. clients/src

Re: error while high level consumer

2015-07-27 Thread Jiangjie Qin
This is due to the zookeeper path storing the previous owner info hasn't been deleted at the moment. If the rebalance completes after retry, it should be fine. Jiangjie (Becket) Qin On Fri, Jul 24, 2015 at 6:54 PM, Kris K wrote: > Hi, > > I started seeing these errors in the logs continuously w

[jira] [Updated] (KAFKA-2120) Add a request timeout to NetworkClient

2015-07-27 Thread Mayuresh Gharat (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayuresh Gharat updated KAFKA-2120: --- Attachment: KAFKA-2120_2015-07-27_15:31:19.patch > Add a request timeout to NetworkClient > --

[jira] [Commented] (KAFKA-2120) Add a request timeout to NetworkClient

2015-07-27 Thread Mayuresh Gharat (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643533#comment-14643533 ] Mayuresh Gharat commented on KAFKA-2120: Updated reviewboard https://reviews.apach

Re: Review Request 36858: Patch for KAFKA-2120

2015-07-27 Thread Mayuresh Gharat
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36858/ --- (Updated July 27, 2015, 10:32 p.m.) Review request for kafka. Bugs: KAFKA-212

[jira] [Commented] (KAFKA-1690) new java producer needs ssl support as a client

2015-07-27 Thread Sourabh Chandak (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643532#comment-14643532 ] Sourabh Chandak commented on KAFKA-1690: [~sriharsha] Will this patch unblock the

Re: Review Request 36858: Patch for KAFKA-2120

2015-07-27 Thread Mayuresh Gharat
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36858/ --- (Updated July 27, 2015, 10:31 p.m.) Review request for kafka. Bugs: KAFKA-212

Re: [DISCUSS] Partitioning in Kafka

2015-07-27 Thread Gwen Shapira
I guess it depends on whether the original producer did any "map" tasks or simply wrote raw data. We usually advocate writing raw data, and since we need to write it anyway, the partitioner doesn't introduce any extra "hops". Its definitely useful to look at use-cases and I need to think a bit mor

[jira] [Comment Edited] (KAFKA-2350) Add KafkaConsumer pause capability

2015-07-27 Thread Jiangjie Qin (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643418#comment-14643418 ] Jiangjie Qin edited comment on KAFKA-2350 at 7/27/15 9:45 PM: --

[jira] [Commented] (KAFKA-2350) Add KafkaConsumer pause capability

2015-07-27 Thread Jason Gustafson (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643442#comment-14643442 ] Jason Gustafson commented on KAFKA-2350: [~becket_qin] I think that we're on the s

Re: [DISCUSS] Partitioning in Kafka

2015-07-27 Thread Ewen Cheslack-Postava
Gwen - this is really like two steps of map reduce though, right? The first step does the partial shuffle to two partitions per key, second step does partial reduce + final full shuffle, final step does the final reduce. This strikes me as similar to partition assignment strategies in the consumer

[jira] [Commented] (KAFKA-2350) Add KafkaConsumer pause capability

2015-07-27 Thread Jiangjie Qin (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643418#comment-14643418 ] Jiangjie Qin commented on KAFKA-2350: - [~hachikuji], I am with [~guozhang] that it is

Re: Review Request 36858: Patch for KAFKA-2120

2015-07-27 Thread Mayuresh Gharat
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36858/ --- (Updated July 27, 2015, 9:09 p.m.) Review request for kafka. Bugs: KAFKA-2120

[jira] [Updated] (KAFKA-2120) Add a request timeout to NetworkClient

2015-07-27 Thread Mayuresh Gharat (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayuresh Gharat updated KAFKA-2120: --- Attachment: KAFKA-2120.patch > Add a request timeout to NetworkClient > --

[jira] [Updated] (KAFKA-2120) Add a request timeout to NetworkClient

2015-07-27 Thread Mayuresh Gharat (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayuresh Gharat updated KAFKA-2120: --- Status: Patch Available (was: Open) > Add a request timeout to NetworkClient > --

Review Request 36858: Patch for KAFKA-2120

2015-07-27 Thread Mayuresh Gharat
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36858/ --- Review request for kafka. Bugs: KAFKA-2120 https://issues.apache.org/jira/b

[jira] [Commented] (KAFKA-2120) Add a request timeout to NetworkClient

2015-07-27 Thread Mayuresh Gharat (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643399#comment-14643399 ] Mayuresh Gharat commented on KAFKA-2120: Created reviewboard https://reviews.apach

[jira] [Commented] (KAFKA-2350) Add KafkaConsumer pause capability

2015-07-27 Thread Jason Gustafson (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643349#comment-14643349 ] Jason Gustafson commented on KAFKA-2350: There's one interesting implementation no

Re: Kafka Consumer thoughts

2015-07-27 Thread Jason Gustafson
I think if we recommend a longer session timeout, then we should expose the heartbeat frequency in configuration since this generally controls how long normal rebalances will take. I think it's currently hard-coded to 3 heartbeats per session timeout. It could also be nice to have an explicit Leave

Re: Kafka Consumer thoughts

2015-07-27 Thread Ewen Cheslack-Postava
Kartik, on your second point about timeouts with poll() and heartbeats, the consumer now handles this properly. KAFKA-2123 introduced a DelayedTaskQueue and that is used internally to handle processing events at the right time even if poll() is called with a large timeout. The same mechanism is use

Re: Kafka Consumer thoughts

2015-07-27 Thread Jay Kreps
Hey Kartik, Totally agree we don't want people tuning timeouts in the common case. However there are two ways to avoid this: 1. Default the timeout high 2. Put the heartbeat in a separate thread When we were doing the consumer design we discussed this tradeoff and I think the conclusion we came

Re: Review Request 33620: Patch for KAFKA-1690

2015-07-27 Thread Dong Lin
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33620/#review93177 --- clients/src/main/java/org/apache/kafka/common/config/AbstractConfig

Re: Kafka Consumer thoughts

2015-07-27 Thread Kartik Paramasivam
adding the open source alias. This email started off as a broader discussion around the new consumer. I was zooming into only the aspect of poll() being the only mechanism for driving the heartbeats. Yes the lag is the effect of the problem (not the problem). Monitoring the lag is important as

Re: [DISCUSS] Partitioning in Kafka

2015-07-27 Thread Gwen Shapira
If you are used to map-reduce patterns, this sounds like a perfectly natural way to process streams of data. Call the first consumer "map-combine-log", the topic "shuffle-log" and the second consumer "reduce-log" :) I like that a lot. It works well for either "embarrassingly parallel" cases, or "s

Re: [DISCUSS] Partitioning in Kafka

2015-07-27 Thread Jason Gustafson
For a little background, the difference between this partitioner and the default one is that it breaks the deterministic mapping from key to partition. Instead, messages for a given key can end up in either of two partitions. This means that the consumer generally won't see all messages for a given

[jira] [Commented] (KAFKA-2350) Add KafkaConsumer pause capability

2015-07-27 Thread Guozhang Wang (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643249#comment-14643249 ] Guozhang Wang commented on KAFKA-2350: -- [~becket_qin], I was not considering the impl

Jenkins build is back to normal : KafkaPreCommit #164

2015-07-27 Thread Apache Jenkins Server
See

Re: Review Request 36652: Patch for KAFKA-2351

2015-07-27 Thread Jiangjie Qin
> On July 24, 2015, 4:13 p.m., Jun Rao wrote: > > core/src/main/scala/kafka/network/SocketServer.scala, line 264 > > > > > > Not sure if it's better to keep the thread alive on any throwable. For > > unexpected exce

[jira] [Commented] (KAFKA-2364) Improve documentation for contributing to docs

2015-07-27 Thread Ismael Juma (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643179#comment-14643179 ] Ismael Juma commented on KAFKA-2364: Coincidentally a CONTRIBUTING.md was added today

[jira] [Commented] (KAFKA-2092) New partitioning for better load balancing

2015-07-27 Thread Jason Gustafson (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643164#comment-14643164 ] Jason Gustafson commented on KAFKA-2092: [~azaroth] Haha, I thought we'd get more

[jira] [Commented] (KAFKA-2303) Fix for KAFKA-2235 LogCleaner offset map overflow causes another compaction failures

2015-07-27 Thread Alexander Demidko (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643136#comment-14643136 ] Alexander Demidko commented on KAFKA-2303: -- I think in our case we had too many u

[jira] [Updated] (KAFKA-2349) `contributing` website page should link to "Contributing Code Changes" wiki page

2015-07-27 Thread Guozhang Wang (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-2349: - Resolution: Fixed Fix Version/s: 0.8.3 Status: Resolved (was: Patch Available)

[jira] [Commented] (KAFKA-2349) `contributing` website page should link to "Contributing Code Changes" wiki page

2015-07-27 Thread Guozhang Wang (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643129#comment-14643129 ] Guozhang Wang commented on KAFKA-2349: -- Committed to the repo, thanks! > `contributi

Re: New Producer and "acks" configuration

2015-07-27 Thread Ewen Cheslack-Postava
If only we had some sort of system test framework with a producer performance test that we could parameterize with the different acks settings to validate these performance differences... wrt out of order: yes, with > 1 in flight requests with retries, messages can get out of order. Becket had a g

[jira] [Commented] (KAFKA-2321) Introduce CONTRIBUTING.md

2015-07-27 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643107#comment-14643107 ] ASF GitHub Bot commented on KAFKA-2321: --- Github user asfgit closed the pull request

[jira] [Updated] (KAFKA-2321) Introduce CONTRIBUTING.md

2015-07-27 Thread Guozhang Wang (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-2321: - Resolution: Fixed Fix Version/s: 0.8.3 Status: Resolved (was: Patch Available)

[GitHub] kafka pull request: KAFKA-2321; Introduce CONTRIBUTING.md

2015-07-27 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/97 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enable

Re: Best practices - Using kafka (with http server) as source-of-truth

2015-07-27 Thread Ewen Cheslack-Postava
Hi Prabhjot, Confluent has a REST proxy with docs that may give some guidance: http://docs.confluent.io/1.0/kafka-rest/docs/intro.html The new producer that it uses is very efficient, so you should be able to get pretty good throughput. You take a bit of a hit due to the overhead of sending data t

[jira] [Commented] (KAFKA-2365) Copycat checklist

2015-07-27 Thread Neha Narkhede (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643087#comment-14643087 ] Neha Narkhede commented on KAFKA-2365: -- Worth discussing a process for including a co

[jira] [Updated] (KAFKA-2350) Add KafkaConsumer pause capability

2015-07-27 Thread Jason Gustafson (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-2350: --- Description: There are some use cases in stream processing where it is helpful to be able to

[jira] [Commented] (KAFKA-2260) Allow specifying expected offset on produce

2015-07-27 Thread Mayuresh Gharat (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643065#comment-14643065 ] Mayuresh Gharat commented on KAFKA-2260: I think, when 2 producers are trying to p

[jira] [Updated] (KAFKA-2372) Copycat distributed config storage

2015-07-27 Thread Ewen Cheslack-Postava (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-2372: - Component/s: copycat > Copycat distributed config storage > --

[jira] [Updated] (KAFKA-2377) Add copycat system tests

2015-07-27 Thread Ewen Cheslack-Postava (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-2377: - Component/s: copycat > Add copycat system tests > > >

[jira] [Updated] (KAFKA-2378) Add Copycat embedded API

2015-07-27 Thread Ewen Cheslack-Postava (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-2378: - Component/s: copycat > Add Copycat embedded API > > >

[jira] [Updated] (KAFKA-2379) Add Copycat documentation

2015-07-27 Thread Ewen Cheslack-Postava (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-2379: - Component/s: copycat > Add Copycat documentation > - > >

[jira] [Updated] (KAFKA-2370) Add pause/unpause connector support

2015-07-27 Thread Ewen Cheslack-Postava (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-2370: - Component/s: copycat > Add pause/unpause connector support > -

[jira] [Updated] (KAFKA-2374) Implement Copycat log/file connector

2015-07-27 Thread Ewen Cheslack-Postava (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-2374: - Component/s: copycat > Implement Copycat log/file connector >

[jira] [Updated] (KAFKA-2375) Implement elasticsearch Copycat sink connector

2015-07-27 Thread Ewen Cheslack-Postava (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-2375: - Component/s: copycat > Implement elasticsearch Copycat sink connector > --

[jira] [Updated] (KAFKA-2369) Add Copycat REST API

2015-07-27 Thread Ewen Cheslack-Postava (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-2369: - Component/s: copycat > Add Copycat REST API > > >

[jira] [Updated] (KAFKA-2371) Add distributed coordinator implementation for Copycat

2015-07-27 Thread Ewen Cheslack-Postava (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-2371: - Component/s: copycat > Add distributed coordinator implementation for Copycat > --

[jira] [Updated] (KAFKA-2376) Add Copycat metrics

2015-07-27 Thread Ewen Cheslack-Postava (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-2376: - Component/s: copycat > Add Copycat metrics > --- > >

[jira] [Updated] (KAFKA-2373) Copycat distributed offset storage

2015-07-27 Thread Ewen Cheslack-Postava (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-2373: - Component/s: copycat > Copycat distributed offset storage > --

[jira] [Updated] (KAFKA-2366) Initial patch for Copycat

2015-07-27 Thread Ewen Cheslack-Postava (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-2366: - Component/s: copycat > Initial patch for Copycat > - > >

[jira] [Updated] (KAFKA-2368) Add Copycat standalone CLI

2015-07-27 Thread Ewen Cheslack-Postava (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-2368: - Component/s: copycat > Add Copycat standalone CLI > -- > >

[jira] [Updated] (KAFKA-2367) Add Copycat runtime data API

2015-07-27 Thread Ewen Cheslack-Postava (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-2367: - Component/s: copycat > Add Copycat runtime data API >

[jira] [Updated] (KAFKA-2365) Copycat checklist

2015-07-27 Thread Ewen Cheslack-Postava (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-2365: - Component/s: copycat > Copycat checklist > - > > K

[jira] [Commented] (KAFKA-2365) Copycat checklist

2015-07-27 Thread Ewen Cheslack-Postava (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643036#comment-14643036 ] Ewen Cheslack-Postava commented on KAFKA-2365: -- Thanks, auto assignment works

[jira] [Commented] (KAFKA-2365) Copycat checklist

2015-07-27 Thread Gwen Shapira (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643016#comment-14643016 ] Gwen Shapira commented on KAFKA-2365: - BTW. Two connectors that appeared in the KIP di

[jira] [Commented] (KAFKA-2365) Copycat checklist

2015-07-27 Thread Gwen Shapira (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643014#comment-14643014 ] Gwen Shapira commented on KAFKA-2365: - I added a component, added you as component lea

  1   2   >