How many partition can one single machine handle in Kafka?

2014-10-21 Thread Xiaobin She
hello, everyone I'm new to kafka, I'm wondering what's the max num of partition can one siggle machine handle in Kafka? Is there an sugeest num? Thanks. xiaobinshe

Re: taking broker down and returning it does not restore cluster state (nor rebalance)

2014-10-21 Thread Shlomi Hazan
trying to reproduce failed: after somewhat long minutes I noticed that the partition leaders regained balance again, and the only issue left is that the preferred replica was not balanced as it was before taking the broker down. meaning, that the output of the topic description shows broker 1 (out

Re: How to produce and consume events in 2 DCs?

2014-10-21 Thread Erik van oosten
Thanks Neha, Unfortunately, the maintenance overhead of 2 more clusters is not acceptable to us. Would you accept a pull request on mirror maker that would rename topics on the fly? For example by accepting the parameter rename: —rename src1/dest1,src2/dest2 or, extended with RE support:

Clean Kafka Queue

2014-10-21 Thread Eduardo Costa Alfaia
Hi Guys, Is there a manner of cleaning a kafka queue after that the consumer consume the messages? Thanks -- Informativa sulla Privacy: http://www.unibs.it/node/8155

Re: Sending Same Message to Two Topics on Same Broker Cluster

2014-10-21 Thread Neha Narkhede
I'm not sure I understood your concern about invoking send() twice, once with each topic. Are you worried about the network overhead? Whether Kafka does this transparently or not, sending messages to different topics will carry some overhead. I think the design of the API is much more intuitive

Re: Performance issues

2014-10-21 Thread Mohit Anchlia
I have a java test that produces messages and then consumer consumers it. Consumers are active all the time. There is 1 consumer for 1 producer. I am measuring the time between the message is successfully written to the queue and the time consumer picks it up. On Tue, Oct 21, 2014 at 8:32 AM, Neha

Re: Clean Kafka Queue

2014-10-21 Thread Harsha
you can use log.retention.hours or log.retention.bytes to prune the log more info on that config here https://kafka.apache.org/08/configuration.html if you want to delete a message after the consumer processed a message there is no api for it. -Harsha On Tue, Oct 21, 2014, at 08:00 AM, Eduardo

Re: Sending Same Message to Two Topics on Same Broker Cluster

2014-10-21 Thread Bhavesh Mistry
Hi Neha, All, I am saying is that if same byte[] or data has to go to two topics then, I have to call send twice and with same data has to transfer over the wire twice (assuming the partition is on same broker for two topics, then it not efficient.). If Kafka Protocol allows to set multiple

Re: Clean Kafka Queue

2014-10-21 Thread Joe Stein
The concept of truncate topic comes up a lot. I will add it as an item to https://issues.apache.org/jira/browse/KAFKA-1694 It is a scary feature though, it might be best to wait until authorizations are in place before we release it. With 0.8.2 you can delete topics so at least you can start

Re: Clean Kafka Queue

2014-10-21 Thread Eduardo Costa Alfaia
Ok guys, Thanks by the help. Regards On Oct 21, 2014, at 18:30, Joe Stein joe.st...@stealth.ly wrote: The concept of truncate topic comes up a lot. I will add it as an item to https://issues.apache.org/jira/browse/KAFKA-1694 It is a scary feature though, it might be best to wait until

Re: Sending Same Message to Two Topics on Same Broker Cluster

2014-10-21 Thread Jay Kreps
Hey Bhavesh, This would only work if both topics happened to be on the same machine, which generally they wouldn't. -Jay On Tue, Oct 21, 2014 at 9:14 AM, Bhavesh Mistry mistry.p.bhav...@gmail.com wrote: Hi Neha, All, I am saying is that if same byte[] or data has to go to two topics then,

Re: Performance issues

2014-10-21 Thread Mohit Anchlia
This is the version I am using: kafka_2.10-0.8.1.1 I think this is fairly recent version On Tue, Oct 21, 2014 at 10:57 AM, Jay Kreps jay.kr...@gmail.com wrote: What version of Kafka is this? Can you try the same test against trunk? We fixed a couple of latency related bugs which may be the

Sizing Cluster

2014-10-21 Thread Pete Wright
Hi There, I have a question regarding sizing disk for kafka brokers. Let's say I have systems capable of providing 10TB of storage, and they act as Kafka brokers. If I were to deploy two of these nodes, and enable replication in Kafka, would I actually have 10TB available for my

0.8.1.2

2014-10-21 Thread Shlomi Hazan
Hi All, Will version 0.8.1.2 happen? Shlomi

Re: Sizing Cluster

2014-10-21 Thread István
One thing that you have to keep in mind is that moving 10T between nodes takes long time. If you have a node failure and you need to rebuild (resync) the data your system is going to be vulnerable against the second node failure. You could mitigate this with using raid. I think generally speaking

Re: Performance issues

2014-10-21 Thread Jay Kreps
There was a bug that could lead to the fetch request from the consumer hitting it's timeout instead of being immediately triggered by the produce request. To see if you are effected by that set you consumer max wait time to 1 ms and see if the latency drops to 1 ms (or, alternately, try with trunk

Re: Performance issues

2014-10-21 Thread Mohit Anchlia
Is this a parameter I need to set it in kafka server or on the client side? Also, can you help point out which one exactly is consumer max wait time from this list? https://kafka.apache.org/08/configuration.html On Tue, Oct 21, 2014 at 11:35 AM, Jay Kreps jay.kr...@gmail.com wrote: There was a

frequent periods of ~1500 replicas not in sync

2014-10-21 Thread Neil Harkins
Hi. I've got a 5 node cluster running Kafka 0.8.1, with 4697 partitions (2 replicas each) across 564 topics. I'm sending it about 1% of our total messaging load now, and several times a day there is a period where 1~1500 partitions have one replica not in sync. Is this normal? If a consumer is

Re: frequent periods of ~1500 replicas not in sync

2014-10-21 Thread Gwen Shapira
Consumers always read from the leader replica, which is always in sync by definition. So you are good there. The concern would be if the leader crashes during this period. On Tue, Oct 21, 2014 at 2:56 PM, Neil Harkins nhark...@gmail.com wrote: Hi. I've got a 5 node cluster running Kafka 0.8.1,

Re: frequent periods of ~1500 replicas not in sync

2014-10-21 Thread Guozhang Wang
Neil, what you are seeing could probably be KAFKA-1407 https://issues.apache.org/jira/browse/KAFKA-1407. On Tue, Oct 21, 2014 at 12:03 PM, Gwen Shapira gshap...@cloudera.com wrote: Consumers always read from the leader replica, which is always in sync by definition. So you are good there. The

Re: Performance issues

2014-10-21 Thread Guozhang Wang
This is a consumer config: fetch.wait.max.ms On Tue, Oct 21, 2014 at 11:39 AM, Mohit Anchlia mohitanch...@gmail.com wrote: Is this a parameter I need to set it in kafka server or on the client side? Also, can you help point out which one exactly is consumer max wait time from this list?

Re: Sizing Cluster

2014-10-21 Thread Pete Wright
Thanks Istvan - I think I understand what you are say here - although I was under the impression that if I ensured each topic was being replicated N+1 times a two node cluster would ensure each node has a copy of the entire contents of the message bus at any given time. I agree with your

Re: How many partition can one single machine handle in Kafka?

2014-10-21 Thread Guozhang Wang
Xiaobin, This FAQ may give you some hints: https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-HowdoIchoosethenumberofpartitionsforatopic ? On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She xiaobin...@gmail.com wrote: hello, everyone I'm new to kafka, I'm wondering what's the max num of

Re: Strange behavior during un-clean leader election

2014-10-21 Thread Guozhang Wang
Bryan, Did you take down some brokers in your cluster while hitting KAFKA-1028? If yes, you may be hitting KAFKA-1647 also. Guozhang On Mon, Oct 20, 2014 at 1:18 PM, Bryan Baugher bjb...@gmail.com wrote: Hi everyone, We run a 3 Kafka cluster using 0.8.1.1 with all topics having a

Partition and Replica assignment for a Topic

2014-10-21 Thread Jonathan Creasy
I¹d like to be able to see a little more detail for a topic. What is the best way to get this information? Topic Partition Replica Broker topic1 1 1 3 topic1 1 2 4 topic1 1 3 1 topic1 2 1 1 topic1 2

Re: Partition and Replica assignment for a Topic

2014-10-21 Thread Gwen Shapira
Anything missing in the output of: kafka-topics.sh --describe --zookeeper localhost:2181 ? On Tue, Oct 21, 2014 at 4:29 PM, Jonathan Creasy jonathan.cre...@turn.com wrote: I¹d like to be able to see a little more detail for a topic. What is the best way to get this information? Topic

Re: Strange behavior during un-clean leader election

2014-10-21 Thread Bryan Baugher
Yes the cluster was to a degree restarted in a rolling fashion but due to some other events causing the brokers to be rather confused the ISR for a number of partitions became empty and a new controller was elected. KAFKA-1647 sounds exactly like the problem I encountered. Thank you. On Tue, Oct

Re: [DISCUSS] Release 0.8.2-beta before 0.8.2?

2014-10-21 Thread Olson,Andrew
https://issues.apache.org/jira/browse/KAFKA-1647 sounds serious enough to include in 0.8.2-beta if possible. CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation and are intended only for the addressee. The information contained in this message is

Re: How many partition can one single machine handle in Kafka?

2014-10-21 Thread Todd Palino
As far as the number of partitions a single broker can handle, we've set our cap at 4000 partitions (including replicas). Above that we've seen some performance and stability issues. -Todd On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She xiaobin...@gmail.com wrote: hello, everyone I'm new to

Re: Performance issues

2014-10-21 Thread Mohit Anchlia
I set the property to 1 in the consumer code that is passed to createJavaConsumerConnector code, but it didn't seem to help props.put(fetch.wait.max.ms, fetchMaxWait); On Tue, Oct 21, 2014 at 1:21 PM, Guozhang Wang wangg...@gmail.com wrote: This is a consumer config: fetch.wait.max.ms On

Re: [DISCUSS] Release 0.8.2-beta before 0.8.2?

2014-10-21 Thread Joe Stein
It doesn't look like a showstopper (all replicas for a partition going down is rare and bigger issue if it happens) but it is good for folks to know about it going in, definitely! In either case I changed the fix version for that ticket to 0.8.2 so it shows up now it is a blocker for final I

Re: Partition and Replica assignment for a Topic

2014-10-21 Thread Jonathan Creasy
Heh, I think I was mis-interpreting that output. Taking this output for example: Topic:REPL-atl1-us PartitionCount:256 ReplicationFactor:1 Configs: Topic: REPL-atl1-us Partition: 0Leader: 32 Replicas: 32Isr: 32 Topic: REPL-atl1-us Partition: 1

Re: Performance issues

2014-10-21 Thread Mohit Anchlia
Most of the consumer threads seems to be waiting: ConsumerFetcherThread-groupA_ip-10-38-19-230-1413925671158-3cc3e22f-0-0 prio=10 tid=0x7f0aa84db800 nid=0x5be9 runnable [0x7f0a5a618000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)

Re: How many partition can one single machine handle in Kafka?

2014-10-21 Thread Neil Harkins
On Tue, Oct 21, 2014 at 2:10 PM, Todd Palino tpal...@gmail.com wrote: As far as the number of partitions a single broker can handle, we've set our cap at 4000 partitions (including replicas). Above that we've seen some performance and stability issues. How many brokers? I'm curious: what kinds

Re: How to produce and consume events in 2 DCs?

2014-10-21 Thread Steven Wu
I think it doesn't have to be two more clusters. can be just two more topics. MirrorMaker can copy from source topics in both regions into one aggregate topic. On Tue, Oct 21, 2014 at 1:54 AM, Erik van oosten e.vanoos...@grons.nl.invalid wrote: Thanks Neha, Unfortunately, the maintenance

Re: taking broker down and returning it does not restore cluster state (nor rebalance)

2014-10-21 Thread Jun Rao
To balance the leaders, you can run the tool in http://kafka.apache.org/documentation.html#basic_ops_leader_balancing In the upcoming 0.8.2 release, we have fixed the auto leader balancing logic. So leaders will be balanced automatically. Thanks, Jun On Tue, Oct 21, 2014 at 12:19 AM, Shlomi

Re: 0.8.1.2

2014-10-21 Thread Jun Rao
We are voting an 0.8.2 beta release right now. Thanks, Jun On Tue, Oct 21, 2014 at 11:17 AM, Shlomi Hazan shl...@viber.com wrote: Hi All, Will version 0.8.1.2 happen? Shlomi

Re: How many partition can one single machine handle in Kafka?

2014-10-21 Thread Xiaobin She
Todd, Actually I'm wondering how kafka handle so much partition, with one partition there is at least one file on disk, and with 4000 partition, there will be at least 4000 files. When all these partitions have write request, how did Kafka make the write operation on the disk to be sequential

Re: Sizing Cluster

2014-10-21 Thread István
Hi Pete, Yes you are right, both nodes has all of the data. I was just wondering what is the scenario for losing one node, in production it might not fly. If this is for testing only, you are good. Answering your question, I think retention policy (log.retention.hours) is for controlling the