What is the data format of Kafka's data nodes in ZooKeeper?

2015-04-07 Thread Andrei
I'm trying to read data from ZooKeeper nodes that was written by different Kafka components. As a specific example (just one from a bunch), I'm trying to read current offset for specific group, topic and partition. As far as I understand, it is stored under the path

Increasing replication factor of existing topics

2015-04-07 Thread Navneet Gupta (Tech - BLR)
Hi, I got a method to increase replication factor of topics here https://kafka.apache.org/081/ops.html However, I was wondering if it's possible to do it by altering some nodes in zookeeper. Thoughts/suggestions welcome. -- Thanks Regards, Navneet Gupta

Re: Increasing replication factor of existing topics

2015-04-07 Thread Todd Palino
The partition reassignment is started by writing a zookeeper node in the admin tree. While it's possible to kick off the partition reassignment by writing the zookeeper node that controls it directly, you have to be very careful about doing this, making sure that the format is perfect and you

Empty topic metadata returned during/shortly after server startup

2015-04-07 Thread David Corley
Hey all, We're trying to write some integration tests around a Ruby-based Kafka client we're developing that leverages both poseidon and poseidon_cluster gems. We're running Kafka 0.8.0 in a single node config with a single ZK instance supporting it on the same machine. The basic tests is as

Re: What is the data format of Kafka's data nodes in ZooKeeper?

2015-04-07 Thread Patrick Dignan
You need to create the ZKClient with the kafka.utils.ZkStringSerializer as the serializer. On Tue, Apr 7, 2015 at 9:40 AM, Andrei faithlessfri...@gmail.com wrote: I'm trying to read data from ZooKeeper nodes that was written by different Kafka components. As a specific example (just one from a

Re: What is the data format of Kafka's data nodes in ZooKeeper?

2015-04-07 Thread Andrei
Thanks a lot, ZkStringSerializer works like a charm! For those googling for the same question, here's a gist, which instantiates ZkClient and sets proper serializer (in case somebody else finds this question). [1]: https://gist.github.com/jjkoshy/3842975 On Tue, Apr 7, 2015 at 6:49 PM,

Re: Increasing replication factor of existing topics

2015-04-07 Thread Harsha
Hi Navneet,           Any reason that you are looking to modify the zk nodes directly to increase the topic partition. If you are looking for an api to do this there is AdminUtils.addPartitions .  --  Harsha On April 7, 2015 at 6:45:40 AM, Navneet Gupta (Tech - BLR)

Re: New kafka client for Go (golang)

2015-04-07 Thread Piotr Husiatyński
Sorry if this mail is not send properly, but I have no idea how to send in gmail response to mailing list without having original message in mailbox. How does it compare to Sarama? There are several important differences. It's been about two months since I looked at sarama API and I know that

Re: Kafka question

2015-04-07 Thread Guozhang Wang
Jack, Okay I see your point now. I was originally thinking that in each run, you 1) first create the topic, 2) start producing to the topic, 3) start consuming from the topic, and then 4) delete the topic, stop producers / consumers before complete, but it sounds like you actually only create the

Re: Consumer Group Lag Reporting

2015-04-07 Thread Kyle Banker
Thanks, Otis. I actually already have a reporting and alerting infrastructure. I mainly wanted to confirm that parsing the output of the offset checker is the recommended practice for reporting consumer group offsets. Is this the case? If so, I wanted to find out if any work is under way to make

Re: What is the data format of Kafka's data nodes in ZooKeeper?

2015-04-07 Thread Guozhang Wang
Andrei, Kafka uses string serialization when writing data to ZK, you can find its implementation in kafka.utils.ZKStringSerializer. Guozhang On Tue, Apr 7, 2015 at 6:40 AM, Andrei faithlessfri...@gmail.com wrote: I'm trying to read data from ZooKeeper nodes that was written by different

Re: Problem with node after restart no partitions?

2015-04-07 Thread Jason Rosenberg
Thunder, thanks for the detailed info. I can confirm that our incident had the same (or similar) sequence of messages, when the first upgraded broker restarted (after having undergone an unclean shutdown). I think it makes sense at this point, to file a jira issue to track it. (Could mostly

Re: Number of Partitions and Performance

2015-04-07 Thread Jay Kreps
I think the blog post was giving that as an upper bound not a recommended size. I think that blog goes through some of the trade offs of having more or fewer partitions. -Jay On Tue, Apr 7, 2015 at 10:13 AM, François Méthot fmetho...@gmail.com wrote: Hi, We initially had configured our

Re: Number of Partitions and Performance

2015-04-07 Thread Todd Palino
Going to stand with Jay here :) I just posted an email yesterday about how we size clusters and topics. Basically, have at least as many partitions as you have consumers in your consumer group (preferably a multiple). If you want to balance it across the cluster, also have it be a multiple of the

Number of Partitions and Performance

2015-04-07 Thread François Méthot
Hi, We initially had configured our topics to have between 8 to 16 partitions each on a cluster of 10 brokers (vm with 2 cores, 16 MB ram, Few TB of SAN Disk). Then I came across the rule of thump formula *100 x b x r.* (

Re: Kafka question

2015-04-07 Thread Jack
That would be really useful. Thanks for your writing, Guozhang. I will give it a shot and let you know. On Tue, Apr 7, 2015 at 10:06 AM, Guozhang Wang wangg...@gmail.com wrote: Jack, Okay I see your point now. I was originally thinking that in each run, you 1) first create the topic, 2)

Re: Kafka server relocation

2015-04-07 Thread nitin sharma
hi, sorry for late response. ... i have been able to fix the issue .. problem was in my approach. I got confused between my source and target system while defining consumer producer property file .. it is fixed now Now new issue.. the rate at which data is migrated is very very slow... i mean

Re: Number of Partitions and Performance

2015-04-07 Thread François Méthot
Thanks guys for the clarification about the rule of thumb formula, I will stick with a reasonably small set of partitions but add a few to make them a multiple of the number of brokers. Todd, I read your post yesterday as well, very helpful. On Tue, Apr 7, 2015 at 1:42 PM, Todd Palino

Re: Is there a complete Kafka 0.8.* replication design document

2015-04-07 Thread Jun Rao
Yes, the wiki is a bit old. You can find out more about replication in the following links. http://kafka.apache.org/documentation.html#replication http://www.slideshare.net/junrao/kafka-replication-apachecon2013 #1, #2, #8. See the ZK layout in

Request for adding us to the Powered By list

2015-04-07 Thread Anuj Goyal
Dear Kafka team, Could you add us at https://cwiki.apache.org/confluence/display/KAFKA/Powered+By. Here is the blurb: *IFTTT http://www.ifttt.com/ (www.ifttt.com http://www.ifttt.com) - We use Kafka to ingest real-time log and tracking data for analytics, dashboards, and machine learning.*

Re: question about Kafka

2015-04-07 Thread Jiangjie Qin
Yes, you might need to write some code to read from the log end and send it to Kafka using Kafka’s producer. On 4/6/15, 2:39 PM, Sun, Joey joey@emc.com wrote: Thanks for your info, Becket. Does it mean I should program for it? is there any other app can gracefully glue access_log to Kafka's

Re: Kafka server relocation

2015-04-07 Thread tao xiao
You may need to look into the consumer metrics and producer metrics to identify the root cause. metrics in kafka.consumer and kafka.producer categories will help you find out the problems. This link gives instruction how to read the metrics http://kafka.apache.org/documentation.html#monitoring