KAFKA-1499 compression.type

2016-01-14 Thread Elias Levy
The description of the compression.type config property in the documentation is somewhat confusing. It begins with "Specify the final compression type for a given topic.", yet it is defined as a broker configuration property and it is not listed under topic-level configuration properties. Reading

trouble upgrading from 0.8.2.1 to 0.9.0.0: invalid message

2016-01-14 Thread Dave Peterson
I was trying to upgrade an 0.8.2.1 broker cluster to 0.9.0.0 by following the instructions here: http://kafka.apache.org/documentation.html#upgrade After upgrading one broker, with inter.broker.protocol.version=0.8.2.X set, I get ACK error 2 (InvalidMessage) when I try to send produce request

Re: Encryption on disk

2016-01-14 Thread Jim Hoagland
We did a proof of concept on end-to-end encryption using an approach which sounds similar to what you describe. We blogged about it here: http://www.symantec.com/connect/blogs/end-end-encryption-though-kafka-our-p roof-concept You might want to review what is there to see how it differs from w

Encryption on disk

2016-01-14 Thread Bruno Rassaerts
Hello, In our project we have a very strong requirement to protect all data, all the time. Even when the data is “in-rest” on disk, it needs to be protected. We’ve been trying to figure out how to this with Kafka, and hit some obstacles. One thing we’ve tried to do is to encrypt every message we

Possible WAN Replication Setup

2016-01-14 Thread Jason J. W. Williams
Hello, We historically have been a RabbitMQ environment, but we're looking at using Kafka for a new project and I'm wondering if the following topology/setup would work well in Kafka (for RMQ we'd use federation): * Multiple remote datacenters consisting each of a single server running an HTTP ap

Re: kafka-streams

2016-01-14 Thread Guozhang Wang
We do not have a concrete target date for transactional messaging yet, but we plan to make it this year. Guozhang On Thu, Jan 14, 2016 at 9:02 AM, Tom Dearman wrote: > Thanks. Presumably, if transactions are added to kafka then you get > exactly-once semantics anyway. Any word on transactions

Re: Kafka + ZooKeeper on the same hardware?

2016-01-14 Thread Todd Palino
I’d say it depends on load and usage. It can definitely be done, and we’ve done it here in places, though we don’t anymore. Part of the luxury of being able to get the hardware we want. In general, it’s probably easier to do with 0.9 and Kafka-committed offsets, because the consumers don’t need to

Re: Controlled shutdown not relinquishing leadership of all partitions

2016-01-14 Thread Luke Steensen
I don't have broker logs at the moment, but I'll work on getting some I can share. We are running 0.9.0.0 for both the brokers and producer in this case. I've pasted some bits from the producer log below in case that's helpful. Of particular note is how long it takes for the second disconnect to oc

Re: Kafka + ZooKeeper on the same hardware?

2016-01-14 Thread Gwen Shapira
It depends on load :) As long as there is no contention, you are fine. On Thu, Jan 14, 2016 at 6:06 AM, Erik Forsberg wrote: > Hi! > > Pondering how to configure Kafka clusters and avoid having too many > machines to manage.. Would it be recommended to run say a 3 node kafka > cluster where you

Re: Partition rebalancing after broker removal

2016-01-14 Thread Luke Steensen
No worries, glad to have the functionality! Thanks for your help. Luke On Thu, Jan 14, 2016 at 10:58 AM, Gwen Shapira wrote: > Yep. That tool is not our best documented :( > > On Thu, Jan 14, 2016 at 11:49 AM, Luke Steensen < > luke.steen...@braintreepayments.com> wrote: > > > Is the preferred

Re: kafka-streams

2016-01-14 Thread Tom Dearman
Thanks. Presumably, if transactions are added to kafka then you get exactly-once semantics anyway. Any word on transactions and release date for that? > On 14 Jan 2016, at 16:58, Guozhang Wang wrote: > > Hello Tom, > > There is no specific release date planned yet, but we are shooting for >

Re: kafka-streams

2016-01-14 Thread Guozhang Wang
Hello Tom, There is no specific release date planned yet, but we are shooting for adding kafka-streams in the next major release of Kafka. Regarding exactly-once semantics, the first version of kafka-streams may not yet have this feature implemented but we do have designs and target to add it in

Re: Partition rebalancing after broker removal

2016-01-14 Thread Gwen Shapira
Yep. That tool is not our best documented :( On Thu, Jan 14, 2016 at 11:49 AM, Luke Steensen < luke.steen...@braintreepayments.com> wrote: > Is the preferred leader the first replica in the list passed to the > reassignment tool? I don't see it specifically called out in the json file > format. >

Re: Partition rebalancing after broker removal

2016-01-14 Thread Luke Steensen
Is the preferred leader the first replica in the list passed to the reassignment tool? I don't see it specifically called out in the json file format. On Thu, Jan 14, 2016 at 10:42 AM, Gwen Shapira wrote: > Ah, got it! > > There's no easy way to transfer leadership on command, but you could use

Re: Controlled shutdown not relinquishing leadership of all partitions

2016-01-14 Thread Gwen Shapira
Do you happen to have broker-logs and state-change logs from the controlled shutdown attempt? In theory, the producer should not really see a disconnect - it should get NotALeader exception (because leaders are re-assigned before the shutdown) that will cause it to get the metadata. I am guessing

Re: Partition rebalancing after broker removal

2016-01-14 Thread Gwen Shapira
Ah, got it! There's no easy way to transfer leadership on command, but you could use the reassignment tool to change the preferred leader (and nothing else) and then trigger preferred leader election. Gwen On Thu, Jan 14, 2016 at 11:30 AM, Luke Steensen < luke.steen...@braintreepayments.com> wro

Re: Partition rebalancing after broker removal

2016-01-14 Thread Luke Steensen
Hi Gwen, 1. I sent a message to this list a couple days ago with the subject "Controlled shutdown not relinquishing leadership of all partitions" describing the issue I saw. Sorry there's not a lot of detail on the controlled shutdown part, but I've had trouble reproducing outside of our specific

Re: Partition rebalancing after broker removal

2016-01-14 Thread Gwen Shapira
Hi, 1. If you had problems with controlled shutdown, we need to know. Maybe open a thread to discuss? 2. Controlled shutdown is only used to reduce the downtime involved in large number of leader elections. New leaders will get elected in any case. 3. Controlled (or uncontrolled shutdown) does not

Re: Partition rebalancing after broker removal

2016-01-14 Thread Luke Steensen
Hello, For #3, I assume this relies on controlled shutdown to transfer leadership gracefully? Or is there some way to use partition reassignment to set the preferred leader of each partition? I ask because we've run into some problems relying on controlled shutdown and having a separate verifiable

Re: Partition rebalancing after broker removal

2016-01-14 Thread Gwen Shapira
Hi, There was a Jira to add "remove broker" option to the partition-reassignment tool. I think it died in a long discussion trying to solve a harder problem... To your work-around - it is an acceptable work-around. Few improvements: 1. Manually edit the resulting assignment json to avoid unneces

Re: Kafka Consumer and Topic Partition

2016-01-14 Thread Stephen Powis
I think you need to have unique consumer group ids for each consumer (if you want both consumers to receive ALL msgs) or multiple partitions setup on your topic with your consumers sharing the same consumer group (each consumer would then get ~half of all the messages) On Thu, Jan 14, 2016 at 8:46

Kafka Consumer and Topic Partition

2016-01-14 Thread Joe San
Kafka Users, I have been trying out a simple consumer example that is supposed to read messages from a specific partition of a topic. I'm not able to get two consumer instances up and running. The second consumer instance is idle! Here is the original post that I created: http://stackoverflow.co

Re: Kafka + ZooKeeper on the same hardware?

2016-01-14 Thread Kamal C
Yes, it can sustain one failure. Misunderstood your question.. On Thu, Jan 14, 2016 at 5:14 PM, Erik Forsberg wrote: > > > On 2016-01-14 12:42, Kamal C wrote: > >> It's a single point of failure. You may lose high-availability. >> > > In this case I would like to protect myself from 1 machine

Re: Kafka + ZooKeeper on the same hardware?

2016-01-14 Thread Erik Forsberg
On 2016-01-14 12:42, Kamal C wrote: It's a single point of failure. You may lose high-availability. In this case I would like to protect myself from 1 machine going down, and my replication factor for Kafka would be 2. So in the case of one machine going down, Zookeeper cluster would still

Re: Kafka + ZooKeeper on the same hardware?

2016-01-14 Thread Kamal C
It's a single point of failure. You may lose high-availability. On Thu, Jan 14, 2016 at 4:36 PM, Erik Forsberg wrote: > Hi! > > Pondering how to configure Kafka clusters and avoid having too many > machines to manage.. Would it be recommended to run say a 3 node kafka > cluster where you also ru

kafka-streams

2016-01-14 Thread Tom Dearman
Can anyone tell me whether the kaka-streams project will resolve the issue noted in the KIP-28: “After processor writes to a store instance, it first sends the change message to its corresponding changelog topic partition. When user calls commit() in his processor, KStream needs to flush both t

Kafka + ZooKeeper on the same hardware?

2016-01-14 Thread Erik Forsberg
Hi! Pondering how to configure Kafka clusters and avoid having too many machines to manage.. Would it be recommended to run say a 3 node kafka cluster where you also run your 3 node zookeeper cluster on the same machines? I guess the answer is that "it depends on load", but would be interest

Re: How to chose the size of a Kafka broker

2016-01-14 Thread Jens Rantil
Hi Vladoiu, I am by no means a Kafka expert, but what are you optimizing for? - Cost could be a variable. - Time to bring on a new broker could be another variable. For large machines that could take longer since they need to stream more data. Cheers, Jens On Wed, Jan 13, 2016 at 1:09