Re: Kafka and Zookeeper node removal from two nodes Kafka cluster

2013-10-15 Thread Jason Rosenberg
Yeah, so it would seem a work around could be to defer full replica assignment until adequate brokers are available, but in the meantime, allow topic creation to proceed. With respect to Joel's point around the possibility for imbalanced partition assignment if not all replicas are available, this

Re: Handling consumer rebalance when implementing synchronous auto-offset commit

2013-10-15 Thread Jason Rosenberg
Jun, Yes, sorry, I think that was the basis for my question. When auto commit is enabled, special care is taken to make sure things are auto-committed during a rebalance. This is needed because when a topic moves off of a consumer thread (since it is being rebalanced to another one), it's as if

Re: Flush configuration per topic

2013-10-15 Thread Jun Rao
In 0.8, we do have "log.flush.interval.ms.per.topic" (see http://kafka.apache.org/documentation.html#brokerconfigs for details). Thanks, Jun On Tue, Oct 15, 2013 at 5:47 PM, Simon Hørup Eskildsen wrote: > Do you mean that it's possible to override log configurations per topic in > trunk? > > Y

Re: Handling consumer rebalance when implementing synchronous auto-offset commit

2013-10-15 Thread Jun Rao
If auto commit is disabled, the consumer connector won't call commitOffsets during rebalancing. Thanks, Jun On Tue, Oct 15, 2013 at 4:16 PM, Jason Rosenberg wrote: > I'm looking at implementing a synchronous auto offset commit solution. > People have discussed the need for this in previous >

Re: Kafka and Zookeeper node removal from two nodes Kafka cluster

2013-10-15 Thread Jun Rao
When creating a new topic, we require # live brokers to be equal to or larger than # replicas. Without enough brokers, can't complete the replica assignment since we can't assign more than 1 replica on the same broker. Thanks, Jun On Tue, Oct 15, 2013 at 1:47 PM, Jason Rosenberg wrote: > Is t

Re: partition reassignment

2013-10-15 Thread Kane Kane
I thought if i have all replicas in sync, leader change should be much faster? On Tue, Oct 15, 2013 at 5:12 PM, Joel Koshy wrote: > Depending on how much data there is in those partitions it can take a > while for reassignment to actually complete. You will need to use the > --status-check-json

Re: Flush configuration per topic

2013-10-15 Thread Simon Hørup Eskildsen
Do you mean that it's possible to override log configurations per topic in trunk? Yeah, you're right. :-) I wasn't sure what to call it if not consistency, even though I know that sort of has another meaning in this context. On Tue, Oct 15, 2013 at 6:53 PM, Jay Kreps wrote: > Yeah, looks like

Re: partition reassignment

2013-10-15 Thread Joel Koshy
Depending on how much data there is in those partitions it can take a while for reassignment to actually complete. You will need to use the --status-check-json-file option of the reassign partitions command to determine whether partition reassignment has completed or not. Joel On Tue, Oct 15, 20

Handling consumer rebalance when implementing synchronous auto-offset commit

2013-10-15 Thread Jason Rosenberg
I'm looking at implementing a synchronous auto offset commit solution. People have discussed the need for this in previous threads..Basically, in my consumer loop, I want to make sure a message has been actually processed before allowing it's offset to be committed. But I don't want to commit

Re: Kafka and Zookeeper node removal from two nodes Kafka cluster

2013-10-15 Thread Joel Koshy
That's a good question. Off the top of my head I don't remember any fundamentally good reason why we don't allow it - apart from: - broker registration paths are ephemeral so topic creation cannot succeed when there are insufficient brokers available - it may be confusing to some users to successfu

Re: Flush configuration per topic

2013-10-15 Thread Jay Kreps
Yeah, looks like you are right, we don't have the per-topic override in 0.8 :-( All log configurations are overridable in trunk which will be 0.8.1. Just to be totally clear this setting does not impact consistency (i.e. all replicas will have the same messages in the same order), nor even durabi

partition reassignment

2013-10-15 Thread Kane Kane
I have 3 brokers and a topic with replication factor of 3. Somehow all partitions ended up being on the same broker. I've created topic with 3 brokers alive, and they didn't die since then. Even when i try to reassign it: bin/kafka-reassign-partitions.sh --zookeeper 10.80.42.147:2181--broker-list

Re: Flush configuration per topic

2013-10-15 Thread Simon Hørup Eskildsen
0.8, we're not on master, but we definitely can be. On Tue, Oct 15, 2013 at 5:03 PM, Jay Kreps wrote: > Hey Simon, > > What version of Kafka are you using? > > -Jay > > > On Tue, Oct 15, 2013 at 9:56 AM, Simon Hørup Eskildsen > wrote: > > > Hi Kafkas! > > > > Reading through the documentation a

Re: Flush configuration per topic

2013-10-15 Thread Jay Kreps
Hey Simon, What version of Kafka are you using? -Jay On Tue, Oct 15, 2013 at 9:56 AM, Simon Hørup Eskildsen wrote: > Hi Kafkas! > > Reading through the documentation and code of Kafka, it seems there is no > feature to set flushing interval (messages/time) for a specific topic. I am > interest

Re: KafkaStream bug?

2013-10-15 Thread Joel Koshy
This is probably because KafkaStream is a scala iterable - toString on an iterable. Per the scala-doc: "returns a string representation of this collection. By default this string consists of the stringPrefix of this immutable iterable collection, followed by all elements separated by commas and enc

Re: Kafka and Zookeeper node removal from two nodes Kafka cluster

2013-10-15 Thread Jason Rosenberg
Is there a fundamental reason for not allowing creation of new topics while in an under-replicated state? For systems that use automatic topic creation, it seems like losing a node in this case is akin to the cluster being unavailable, if one of the nodes goes down, etc. On Tue, Oct 15, 2013 at

Re: Kafka and Zookeeper node removal from two nodes Kafka cluster

2013-10-15 Thread Joel Koshy
Steve - that's right. I think Monika wanted clarification on what would happen if replication factor is two and only one broker is available. In that case, you won't be able to create new topics with replication factor two (you should see an AdministrationException saying the replication factor is

Re: Kafka and Zookeeper node removal from two nodes Kafka cluster

2013-10-15 Thread Steve Morin
If you have a double broker failure with replication factor of 2 and only have 2 brokers in the cluster. Wouldn't every partition be not available? On Tue, Oct 15, 2013 at 8:48 AM, Jun Rao wrote: > If you have double broker failures with a replication factor of 2, some > partitions will not be

Re: Kafka and Zookeeper node removal from two nodes Kafka cluster

2013-10-15 Thread Monika Garg
Thanks for replying..:) What if the second broker never comes? On Oct 15, 2013 3:48 PM, "Jun Rao" wrote: > If you have double broker failures with a replication factor of 2, some > partitions will not be available. When one of the brokers comes back, the > partition is made available again (there

Flush configuration per topic

2013-10-15 Thread Simon Hørup Eskildsen
Hi Kafkas! Reading through the documentation and code of Kafka, it seems there is no feature to set flushing interval (messages/time) for a specific topic. I am interested in this to get consistency for certain topics by flushing after every message, while having eventual consistency for other top

Re: Kafka and Zookeeper node removal from two nodes Kafka cluster

2013-10-15 Thread Jun Rao
If you have double broker failures with a replication factor of 2, some partitions will not be available. When one of the brokers comes back, the partition is made available again (there is potential data loss), but in an under replicated mode. After the second broker comes back, it will catch up f

Kafka and Zookeeper node removal from two nodes Kafka cluster

2013-10-15 Thread Monika Garg
I have 2 nodes kafka cluster with default.replication.factor=2,is set in server.properties file. I removed one node-in removing that node,I killed Kafka process,removed all the kafka-logs and bundle from that node. Then I stopped my remaining running node in the cluster and started again(default.

Re: Messages from producer are immediately going to /tmp/logs in kafka

2013-10-15 Thread Monika Garg
Thanks for clearing my doubt Jay... I was mixing the kafka log.flush policy with that of OS in a sense that is explained below: I read the below property of Kafka log.flush.interval.messages - The number of messages written to a log partition before we force an fsync on the log. So I thought