Re: Replacing brokers in a cluster (0.8)

2013-07-22 Thread Glenn Nethercutt
This seems like the type of behavior I'd ultimately want from the 
controlled shutdown tool 
.


Currently, I believe the ShutdownBroker causes new leaders to be 
selected for any partition the dying node was leading, but I don't think 
it explicitly forces a rebalance for topics in which the dying node was 
just an ISR (in-sync replica set) member. Ostensibly, leadership 
elections are what we want to avoid, due to the Zookeeper chattiness 
that would ensue for ensembles with lots of partitions, but I'd wager 
we'd benefit from a reduction in rebalances too.  The preferred 
replication election tool also seems to have some similar level of 
control (manual selection of the preferred replicas), but still doesn't 
let you add/remove brokers from the ISR directly.  I know the 
kafka-reassign-partitions tool lets you specify a full list of 
partitions and replica assignment, but I don't know how easily 
integrated that will be with the lifecycle you described.


Anyone know if controlled shutdown is the right tool for this? Our 
devops team will certainly be interested in the canonical answer as well.


--glenn

On 07/22/2013 05:14 AM, Jason Rosenberg wrote:

I'm planning to upgrade a 0.8 cluster from 2 old nodes, to 3 new ones
(better hardware).  I'm using a replication factor of 2.

I'm thinking the plan should be to spin up the 3 new nodes, and operate as
a 5 node cluster for a while.  Then first remove 1 of the old nodes, and
wait for the partitions on the removed node to get replicated to the other
nodes.  Then, do the same for the other old node.

Does this sound sensible?

How does the cluster decide when to re-replicate partitions that are on a
node that is no longer available?  Does it only happen if/when new messages
arrive for that partition?  Is it on a partition by partition basis?

Or is it a cluster-level decision that a broker is no longer valid, in
which case all affected partitions would immediately get replicated to new
brokers as needed?

I'm just wondering how I will know when it will be safe to take down my
second old node, after the first one is removed, etc.

Thanks,

Jason





Re: Setting up kafka consumer to send logs to s3

2013-06-26 Thread Glenn Nethercutt

I did the same search about a month ago.

I forked this kafka-s3-consumer 
 from 
QubitProducts/kafka-s3-consumer 
 to upfit it for the 
0.8 api changes.
You'll have to futz with the local repo/ivy to get your kafka jar in the 
right place for maven, but it's functional and straightforward.


Planning on sending a pull request once 0.8.0 is formally released.

--glenn

On 06/26/2013 12:08 AM, Florin Trofin wrote:

Quick search for "kafka s3 consumer" brings up a bunch of Github projects.
If you don't like any, I would write a kafka consumer in java that writes
to s3. Probably less than 200 lines of code.

F.

On 6/25/13 1:50 PM, "Alan Everdeen"  wrote:


In my application, I have logs that are sent as kafka messages, and I need
a way to save these logs to an s3 bucket using a kafka sink.

I was wondering what the recommended approach would be to accomplish this
task, assuming I am using Kafka 0.8.


Thank you for your time,
Alan Everdeen