Is the kafka-reassign-partitions tool something I can experiment with now (this will only be staging data, in the first go-round). How does it work? Do I manually have to specify each replica I want to move? This would be cumbersome, as I have on the order of 100's of topics....Or does the tool have the ability to specify all replicas on a particular broker? How can I easily check whether a partition has all its replicas in the ISR?
For some reason, I had thought there would be a default behavior, whereby a replica could automatically be declared dead after a configurable timeout period. Re-assigning broker id's would not be ideal, since I have a scheme currently whereby broker id's are auto-generated, from a hostname/ip, etc. I could make it work, but it's not my preference to override that! Jason On Mon, Jul 22, 2013 at 11:50 AM, Jun Rao <jun...@gmail.com> wrote: > A replica's data won't be automatically moved to another broker where there > are failures. This is because we don't know if the failure is transient or > permanent. The right tool to use is the kafka-reassign-partitions tool. It > hasn't been thoroughly tested tough. We hope to harden it in the final > 0.8.0 release. > > You can also replace a broker with a new server by keeping the same broker > id. When the new server starts up, it will replica data from the leader. > You know the data is fully replicated when both replicas are in ISR. > > Thanks, > > Jun > > > On Mon, Jul 22, 2013 at 2:14 AM, Jason Rosenberg <j...@squareup.com> wrote: > > > I'm planning to upgrade a 0.8 cluster from 2 old nodes, to 3 new ones > > (better hardware). I'm using a replication factor of 2. > > > > I'm thinking the plan should be to spin up the 3 new nodes, and operate > as > > a 5 node cluster for a while. Then first remove 1 of the old nodes, and > > wait for the partitions on the removed node to get replicated to the > other > > nodes. Then, do the same for the other old node. > > > > Does this sound sensible? > > > > How does the cluster decide when to re-replicate partitions that are on a > > node that is no longer available? Does it only happen if/when new > messages > > arrive for that partition? Is it on a partition by partition basis? > > > > Or is it a cluster-level decision that a broker is no longer valid, in > > which case all affected partitions would immediately get replicated to > new > > brokers as needed? > > > > I'm just wondering how I will know when it will be safe to take down my > > second old node, after the first one is removed, etc. > > > > Thanks, > > > > Jason > > >