Thanks Istvan - I think I understand what you are say here - although I was under the impression that if I ensured each topic was being replicated N+1 times a two node cluster would ensure each node has a copy of the entire contents of the message bus at any given time.
I agree with your assessment though that having 3 nodes is a more durable configuration, but was hoping others could explain how they calculate capacity and scaling issues on their storage subsystems. Cheers, -pete On 10/21/14 11:28, István wrote: > One thing that you have to keep in mind is that moving 10T between nodes > takes long time. If you have a node failure and you need to rebuild > (resync) the data your system is going to be vulnerable against the second > node failure. You could mitigate this with using raid. I think generally > speaking 3 node clusters are better for production purposes. > > I. > > On Tue, Oct 21, 2014 at 11:12 AM, Pete Wright <pwri...@rubiconproject.com> > wrote: > >> Hi There, >> I have a question regarding sizing disk for kafka brokers. Let's >> say I >> have systems capable of providing 10TB of storage, and they act as Kafka >> brokers. If I were to deploy two of these nodes, and enable replication >> in Kafka, would I actually have 10TB available for my producers to write >> to? Is there any overhead I should be concerned with? >> >> I guess I am just wanting to make sure that there are not any major >> pitfalls in deploying a two-node cluster, versus say a 3-node cluster. >> >> Any advice or best-practices would be very helpful! >> >> Thanks in advance, >> -pete >> >> >> -- >> Pete Wright >> Systems Architect >> Rubicon Project >> pwri...@rubiconproject.com >> 310.309.9298 >> > > > -- Pete Wright Systems Architect Rubicon Project pwri...@rubiconproject.com 310.309.9298