i think it would be nice if the recommended setup for kafka is jbod and not
raid because:
* it makes it easy to "test" kafka on an existing hadoop/spark cluster
* co-location, for example we colocate kafka and spark streaming (our spark
streaming app is kafka partition location aware)

ideally kafka would survive a disk failure and only report partial loss,
just like a hdfs datanode does. i realize this is a big ask...

On Tue, Jan 20, 2015 at 12:25 PM, Yang Fang <franklin.f...@gmail.com> wrote:

> I think the best way is raid not jbod. If one disk of jbod goes wrong ,
> broker shutdown, then it takes long time to recovery . Brokes which run for
> long time will be more and more leaders of partitions. I/O  pressure will
> be unbalanced.
> btw, I use kafka 0.8.0-beta1
>

Reply via email to