On Tue, Aug 2, 2011 at 23:10, Jeremiah Jordan <jeremiah.jor...@morningstar.com> wrote: > If you have RF=1, taking one node down is going to cause 25% of your > data to be unavailable. If you want to tolerate a machines going down > you need to have at least RF=2, if you want to use quorum and have a > machine go down, you need at least RF=3.
I know I can have RF > 1 but I have limited resources and I don't care lossing 25% of the data. RF > 1 basicaly means if a node goes down I have the data elsewhere, but what I need is if node goes down just ignore its range. I can handle it in my applications using thrift, but the hadoop-mapreduce can't handle it. It just fails with "Exception in thread "main" java.io.IOException: Could not get input splits". Is there a way to say ignore this range to hadoop? Regards, P.