See https://issues.apache.org/jira/browse/CASSANDRA-2388

On Wed, Aug 17, 2011 at 6:28 AM, Patrik Modesto
<patrik.mode...@gmail.com> wrote:
> Hi,
>
> while I was investigating this issue, I've found that hadoop+cassandra
> don't work if you stop even just one node in the cluster. It doesn't
> depend on RF. ColumnFamilyRecordReader gets list of nodes (acording
> the RF) but chooses just the local host and if there is no cassandra
> running localy it throws RuntimeError exception. Which in turn marks
> the MapReduce task as failed.
>
> I've created a patch that makes ColumnFamilyRecordReader to try the
> local node and if it fails tries the other nodes in it's list. The
> patch is here http://pastebin.com/0RdQ0HMx I think attachements are
> not allowed on this ML.
>
> Please test it and apply. It's for 0.7.8 version.
>
> Regards,
> P.
>
>
> On Wed, Aug 3, 2011 at 13:59, aaron morton <aa...@thelastpickle.com> wrote:
>> If you want to take a look o.a.c.hadoop.ColumnFamilyRecordReader.getSplits() 
>> is the function that gets the splits.
>>
>>
>> Cheers
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 3 Aug 2011, at 16:18, Patrik Modesto wrote:
>>
>>> On Tue, Aug 2, 2011 at 23:10, Jeremiah Jordan
>>> <jeremiah.jor...@morningstar.com> wrote:
>>>> If you have RF=1, taking one node down is going to cause 25% of your
>>>> data to be unavailable.  If you want to tolerate a machines going down
>>>> you need to have at least RF=2, if you want to use quorum and have a
>>>> machine go down, you need at least RF=3.
>>>
>>> I know I can have RF > 1 but I have limited resources and I don't care
>>> lossing 25% of the data. RF > 1 basicaly means if a node goes down I
>>> have the data elsewhere, but what I need is if node goes down just
>>> ignore its range. I can handle it in my applications using thrift, but
>>> the hadoop-mapreduce can't handle it. It just fails with "Exception in
>>> thread "main" java.io.IOException: Could not get input splits". Is
>>> there a way to say ignore this range to hadoop?
>>>
>>> Regards,
>>> P.
>>
>>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Reply via email to