Thanks for the reply, which confirms what I expected.

Let me explain why I asked. I have an application that my intuition says would 
be a good match to Riak, but I don't trust my intuition since I've never used 
Riak and I'm not sure I understand all of its failure modes. One thing I'm 
trying is to work through a mental model-checking exercise—which I might 
eventually turn over to a real model checker—which is making me wonder about 
all the things that can go wrong. A failed write that is visible anyway, either 
permanently or just for a while, is just one example.

In the long run, it would be great if Riak were documented perfectly and 
completely—and other other piece of software in the world too!—but in the 
meanwhile I'm just trying to build my own mental model. I'd prefer, of course, 
a mental model that does not depend on a detailed knowledge of Riak's internal 
workings, enumberating only the preconditions and postconditions of each 
operation. We'll see how far I can get....

Cheers,
John

On Jan 9, 2012, at 2:38 PM, John DeTreville wrote:

> Thanks you very much for your reply. Longer response to follow.
> 
> Cheers,
> John
> 
> On Jan 9, 2012, at 2:33 PM, Ryan Zezeski wrote:
> 
>> John,
>> 
>> To your first question, yes, it is possible that the client may receive a 
>> failure response from Riak but the data could have persisted on some of the 
>> nodes.  This is because a single write to Riak is actually N writes to N 
>> different partitions inside of Riak.  These N writes are not atomic in 
>> relation to each other.
>> 
>> As for your second question, it depends on what happens between the time of 
>> the "failed" write and the time the node(s) with the replicas go down.  If 
>> some form of anti-entropy is employed before the node failure then the 
>> replicas should have been repaired and N copies should exist.  Riak's main 
>> form of anti-entropy is read repair that occurs at read time (we also have a 
>> form of active anti-entropy between Riak clusters in our enterprise 
>> offering).  If the object is read before node failure then read-repair will 
>> occur and repair all N replicas.
>> 
>> An example might help.  If N=3/W=2 and two partitions fail to write then the 
>> overall request will fail but the remaining W is successful.  If you perform 
>> a read after this "failed" write then you may or may not see the new value 
>> depending on the R value and which partitions respond to the coordinator 
>> first.  However, regardless what is returned by that read the coordinator 
>> will stay alive a while longer in an attempt to perform read-repair.  If 
>> read-repair is successful then you should have N copies and it will be like 
>> the write failure never occurred.  If you hadn't performed that read and the 
>> replicas hadn't been repaired and the node containing the only replica went 
>> down and you did a read then you would get the old value or a not_found 
>> (depending on if a value existed for that key before the write).
>> 
>> -Ryan
>> 
>> 
>> On Mon, Jan 9, 2012 at 12:32 AM, John DeTreville <[email protected]> wrote:
>> (An earlier post seems not to have gone through. My apologies in the 
>> eventual case of a duplicate.)
>> 
>> I'm thinking of using Riak to replace a large Oracle system, and I'm trying 
>> to understand its guarantees. I have a few introductory questions; this is 
>> the second of three.
>> 
>> Imagine I do a write, and the write fails because it could not contact 
>> enough hosts. Am I right to imagine that the write may actually have 
>> persisted, and that the data might later be available for reading? Am I also 
>> right to imagine that the data, once read, might later vanish due to host 
>> failure, because it was persisted to fewer hosts than expected?
>> 
>> Cheers,
>> John
>> _______________________________________________
>> riak-users mailing list
>> [email protected]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
> 

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to