Hello David,
this behaviour is quite expected if you think about how Riak works.
Assuming you use the default replication factor of n=3, each key is
stored on all of your three nodes. If you delete a key while one node
(let's call it A) is down, the key is deleted from the two nodes that
are still up (let's call them B and C), and remains on the downed node A.
Once node A is up again, the situation is indistinguishable from B and C
having a hard drive crash and loosing all their data, in that A has the
key and B and C know nothing about it.
If you do a GET of the deleted key at this point, the result depends on
the r-value that you choose. For r>1 you will get a not_found on the
first get. For r=1 you might get the data or a not_found, depending on
which two nodes answer first (see
https://issues.basho.com/show_bug.cgi?id=992 about basic quorum for an
explanation). Also, at that point read repair will kick in and
re-replicate the key to all nodes, so subsequent GETs will always return
the original datum.
listing keys on the other hand does not use quorum but just does a set
union of all keys of all the nodes in you cluster. Essentially it is
equivalent to r=1 without basic quorum. The same is true for map/reduce
queries to my knowledge
The essential problem is that a real physical delete is
indistinguishable from data loss (or never having had the data in the
first place), while those two things are logically different.
If you want to be sure that a key is deleted with all its replicas you
must delete it with a write quorum setting of w=n. Also you need to tell
Riak not to count fallback vnodes toward you write quorum. This feature
is quite new and I believe only available in the head revision. Also I
forgot the name of the parameter and don't know if it is even applicable
for DELETEs.
Anyhow, if you do all this, your DELETEs will simply fail if any of the
nodes that has a copy of the key is down (so in your case, if any node
is down).
If you only want to logically delete, and don't care about freeing the
disk space and RAM that is used by the key, you should use a special
value, which is interpreted by your application as a not found. That way
you also get proper conflict resolution between DELETEs and PUTs (say
one client deletes a key while another one updates it).
Cheers,
Nico
Am 16.06.2011 00:55, schrieb David Mitchell:
Erlang: R13B04
Riak: 0.14.2
I have a three node cluster, and while one node was down, I deleted
every key in a certain bucket. Then, I started the node that was
down, and it joined the cluster.
Now, when do a listing on these keys in this bucket, and I get the
entire list. I can also get the values of the bucket. However, when
I try to delete the keys, the keys are not deleted.
Can anyone help me get the nodes back in a consistent state? I have
tried restarting the nodes.
David
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com