I need more time to examine the diagram, but this all looks as expected so far.

If a client sends no context then it’s write will be a sibling of whatever is 
stored at the coordinator, as you rightly point out riak treats an incoming 
clock that is less than a local clock as a sibling.
If the coordinator is configured to not store siblings then the sibling value 
with the highest timestamp is stored, I recommend you run riak in either 
allow_mult=true or LWW=true, allow_mult=false, in my view, should not be 
default.
If two riak nodes do the above, and then replicate their values, the single 
value with the highest value is stored. Isn’t this what you are seeing? If you 
depend on time to pick the latest, and nodes’ clocks are out of sync this is 
the price.

Is this what you are seeing? Are you seeing results you didn’t expect, or 
non-deterministic results? Or both?

Regards

Russell

On 1 Oct 2015, at 12:58, Zuzana Zatrochova <zatroch...@gmail.com> wrote:

> Hi,
> 
> 
> 
> We are researching the client-centric consistency features of Riak database. 
> We encountered a problem with vector clocks implementation. The vector clocks 
> do not seem to work locally on a machine as expected. We would like you to 
> confirm if the behavior is desired. First I will describe the environment of 
> our experiments and then the problem will be presented.
> 
> 
> 
> Environment:
> 
> 
>       • Our environment consists of six virtual machines
>               • five machines in Riak cluster, each represent a single Riak 
> node with Riak database
>               • one machine with java application that simulates multiple 
> clients communicating with Riak database
>       • Machines are Virtualized VMs by VMware software and have slightly 
> shifted time to each other (no more than 1 second)
>       • We made experiments with versions riak-1.4.8 and riak-2.1.1. In 
> riak-1.4.8 app_config contains vnode_vclocks = true  (default setting that 
> was there when downloaded) in riak-2.1.1 we could not locate configuration 
> for vnode vclocks either in advanced configurations in documentation or 
> riak.conf so we assumed it also defaults to true and is no longer enabled to 
> change
>       • For each experiment we have 500 clients concurrently sending requests 
> to random node from the cluster. There are 20000 requests per minute 
> operating only on 20 different keys (load on single key is 16 requests per 
> second (read:write ration = 50:50).
>       • For referenced issue we used quorums R = 1, W = 3; R = 2, W = 2 and R 
> =3 W = 1
>       • All riak settings are default apart from IP settings and quorum 
> settings. We added interceptors from riak_test module that don’t change the 
> code and are implemented only for logging purposes (information about states 
> of nodes), error.log is empty
> 
> Problem:
> 
> 
>       • It seems that Riak does not use vector clocks locally, only on global 
> scale. When a data object is created on client side and sent to Riak database 
> it does not have any vector clocks assigned (more precisely the function 
> riak_object:vclock(UpdObj) = [] and local object: 
> riak_object:vclock(LocalObj) returns the local VC for the local object. 
> Therefore the function (in 2.1.1 but similar behavior is in 1.4.8) 
> vclock:descends(NewObject, LocalObject) returns false for all my experiments 
> with different quorums (Empty vector clocks cannot descend non empty vector 
> clocks). The behavior leads to merge of contents = creation of siblings (or 
> resolving the value according to the timestamp not vector clocks when 
> siblings are not allowed – our configuration)
>       • In our experiments when time on VMs is not synchronized up to 500 
> milliseconds the situation from picture issue.png sent in attachment arises. 
> Due to the fact that two objects with the same key are sent to two different 
> coordinators and coordinators clocks are shifted the later object is assigned 
> earlier timestamp as the object that was sent before. As the result of the 
> vector clocks implementation in Riak, the later object is lost due to the 
> merge of contents where later timestamp (wrong because of local clock shift) 
> is evaluated as the latest.
> 
> The question:
> 
> 
> 
> Is this the Riak intended behavior? The problem is that even when quorum is 
> set to prefer consistency and there are no partitions in the cluster there 
> are still inconsistent requests seen from client perspective = any read must 
> return the value of the latest finished write or later unfinished write 
> request. (We did not use the strong_consistency feature of riak-2.1.1 
> version).
> 
> 
> 
> Thank you,
> 
> Zuzana
> 
> <issue.png>_______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to