I've thought about this a bit more. The question I meant to ask was: As a
system developer, if I choose to rely on VNode vector clocks instead of
sending/setting client IDs, am I making a *tradeoff*, or am I only making
my life simpler?

My own analysis of this, so far:

Using vnode vector clocks, the number of vnodes on a key's vector clock is
on an average day very close to 1 due to the preference list. In light of
this, the only situation in which client IDs are meaningfully better is in
a period of serious network flapping, during which the vnode preference
list is exhausted for frequently-mutated keys over a longish period. Good
client IDs can help, here, as they are stable and small in number across
any network flap. On the other hand, poorly-chosen client IDs (perhaps
unluckily so) are known to cause huge vector clocks. It seems like vnode
vclocks provide some reliability in the sense that they prevent a developer
from generating explosive vector clocks in the average case.

I'm thus conflicted. There doesn't seem to be an overwhelmingly strong case
in favor of having Client IDs. Old (pre-1.0) clusters and keysets excepted.
Does your experience support this?

The reason I'm looking for a position on this is because I am at the point
of implementing this (or not) for Riak-Cpp. If I implement client IDs, the
Protocol Buffers API forces me to reset a connection with a new client ID
(incurring a roundtrip) before virtually every request. That's a lot of
wasted round-trips, and furthermore it complicates the implementation of a
connection pool. If there isn't a use-case in the realm of practicality
that *requires* client-id vclocks, I'd just as soon forget it, or submit a
pull request with per-put-request client IDs later on.

--
Andres

2011/12/18 Andres Jaan Tack <[email protected]>

> Sean,
>
> I understand that they are still used when provided. The more fundamental
> question is: as good Riak citizens, should we consider Client IDs
> deprecated, and stop using them in new projects?
>
> --
> Andres
>
>
> 2011/12/18 Sean Cribbs <[email protected]>
>
>> Andres
>>
>> Client IDs are still used if the riak_kv / vnode_vclocks setting is
>> false. If it is true, the client ID will simply be ignored. For
>> compatibility's sake, it's best to send/set them still.
>>
>> On Sun, Dec 18, 2011 at 11:12 AM, Andres Jaan Tack <
>> [email protected]> wrote:
>>
>>> I guess the fundamental question is: In new clusters unaffected by
>>> pre-1.0 shenanigans, should we *ever* be using client-provided
>>> ClientIds in our vector clocks, or is this method deprecated?
>>>
>>> I ask because I was deep into figuring out how to do this efficiently
>>> with the Riak PBC API for PUT requests, when I realized that the
>>> documentation for HTTP PUT no longer even mentions 
>>> it<http://wiki.basho.com/HTTP-Store-Object.html>
>>> .
>>>
>>> If client-id isn't totally deprecated, I will probably start finding a
>>> way to shoehorn it into individual PBC PUT requests, as keeping a
>>> per-connection client id is very onwieldy. Please advise. :)
>>>
>>> --
>>> Andres
>>> _______________________________________________
>>> riak-users mailing list
>>> [email protected]
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>
>>
>> --
>> Sean Cribbs <[email protected]>
>> Developer Advocate
>> Basho Technologies, Inc.
>> http://www.basho.com/
>>
>>
>
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to