See inline: On Mon, Feb 22, 2016 at 10:31 AM, Alex Moore <amo...@basho.com> wrote:
> Hi Vanessa, > > You might have a problem with your delete function (depending on it's > return value). > What does the return value of the delete() function indicate? Right now > if an object existed, and was deleted, the function will return true, and > will only return false if the object didn't exist or only consisted of > tombstones. > That's the correct behaviour: it should return true iff a value was actually deleted. > If you never look at the object value returned by your fetchValue(key) > function, another potential optimization you could make is to only return the > HEAD / metadata: > > FetchValue fv = new FetchValue.Builder(new Location(new Namespace( > "some_bucket"), key)) > > .withOption(FetchValue.Option.HEAD, true) > .build(); > > This would be more efficient because Riak won't have to send you the > values over the wire, if you only need the metadata. > > Thanks, I'll clean that up. > If you do write this up somewhere, share the link! :) > Will do! Regards, Vanessa > > Thanks, > Alex > > > On Mon, Feb 22, 2016 at 6:23 AM, Vanessa Williams < > vanessa.willi...@thoughtwire.ca> wrote: > >> Hi Dmitri, this thread is old, but I read this part of your answer >> carefully: >> >> You can use the following strategies to prevent stale values, in >>> increasing order of security/preference: >>> 1) Use timestamps (and not pass in vector clocks/causal context). This >>> is ok if you're not editing objects, or you're ok with a bit of risk of >>> stale values. >>> 2) Use causal context correctly (which means, read-before-you-write -- >>> in fact, the Update operation in the java client does this for you, I >>> think). And if Riak can't determine which version is correct, it will fall >>> back on timestamps. >>> 3) Turn on siblings, for that bucket or bucket type. That way, Riak >>> will still try to use causal context to decide the right value. But if it >>> can't decide, it will store BOTH values, and give them back to you on the >>> next read, so that your application can decide which is the correct one. >> >> >> I decided on strategy #2. What I am hoping for is some validation that >> the code we use to "get", "put", and "delete" is correct in that context, >> or if it could be simplified in some cases. Not we are using delete-mode >> "immediate" and no duplicates. >> >> In their shortest possible forms, here are the three methods I'd like >> some feedback on (note, they're being used in production and haven't caused >> any problems yet, however we have very few writes in production so the lack >> of problems doesn't support the conclusion that the implementation is >> correct.) Note all argument-checking, exception-handling, and logging >> removed for clarity. *I'm mostly concerned about correct use of causal >> context and response.isNotFound and response.hasValues. *Is there >> anything I could/should have left out? >> >> public RiakClient(String name, com.basho.riak.client.api.RiakClient >> client) >> { >> this.name = name; >> this.namespace = new Namespace(name); >> this.client = client; >> } >> >> public byte[] get(String key) throws ExecutionException, >> InterruptedException { >> >> FetchValue.Response response = fetchValue(key); >> if (!response.isNotFound()) >> { >> RiakObject riakObject = response.getValue(RiakObject.class); >> return riakObject.getValue().getValue(); >> } >> return null; >> } >> >> public void put(String key, byte[] value) throws ExecutionException, >> InterruptedException { >> >> // fetch in order to get the causal context >> FetchValue.Response response = fetchValue(key); >> RiakObject storeObject = new >> >> RiakObject().setValue(BinaryValue.create(value)).setContentType("binary/octet-stream"); >> StoreValue.Builder builder = >> new StoreValue.Builder(storeObject).withLocation(new >> Location(namespace, key)); >> if (response.getVectorClock() != null) { >> builder = builder.withVectorClock(response.getVectorClock()); >> } >> StoreValue storeValue = builder.build(); >> client.execute(storeValue); >> } >> >> public boolean delete(String key) throws ExecutionException, >> InterruptedException { >> >> // fetch in order to get the causal context >> FetchValue.Response response = fetchValue(key); >> if (!response.isNotFound()) >> { >> DeleteValue deleteValue = new DeleteValue.Builder(new >> Location(namespace, key)).build(); >> client.execute(deleteValue); >> } >> return !response.isNotFound() || !response.hasValues(); >> } >> >> >> Any comments much appreciated. I want to provide a minimally correct >> example of simple client code somewhere (GitHub, blog post, something...) >> so I don't want to post this without review. >> >> Thanks, >> Vanessa >> >> ThoughtWire Corporation >> http://www.thoughtwire.com >> >> >> >> >> On Thu, Oct 8, 2015 at 8:45 AM, Dmitri Zagidulin <dzagidu...@basho.com> >> wrote: >> >>> Hi Vanessa, >>> >>> The thing to keep in mind about read repair is -- it happens >>> asynchronously on every GET, but /after/ the results are returned to the >>> client. >>> >>> So, when you issue a GET with r=1, the coordinating node only waits for >>> 1 of the replicas before responding to the client with a success, and only >>> afterwards triggers read-repair. >>> >>> It's true that with notfound_ok=false, it'll wait for the first >>> non-missing replica before responding. But if you edit or update your >>> objects at all, an R=1 still gives you a risk of stale values being >>> returned. >>> >>> For example, say you write an object with value A. And let's say your 3 >>> replicas now look like this: >>> >>> replica 1: A, replica 2: A, replica 3: notfound/missing >>> >>> A read with an R=1 and notfound_ok=false is just fine, here. (Chances >>> are, the notfound replica will arrive first, but the notfound_ok setting >>> will force the coordinator to wait for the first non-empty value, A, and >>> return it to the client. And then trigger read-repair). >>> >>> But what happens if you edit that same object, and give it a new value, >>> B? So, now, there's a chance that your replicas will look like this: >>> >>> replica 1: A, replica 2: B, replica 3: B. >>> >>> So now if you do a read with an R=1, there's a chance that replica 1, >>> with the old value of A, will arrive first, and that's the response that >>> will be returned to the client. >>> >>> Whereas, using R=2 eliminates that risk -- well, at least decreases it. >>> You still have the issue of -- how does Riak decide whether A or B is the >>> correct value? Are you using causal context/vclocks correctly? (That is, >>> reading the object before you update, to get the correct causal context?) >>> Or are you relying on timestamps? (This is an ok strategy, provided that >>> the edits are sufficiently far apart in time, and you don't have many >>> concurrent edits, AND you're ok with the small risk of occasionally the >>> timestamp being wrong). You can use the following strategies to prevent >>> stale values, in increasing order of security/preference: >>> >>> 1) Use timestamps (and not pass in vector clocks/causal context). This >>> is ok if you're not editing objects, or you're ok with a bit of risk of >>> stale values. >>> >>> 2) Use causal context correctly (which means, read-before-you-write -- >>> in fact, the Update operation in the java client does this for you, I >>> think). And if Riak can't determine which version is correct, it will fall >>> back on timestamps. >>> >>> 3) Turn on siblings, for that bucket or bucket type. That way, Riak >>> will still try to use causal context to decide the right value. But if it >>> can't decide, it will store BOTH values, and give them back to you on the >>> next read, so that your application can decide which is the correct one. >>> >>> >>> >>> >>> >>> >>> >>> On Thu, Oct 8, 2015 at 1:56 AM, Vanessa Williams < >>> vanessa.willi...@thoughtwire.ca> wrote: >>> >>>> Hi Dmitri, what would be the benefit of r=2, exactly? It isn't >>>> necessary to trigger read-repair, is it? If it's important I'd rather try >>>> it sooner than later... >>>> >>>> Regards, >>>> Vanessa >>>> >>>> >>>> >>>> On Wed, Oct 7, 2015 at 4:02 PM, Dmitri Zagidulin <dzagidu...@basho.com> >>>> wrote: >>>> >>>>> Glad you sorted it out! >>>>> >>>>> (I do want to encourage you to bump your R setting to at least 2, >>>>> though. Run some tests -- I think you'll find that the difference in speed >>>>> will not be noticeable, but you do get a lot more data resilience with 2.) >>>>> >>>>> On Wed, Oct 7, 2015 at 6:24 PM, Vanessa Williams < >>>>> vanessa.willi...@thoughtwire.ca> wrote: >>>>> >>>>>> Hi Dmitri, well...we solved our problem to our satisfaction but it >>>>>> turned out to be something unexpected. >>>>>> >>>>>> The keys were two properties mentioned in a blog post on "configuring >>>>>> Riak’s oft-subtle behavioral characteristics": >>>>>> http://basho.com/posts/technical/riaks-config-behaviors-part-4/ >>>>>> >>>>>> notfound_ok= false >>>>>> basic_quorum=true >>>>>> >>>>>> The 2nd one just makes things a little faster, but the first one is >>>>>> the one whose default value of true was killing us. >>>>>> >>>>>> With r=1 and notfound_ok=true (default) the first node to respond, if >>>>>> it didn't find the requested key, the authoritative answer was "this key >>>>>> is >>>>>> not found". Not what we were expecting at all. >>>>>> >>>>>> With the changed settings, it will wait for a quorum of responses and >>>>>> only if *no one* finds the key will "not found" be returned. Perfect. >>>>>> (Without this setting it would wait for all responses, not ideal.) >>>>>> >>>>>> Now there is only one snag, which is that if the Riak node the client >>>>>> connects to goes down, there will be no communication and we have a >>>>>> problem. This is easily solvable with a load-balancer, though for >>>>>> complicated reasons we actually don't need to do that right now. It's >>>>>> just >>>>>> acceptable for us temporarily. Later, we'll get the load-balancer working >>>>>> and even that won't be a problem. >>>>>> >>>>>> I *think* we're ok now. Thanks for your help! >>>>>> >>>>>> Regards, >>>>>> Vanessa >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Oct 7, 2015 at 9:33 AM, Dmitri Zagidulin < >>>>>> dzagidu...@basho.com> wrote: >>>>>> >>>>>>> Yeah, definitely find out what the sysadmin's experience was, with >>>>>>> the load balancer. It could have just been a wrong configuration or >>>>>>> something. >>>>>>> >>>>>>> And yes, that's the documentation page I recommend - >>>>>>> http://docs.basho.com/riak/latest/ops/advanced/configs/load-balancing-proxy/ >>>>>>> Just set up HAProxy, and point your Java clients to its IP. >>>>>>> >>>>>>> The drawbacks to load-balancing on the java client side (yes, the >>>>>>> cluster object) instead of a standalone load balancer like HAProxy, are >>>>>>> the >>>>>>> following: >>>>>>> >>>>>>> 1) Adding node means code changes (or at very least, config file >>>>>>> changes) rolled out to all your clients. Which turns out to be a pretty >>>>>>> serious hassle. Instead, HAProxy allows you to add or remove nodes >>>>>>> without >>>>>>> changing any java code or config files. >>>>>>> >>>>>>> 2) Performance. We've ran many tests to compare performance, and >>>>>>> client-side load balancing results in significantly lower throughput >>>>>>> than >>>>>>> you'd have using haproxy (or nginx). (Specifically, you actually want to >>>>>>> use the 'leastconn' load balancing algorithm with HAProxy, instead of >>>>>>> round >>>>>>> robin). >>>>>>> >>>>>>> 3) The health check on the client side (so that the java load >>>>>>> balancer can tell when a remote node is down) is much less intelligent >>>>>>> than >>>>>>> a dedicated load balancer would provide. With something like HAProxy, >>>>>>> you >>>>>>> should be able to take down nodes with no ill effects for the client >>>>>>> code. >>>>>>> >>>>>>> Now, if you load balance on the client side and you take a node >>>>>>> down, it's not supposed to stop working completely. (I'm not sure why >>>>>>> it's >>>>>>> failing for you, we can investigate, but it'll be easier to just use a >>>>>>> load >>>>>>> balancer). It should throw an error or two, but then start working again >>>>>>> (on the retry). >>>>>>> >>>>>>> Dmitri >>>>>>> >>>>>>> On Wed, Oct 7, 2015 at 2:45 PM, Vanessa Williams < >>>>>>> vanessa.willi...@thoughtwire.ca> wrote: >>>>>>> >>>>>>>> Hi Dmitri, thanks for the quick reply. >>>>>>>> >>>>>>>> It was actually our sysadmin who tried the load balancer approach >>>>>>>> and had no success, late last evening. However I haven't discussed the >>>>>>>> gory >>>>>>>> details with him yet. The failure he saw was at the application level >>>>>>>> (i.e. >>>>>>>> failure to read a key), but I don't know a) how he set up the LB or b) >>>>>>>> what >>>>>>>> the Java exception was, if any. I'll find that out in an hour or two >>>>>>>> and >>>>>>>> report back. >>>>>>>> >>>>>>>> I did find this article just now: >>>>>>>> >>>>>>>> >>>>>>>> http://docs.basho.com/riak/latest/ops/advanced/configs/load-balancing-proxy/ >>>>>>>> >>>>>>>> So I suppose we'll give those suggestions a try this morning. >>>>>>>> >>>>>>>> What is the drawback to having the client connect to all 4 nodes >>>>>>>> (the cluster client, I assume you mean?) My understanding from reading >>>>>>>> articles I've found is that one of the nodes going away causes that >>>>>>>> client >>>>>>>> to fail as well. Is that what you mean, or are there other drawbacks as >>>>>>>> well? >>>>>>>> >>>>>>>> If there's anything else you can recommend, or links other than the >>>>>>>> one above you can point me to, it would be much appreciated. We expect >>>>>>>> both >>>>>>>> node failure and deliberate node removal for upgrade, repair, >>>>>>>> replacement, >>>>>>>> etc. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Vanessa >>>>>>>> >>>>>>>> On Wed, Oct 7, 2015 at 8:29 AM, Dmitri Zagidulin < >>>>>>>> dzagidu...@basho.com> wrote: >>>>>>>> >>>>>>>>> Hi Vanessa, >>>>>>>>> >>>>>>>>> Riak is definitely meant to run behind a load balancer. (Or, at >>>>>>>>> the worst case, to be load-balanced on the client side. That is, all >>>>>>>>> clients connect to all 4 nodes). >>>>>>>>> >>>>>>>>> When you say "we did try putting all 4 Riak nodes behind a >>>>>>>>> load-balancer and pointing the clients at it, but it didn't help." -- >>>>>>>>> what >>>>>>>>> do you mean exactly, by "it didn't help"? What happened when you tried >>>>>>>>> using the load balancer? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Oct 7, 2015 at 1:57 PM, Vanessa Williams < >>>>>>>>> vanessa.willi...@thoughtwire.ca> wrote: >>>>>>>>> >>>>>>>>>> Hi all, we are still (for a while longer) using Riak 1.4 and the >>>>>>>>>> matching Java client. The client(s) connect to one node in the >>>>>>>>>> cluster >>>>>>>>>> (since that's all it can do in this client version). The cluster >>>>>>>>>> itself has >>>>>>>>>> 4 nodes (sorry, we can't use 5 in this scenario). There are 2 >>>>>>>>>> separate >>>>>>>>>> clients. >>>>>>>>>> >>>>>>>>>> We've tried both n_val = 3 and n_val=4. We achieve >>>>>>>>>> consistency-by-writes by setting w=all. Therefore, we only require >>>>>>>>>> one >>>>>>>>>> successful read (r=1). >>>>>>>>>> >>>>>>>>>> When all nodes are up, everything is fine. If one node fails, the >>>>>>>>>> clients can no longer read any keys at all. There's an exception >>>>>>>>>> like this: >>>>>>>>>> >>>>>>>>>> com.basho.riak.client.RiakRetryFailedException: >>>>>>>>>> java.net.ConnectException: Connection refused >>>>>>>>>> >>>>>>>>>> Now, it isn't possible that Riak can't operate when one node >>>>>>>>>> fails, so we're clearly missing something here. >>>>>>>>>> >>>>>>>>>> Note: we did try putting all 4 Riak nodes behind a load-balancer >>>>>>>>>> and pointing the clients at it, but it didn't help. >>>>>>>>>> >>>>>>>>>> Riak is a high-availability key-value store, so... why are we >>>>>>>>>> failing to achieve high-availability? Any suggestions greatly >>>>>>>>>> appreciated, >>>>>>>>>> and if more info is required I'll do my best to provide it. >>>>>>>>>> >>>>>>>>>> Thanks in advance, >>>>>>>>>> Vanessa >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Vanessa Williams >>>>>>>>>> ThoughtWire Corporation >>>>>>>>>> http://www.thoughtwire.com >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> riak-users mailing list >>>>>>>>>> riak-users@lists.basho.com >>>>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >> _______________________________________________ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com