Re: Riak-CS , S3cmd and A single Node outage on a 3 node cluster

Brady Wetherington Fri, 23 Aug 2013 10:36:00 -0700

Oh, I thought the rule was "if you have one more node than your n_val, then
you can be sure each replica will be on a distinct node" - is that not
correct?


Also with n of 2 I would wonder how you would handle r and w - my guess
would be you *could* set them to 1 each, and just deal with possible
consistency issues if you lost a node. But it would be 'weird' in that you
wouldn't really have a quorum to make - it'd be "whoever is up right now
wins". And then - when the downed node comes back - you might spit back
inconsistent answers until AAE fixes up your data. Read-repair would *not*
end up fixing your data up, because the r value of 1 would be satisfied.
Does that sound right?

I'm asking not only for the original poster; but to make sure I understand
what I've got here for myself too!

-B.




On Thu, Aug 22, 2013 at 1:41 PM, Kelly McLaughlin <ke...@basho.com> wrote:

> Idan,
>
> Actually in the case you described of using a 3 node Riak cluster with
> n_val of 2 the behavior you see makes perfect sense. When using only three
> nodes Riak does not guarantee that all replicas of an object will be on
> distinct physical nodes. So if you have one node down you can hit a case
> where both of the replicas of an object stored in Riak live on the downed
> node. This explains the reason you see occasional failure to retrieve an
> object with s3cmd, but most of the time it works just fine. It is for this
> reason we strongly recommend people to use at least 5 nodes in their
> production deployments. For testing, using at least use 4 nodes in the
> cluster will likely reduce the chances of this situation occurring versus
> using only 3.
>
> Also just be very cautious about reducing the n_val to 2. You will save on
> storage space doing that, but there is a trade-off to be made in
> availability and performance in doing so.
>
> Hope that helps shed some light.
>
> Kelly
>
> On August 22, 2013 at 10:12:06 AM, Idan Shinberg (idan.shinb...@idomoo.com)
> wrote:
>
> Thanks for the aid
>
> This is a testing environemt . no-one accesses it aside of me
>
> Prior to creating the objects via riak-cs , I set the cluster to n_val 2
> for the default bucket props .
> On that empty cluster , i started creating objects . ( around 500) .
> Then I stopped and killed one of the nodes .
> Then the issues mentioned above were seen.
>
> still doesn't make any  sense...
>
> Regards,
>
> Idan Shinberg
>
>
> System Architect
>
> Idomoo Ltd.
>
>
>
> Mob +972.54.562.2072
>
> email idan.shinb...@idomoo.com
>
> web www.idomoo.com
>
> [image: Description:
> file:///Users/kelly/Library/Containers/it.bloop.airmail/Data/Library/Application
> Support/Airmail/General/Local/1377192732334581760/Attachments/fc66336e-e750-4c2d-b6e3-985d5a06b...@idomoo.co.il]
>
>
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak-CS , S3cmd and A single Node outage on a 3 node cluster

Reply via email to