Re: reads/writes during node replacement
Thank you Magnus. On Mon, Nov 14, 2016 at 7:06 AM, Magnus Kessler wrote: > On 12 November 2016 at 00:08, Johnny Tan wrote: > >> When doing a node replace (http://docs.basho.com/riak/1. >> 4.12/ops/running/nodes/replacing/), after commit-ing the plan, how does >> the cluster handle reads/writes? Do I include the new node in my app's >> config as soon as I commit, and let riak internally handle which node(s) >> will do the reads/writes? Or do I wait until the ringready on the new node >> before being able to do reads/writes to it? >> >> johnny >> >> > Hi Johnny, > > As soon as a node has been joined to the cluster it is capable of taking > on requests. `riak-admin ringready` returns true after a join or leave > operation when the new ring state has been communicated successfully to all > nodes in the cluster. > > During a replacement operation, the leaving node will hand off [0] all its > partitions to the joining node. Both nodes can handle requests during this > phase and store data in the partitions they own. Once the leaving node has > handed off all its partitions, it will automatically shut down. Please keep > this in mind when configuring your clients or load balancers. Clients > should deal with nodes being temporarily or permanently unavailable. > > Kind Regards, > > Magnus > > [0]: http://docs.basho.com/riak/kv/2.1.4/using/reference/handoff/ > > -- > Magnus Kessler > Client Services Engineer > Basho Technologies Limited > > Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431 > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
reads/writes during node replacement
When doing a node replace ( http://docs.basho.com/riak/1.4.12/ops/running/nodes/replacing/), after commit-ing the plan, how does the cluster handle reads/writes? Do I include the new node in my app's config as soon as I commit, and let riak internally handle which node(s) will do the reads/writes? Or do I wait until the ringready on the new node before being able to do reads/writes to it? johnny ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: bitcask merges & deletions
Hm, definitely not a cronjob. I'll look at our app and see if there's anything that does something like that there. On Wed, Jun 15, 2016 at 9:10 PM, Luke Bakken wrote: > Hi Johnny, > > Since this seems to happen regularly on one node on your cluster (not > necessarily the same node), do you have a repetitive process that > performs a *lot* of updates or deletes on a single key that could be > correlated to these merges? > -- > Luke Bakken > Engineer > lbak...@basho.com > > > On Wed, Jun 15, 2016 at 10:22 AM, Johnny Tan wrote: > > We're running riak-1.4.2 > > > > Every few weeks, we have a riak node that starts to slowly fill up on > disk > > space for several days, and then suddenly gain that space back again. > > > > In looking into this more today, I think I see what's going on. > > > > Per the console.log on a node that it's happening to right now, there > are an > > unusually large amount of merges happening right now. There are 6 total > > nodes in our cluster, it's only happening to this node today. (In > previous > > weeks, it's been other nodes, but it's always been one node at a time.) > > > > Normally, we get 50-70 merges per day per node (according to various > nodes' > > console.log, including the node in question). Yesterday and today, the > node > > in question has several hundred merges happening. > > > > When I look inside the bitcask directory, I see a lot of files with this > set > > of permissions: > > -rwSrw-r-- > > > > My understanding is that those are files marked for deletions after > bitcask > > merging. > > > > The number of those files is currently growing, and from a spot-check, > they > > indeed match up as the files that have been merged. > > > > So it seems the two are related: a lot of merges are happening, which > then > > causes a large number of files to be marked for deletion, and those > marked > > files are piling up and not getting deleted for some reason. > > > > If I don't do anything, those files eventually get deleted, and > everything > > is good again for another couple weeks until it happens to another node. > But > > the disk usage does get high enough to alert us, and obviously we don't > want > > it to get anywhere near 100%. > > > > > > I'm trying to figure out why there are times when this happens. One > thing I > > noticed is a difference in the merge log entries. > > > > Here's one from a "normal" day, nearly all the entries for that day are > > roughly this same length and same amount of time merging: > > 2016-06-10 05:27:39.426 UTC [info] <0.15230.160> Merged > > > {["/var/lib/riak/bitcask/890602560248518965780370444936484965102833893376/84000.bitcask.data","/var/lib/riak/bitcask/890602560248518965780370444936484965102833893376/83999.bitcask.data"],[]} > > in 11.902028 seconds. > > > > But here's one from today on the problematic node: > > 2016-06-15 17:13:40.626 UTC [info] <0.17903.500> Merged > > > {["/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83633.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83632.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83631.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83630.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83629.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83628.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83627.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83626.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83625.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83624.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83623.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83622.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83621.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83620.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83619.bitcask.data","/var/lib/riak/bitcask/12331420064979493372343
bitcask merges & deletions
We're running riak-1.4.2 Every few weeks, we have a riak node that starts to slowly fill up on disk space for several days, and then suddenly gain that space back again. In looking into this more today, I think I see what's going on. Per the console.log on a node that it's happening to right now, there are an unusually large amount of merges happening right now. There are 6 total nodes in our cluster, it's only happening to this node today. (In previous weeks, it's been other nodes, but it's always been one node at a time.) Normally, we get 50-70 merges per day per node (according to various nodes' console.log, including the node in question). Yesterday and today, the node in question has several hundred merges happening. When I look inside the bitcask directory, I see a lot of files with this set of permissions: -rwSrw-r-- My understanding is that those are files marked for deletions after bitcask merging. The number of those files is currently growing, and from a spot-check, they indeed match up as the files that have been merged. So it seems the two are related: a lot of merges are happening, which then causes a large number of files to be marked for deletion, and those marked files are piling up and not getting deleted for some reason. If I don't do anything, those files eventually get deleted, and everything is good again for another couple weeks until it happens to another node. But the disk usage does get high enough to alert us, and obviously we don't want it to get anywhere near 100%. I'm trying to figure out why there are times when this happens. One thing I noticed is a difference in the merge log entries. Here's one from a "normal" day, nearly all the entries for that day are roughly this same length and same amount of time merging: 2016-06-10 05:27:39.426 UTC [info] <0.15230.160> Merged {["/var/lib/riak/bitcask/890602560248518965780370444936484965102833893376/84000.bitcask.data","/var/lib/riak/bitcask/890602560248518965780370444936484965102833893376/83999.bitcask.data"],[]} in 11.902028 seconds. But here's one from today on the problematic node: 2016-06-15 17:13:40.626 UTC [info] <0.17903.500> Merged {["/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83633.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83632.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83631.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83630.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83629.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83628.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83627.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83626.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83625.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83624.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83623.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83622.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83621.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83620.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83619.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83618.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83617.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83616.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83615.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83614.bitcask.data","/var/lib/riak/bitcask/1233142006497949337234359077604363797834693083136/83613.bitcask.data","/var/lib/riak/bitcask/12331420064979493372343590776043637978346930...",...],...} in 220.186043 seconds. It's not just that it takes 20x longer to merge, but it seems to be doing a lot more files at once. What is going on? I'm not sure how much of the app.config is relevant, but I'll at least paste just the bitcask and merge sections for now: {bitcask, [ {data_root, "/var/lib/riak/bitcask"}, {dead_bytes_merge_trigger, 268435456}, {dead_bytes_threshold, 67108864}, {frag_merge_trigger, 60}, {frag_threshold, 40}, {io_mode, erlang}, {max_file_size, 1073741824}, {small_file_threshold, 134217728} ]}, {merge_index, [ {buffer_rollover_size, 1048576}, {data_root, "/var/lib/riak/merge_index"},
Re: Changing ring size on 1.4 cluster
Thank you Luke. On Wed, Jun 1, 2016 at 1:46 PM, Luke Bakken wrote: > Hi Johnny, > > Yes, the latter two are your main options. For a 1.4 series Riak > installation, your only option is to bring up a new cluster with the > desired ring size and replicate data. > -- > Luke Bakken > Engineer > lbak...@basho.com > > > On Fri, May 27, 2016 at 12:11 PM, Johnny Tan wrote: > > The docs > http://docs.basho.com/riak/kv/2.1.4/configuring/basic/#ring-size > > seem to imply that there's no easy, non-destructive way to change a > > cluster's ring size live for Riak-1.4x. > > > > I thought about replacing one node at a time, but you can't join a new > node > > or replace an existing one with a node that has a different ring size. > > > > I was also thinking of bring up a completely new cluster with the new > ring > > size, and then replicating the data from the original cluster, and take a > > quick maintenance window to failover to the new cluster. > > > > One other alternative seems to be to upgrade to 2.0, and then use 2.x's > > ability to resize the ring. > > > > Are these latter two my main options? > > > > johnny > > > > ___ > > riak-users mailing list > > riak-users@lists.basho.com > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Changing ring size on 1.4 cluster
The docs http://docs.basho.com/riak/kv/2.1.4/configuring/basic/#ring-size seem to imply that there's no easy, non-destructive way to change a cluster's ring size live for Riak-1.4x. I thought about replacing one node at a time, but you can't join a new node or replace an existing one with a node that has a different ring size. I was also thinking of bring up a completely new cluster with the new ring size, and then replicating the data from the original cluster, and take a quick maintenance window to failover to the new cluster. One other alternative seems to be to upgrade to 2.0, and then use 2.x's ability to resize the ring. Are these latter two my main options? johnny ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: uneven disk distribution
To followup: Since we use chef (configuration management), the riak configs are the same across all our riak nodes (except for stuff like hostnames/IPs, etc.). I ran riak_kv:repair and it looks like it had fixed the problem on node 004, but then a _different_ node (002) started to throw a bunch of locking errors. (I thought I saved a copy of that log somewhere but can't seem to find it now, and the old ones are rotated out.) Nothing I did would stem those errors on 002, even though 004 seemed perfectly fine after the repair. In the end, I rm'd 002's bitcask directory, rejoined the cluster, and it seems to now be back in shape. No errors, the nodes are all relatively similar in size -- 002 lags a little behind the others, but not in a worrisome way. I'm sure this was related to Bitcask merging, I just still haven't pinpointed what it was. But I appreciate the input and suggestions. johnny On Fri, May 15, 2015 at 4:48 PM, Charlie Voiselle wrote: > Johnny: > > Something else to look for would be any errors in the console.log related > to Bitcask merging. It would be interesting to see if the unusual disk > utilization was related to a specific partition. If it is, you could > consider removing that particular partition and running riak_kv:repair to > restore the replicas from the adjacent partitions. I can provide more > information if you find that to be the case. > > Regards, > Charlie Voiselle > Client Services, Basho > > > On May 14, 2015 10:06 PM, "Engel Sanchez" wrote: > >> Hi Johnny. Make sure that the configuration on that node is not different >> to the others. For example, it could be configured to never merge Bitcask >> files, so that space could never be reclaimed. >> >> >> http://docs.basho.com/riak/latest/ops/advanced/backends/bitcask/#Configuring-Bitcask >> >> On Thu, May 14, 2015 at 4:31 PM, Johnny Tan wrote: >> >>> We have a 6-node test riak cluster. One of the nodes seems to be using >>> far more disk: >>> staging-riak001.pp /dev/sda3 15G 6.3G 7.2G 47% / >>> staging-riak002.pp /dev/sda3 15G 6.4G 7.1G 48% / >>> staging-riak003.pp /dev/sda3 15G 6.1G 7.5G 45% / >>> staging-riak004.pp /dev/sda3 15G 14G 266M 99% / >>> staging-riak005.pp /dev/sda3 15G 5.8G 7.7G 44% / >>> staging-riak006.pp /dev/sda3 15G 6.3G 7.3G 47% / >>> >>> Specifically, /var/lib/riak/bitcask is using up most of that space. It >>> seems to have files in there that are much older than any of the other >>> nodes. We've done maintenance of various sort on this cluster -- as the >>> name indicates, we use it as a staging ground before we go to production. I >>> don't recall a specific issue per se, but I wouldn't rule it out. >>> >>> Is there a way to figure out if there's an underlying issue here, or >>> whether some of this disk space is not really current and can somehow be >>> purged? >>> >>> What info would help answer those questions? >>> >>> johnny >>> >>> ___ >>> riak-users mailing list >>> riak-users@lists.basho.com >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> >>> >> >> ___ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
uneven disk distribution
We have a 6-node test riak cluster. One of the nodes seems to be using far more disk: staging-riak001.pp /dev/sda3 15G 6.3G 7.2G 47% / staging-riak002.pp /dev/sda3 15G 6.4G 7.1G 48% / staging-riak003.pp /dev/sda3 15G 6.1G 7.5G 45% / staging-riak004.pp /dev/sda3 15G 14G 266M 99% / staging-riak005.pp /dev/sda3 15G 5.8G 7.7G 44% / staging-riak006.pp /dev/sda3 15G 6.3G 7.3G 47% / Specifically, /var/lib/riak/bitcask is using up most of that space. It seems to have files in there that are much older than any of the other nodes. We've done maintenance of various sort on this cluster -- as the name indicates, we use it as a staging ground before we go to production. I don't recall a specific issue per se, but I wouldn't rule it out. Is there a way to figure out if there's an underlying issue here, or whether some of this disk space is not really current and can somehow be purged? What info would help answer those questions? johnny ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
loadbalancing
I assume the best practice is to use a virtual IP that is loadbalanced to each member of a riak ring to read/write data? Since there is no state, I assume stickiness is not an issue. Are there any other potential gotchas? johnny ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com