Russell, thank you for the answer. > Maybe one of the ex-basho support people remembers? I’ll prod one in a back channel and see if they can help.
It would be great. Thank you once more. On Wed, May 24, 2017 at 11:36 AM Russell Brown <russell.br...@icloud.com> wrote: > Also, this issue https://github.com/basho/riak_kv/issues/1188 suggests > that adding the property `riak_kv.retry_put_coordinator_failure=false` may > help in future. But won’t help with your keys with too many siblings. > > On 24 May 2017, at 09:22, Russell Brown <russell.br...@icloud.com> wrote: > > > > > On 24 May 2017, at 09:11, Vladyslav Zakhozhai < > v.zakhoz...@smartweb.com.ua> wrote: > > > >> Hello, > >> > >> My riak cluster still experiences "too many siblings". And hinted > handoffs are not able to be finished completely. So "siblings will be > resolved after hinted handoffs are finished" is not my case unfortunately. > >> > >> According to basho's docs ( > http://docs.basho.com/riak/kv/2.2.3/learn/concepts/causal-context/#sibling-explosion) > I need to enable dvv conflict resolution mechanism. So here is a quesion: > >> > >> Is it safe to enable dvv on default bucket type and how it affects > existing data? > > > > It might not affect existing data enough. All the existing siblings are > “undotted” and would need a read-put cycle to resolve. > > > >> It may be a solution, is not it? > > > > You may require further action. I remember basho support helping someone > with a similar issue, and there was some manual intervention/scripted > solution, but I can’t remember what it was right now. I think those objects > (as logged) with the sibling issues need to be read and resolved. Maybe one > of the ex-basho support people remembers? I’ll prod one in a back channel > and see if they can help. > > > >> > >> Why I talk about default bucket type? Because there is only one riak > client - Riak CS and it does not manage bucket types of PUT'ed object (so, > default bucket type always is used during PUT's). Is it correct? > > > > Yes. > > > >> > >> Thank you in advance. > >> > >> On Fri, Jun 17, 2016 at 11:45 AM Vladyslav Zakhozhai < > v.zakhoz...@smartweb.com.ua> wrote: > >> Hi Russel, > >> > >> thank you for your answer. I really appreciate your help. > >> > >> 2.1.3 is not actually riak_kv version. It is version of basho's riak > package. Versions of riak subsystems you can see below. > >> > >> Bucket properties: > >> # riak-admin bucket-type list > >> default (active) > >> > >> # riak-admin bucket-type status default > >> default is active > >> > >> allow_mult: true > >> basic_quorum: false > >> big_vclock: 50 > >> chash_keyfun: {riak_core_util,chash_std_keyfun} > >> dvv_enabled: false > >> dw: quorum > >> last_write_wins: false > >> linkfun: {modfun,riak_kv_wm_link_walker,mapreduce_linkfun} > >> n_val: 3 > >> notfound_ok: true > >> old_vclock: 86400 > >> postcommit: [] > >> pr: 0 > >> precommit: [] > >> pw: 0 > >> r: quorum > >> rw: quorum > >> small_vclock: 50 > >> w: quorum > >> write_once: false > >> young_vclock: 20 > >> > >> I did not mentioned that upgrade from riak 1.5.4 have been took place > couple months ago (about 6 months). As I understand DVV is disabled. Is it > safe to migrate to setting DVV from Vector Clocks? > >> > >> Package versions: > >> # dpkg -l | grep riak > >> ii riak 2.1.3-1 > amd64 Riak is a distributed data store > >> ii riak-cs 2.1.0-1 > amd64 Riak CS > >> > >> Subsystems versions: > >> "clique_version" : "0.3.2-0-ge332c8f", > >> "bitcask_version" : "1.7.2", > >> "sys_driver_version" : "2.2", > >> "riak_core_version" : "2.1.5-0-gb02ab53", > >> "riak_kv_version" : "2.1.2-0-gf969bba", > >> "riak_pipe_version" : "2.1.1-0-gb1ac2cf", > >> "cluster_info_version" : "2.0.3-0-g76c73fc", > >> "riak_auth_mods_version" : "2.1.0-0-g31b8b30", > >> "erlydtl_version" : "0.7.0", > >> "os_mon_version" : "2.2.13", > >> "inets_version" : "5.9.6", > >> "erlang_js_version" : "1.3.0-0-g07467d8", > >> "riak_control_version" : "2.1.2-0-gab3f924", > >> "xmerl_version" : "1.3.4", > >> "protobuffs_version" : "0.8.1p5-0-gf88fc3c", > >> "riak_sysmon_version" : "2.0.0", > >> "compiler_version" : "4.9.3", > >> "eleveldb_version" : "2.1.10-0-g0537ca9", > >> "lager_version" : "2.1.1", > >> "sasl_version" : "2.3.3", > >> "riak_dt_version" : "2.1.1-0-ga2986bc", > >> "runtime_tools_version" : "1.8.12", > >> "yokozuna_version" : "2.1.2-0-g3520d11", > >> "riak_search_version" : "2.1.1-0-gffe2113", > >> "sys_system_version" : "Erlang R16B02_basho8 (erts-5.10.3) [source] > [64-bit] [smp:4:4] [async-threads:64] [kernel-poll:true] [frame-pointer]", > >> "basho_stats_version" : "1.0.3", > >> "crypto_version" : "3.1", > >> "merge_index_version" : "2.0.1-0-g0c8f77c", > >> "kernel_version" : "2.16.3", > >> "stdlib_version" : "1.19.3", > >> "riak_pb_version" : "2.1.0.2-0-g620bc70", > >> "syntax_tools_version" : "1.6.11", > >> "goldrush_version" : "0.1.7", > >> "ibrowse_version" : "4.0.2", > >> "mochiweb_version" : "2.9.0", > >> "exometer_core_version" : "1.0.0-basho2-0-gb47a5d6", > >> "ssl_version" : "5.3.1", > >> "public_key_version" : "0.20", > >> "pbkdf2_version" : "2.0.0-0-g7076584", > >> "sidejob_version" : "2.0.0-0-gc5aabba", > >> "webmachine_version" : "1.10.8-0-g7677c24", > >> "poolboy_version" : "0.8.1p3-0-g8bb45fb", > >> "riak_api_version" : "2.1.2-0-gd8d510f", > >> "asn1_version" : "2.0.3", > >> > >> > >> On Fri, Jun 17, 2016 at 10:45 AM Russell Brown <russell.br...@me.com> > wrote: > >> What version of riak_kv is behind this riak_cs install, please? Is it > really 2.1.3 as stated below? This looks and sounds like sibling explosion, > which is fixed in riak 2.0 and above. Are you sure that you are using the > DVV enabled setting for riak_cs bucket properties? Can you post your bucket > properties please? > >> > >> On 16 Jun 2016, at 23:54, Vladyslav Zakhozhai < > v.zakhoz...@smartweb.com.ua> wrote: > >> > >>> Hello. > >>> > >>> I see very interesting and confusing thing. > >>> > >>> From my previous letter you can see that siblings count on manifest > objects is about 100 (actualy it is in range 100-300). Unfortunately my > problem is that almost all PUT requests are failing with 500 Internal > Server error. > >>> > >>> I've tried today set max_siblings riak option to 500. And there were > successfull PUT requests but not for long. Now I see in riak logs error > with "max siblings", but actual count of them is 500+ (earlier it was > 100-300 as I've mentioned). > >>> > >>> Period of time between max_siblings=500 and errors in log is about 30 > minutes. And I want to point your attention that I've forbid PUT method on > haproxy - frontend for riak cs. > >>> > >>> > >>> > >>> On Mon, Jun 6, 2016 at 1:17 AM Vladyslav Zakhozhai < > v.zakhoz...@smartweb.com.ua> wrote: > >>> Hi, Luke. > >>> > >>> Thank you for your answer. I did not understand you completely about > transfer-limit. How does it relate to my problem. Transfer limit - is a > limit of concurrent data transfer from different nodes. Am I wright? You > mean that riak can handoff one partition from several nodes concurrently? > >>> > >>> Now I have transfer-limit 1 on all riak nodes. > >>> > >>> But I am not sure that my cluster will be converged ever. All nodes > experiences low memory and are killed by OOM Killer periodically. I try to > add new nodes to the cluster but due problem with OOM killer this process > is very-very slow. > >>> > >>> In the official docs I've read: > >>> > >>> "Sibling explosion occurs when an object rapidly collects siblings > that are not reconciled. This can lead to a variety of problems, including > degraded performance, especially if many objects in a cluster suffer from > siblings explosion. At the extreme, having an enormous object in a node can > cause reads of that object to crash the entire node. Other issues include > undue latency and out-of-memory errors." > >>> > >>> I mentioned that new nodes in the cluster do not experience such > problems (I mean out of RAM). > >>> > >>> Regarding to siblings maybe your are right, this is manifest object. I > can recognize key name but not bucket name. But more than 100 siblings on > many keys is really confused me. Each time I try to PUT some object to Riak > via Riak CS S3 interface I got errors with siblings. > >>> > >>> On Fri, Jun 3, 2016 at 6:43 PM Luke Bakken <lbak...@basho.com> wrote: > >>> Hi Vladyslav, > >>> > >>> If you recognize the full name of the object raising the sibling > >>> warning, it is most likely a manifest object. Sometimes, during hinted > >>> handoff, you can see these messages. They should resolve after handoff > >>> completes. > >>> > >>> Please see the documentation for the transfer-limit command as well: > >>> > >>> > http://docs.basho.com/riak/kv/2.1.4/using/admin/riak-admin/#transfer-limit > >>> > >>> -- > >>> Luke Bakken > >>> Engineer > >>> lbak...@basho.com > >>> > >>> > >>> On Fri, Jun 3, 2016 at 2:55 AM, Vladyslav Zakhozhai > >>> <v.zakhoz...@smartweb.com.ua> wrote: > >>>> Hi. > >>>> > >>>> I have a trouble with PUT to Riak CS cluster. During this process I > >>>> periodically see the following message in Riak error.log: > >>>> > >>>> 2016-06-03 11:15:55.201 [error] > >>>> <0.15536.142>@riak_kv_vnode:encode_and_put:2253 Put failure: too many > >>>> siblings for object OBJECT_NAME (101) > >>>> > >>>> and also > >>>> > >>>> 2016-06-03 12:41:50.678 [error] > >>>> <0.20448.515>@riak_api_pb_server:handle_info:331 Unrecognized message > >>>> {7345880,{error,{too_many_siblings,101}}} > >>>> > >>>> Here OBJECT_NAME - is the name of object in Riak which has too many > >>>> siblings. > >>>> > >>>> I definitely sure that this objects are static. Nobody deletes is, > nobody > >>>> rewrites it. I have no idea why more than 100 siblings of this object > >>>> occurs. > >>>> > >>>> The following effect of this issue occurs: > >>>> > >>>> Great amount of keys are loaded to RAM. I almost out of RAM (Do each > sibling > >>>> has it own key or key duplicate?). > >>>> Nodes are slow - adding new nodes are too slow > >>>> Presence of "too many siblings" affects ownership handoffs > >>>> > >>>> So I have several questions: > >>>> > >>>> Do hinted or ownership handoffs can affect siblings count (I mean can > >>>> siblings be created during ownership of hinted handoffs) > >>>> Is there any workaround of this issue. Do I need remove siblings > manually or > >>>> it removes during merges, read repairs and so on > >>>> > >>>> > >>>> My configuration: > >>>> > >>>> riak from basho's packages - 2.1.3-1 > >>>> riak cs from basho's packages - 2.1.0-1 > >>>> 24 riak/riak-cs nodes > >>>> 32 GB RAM per node > >>>> AAE is disabled > >>>> > >>>> > >>>> I appreciate you help. > >>>> > >>>> _______________________________________________ > >>>> riak-users mailing list > >>>> riak-users@lists.basho.com > >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >>>> > >>> _______________________________________________ > >>> riak-users mailing list > >>> riak-users@lists.basho.com > >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >> > >> _______________________________________________ > >> riak-users mailing list > >> riak-users@lists.basho.com > >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > > _______________________________________________ > > riak-users mailing list > > riak-users@lists.basho.com > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com