Hi Guillaume,

From your bucket properties it looks like you are using search, and I assume 
that is search 2.0 (i.e., yokozuna), and not the legacy Riak Search.

It is true that in the current 2.0 and 2.1 trunks the indexing into Solr via 
Yokozuna is synchronous with the vnode -- very long times indexing can block 
vnodes while the indexing completes, which can increase latency for ordinary 
Riaks gets and puts.  You may be able to verify this by checking the mailbox 
lengths on your vnodes -- you may see messages enqueued while the Solr indexing 
completes. 

We are in the process of reviewing a PR [1] into the 2.0 trunk which adds 
batching and asynchronous indexing, decoupled from the riak vnodes responsible 
for initiating the work.  It's a significant chunk of work, and we are doing 
everything we can to ensure it is stable and performant.  Please note that even 
with batching and asynchronous writes into Solr, you will experience some 
increased latency and diminished throughput, as more work needs to be done on 
each node to index (and possibly store) data in Solr.

This work will eventually be forward merged into a post-2.0 branch.  Details 
will follow, once we get the work into 2.0.

I hope that helps.

-Fred

[1] https://github.com/basho/yokozuna/pull/634

> On May 3, 2016, at 1:01 PM, Matthew Von-Maszewski <matth...@basho.com> wrote:
> 
> Guillaume,
> 
> I have reviewed the debug package for your riak1 server.  There are two 
> potential areas of follow-up:
> 
> 1.  You are running our most recent Riak 2.1.4 which has eleveldb 2.0.17.  We 
> have seen a case where a recent feature in eleveldb 2.0.17 caused too much 
> cache flushing, impacting leveldb’s performance.  A discussion is here:
> 
>   https://github.com/basho/leveldb/wiki/mv-timed-grooming2 
> <https://github.com/basho/leveldb/wiki/mv-timed-grooming2>
> 
> 2.  Yokozuna search was recently updated for some timeout problems.  Those 
> updates are not yet in a public build.  One of our other engineers is likely 
> to respond to you on that topic.
> 
> 
> An eleveldb 2.0.18 is tagged and available via github if you want to build it 
> yourself.  Otherwise, Basho may be releasing prebuilt patches of eleveldb 
> 2.0.18 in the near future.  But no date is currently set.
> 
> Matthew
> 
>> On May 3, 2016, at 10:50 AM, Luke Bakken <lbak...@basho.com 
>> <mailto:lbak...@basho.com>> wrote:
>> 
>> Guillaume -
>> 
>> You said earlier "My data are stored on an openstack volume that
>> support up to 3000IOPS". There is a likelihood that your write load is
>> exceeding the capacity of your virtual environment, especially if some
>> Riak nodes are sharing physical disk or server infrastructure.
>> 
>> Some suggestions:
>> 
>> * If you're not using Riak Search, set "search = off" in riak.conf
>> 
>> * Be sure to carefully read and apply all tunings:
>> http://docs.basho.com/riak/kv/2.1.4/using/performance/ 
>> <http://docs.basho.com/riak/kv/2.1.4/using/performance/>
>> 
>> * You may wish to increase the memory dedicated to leveldb:
>> http://docs.basho.com/riak/kv/2.1.4/configuring/backend/#leveldb
>> 
>> --
>> Luke Bakken
>> Engineer
>> lbak...@basho.com
>> 
>> 
>> On Tue, May 3, 2016 at 7:33 AM, Guillaume Boddaert
>> <guilla...@lighthouse-analytics.co> wrote:
>>> Hi,
>>> 
>>> Sorry for the delay, I've spent a lot of time trying to understand if the
>>> problem was elsewhere.
>>> I've simplified my infrastructure and got a simple layout that don't rely
>>> anymore on loadbalancer and also corrected some minor performance issue on
>>> my workers.
>>> 
>>> At the moment, i have up to 32 workers that are calling riak for writes,
>>> each of them are set to :
>>> w=1
>>> dw=0
>>> timeout=1000
>>> using protobuf
>>> a timeouted attempt is rerun 180s later
>>> 
>>> From my application server perspective, 23% of the calls are rejected by
>>> timeout (75446 tries, 57564 success, 17578 timeout).
>>> 
>>> Here is a sample riak-admin stat for one of my 5 hosts:
>>> 
>>> node_put_fsm_time_100 : 999331
>>> node_put_fsm_time_95 : 773682
>>> node_put_fsm_time_99 : 959444
>>> node_put_fsm_time_mean : 156242
>>> node_put_fsm_time_median : 20235
>>> vnode_put_fsm_time_100 : 5267527
>>> vnode_put_fsm_time_95 : 2437457
>>> vnode_put_fsm_time_99 : 4819538
>>> vnode_put_fsm_time_mean : 175567
>>> vnode_put_fsm_time_median : 6928
>>> 
>>> I am using leveldb, so i can't tune bitcask backend as suggested.
>>> 
>>> I've changed the vmdirty settings and enabled them:
>>> admin@riak1:~$ sudo sysctl -a | grep dirtyvm.dirty_background_ratio = 0
>>> vm.dirty_background_bytes = 209715200
>>> vm.dirty_ratio = 40
>>> vm.dirty_bytes = 0
>>> vm.dirty_writeback_centisecs = 100
>>> vm.dirty_expire_centisecs = 200
>>> 
>>> I've seen less idle time between writes, iostat is showing near constant
>>> writes between 20 and 500 kb/s, with some surges around 4000 kb/s. That's
>>> better, but not that great.
>>> 
>>> Here is the current configuration for my "activity_fr" bucket type and
>>> "tweet" bucket:
>>> 
>>> 
>>> admin@riak1:~$ http localhost:8098/types/activity_fr/props
>>> HTTP/1.1 200 OK
>>> Content-Encoding: gzip
>>> Content-Length: 314
>>> Content-Type: application/json
>>> Date: Tue, 03 May 2016 14:30:21 GMT
>>> Server: MochiWeb/1.1 WebMachine/1.10.8 (that head fake, tho)
>>> Vary: Accept-Encoding
>>> {
>>>    "props": {
>>>        "active": true,
>>>        "allow_mult": false,
>>>        "basic_quorum": false,
>>>        "big_vclock": 50,
>>>        "chash_keyfun": {
>>>            "fun": "chash_std_keyfun",
>>>            "mod": "riak_core_util"
>>>        },
>>>        "claimant": "r...@riak2.lighthouse-analytics.co",
>>>        "dvv_enabled": false,
>>>        "dw": "quorum",
>>>        "last_write_wins": true,
>>>        "linkfun": {
>>>            "fun": "mapreduce_linkfun",
>>>            "mod": "riak_kv_wm_link_walker"
>>>        },
>>>        "n_val": 3,
>>>        "notfound_ok": true,
>>>        "old_vclock": 86400,
>>>        "postcommit": [],
>>>        "pr": 0,
>>>        "precommit": [],
>>>        "pw": 0,
>>>        "r": "quorum",
>>>        "rw": "quorum",
>>>        "search_index": "activity_fr.20160422104506",
>>>        "small_vclock": 50,
>>>        "w": "quorum",
>>>        "young_vclock": 20
>>>    }
>>> }
>>> 
>>> admin@riak1:~$ http localhost:8098/types/activity_fr/buckets/tweet/props
>>> HTTP/1.1 200 OK
>>> Content-Encoding: gzip
>>> Content-Length: 322
>>> Content-Type: application/json
>>> Date: Tue, 03 May 2016 14:30:02 GMT
>>> Server: MochiWeb/1.1 WebMachine/1.10.8 (that head fake, tho)
>>> Vary: Accept-Encoding
>>> 
>>> {
>>>    "props": {
>>>        "active": true,
>>>        "allow_mult": false,
>>>        "basic_quorum": false,
>>>        "big_vclock": 50,
>>>        "chash_keyfun": {
>>>            "fun": "chash_std_keyfun",
>>>            "mod": "riak_core_util"
>>>        },
>>>        "claimant": "r...@riak2.lighthouse-analytics.co",
>>>        "dvv_enabled": false,
>>>        "dw": "quorum",
>>>        "last_write_wins": true,
>>>        "linkfun": {
>>>            "fun": "mapreduce_linkfun",
>>>            "mod": "riak_kv_wm_link_walker"
>>>        },
>>>        "n_val": 3,
>>>        "name": "tweet",
>>>        "notfound_ok": true,
>>>        "old_vclock": 86400,
>>>        "postcommit": [],
>>>        "pr": 0,
>>>        "precommit": [],
>>>        "pw": 0,
>>>        "r": "quorum",
>>>        "rw": "quorum",
>>>        "search_index": "activity_fr.20160422104506",
>>>        "small_vclock": 50,
>>>        "w": "quorum",
>>>        "young_vclock": 20
>>>    }
>>> }
>>> 
>>> I really don't know what to do. Can you help ?
>>> 
>>> Guillaume
>> 
>> _______________________________________________
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to