Awesome! Ya, Solr like resources. If you're on 3 nodes now, consider
adjusting your n_val from default 3 to 2. With default ring_size of 64 and
n_val of 3 and a cluster size less than 5 you are not guaranteed to have
all copies of your data on distinct physical nodes. Some nodes will receive
2 copies of data. Just be aware of that.

On Friday, May 27, 2016, Guillaume Boddaert <
guilla...@lighthouse-analytics.co> wrote:

> A little follow up for you guys since I went offline for quite some times.
>
> As suggested, it was a Solr performance issue, we were able to prove that
> my old 5 hosts were able to handle the load without Solr/Yokozuna.
> Fact was that I lacked CPU for my host, as well as RAM. Since SolR is
> pretty resource consuming, so I switched from :
> - 5 x 16Gb x 2 CPU hosts
> to
> - 3 x 120Gb x 8 CPU hosts
>
> And it now works like a charm,
>
> Thanks for the help (especially to Damien)
>
> Guillaume
>
> On 04/05/2016 15:17, Matthew Von-Maszewski wrote:
>
> Guillaume,
>
> Two points:
>
> 1.  You can send the “riak debug” from one server and I will verify that
> 2.0.18 is indicated in the LOG file.
>
> 2.  Your previous “riak debug” from server “riak1” indicated that only two
> CPU cores existed.  We performance test with eight, twelve, and twenty-four
> core servers, not two.  You have two heavy weight applications, Riak and
> Solr, competing for time on those two cores.  Actually, you have three
> applications due to leveldb’s background compaction operations.
>
> One leveldb compaction is CPU intensive.  The compaction reads a block
> from the disk, computes a CRC32 check of the block, decompresses the block,
> merges the keys of this block with one or more blocks from other files,
> then compresses the new block, computes a new CRC32, and finally writes the
> block to disk.  And there can be multiple compactions running
> simultaneously.  All of your CPU time could be periodically lost to leveldb
> compactions.
>
> There are some minor tunings we could do, like disabling compression in
> leveldb, that might help.  But I seriously doubt you are going to achieve
> your desired results with only two cores.  Adding a sixth server with two
> cores is not really going to help.
>
> Matthew
>
>
> On May 4, 2016, at 4:27 AM, Guillaume Boddaert <
> guilla...@lighthouse-analytics.co
> <javascript:_e(%7B%7D,'cvml','guilla...@lighthouse-analytics.co');>>
> wrote:
>
> Thanks, I've installed the new library as stated in the documentation
> using 2.0.18 files.
>
> I was unable to find the vnode LOG file from the documentation, as my
> vnodes looks like file, not directories. So I can't verify that I run the
> proper version of the library after my riak restart.
>
> Anyway, it has unfortunately no effect:
>
> http://www.awesomescreenshot.com/image/1219821/1b292613c051da86df5696034c114b14
>
> I think i'll try to add a 6th node that don't rely on network disks and
> see what's going on.
>
> G.
>
>
> On 03/05/2016 22:47, Matthew Von-Maszewski wrote:
>
> Guillaume,
>
> A prebuilt eleveldb 2.0.18 for Debian 7 is found here:
>
>
>    -
>    
> <https://s3.amazonaws.com/downloads.basho.com/patches/eleveldb/2.0.18/eleveldb_2.0.18_debian7.tgz>
>    
> https://s3.amazonaws.com/downloads.basho.com/patches/eleveldb/2.0.18/eleveldb_2.0.18_debian7.tgz
>
>
> There are good instructions for applying an eleveldb patch here:
>
>
> <http://docs.basho.com/community/productadvisories/leveldbsegfault/#patch-eleveldb-so>
> http://docs.basho.com/community/productadvisories/leveldbsegfault/#patch-eleveldb-so
>
> Key points about the above web page:
>
> - use the eleveldb patch file link in this email, NOT links on the web page
>
> - the Debian directory listed on the web page will be slightly different
> than your Riak 2.1.4 installation:
>
>     /usr/lib/riak/lib/eleveldb-<something_different>/priv/
>
>
> Matthew
>
>
> On May 3, 2016, at 1:01 PM, Matthew Von-Maszewski <
> <javascript:_e(%7B%7D,'cvml','matth...@basho.com');>matth...@basho.com
> <javascript:_e(%7B%7D,'cvml','matth...@basho.com');>> wrote:
>
> Guillaume,
>
> I have reviewed the debug package for your riak1 server.  There are two
> potential areas of follow-up:
>
> 1.  You are running our most recent Riak 2.1.4 which has eleveldb 2.0.17.
> We have seen a case where a recent feature in eleveldb 2.0.17 caused too
> much cache flushing, impacting leveldb’s performance.  A discussion is here:
>
>   https://github.com/basho/leveldb/wiki/mv-timed-grooming2
>
> 2.  Yokozuna search was recently updated for some timeout problems.  Those
> updates are not yet in a public build.  One of our other engineers is
> likely to respond to you on that topic.
>
>
> An eleveldb 2.0.18 is tagged and available via github if you want to build
> it yourself.  Otherwise, Basho may be releasing prebuilt patches of
> eleveldb 2.0.18 in the near future.  But no date is currently set.
>
> Matthew
>
> On May 3, 2016, at 10:50 AM, Luke Bakken <
> <javascript:_e(%7B%7D,'cvml','lbak...@basho.com');>lbak...@basho.com
> <javascript:_e(%7B%7D,'cvml','lbak...@basho.com');>> wrote:
>
> Guillaume -
>
> You said earlier "My data are stored on an openstack volume that
> support up to 3000IOPS". There is a likelihood that your write load is
> exceeding the capacity of your virtual environment, especially if some
> Riak nodes are sharing physical disk or server infrastructure.
>
> Some suggestions:
>
> * If you're not using Riak Search, set "search = off" in riak.conf
>
> * Be sure to carefully read and apply all tunings:
> http://docs.basho.com/riak/kv/2.1.4/using/performance/
>
> * You may wish to increase the memory dedicated to leveldb:
> http://docs.basho.com/riak/kv/2.1.4/configuring/backend/#leveldb
>
> --
> Luke Bakken
> Engineer
> lbak...@basho.com <javascript:_e(%7B%7D,'cvml','lbak...@basho.com');>
>
>
> On Tue, May 3, 2016 at 7:33 AM, Guillaume Boddaert
> <guilla...@lighthouse-analytics.co>
> <javascript:_e(%7B%7D,'cvml','guilla...@lighthouse-analytics.co');> wrote:
>
> Hi,
>
> Sorry for the delay, I've spent a lot of time trying to understand if the
> problem was elsewhere.
> I've simplified my infrastructure and got a simple layout that don't rely
> anymore on loadbalancer and also corrected some minor performance issue on
> my workers.
>
> At the moment, i have up to 32 workers that are calling riak for writes,
> each of them are set to :
> w=1
> dw=0
> timeout=1000
> using protobuf
> a timeouted attempt is rerun 180s later
>
> From my application server perspective, 23% of the calls are rejected by
> timeout (75446 tries, 57564 success, 17578 timeout).
>
> Here is a sample riak-admin stat for one of my 5 hosts:
>
> node_put_fsm_time_100 : 999331
> node_put_fsm_time_95 : 773682
> node_put_fsm_time_99 : 959444
> node_put_fsm_time_mean : 156242
> node_put_fsm_time_median : 20235
> vnode_put_fsm_time_100 : 5267527
> vnode_put_fsm_time_95 : 2437457
> vnode_put_fsm_time_99 : 4819538
> vnode_put_fsm_time_mean : 175567
> vnode_put_fsm_time_median : 6928
>
> I am using leveldb, so i can't tune bitcask backend as suggested.
>
> I've changed the vmdirty settings and enabled them:
> admin@riak1:~$ sudo sysctl -a | grep dirtyvm.dirty_background_ratio = 0
> vm.dirty_background_bytes = 209715200
> vm.dirty_ratio = 40
> vm.dirty_bytes = 0
> vm.dirty_writeback_centisecs = 100
> vm.dirty_expire_centisecs = 200
>
> I've seen less idle time between writes, iostat is showing near constant
> writes between 20 and 500 kb/s, with some surges around 4000 kb/s. That's
> better, but not that great.
>
> Here is the current configuration for my "activity_fr" bucket type and
> "tweet" bucket:
>
>
> admin@riak1:~$ http localhost:8098/types/activity_fr/props
> HTTP/1.1 200 OK
> Content-Encoding: gzip
> Content-Length: 314
> Content-Type: application/json
> Date: Tue, 03 May 2016 14:30:21 GMT
> Server: MochiWeb/1.1 WebMachine/1.10.8 (that head fake, tho)
> Vary: Accept-Encoding
> {
>    "props": {
>        "active": true,
>        "allow_mult": false,
>        "basic_quorum": false,
>        "big_vclock": 50,
>        "chash_keyfun": {
>            "fun": "chash_std_keyfun",
>            "mod": "riak_core_util"
>        },
>        "claimant":
> <javascript:_e(%7B%7D,'cvml','r...@riak2.lighthouse-analytics.co');>
> "r...@riak2.lighthouse-analytics.co"
> <javascript:_e(%7B%7D,'cvml','r...@riak2.lighthouse-analytics.co');>,
>        "dvv_enabled": false,
>        "dw": "quorum",
>        "last_write_wins": true,
>        "linkfun": {
>            "fun": "mapreduce_linkfun",
>            "mod": "riak_kv_wm_link_walker"
>        },
>        "n_val": 3,
>        "notfound_ok": true,
>        "old_vclock": 86400,
>        "postcommit": [],
>        "pr": 0,
>        "precommit": [],
>        "pw": 0,
>        "r": "quorum",
>        "rw": "quorum",
>        "search_index": "activity_fr.20160422104506",
>        "small_vclock": 50,
>        "w": "quorum",
>        "young_vclock": 20
>    }
> }
>
> admin@riak1:~$ http localhost:8098/types/activity_fr/buckets/tweet/props
> HTTP/1.1 200 OK
> Content-Encoding: gzip
> Content-Length: 322
> Content-Type: application/json
> Date: Tue, 03 May 2016 14:30:02 GMT
> Server: MochiWeb/1.1 WebMachine/1.10.8 (that head fake, tho)
> Vary: Accept-Encoding
>
> {
>    "props": {
>        "active": true,
>        "allow_mult": false,
>        "basic_quorum": false,
>        "big_vclock": 50,
>        "chash_keyfun": {
>            "fun": "chash_std_keyfun",
>            "mod": "riak_core_util"
>        },
>        "claimant":
> <javascript:_e(%7B%7D,'cvml','r...@riak2.lighthouse-analytics.co');>
> "r...@riak2.lighthouse-analytics.co"
> <javascript:_e(%7B%7D,'cvml','r...@riak2.lighthouse-analytics.co');>,
>        "dvv_enabled": false,
>        "dw": "quorum",
>        "last_write_wins": true,
>        "linkfun": {
>            "fun": "mapreduce_linkfun",
>            "mod": "riak_kv_wm_link_walker"
>        },
>        "n_val": 3,
>        "name": "tweet",
>        "notfound_ok": true,
>        "old_vclock": 86400,
>        "postcommit": [],
>        "pr": 0,
>        "precommit": [],
>        "pw": 0,
>        "r": "quorum",
>        "rw": "quorum",
>        "search_index": "activity_fr.20160422104506",
>        "small_vclock": 50,
>        "w": "quorum",
>        "young_vclock": 20
>    }
> }
>
> I really don't know what to do. Can you help ?
>
> Guillaume
>
>
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> <javascript:_e(%7B%7D,'cvml','riak-users@lists.basho.com');>
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
>
>
>
>

-- 


Alexander Sicular
Solutions Architect
Basho Technologies
9175130679
@siculars
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to