from:"Matthew Von\-Maszewski"

Re: Riak 2.1.4 crashes with Out of Memory Error

2017-05-04 Thread Matthew Von-Maszewski

How much ram is in this device?  What is the ring_size setting in Riak.conf?

Thank you,
Matthew

Sent from my iPad

> On May 4, 2017, at 4:08 AM, Magnus Kessler  wrote:
> 
>> On 4 May 2017 at 09:56, Arulappan, Jerald (Jerald)  wrote:
>> Hi Magnus Kessler,
>> 
>>  
>> 
>> The configuration looks good.
>> 
>>  
>> 
>> [root@server205 bin]# ./riak config effective | grep "_dir"
>> 
>> anti_entropy.data_dir = $(platform_data_dir)/anti_entropy
>> 
>> bitcask.data_root = $(platform_data_dir)/bitcask
>> 
>> leveldb.data_root = $(platform_data_dir)/leveldb
>> 
>> log.console.file = $(platform_log_dir)/console.log
>> 
>> log.crash.file = $(platform_log_dir)/crash.log
>> 
>> log.error.file = $(platform_log_dir)/error.log
>> 
>> platform_bin_dir = ./bin
>> 
>> platform_data_dir = ./data
>> 
>> platform_etc_dir = ./etc
>> 
>> platform_lib_dir = ./lib
>> 
>> platform_log_dir = ./log
>> 
>> ring.state_dir = $(platform_data_dir)/ring
>> 
>> search.anti_entropy.data_dir = $(platform_data_dir)/yz_anti_entropy
>> 
>> search.root_dir = $(platform_data_dir)/yz
>> 
>> search.temp_dir = $(platform_data_dir)/yz_temp
>> 
>>  
>> 
>> Regards,
>> 
>> Jerald
>> 
>> 
>> 
> 
> Hi Jerald,
> 
> The that fill up the logs at a very high rate are due to the use of relative 
> file paths for platform_{bin,data,etc,lib,log}_dir. Those entries should 
> generally contain absolute file paths, such as /var/lib/riak, as init systems 
> may start the application from an arbitrary working directory. Please check 
> if the errors go away after adjusting platform_data_dir.
> 
> Kind Regards,
> 
> Magnus
> 
> -- 
> Magnus Kessler
> Client Services Engineer
> Basho Technologies Limited
> 
> Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Object not found after successful PUT on S3 API

2017-05-02 Thread Matthew Von-Maszewski

Daniel,

1G of ram is an interesting and untested challenge.  Others have succeeded in 
getting regular Riak KV to operate on a Raspberry PI with 2g of Ram.

Here are the 3 things you can do in an attempt to drive down required RAM (in 
order of priority):

1.  reduce the ring size:  in riak.conf set “ring_size=16”.  It is currently 
128.  You will have to rebuild your dataset from scratch.

2.  reduce the memory model used by leveldb from “normal” to “developer"

{multi_backend, [
{be_default, riak_kv_eleveldb_backend, [
{data_root, "/opt/data/ecrypt/riak/leveldb”},
{limited_developer_mem, true}
]},
{be_blocks, riak_kv_eleveldb_backend, [
{data_root, "/opt/data/ecrypt/riak/blocks”},
{limited_developer_mem, true}
]}
]}

3.  disable the active anti-entropy feature:  in riak.conf set "anti_entropy = 
passive"


The above 3 changes greatly reduce that runtime memory requirements.  I 
understand why your swappiness setting is helping.  It is pushing executable 
pages into swap in favor of data memory pages.  That is going to help in your 
situation.  Do not waste time going back to swappiness=0.

Matthew



> On May 2, 2017, at 8:39 AM, Daniel Miller  <mailto:dmil...@dimagi.com>> wrote:
> 
> Hi Mattew,
> 
> I have attached the file generated by riak-debug as requested. Thanks for 
> taking a look at this for me.
> 
> The node I ran this on has had swappiness temporarily set back to 0 as it was 
> when the load avg was spiking and the node was becoming unresponsive. I say 
> "temporarily" because I had changed swappiness to 20 on all nodes over the 
> weekend in an attempt to hopefully make the cluster more stable than it had 
> been with swappiness=0. Incidentally, setting swappiness to 20 seemed to calm 
> things down and the cluster has been stable with no issues over the weekend, 
> which is great news, although a little confusing. I did notice that memory 
> use is slowly increasing, so it's possible that once all of swap has been 
> consumed the cluster will become unstable again.
> 
> In case you haven't reviewed the history on this thread, this is not a 
> standard Riak CS configuration. I'm using leveldb for both be_blocks and 
> be_default as directed by Luke Bakken after he discovered that the previous 
> thing I had tried (storage_backend=leveldb with no advanced.config, and 
> therefore no multi_backend), could possibly result in silent loss of data due 
> to manifests being randomly overwritten.
> 
> The reason we're trying to use leveldb for both backends rather than bitcask 
> is to hopefully run Riak in a more RAM constrained environment than is 
> typical. As I understand it, bitcask keeps all keys in RAM while leveldb does 
> not. The tradeoff of with using leveldb instead of bitcask is additional 
> latency, but so far this has not been a problem for us.
> 
> Daniel
> 
> 
> On Fri, Apr 28, 2017 at 10:15 AM, Matthew Von-Maszewski  <mailto:matth...@basho.com>> wrote:
> Daniel,
> 
> Something is wrong.  All instances of leveldb within a node share the total 
> memory configuration.  The memory is equally divided between all active 
> vnodes.  It is possible to create an OOM situation if total RAM is low and 
> vnodes count per node is high relative to RAM size.
> 
> The best next step would be for you to execute the riak-debug program on one 
> of the nodes known to experience OOM.  Send the resulting .tar.gz file 
> directly to me (no need to share that with the mailing list).  I will review 
> the memory situation and suggest options.
> 
> Matthew
> 
>> On Apr 28, 2017, at 8:22 AM, Daniel Miller > <mailto:dmil...@dimagi.com>> wrote:
>> 
>> Hi Luke,
>> 
>> I'm reviving this thread from March where we discussed a new backend 
>> configuration for our riak cluster. We have had a chance to test out the new 
>> recommended configuration, and so far we have not been successful in 
>> limiting the RAM usage of leveldb with multi_backend. We have tried various 
>> configurations to limit memory usage without success.
>> 
>> First try (default config).
>> riak.conf: leveldb.maximum_memory.percent = 70
>> 
>> Second try.
>> riak.conf: leveldb.maximum_memory.percent = 40
>> 
>> Third try
>> riak.conf: #leveldb.maximum_memory.percent = 40 (commented out)
>> advanced.config: [{eleveldb, [{total_leveldb_mem_percent, 30}]}, ...
>> 
>> In all cases (under load) riak consumes all available RAM and eventually 
>> becomes unresponsive, presumably due to OOM conditions. Is there a way to 
>> limit the amount of RAM cons

Re: Object not found after successful PUT on S3 API

2017-04-28 Thread Matthew Von-Maszewski

Daniel,

Something is wrong.  All instances of leveldb within a node share the total 
memory configuration.  The memory is equally divided between all active vnodes. 
 It is possible to create an OOM situation if total RAM is low and vnodes count 
per node is high relative to RAM size.

The best next step would be for you to execute the riak-debug program on one of 
the nodes known to experience OOM.  Send the resulting .tar.gz file directly to 
me (no need to share that with the mailing list).  I will review the memory 
situation and suggest options.

Matthew

> On Apr 28, 2017, at 8:22 AM, Daniel Miller  wrote:
> 
> Hi Luke,
> 
> I'm reviving this thread from March where we discussed a new backend 
> configuration for our riak cluster. We have had a chance to test out the new 
> recommended configuration, and so far we have not been successful in limiting 
> the RAM usage of leveldb with multi_backend. We have tried various 
> configurations to limit memory usage without success.
> 
> First try (default config).
> riak.conf: leveldb.maximum_memory.percent = 70
> 
> Second try.
> riak.conf: leveldb.maximum_memory.percent = 40
> 
> Third try
> riak.conf: #leveldb.maximum_memory.percent = 40 (commented out)
> advanced.config: [{eleveldb, [{total_leveldb_mem_percent, 30}]}, ...
> 
> In all cases (under load) riak consumes all available RAM and eventually 
> becomes unresponsive, presumably due to OOM conditions. Is there a way to 
> limit the amount of RAM consumed by riak with the new multi_backend 
> configuration? For example, do we need to consider ring size or other 
> configuration parameters when calculating the value of 
> total_leveldb_mem_percent?
> 
> Notably, the old (storage_backend = leveldb in riak.conf, empty 
> advanced.config) clusters have had very good RAM and disk usage 
> characteristics. Is there any way we can make riak or riak cs avoid the rare 
> occasions where it overwrites the manifest file while using this (non-multi) 
> backend?
> 
> Thank you,
> Daniel Miller
> 
> 
> On Tue, Mar 7, 2017 at 3:58 PM, Luke Bakken  > wrote:
> Hi Daniel,
> 
> Thanks for providing all of that information.
> 
> You are missing important configuration for riak_kv that can only be provided 
> in an /etc/riak/advanced.config file. Please see the following document, 
> especially the section to which I link here:
> 
> http://docs.basho.com/riak/cs/2.1.1/cookbooks/configuration/riak-for-cs/#setting-up-the-proper-riak-backend
>  
> 
> 
> [
> {riak_kv, [
> % NOTE: double-check this path for your environment:
> {add_paths, ["/usr/lib/riak-cs/lib/riak_cs-2.1.1/ebin"]},
> {storage_backend, riak_cs_kv_multi_backend},
> {multi_backend_prefix_list, [{<<"0b:">>, be_blocks}]},
> {multi_backend_default, be_default},
> {multi_backend, [
> {be_default, riak_kv_eleveldb_backend, [
> {data_root, "/opt/data/ecryptfs/riak"}
> ]},
> {be_blocks, riak_kv_eleveldb_backend, [
> {data_root, "/opt/data/ecryptfs/riak_blocks"}
> ]}
> ]}
> ]}
> ].
> 
> Your configuration will look like the above. The contents of this file are 
> merged with the contents of /etc/riak/riak.conf to produce the configuration 
> that Riak uses.
> 
> Notice that I chose riak_kv_eleveldb_backend twice because of the discussion 
> you had previously about RAM usage and bitcask 
> (http://lists.basho.com/pipermail/riak-users_lists.basho.com/2016-November/018801.html
>  
> )
> 
> In your current configuration, you are not using the expected prefix for the 
> block data. My guess is that on very rare occasions your data happens to 
> overwrite the manifest for a file. You may also have corrupted files at this 
> point without noticing it at all.
> 
> IMPORTANT: you can't switch from your current configuration to this new one 
> without re-saving all of your data.
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: How Riak Handle Request?

2017-04-18 Thread Matthew Von-Maszewski

"with a scheduler per thread"

Slight correction:  with a scheduler per logical CPU core.

If you have other applications on the same box, e.g. java code, you might want 
to reduce the number of schedulers.  See 
https://github.com/basho/leveldb/wiki/riak-tuning-2 
 for more details

Matthew


> On Apr 18, 2017, at 9:34 AM, Christopher Meiklejohn 
>  wrote:
> 
> On Tue, Apr 18, 2017 at 3:16 PM, Carlo Pires  wrote:
>> Fred,
>> 
>> Are you saying that Riak doesn't start erlang runtime in SMP and a machine
>> with multiple processes will have only one dedicated to each Riak instance?
> 
> Hi Carlo,
> 
> Riak is started in SMP mode, with a scheduler per thread, but running
> within a single Unix process.
> 
> Thanks,
> - Christopher
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: anti_entropy and solr taking up suspiciously large amounts of space

2017-04-03 Thread Matthew Von-Maszewski

Rohit,

My apologies for the delayed reply.  Too many conflicting demands on my time 
the past two weeks.

I reviewed the riak-debug package you shared.  I also discussed its contents 
with other Riak developers.

There does not appear to be anything unexpected.  The anti_entropy bloat is due 
to the bitcask backend not actively communicating TTL expirations to AAE.  This 
is a known issue.  Similarly, bitcask is not communicating TTL expirations to 
solr.  (The leveldb backend recently added expiry / TTL.  And it fails the same 
way in this scenario as bitcask.)

We have engineering designs in the works that will eventually correct this 
situation.  But the designs do not translate to code that you can use today.  
My apologies.

--> The 100% accurate approach today is to disable bitcask's TTL and create 
external jobs that prune your data via Delete operations.  Yes, this is going 
to create a bunch of extra disk operations.  But I am being honest.

--> You could reduce only the anti_entropy disk usage by changing the 
"anti_entropy" setting in riak.conf from "active" to "passive".  But this does 
nothing for solr.

Matthew

> On Mar 22, 2017, at 10:56 AM, Matthew Von-Maszewski  
> wrote:
> 
> Rohit,
> 
> Would you please run “riak-debug” on one server from the command line and 
> send the tar.gz file it creates directly to me.  Do not copy the email list.
> 
> Notes on its usage are here:  
> http://docs.basho.com/riak/kv/2.2.1/using/cluster-operations/inspecting-node/#riak-debug
>  
> <http://docs.basho.com/riak/kv/2.2.1/using/cluster-operations/inspecting-node/#riak-debug>
> 
> The resulting debug package will give me and others at Basho a better picture 
> of the problem.  The alternative is a about twenty rounds of “what about 
> this, oh, then what about that”.
> 
> Thanks,
> Matthew
> 
> 
> 
>> On Mar 21, 2017, at 9:53 PM, Rohit Sanbhadti > <mailto:sanbhadtiro...@vmware.com>> wrote:
>> 
>> Matthew,
>> 
>> To clarify, this happens on all nodes in our cluster (10+ nodes) although 
>> the exact size varies by 10s of GB. I’ve rolling restarted the cluster the 
>> last time this happened (last week) with no significant change in size, 
>> although the output from riak-admin aae-status and riak-admin search 
>> aae-status shows empty after the restart.  
>> 
>> -- 
>> Rohit S.
>> 
>> On 3/21/17, 5:25 PM, "Matthew Von-Maszewski" > <mailto:matth...@basho.com>> wrote:
>> 
>>Rohit,
>> 
>>If you restart the node does the elevated anti_entropy size decline after 
>> the restart?
>> 
>>Matthew
>> 
>> 
>>> On Mar 21, 2017, at 8:00 PM, Rohit Sanbhadti >> <mailto:sanbhadtiro...@vmware.com>> wrote:
>>> 
>>> Running Riak 2.2.0 on Ubuntu 16.04.1, we’ve noticed that anti_entropy is 
>>> taking up way too much space on all of our nodes. We use multi_backend with 
>>> mostly bitcask backends (relevant part of config pasted below). Has anyone 
>>> seen this before, or does anyone have an idea what might be causing this? 
>>> 
>>> storage_backend = multi
>>> multi_backend.bitcask_1day.storage_backend = bitcask
>>> multi_backend.bitcask_1day.bitcask.data_root = /var/lib/riak/bitcask_1day
>>> multi_backend.bitcask_1day.bitcask.expiry = 1d
>>> multi_backend.bitcask_1day.bitcask.expiry.grace_time = 1h
>>> multi_backend.bitcask_2day.storage_backend = bitcask
>>> multi_backend.bitcask_2day.bitcask.data_root = /var/lib/riak/bitcask_2day
>>> multi_backend.bitcask_2day.bitcask.expiry = 2d
>>> multi_backend.bitcask_2day.bitcask.expiry.grace_time = 1h
>>> multi_backend.bitcask_4day.storage_backend = bitcask
>>> multi_backend.bitcask_4day.bitcask.data_root = /var/lib/riak/bitcask_4day
>>> multi_backend.bitcask_4day.bitcask.expiry = 4d
>>> multi_backend.bitcask_4day.bitcask.expiry.grace_time = 1h
>>> multi_backend.bitcask_8day.storage_backend = bitcask
>>> multi_backend.bitcask_8day.bitcask.data_root = /var/lib/riak/bitcask_8day
>>> multi_backend.bitcask_8day.bitcask.expiry = 8d
>>> multi_backend.bitcask_8day.bitcask.expiry.grace_time = 1h
>>> multi_backend.leveldb_mult.storage_backend = leveldb
>>> multi_backend.leveldb_mult.leveldb.maximum_memory.percent = 30
>>> multi_backend.leveldb_mult.leveldb.data_root = /var/lib/riak/leveldb_mult
>>> multi_backend.bitcask_mult.storage_backend = bitcask
>>> multi_backend.bitcask_mult.bitcask.data_root = /var/lib/riak/bitcask_mult
>>> multi_backend.default = leveldb_mult
>>> 
>>> $ du –h –d 1 /var

Re: Unable to compile Riak on Raspberry Pi 3

2017-04-01 Thread Matthew Von-Maszewski

A fix for that problem in leveldb/util/perf_count.cc already exists on the 
"develop" branch of the github repository basho/leveldb.  Download it and 
rebuild. 

Sent from my iPad

> On Apr 1, 2017, at 4:08 AM, cocos  wrote:
> 
> Hello,
> 
> currenty I'm trying  to install Riak on the Raspberry Pi 3 for testing 
> purposes. I used the following instruction from basho:
> 
> http://docs.basho.com/riak/kv/2.2.2/setup/installing/source/
> 
> I'm having problems compiling it from source. I tried to compile it on 
> Raspbian Jessie and then switched to Ubuntu Server 16.04. Both times with the 
> same result. It is not compiling and aborts at a certain point. I don't know 
> what causes the problem since it only says: `recipe for target 
> 'util/perf_count.o' failed`. Searching Google and the mailing list from basho 
> weren't successful. 
> 
> The version of `gcc` is `gcc (Raspbian 4.9.2-10) 4.9.2`. The version of 
> `Erlang` is  `Erlang R16B02_basho8 (erts-5.10.3)`
> 
> The commands i used are the following:
> 
> Installing Erlang:
> 
> wget 
> http://s3.amazonaws.com/downloads.basho.com/erlang/otp_src_R16B02basho10.tar.gz
> 
> tar zxvf otp_src_R16B02-basho10.tar.gz
> 
> cd OTP_R16B02_basho10
> ./otp_build autoconf
> ./configure && make && sudo make install
> 
> Installing Riak:
> 
> wget 
> http://s3.amazonaws.com/downloads.basho.com/riak/2.2/2.2.1/riak-2.2.1.tar.gz
> 
> tar zxvf riak-2.2.1.tar.gz
> 
> cd riak-2.2.1
> make locked-deps
> make rel
> 
> Any suggestions are welcome.
> 
> ## Output: ##
> 
> `./include/leveldb/atomics.h:155:15: note: 
> template argument deduction/substitution failed util/perf_count.cc:439:40:
> note: deduced conflicting types for parameter ‘ValueT’ 
> (‘unsigned int’ and‘int’ add_and_fetch(ptr_32, 1);`
> 
> 
> `Makefile:190: recipe for target 'util/perf_count.o' failed
> make[1]: *** [util/perf_count.o] Error 1
> make[1]: *** Waiting for unfinished jobs
> make[1]: Leaving directory 
> '/home/pi/Riak/riak/deps/eleveldb/c_src/leveldb'
> ERROR: Command [compile] failed!
> Makefile:23: recipe for target 'compile' failed
> make: *** [compile] Error 1`
> 
> Von meinem Samsung Galaxy Smartphone gesendet.
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: anti_entropy and solr taking up suspiciously large amounts of space

2017-03-22 Thread Matthew Von-Maszewski

Rohit,

Would you please run “riak-debug” on one server from the command line and send 
the tar.gz file it creates directly to me.  Do not copy the email list.

Notes on its usage are here:  
http://docs.basho.com/riak/kv/2.2.1/using/cluster-operations/inspecting-node/#riak-debug
 
<http://docs.basho.com/riak/kv/2.2.1/using/cluster-operations/inspecting-node/#riak-debug>

The resulting debug package will give me and others at Basho a better picture 
of the problem.  The alternative is a about twenty rounds of “what about this, 
oh, then what about that”.

Thanks,
Matthew



> On Mar 21, 2017, at 9:53 PM, Rohit Sanbhadti  
> wrote:
> 
> Matthew,
> 
> To clarify, this happens on all nodes in our cluster (10+ nodes) although the 
> exact size varies by 10s of GB. I’ve rolling restarted the cluster the last 
> time this happened (last week) with no significant change in size, although 
> the output from riak-admin aae-status and riak-admin search aae-status shows 
> empty after the restart.  
> 
> -- 
> Rohit S.
> 
> On 3/21/17, 5:25 PM, "Matthew Von-Maszewski"  wrote:
> 
>Rohit,
> 
>If you restart the node does the elevated anti_entropy size decline after 
> the restart?
> 
>Matthew
> 
> 
>> On Mar 21, 2017, at 8:00 PM, Rohit Sanbhadti  
>> wrote:
>> 
>> Running Riak 2.2.0 on Ubuntu 16.04.1, we’ve noticed that anti_entropy is 
>> taking up way too much space on all of our nodes. We use multi_backend with 
>> mostly bitcask backends (relevant part of config pasted below). Has anyone 
>> seen this before, or does anyone have an idea what might be causing this? 
>> 
>> storage_backend = multi
>> multi_backend.bitcask_1day.storage_backend = bitcask
>> multi_backend.bitcask_1day.bitcask.data_root = /var/lib/riak/bitcask_1day
>> multi_backend.bitcask_1day.bitcask.expiry = 1d
>> multi_backend.bitcask_1day.bitcask.expiry.grace_time = 1h
>> multi_backend.bitcask_2day.storage_backend = bitcask
>> multi_backend.bitcask_2day.bitcask.data_root = /var/lib/riak/bitcask_2day
>> multi_backend.bitcask_2day.bitcask.expiry = 2d
>> multi_backend.bitcask_2day.bitcask.expiry.grace_time = 1h
>> multi_backend.bitcask_4day.storage_backend = bitcask
>> multi_backend.bitcask_4day.bitcask.data_root = /var/lib/riak/bitcask_4day
>> multi_backend.bitcask_4day.bitcask.expiry = 4d
>> multi_backend.bitcask_4day.bitcask.expiry.grace_time = 1h
>> multi_backend.bitcask_8day.storage_backend = bitcask
>> multi_backend.bitcask_8day.bitcask.data_root = /var/lib/riak/bitcask_8day
>> multi_backend.bitcask_8day.bitcask.expiry = 8d
>> multi_backend.bitcask_8day.bitcask.expiry.grace_time = 1h
>> multi_backend.leveldb_mult.storage_backend = leveldb
>> multi_backend.leveldb_mult.leveldb.maximum_memory.percent = 30
>> multi_backend.leveldb_mult.leveldb.data_root = /var/lib/riak/leveldb_mult
>> multi_backend.bitcask_mult.storage_backend = bitcask
>> multi_backend.bitcask_mult.bitcask.data_root = /var/lib/riak/bitcask_mult
>> multi_backend.default = leveldb_mult
>> 
>> $ du –h –d 1 /var/lib/riak
>> 
>> 4.0K/var/lib/riak/riak_kv_exchange_fsm
>> 52K /var/lib/riak/generated.configs
>> 224K/var/lib/riak/cluster_meta
>> 424K/var/lib/riak/ring
>> 1.1M/var/lib/riak/kv_vnode
>> 2.1M/var/lib/riak/bitcask_1day
>> 3.6M/var/lib/riak/bitcask_4day
>> 33M /var/lib/riak/yz_temp
>> 1.1G/var/lib/riak/bitcask_2day
>> 5.9G/var/lib/riak/yz_anti_entropy
>> 20G /var/lib/riak/yz
>> 24G /var/lib/riak/leveldb_mult
>> 25G /var/lib/riak/bitcask_mult
>> 27G /var/lib/riak/bitcask_8day
>> 139G/var/lib/riak/anti_entropy
>> 240G/var/lib/riak
>> 
>> 
>> -- 
>> Rohit S.
>> 
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.basho.com_mailman_listinfo_riak-2Dusers-5Flists.basho.com&d=DwIFaQ&c=uilaK90D4TOVoH58JNXRgQ&r=qO7g3ZLnCDJ2Kp8qz13HpzC2NAX5qNCtEhoWeGshuEI&m=zcfqvyFJVK17vAft30Cv9abkK5P_f2JtRonrl0CRYnM&s=wSNoX28DcQ-CCfWp9vRUA-70z3LeWFNGHG3-LPKzAtQ&e=
>>  
> 
> 
> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: anti_entropy and solr taking up suspiciously large amounts of space

2017-03-21 Thread Matthew Von-Maszewski

Rohit,

If you restart the node does the elevated anti_entropy size decline after the 
restart?

Matthew


> On Mar 21, 2017, at 8:00 PM, Rohit Sanbhadti  
> wrote:
> 
> Running Riak 2.2.0 on Ubuntu 16.04.1, we’ve noticed that anti_entropy is 
> taking up way too much space on all of our nodes. We use multi_backend with 
> mostly bitcask backends (relevant part of config pasted below). Has anyone 
> seen this before, or does anyone have an idea what might be causing this? 
> 
> storage_backend = multi
> multi_backend.bitcask_1day.storage_backend = bitcask
> multi_backend.bitcask_1day.bitcask.data_root = /var/lib/riak/bitcask_1day
> multi_backend.bitcask_1day.bitcask.expiry = 1d
> multi_backend.bitcask_1day.bitcask.expiry.grace_time = 1h
> multi_backend.bitcask_2day.storage_backend = bitcask
> multi_backend.bitcask_2day.bitcask.data_root = /var/lib/riak/bitcask_2day
> multi_backend.bitcask_2day.bitcask.expiry = 2d
> multi_backend.bitcask_2day.bitcask.expiry.grace_time = 1h
> multi_backend.bitcask_4day.storage_backend = bitcask
> multi_backend.bitcask_4day.bitcask.data_root = /var/lib/riak/bitcask_4day
> multi_backend.bitcask_4day.bitcask.expiry = 4d
> multi_backend.bitcask_4day.bitcask.expiry.grace_time = 1h
> multi_backend.bitcask_8day.storage_backend = bitcask
> multi_backend.bitcask_8day.bitcask.data_root = /var/lib/riak/bitcask_8day
> multi_backend.bitcask_8day.bitcask.expiry = 8d
> multi_backend.bitcask_8day.bitcask.expiry.grace_time = 1h
> multi_backend.leveldb_mult.storage_backend = leveldb
> multi_backend.leveldb_mult.leveldb.maximum_memory.percent = 30
> multi_backend.leveldb_mult.leveldb.data_root = /var/lib/riak/leveldb_mult
> multi_backend.bitcask_mult.storage_backend = bitcask
> multi_backend.bitcask_mult.bitcask.data_root = /var/lib/riak/bitcask_mult
> multi_backend.default = leveldb_mult
> 
> $ du –h –d 1 /var/lib/riak
> 
> 4.0K/var/lib/riak/riak_kv_exchange_fsm
> 52K /var/lib/riak/generated.configs
> 224K/var/lib/riak/cluster_meta
> 424K/var/lib/riak/ring
> 1.1M/var/lib/riak/kv_vnode
> 2.1M/var/lib/riak/bitcask_1day
> 3.6M/var/lib/riak/bitcask_4day
> 33M /var/lib/riak/yz_temp
> 1.1G/var/lib/riak/bitcask_2day
> 5.9G/var/lib/riak/yz_anti_entropy
> 20G /var/lib/riak/yz
> 24G /var/lib/riak/leveldb_mult
> 25G /var/lib/riak/bitcask_mult
> 27G /var/lib/riak/bitcask_8day
> 139G/var/lib/riak/anti_entropy
> 240G/var/lib/riak
> 
> 
> -- 
> Rohit S.
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: AAE Off

2017-02-28 Thread Matthew Von-Maszewski

Performance gains on write intensive applications.

> On Feb 28, 2017, at 11:18 AM, al so  wrote:
> 
> Why would anyone disable AAE in riak 2.x?
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Truncated bit-cask files

2017-02-14 Thread Matthew Von-Maszewski

Arun,

You are running out of RAM for the leveldb AAE.  There are several ways to fix 
that:

- reduce memory allocated to bitcask
- more memory per server
- more servers of same memory
- reduce the ring size from 64 to 8, and rebuild data within the cluster from 
scratch
- lie to leveldb and give it a big than real memory setting in riak.conf:
leveldb.maximum_memory=8G


The key LOG lines are:

Options.total_leveldb_mem: 2,901,766,963<-- this is the total memory 
assigned to ALL of leveldb, but
only 20% of it goes to AAE vnodes

File cache size: 5833527 <-- the first vnode says, cool enough memory for me
Block cache size: 7930679  <-- ditto

  ... but as more vnodes start:

 File cache size: 0<-- things are just not going to work well
Block cache size: 0

There are no actual file system error messages in your LOG files.  That 
supports that the real problem is memory unhappiness.

Matthew


> On Feb 14, 2017, at 3:34 PM, Arun Rajagopalan  
> wrote:
> 
> Hi Matthew, Magnus
> 
> I have attached the log files for your review
> 
> Thanks
> Arun
> 
> 
> On Tue, Feb 14, 2017 at 11:55 AM, Matthew Von-Maszewski  <mailto:matth...@basho.com>> wrote:
> Arun,
> 
> The AAE code uses leveldb for its storage of anti-entropy data, no matter 
> which backend holds the user data.  Therefore the error below suggests 
> corruption within leveldb files (which is not impossible, but becoming really 
> rare except with bad hardware or full disks).
> 
> Before wiping out the AAE directory, you should copy the LOG file within it.  
> There are likely more useful error messages within that file ... maybe put 
> the file in drop box or zip attach to a reply for us to review.
> 
> Matthew
> 
>> On Feb 14, 2017, at 10:42 AM, Magnus Kessler > <mailto:mkess...@basho.com>> wrote:
>> 
>> On 14 February 2017 at 14:46, Arun Rajagopalan > <mailto:arun.v.rajagopa...@gmail.com>> wrote:
>> Hi Magnus
>> 
>> RIAK crashes on startup when I have trucated bitcask file
>> 
>> It also crashes when the AAE files are bad too I think. Example below
>> 
>> 2017-02-13 21:18:30 =CRASH REPORT
>>   crasher:
>> initial call: riak_kv_index_hashtree:init/1
>> pid: <0.6037.0>
>> registered_name: []
>> exception exit: {{{badmatch,{error,{db_open,"Corruption: truncated 
>> record at end of file"}}},[{hashtree,new_segment_
>> store,2,[{file,"src/hashtree.erl"},{line,675}]},{hashtree,new,2,[{file,"src/hashtree.erl"},{line,246}]},{riak_kv_index_h
>> ashtree,do_new_tree,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,610}]},{lists,foldl,3,[{file,"lists.erl"},{line,124
>> 8}]},{riak_kv_index_hashtree,init_trees,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,474}]},{riak_kv_index_hashtree,
>> init,1,[{file,"src/riak_kv_index_hashtree.erl"},{line,268}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,304}]}
>> ,{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]},[{gen_server,init_it,6,[{file,"gen_server.erl"},{line
>> ,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}
>> ancestors: [<0.715.0>,riak_core_vnode_sup,riak_core_sup,<0.160.0>]
>> messages: []
>> links: []
>> dictionary: []
>> trap_exit: false
>> status: running
>> heap_size: 1598
>> stack_size: 27
>> reductions: 889
>>   neighbours:
>> 
>> 
>> Regards
>> Arun
>> 
>> 
>> Hi Arun,
>> 
>> The crash log you provided shows that there is a corrupted file in the AAE 
>> (anti_entropy) backend. Entries in console.log should have more information 
>> about which partition is affected. Please post output from the affected node 
>> at around 2017-02-13T21:18:30. As this is AAE data, it is safe to remove the 
>> directory named after the affected partition from the active_entropy 
>> directory before restarting the node. You may find that there is more than 
>> one affected partition, the next of which will be encountered after the 
>> attempted restart only. If this is the case, simply identify the next 
>> partition in the same way and remove it, too, until the node starts up 
>> successfully again.
>> 
>> Is there a reason why the nodes aren't shut down in the regular way?
>> 
>> Kind Regards,
>> 
>> Magnus
>> 
>> 
>> 
>> -- 
>> Magnus Kessler
>> Client Services Engineer
>> Basho Technologies Limited
>> 
>> Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com 
>> <http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>
> 
> 
> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Truncated bit-cask files

2017-02-14 Thread Matthew Von-Maszewski

Arun,

The AAE code uses leveldb for its storage of anti-entropy data, no matter which 
backend holds the user data.  Therefore the error below suggests corruption 
within leveldb files (which is not impossible, but becoming really rare except 
with bad hardware or full disks).

Before wiping out the AAE directory, you should copy the LOG file within it.  
There are likely more useful error messages within that file ... maybe put the 
file in drop box or zip attach to a reply for us to review.

Matthew

> On Feb 14, 2017, at 10:42 AM, Magnus Kessler  wrote:
> 
> On 14 February 2017 at 14:46, Arun Rajagopalan  > wrote:
> Hi Magnus
> 
> RIAK crashes on startup when I have trucated bitcask file
> 
> It also crashes when the AAE files are bad too I think. Example below
> 
> 2017-02-13 21:18:30 =CRASH REPORT
>   crasher:
> initial call: riak_kv_index_hashtree:init/1
> pid: <0.6037.0>
> registered_name: []
> exception exit: {{{badmatch,{error,{db_open,"Corruption: truncated record 
> at end of file"}}},[{hashtree,new_segment_
> store,2,[{file,"src/hashtree.erl"},{line,675}]},{hashtree,new,2,[{file,"src/hashtree.erl"},{line,246}]},{riak_kv_index_h
> ashtree,do_new_tree,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,610}]},{lists,foldl,3,[{file,"lists.erl"},{line,124
> 8}]},{riak_kv_index_hashtree,init_trees,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,474}]},{riak_kv_index_hashtree,
> init,1,[{file,"src/riak_kv_index_hashtree.erl"},{line,268}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,304}]}
> ,{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]},[{gen_server,init_it,6,[{file,"gen_server.erl"},{line
> ,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}
> ancestors: [<0.715.0>,riak_core_vnode_sup,riak_core_sup,<0.160.0>]
> messages: []
> links: []
> dictionary: []
> trap_exit: false
> status: running
> heap_size: 1598
> stack_size: 27
> reductions: 889
>   neighbours:
> 
> 
> Regards
> Arun
> 
> 
> Hi Arun,
> 
> The crash log you provided shows that there is a corrupted file in the AAE 
> (anti_entropy) backend. Entries in console.log should have more information 
> about which partition is affected. Please post output from the affected node 
> at around 2017-02-13T21:18:30. As this is AAE data, it is safe to remove the 
> directory named after the affected partition from the active_entropy 
> directory before restarting the node. You may find that there is more than 
> one affected partition, the next of which will be encountered after the 
> attempted restart only. If this is the case, simply identify the next 
> partition in the same way and remove it, too, until the node starts up 
> successfully again.
> 
> Is there a reason why the nodes aren't shut down in the regular way?
> 
> Kind Regards,
> 
> Magnus
> 
> 
> 
> -- 
> Magnus Kessler
> Client Services Engineer
> Basho Technologies Limited
> 
> Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Reg:Continuous Periodic crashes after long operation

2017-01-26 Thread Matthew Von-Maszewski

FYI:  this is the function that is crashing:

get_uint32_measurement(Request, #internal{os_type = {unix, linux}}) ->
{ok,F} = file:open("/proc/loadavg",[read,raw]),  %% <--- 
crash line
{ok,D} = file:read(F,24),
ok = file:close(F),
{ok,[Load1,Load5,Load15,_PRun,PTotal],_} = io_lib:fread("~f ~f ~f ~d/~d", 
D),
case Request of
?avg1  -> sunify(Load1);
?avg5  -> sunify(Load5);
?avg15 -> sunify(Load15);
?ping -> 4711;
?nprocs -> PTotal
end;

Is there something unique about that open?

Matthew

> On Jan 26, 2017, at 10:37 AM, Luke Bakken  wrote:
> 
> Steven,
> 
> You may be able to get information via the lsof command as to what
> process(es) are using many file handles (if that is the cause).
> 
> I searched for that particular error and found this GH issue:
> https://github.com/emqtt/emqttd/issues/426
> 
> Which directed me to this page:
> https://github.com/emqtt/emqttd/wiki/linux-kernel-tuning
> 
> Basho also has a set of recommended tuning parameters:
> http://docs.basho.com/riak/kv/2.2.0/using/performance/
> 
> Do you have other error entries in any of Riak's logs at around the
> same time as these messages? Particularly crash.log.
> 
> --
> Luke Bakken
> Engineer
> lbak...@basho.com
> 
> On Thu, Jan 26, 2017 at 4:42 AM, Steven Joseph  wrote:
>> Hi Shaun,
>> 
>> I have already set this to a very high value
>> 
>> (r...@hawk1.streethawk.com)1> os:cmd("ulimit -n").
>> "2500\n"
>> (r...@hawk1.streethawk.com)2>
>> 
>> 
>> So the issue is not that the limit is low, but maybe a resource leak ? As I
>> mentioned our application processes continuously run queries on the cluster.
>> 
>> Kind Regards
>> 
>> Steven
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: memory usage of Riak.

2017-01-26 Thread Matthew Von-Maszewski

Alex,

Which backend are you using?  Leveldb's memory usage does not show up within 
Erlang.  Maybe that is what you are experiencing?

Matthew

Sent from my iPad

> On Jan 26, 2017, at 5:47 AM, Alex Feng  wrote:
> 
> Hi Riak Users,
> 
> One of my riak nodes, it has 4G memory, when I check the memory usage with 
> "free -m", I can see there are only around 150M left.  Then I check the 
> command "riak-admin status", it shows around 415M(415594432) consumed by 
> Erlang.  But with "top" command, it shows Erlang takes 52.1% memory, it is 
> about 2G.
> So, my question is where is the other ~1.6G (2G - 415M) memory ?
> 
> 
> dropped_vnode_requests_total : 0
> executing_mappers : 0
> gossip_received : 6
> handoff_timeouts : 0
> ignored_gossip_total : 0
> index_fsm_active : 0
> index_fsm_create : 0
> index_fsm_create_error : 0
> late_put_fsm_coordinator_ack : 0
> leveldb_read_block_error : 0
> list_fsm_active : 0
> list_fsm_create : 0
> list_fsm_create_error : 0
> list_fsm_create_error_total : 0
> list_fsm_create_total : 0
> map_actor_counts_100 : 0
> map_actor_counts_95 : 0
> map_actor_counts_99 : 0
> map_actor_counts_mean : 0
> map_actor_counts_median : 0
> mem_allocated : 3859660800
> mem_total : 3975495680
> memory_atom : 703377
> memory_atom_used : 677686
> memory_binary : 92813136
> memory_code : 16511098
> memory_ets : 6823184
> memory_processes : 287157280
> memory_processes_used : 287149824
> memory_system : 128437152
> memory_total : 415594432
> 
> 
> 
> KiB Swap:  4063228 total,   106496 used,  3956732 free.   296560 cached Mem
> 
>   PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND   
>   
> 1 riak  20   0 4844692 1.929g   1772 S   5.3 52.1   4539:13 beam.smp  
>
> 23510 riak  20   0 10.152g 1.131g  32292 S   4.3 30.6   2:07.28 java  
>
>72 riak  20   07492 92  0 S   0.0  0.0   0:15.22 epmd  
>
>   190 riak  20   04440224172 S   0.0  0.0   0:00.67 sh
>
>   191 riak  20   04324252192 S   0.0  0.0   0:08.97 memsup
>
>   193 riak  20   04324  0  0 S   0.0  0.0   0:00.00 cpu_sup   
>
>   285 riak  20   07460 28  0 S   0.0  0.0   0:29.51 
> inet_gethost 
>   286 riak  20   0   13780316236 S   0.0  0.0   0:55.30 
> inet_gethost 
>   287 riak  20   0   13780  8  0 S   0.0  0.0   0:00.00 
> inet_gethost 
> 22534 root  20   0   22320   1248756 S   0.0  0.0   0:00.07 bash  
>
> 24714 root  20   0   23992   1456   1052 R   0.0  0.0   0:00.00 top   
> 
> 
> Br,
> Alex
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Crash Log: yz_anti_entropy

2017-01-19 Thread Matthew Von-Maszewski

Damion,

I will explain what happened.  

ring_size = 8:  The default ring_size is 64.  It is based on the recommendation 
of five servers for a minimum cluster.  You stated you are using only one 
machine.  64 divided by 5 is 12.8 vnodes per server ... and ring size needs to 
be a power of 2.  So next smaller power of 2 from 12 is 8.

leveldb.limited_developer_mem = on:  leveldb allocates certain memory buffer 
size per vnode (ring_size).  This setting reduces the size of those buffers by 
10x.  

The two settings squash the memory requirements to give you a better 
opportunity for happiness with both search and riak on a single server.

Matthew



> On Jan 19, 2017, at 9:15 AM, Junk, Damion A  wrote:
> 
> Matthew -
> 
> That did it! 
> 
> Actually, I tried with both settings, and also with just the ring_size 
> change. 
> 
> Setting ring_size to 8 got rid of crashing.  I'll have to do a bit more 
> reading on this setting I suppose. I have a much more memory-constrained 
> virtual machine running on my local desktop running with just the default 
> install settings and no crashing. 
> 
> Thanks!
> 
> Damion
> 
>> On Jan 19, 2017, at 7:57 AM, Matthew Von-Maszewski > <mailto:matth...@basho.com>> wrote:
>> 
>> Damion,
>> 
>> Add the following settings within riak.conf:
>> 
>> leveldb.limited_developer_mem = on
>> ring_size = 8
>> 
>> Erase all data / vnodes and start over.
>> 
>> Matthew
>> 
>> 
>>> On Jan 19, 2017, at 8:51 AM, Junk, Damion A >> <mailto:jun...@purdue.edu>> wrote:
>>> 
>>> Hi Magnus -
>>> 
>>> I've tried a wide range of parameters for leveldb.maximum_memory_percent 
>>> ranging from 5 to 70. I also tried the leveldb.maximum_memory setting in 
>>> bytes, ranging from 500MB to 4GB. I get the same results in the 
>>> crash/console log no matter what the settings. But the log messages seem to 
>>> indicate an issue with yokozuna, and not leveldb itself from what I can 
>>> tell.
>>> 
>>> I set the max (-Xmx) to 2G for SOLR as well.
>>> 
>>> From the log messages, it looks like it's not actually the KV leveldb 
>>> system that's crashing, but the yokozuna system. I'm not sure how to 
>>> control or set memory here:
>>> 
>>>> {badmatch,{error,{db_open,"IO error: lock 
>>>> /var/lib/riak/yz_anti_entropy/639406966332270026714112114313373821099470487552/LOCK:
>>>>  Cannot allocate memory"}
>>> 
>>> This is a development node, running as a single (nojn-clustered) riak node. 
>>> It has 14G memory, and at the time of trying changes with Riak, 9GB were 
>>> free. 
>>> 
>>> 
>>> To Recap:
>>> 
>>> There are no keys/values in the database at all. 
>>> The only default settings I changed were:
>>> 
>>> storage_backend = leveldb
>>> search = on
>>> 
>>> and when that didn't work, I started changing:
>>> 
>>> search.solr.jvm_options = -d64 -Xms1g -Xmx2g -XX:+UseStringCache 
>>> -XX:+UseCompressedOops
>>> leveldb.maximum_memory_percent = 5 .. 70 
>>> 
>>> and then when nothing seemed to change:
>>> 
>>> leveldb.maximum_memory =  100 ... 40
>>> 
>>> 
>>> Thanks for any assistance!
>>> 
>>> 
>>> Damion
>>> 
>>> 
>>>> On Jan 19, 2017, at 3:33 AM, Magnus Kessler >>> <mailto:mkess...@basho.com>> wrote:
>>>> 
>>>> Hi Damion,
>>>> 
>>>> Let me first state that AAE always uses leveldb, regardless of the storage 
>>>> backend chosen for Riak KV data. Could you please state how much physical 
>>>> memory your Riak nodes have, and what you have configured for 
>>>> "leveldb.maximum_memory.percent" in "riak.conf"? Have you changed the 
>>>> settings for "search.solr.jvm_options", in particular the memory allocated 
>>>> to Solr?
>>>> 
>>>> As a general rule, leveldb should have at least 350MB of memory available 
>>>> per partition, and performance has been shown to increase with up to 2GB 
>>>> (2.5 GB when also using Search and AAE) per partition. Please check that 
>>>> you have enough memory available in your system.
>>>> 
>>>> Kind Regards,
>>>> 
>>>> Magnus
>>>>  
>>>> -- 
>>>> Magnus Kessler
>>>> Client Services Engineer
>>>> Basho Technologies Limited
>>>> 
>>>> Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
>>> 
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com 
>>> <http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>
>> 
> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Crash Log: yz_anti_entropy

2017-01-19 Thread Matthew Von-Maszewski

Damion,

Add the following settings within riak.conf:

leveldb.limited_developer_mem = on
ring_size = 8

Erase all data / vnodes and start over.

Matthew


> On Jan 19, 2017, at 8:51 AM, Junk, Damion A  wrote:
> 
> Hi Magnus -
> 
> I've tried a wide range of parameters for leveldb.maximum_memory_percent 
> ranging from 5 to 70. I also tried the leveldb.maximum_memory setting in 
> bytes, ranging from 500MB to 4GB. I get the same results in the crash/console 
> log no matter what the settings. But the log messages seem to indicate an 
> issue with yokozuna, and not leveldb itself from what I can tell.
> 
> I set the max (-Xmx) to 2G for SOLR as well.
> 
> From the log messages, it looks like it's not actually the KV leveldb system 
> that's crashing, but the yokozuna system. I'm not sure how to control or set 
> memory here:
> 
>> {badmatch,{error,{db_open,"IO error: lock 
>> /var/lib/riak/yz_anti_entropy/639406966332270026714112114313373821099470487552/LOCK:
>>  Cannot allocate memory"}
> 
> This is a development node, running as a single (nojn-clustered) riak node. 
> It has 14G memory, and at the time of trying changes with Riak, 9GB were 
> free. 
> 
> 
> To Recap:
> 
> There are no keys/values in the database at all. 
> The only default settings I changed were:
> 
> storage_backend = leveldb
> search = on
> 
> and when that didn't work, I started changing:
> 
> search.solr.jvm_options = -d64 -Xms1g -Xmx2g -XX:+UseStringCache 
> -XX:+UseCompressedOops
> leveldb.maximum_memory_percent = 5 .. 70 
> 
> and then when nothing seemed to change:
> 
> leveldb.maximum_memory =  100 ... 40
> 
> 
> Thanks for any assistance!
> 
> 
> Damion
> 
> 
>> On Jan 19, 2017, at 3:33 AM, Magnus Kessler > > wrote:
>> 
>> Hi Damion,
>> 
>> Let me first state that AAE always uses leveldb, regardless of the storage 
>> backend chosen for Riak KV data. Could you please state how much physical 
>> memory your Riak nodes have, and what you have configured for 
>> "leveldb.maximum_memory.percent" in "riak.conf"? Have you changed the 
>> settings for "search.solr.jvm_options", in particular the memory allocated 
>> to Solr?
>> 
>> As a general rule, leveldb should have at least 350MB of memory available 
>> per partition, and performance has been shown to increase with up to 2GB 
>> (2.5 GB when also using Search and AAE) per partition. Please check that you 
>> have enough memory available in your system.
>> 
>> Kind Regards,
>> 
>> Magnus
>>  
>> -- 
>> Magnus Kessler
>> Client Services Engineer
>> Basho Technologies Limited
>> 
>> Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Reaping Tombstones

2016-12-30 Thread Matthew Von-Maszewski

The current release of global expiry does not support retroactively dating and 
removing objects.  Only newly written objects get expiry.

The code for retroactively dating objects exists and will be part of a future 
release.

Matthew


> On Dec 30, 2016, at 10:30 AM, Arun Rajagopalan  
> wrote:
> 
> Thanks Matthew & Luca
> 
> Re: global expiry - will that option retroactively remove objects? That is 
> remove objects that became "unneeded" before the option was set ?
> Same question w.r.t delete_mode
> 
> Re: Map / Reduce - I am not sure the delete would remove the tombstone unless 
> I set the delete_mode to immediate AND there are no copies on non-primary 
> nodes. Or am I mistaken ?
> 
> 
> On Fri, Dec 30, 2016 at 10:11 AM, Luca Favatella 
>  <mailto:luca.favate...@erlang-solutions.com>> wrote:
> On 30 December 2016 at 15:06, Matthew Von-Maszewski  <mailto:matth...@basho.com>> wrote:
> > Greetings,
> >
> > I am not able to answer your tombstone questions.  That question needs a
> > better expert.
> >
> > Just wanted to point out that Riak now has global expiry in both the leveldb
> > and bitcask backends.  That might be a quicker solution for your frequent
> > delete operations:
> >
> >   http://basho.com/products/riak-kv/global-object-expiration/ 
> > <http://basho.com/products/riak-kv/global-object-expiration/>
> >
> > Technical details for the leveldb expiry are found here:
> >   https://github.com/basho/leveldb/wiki/mv-expiry 
> > <https://github.com/basho/leveldb/wiki/mv-expiry>
> >
> > Matthew
> >
> > On Dec 30, 2016, at 9:55 AM, Arun Rajagopalan  > <mailto:arun.v.rajagopa...@gmail.com>>
> > wrote:
> >
> > Hello Riakers
> >
> > I wonder if there is anyway to delete all tombstones that were left behind
> > when delete_mode was set to 'keep'
> 
> Hi Arun,
> 
> Brute-force rate-limited map-reduce custom application code looking
> for `X-Riak-Deleted` in object metadata and then deleting? Refs:
> * 
> http://docs.basho.com/riak/kv/2.2.0/developing/app-guide/advanced-mapreduce/#map-phase
>  
> <http://docs.basho.com/riak/kv/2.2.0/developing/app-guide/advanced-mapreduce/#map-phase>
> * 
> https://github.com/basho/basho_docs/blob/6b2f42a7243bfded737bd62eed12c490376c67e2/content/riak/kv/2.2.0/developing/app-guide/advanced-mapreduce.md#map-phase
>  
> <https://github.com/basho/basho_docs/blob/6b2f42a7243bfded737bd62eed12c490376c67e2/content/riak/kv/2.2.0/developing/app-guide/advanced-mapreduce.md#map-phase>
> 
> Interested to hear if you come up with a better way though.
> 
> 
> > Also how do I estimate how much space the tombstones take ? We have a ton of
> > large scale and frequent delete operations and I suspect some space is used
> > by Tombstones. I would like to find out if it is significant enough to
> > warrant some cleanup.
> 
> I understand this is a separate question - I am not sure about the answer.
> 
> 
> > If it does, how do I remove them?
> 
> I believe I proposed an answer above.
> 
> 
> Regards
> Luca
> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Reaping Tombstones

2016-12-30 Thread Matthew Von-Maszewski

Greetings,

I am not able to answer your tombstone questions.  That question needs a better 
expert.

Just wanted to point out that Riak now has global expiry in both the leveldb 
and bitcask backends.  That might be a quicker solution for your frequent 
delete operations:

  http://basho.com/products/riak-kv/global-object-expiration/

Technical details for the leveldb expiry are found here:
  https://github.com/basho/leveldb/wiki/mv-expiry 


Matthew

> On Dec 30, 2016, at 9:55 AM, Arun Rajagopalan  
> wrote:
> 
> Hello Riakers
> 
> I wonder if there is anyway to delete all tombstones that were left behind 
> when delete_mode was set to 'keep'
> 
> Also how do I estimate how much space the tombstones take ? We have a ton of 
> large scale and frequent delete operations and I suspect some space is used 
> by Tombstones. I would like to find out if it is significant enough to 
> warrant some cleanup. If it does, how do I remove them?
> 
> Thanks and wish you all Happy, Peaceful and Prosperous  2017 !
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: RiakTS TTL

2016-11-01 Thread Matthew Von-Maszewski

1.  The global expiry module is an external C++ module that is open source.  
There is no definition at this time for an Erlang callback, but the design 
supports it.  You can patch the open source code now.

2.  The TTL has two components:  when the record is written and number of 
minutes until expiry.  The write time goes into each record.  The number of 
minutes until expiry is loaded at start.  A change to the global minutes until 
expiry impacts the evaluation of all records.  A subsequent release will allow 
distinct TTL by table.

3.  Not my area of the code.

Matthew

> On Nov 1, 2016, at 3:46 PM, Joe Olson  wrote:
> 
> Two questions about the RiakTS TTL functionality (and its future direction):
> 
> 1. Is it possible to replace the standard delete upon TTL expiry with a user 
> defined delete?
> 2. Can the current global setting for the TTL timeout be changed? Will that 
> affect new records going forward?
> 
> Bonus question:
> 3. Are there any plans to implement general triggering in RiakTS?
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Seeking Backup solution for live nodes

2016-09-26 Thread Matthew Von-Maszewski

Neither.

The leveldb instance creates a snapshot of the current files and generates a 
working MANIFEST to go with them.  That means the snapshot is in “ready to run” 
condition.  This is based upon hard links for the .sst table files.

The user can then choose to copy that snapshot elsewhere, point a backup tool 
at it, and/or just leave it there for the five snapshot rotation.

One line cron job can kick off the snapshot.

Similar to using mySql hot backup.  You have the system generate the backup 
data, then you decide how to manage / store it.

Matthew


> On Sep 26, 2016, at 10:07 AM, DeadZen  wrote:
> 
> there a backup tool that uses this yet? or is this meant more to be used with 
> snapshots provided through xfs/zfs?
> 
> On Monday, September 26, 2016, Matthew Von-Maszewski  <mailto:matth...@basho.com>> wrote:
> Here are notes on the new hot backup:
> 
> https://github.com/basho/leveldb/wiki/mv-hot-backup 
> <https://github.com/basho/leveldb/wiki/mv-hot-backup>
> 
> This sound like what you need?
> 
> Matthew
> 
> Sent from my iPad
> 
> > On Sep 26, 2016, at 5:39 AM, Niels Christian Sorensen 
> > > wrote:
> >
> > Hi,
> >
> > We use Riak-kv Enterprise Edition as base for Riak CS to store files in. 
> > Each customer has a separate bucket in the cluster(s) and all data is 
> > stored multi site in 3 copies. Thus the "i lost a node" situation is fully 
> > covered.
> >
> > I need however, a solution for providing customers with a "single instance" 
> > backup of their data.
> >
> > I am aware of the possibility of tar, cp, scp, what-ever-copy of the data - 
> > but this require me to take system off-line according to this:
> > https://docs.basho.com/riak/kv/2.1.4/using/cluster-operations/backing-up/ 
> > <https://docs.basho.com/riak/kv/2.1.4/using/cluster-operations/backing-up/>
> >
> > Also I would still have to restore all customers data (and all involved 
> > nodes) - this is not trivial!
> >
> > Also I realize that the "riak-admin backup" is deprecated and should be 
> > avoided - It seemed like an easy solution to my problem but
> >
> > The s3cmd will allow me to pull out the data and I could most likely write 
> > a fantastic automatic script based system to use this - I do not have the 
> > time for that and are therefor looking for a commercial or "pre-build - 
> > adjustable" solution that will allow me to pull out a single copy of all 
> > data stored in a bucket and keep it elsewhere.
> >
> > Any ideas / solutions / quotes (as external consultant) on a solution to 
> > this problem?
> >
> > A plan for recovery is obviously also needed ;-)
> >
> > Thanks in advance
> >
> > /Christian
> >
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com 
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com 
> > <http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com 
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com 
> <http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Seeking Backup solution for live nodes

2016-09-26 Thread Matthew Von-Maszewski

Here are notes on the new hot backup:

https://github.com/basho/leveldb/wiki/mv-hot-backup

This sound like what you need?

Matthew

Sent from my iPad

> On Sep 26, 2016, at 5:39 AM, Niels Christian Sorensen 
>  wrote:
> 
> Hi,
> 
> We use Riak-kv Enterprise Edition as base for Riak CS to store files in. Each 
> customer has a separate bucket in the cluster(s) and all data is stored multi 
> site in 3 copies. Thus the "i lost a node" situation is fully covered.
> 
> I need however, a solution for providing customers with a "single instance" 
> backup of their data.
> 
> I am aware of the possibility of tar, cp, scp, what-ever-copy of the data - 
> but this require me to take system off-line according to this:
> https://docs.basho.com/riak/kv/2.1.4/using/cluster-operations/backing-up/
> 
> Also I would still have to restore all customers data (and all involved 
> nodes) - this is not trivial!
> 
> Also I realize that the "riak-admin backup" is deprecated and should be 
> avoided - It seemed like an easy solution to my problem but
> 
> The s3cmd will allow me to pull out the data and I could most likely write a 
> fantastic automatic script based system to use this - I do not have the time 
> for that and are therefor looking for a commercial or "pre-build - 
> adjustable" solution that will allow me to pull out a single copy of all data 
> stored in a bucket and keep it elsewhere.
> 
> Any ideas / solutions / quotes (as external consultant) on a solution to this 
> problem?
> 
> A plan for recovery is obviously also needed ;-)
> 
> Thanks in advance
> 
> /Christian
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: speeding up bulk loading

2016-08-31 Thread Matthew Von-Maszewski

Travis,

I cannot address the crash error you see in the logs.  Someone else will have 
to review that problem.

I want to point you to this wiki article:  
https://github.com/basho/leveldb/wiki/riak-tuning-2

The article details how you can potentially increase Riak's throughput by 
restricting the number of schedulers (CPUs) assigned solely to Erlang.  Heavy 
bulk loading scenarios demonstrated 20 to 30% throughput gain.

Per the wiki article, you would create an (or add to your existing) 
advanced.config with the following based upon your 16 core machines:

[
{vm_args, [{"+S", "14:14"}]}
].
Matthew
> On Aug 31, 2016, at 11:41 AM, Travis Kirstine 
>  wrote:
> 
> Magnus
>  
> Thanks for your reply.  We’re are using the riack C client library for riak 
> (https://github.com/trifork/riack ) which 
> is used within an application called MapCache to store 256x256 px images with 
> a corresponding key within riak.  Currently we have 75 million images to 
> transfer from disk into riak which is being done concurrently.  Periodically 
> this transfer process will crash
>  
> Riak is setup using n=3 on 5 nodes with a leveldb backend.  Each server has 
> 45GB of memory and 16 cores with  standard hard drives.  We made no 
> significant modification to the riak.conf except upping the 
> leveldb.maximum_memory.percent to 70 and tweeking the sysctl.conf as follows
>  
> vm.swappiness = 0
> net.ipv4.tcp_max_syn_backlog = 4
> net.core.somaxconn = 4
> net.core.wmem_default = 8388608
> net.core.rmem_default = 8388608
> net.ipv4.tcp_sack = 1
> net.ipv4.tcp_window_scaling = 1
> net.ipv4.tcp_fin_timeout = 15
> net.ipv4.tcp_keepalive_intvl = 30
> net.ipv4.tcp_tw_reuse = 1
> net.ipv4.tcp_moderate_rcvbuf = 1
> # Increase the open file limit
> # fs.file-max = 65536 # current setting
>  
> I have seen this error in the logs
> 2016-08-30 22:26:07.180 [error] <0.20777.512> CRASH REPORT Process 
> <0.20777.512> with 0 neighbours crashed with reason: no function clause 
> matching webmachine_request:peer_from_peername({error,enotconn}, 
> {webmachine_request,{wm_reqstate,#Port<0.2817336>,[],undefined,undefined,undefined,{wm_reqdata,'GET',...},...}})
>  line 150
>  
> Regards
>  
> From: Magnus Kessler [mailto:mkess...@basho.com ] 
> Sent: August-31-16 4:08 AM
> To: Travis Kirstine  >
> Cc: riak-users@lists.basho.com 
> Subject: Re: speeding up bulk loading
>  
> On 26 August 2016 at 22:20, Travis Kirstine  > wrote:
> Is there any way to speed up bulk loading?  I wondering if I should be 
> tweeking the erlang, aae or other config options?
>  
>  
>  
> Hi Travis,
>  
> Excuse the late reply; your message had been stuck in the moderation queue. 
> Please consider subscribing to this list.
>  
> Without knowing more about how you perform bulk uploads, it's difficult to 
> recommend any changes. Are you using the HTTP REST API or one of the client 
> libraries, which use protocol buffers by default? What concerns do you have 
> about the upload performance? Please let us know a bit more about your setup.
>  
> Kind Regards,
>  
> Magnus
>  
>  
> -- 
> Magnus Kessler
> Client Services Engineer
> Basho Technologies Limited
>  
> Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
> ___
> riak-users mailing list
> riak-users@lists.basho.com 
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com 
> 
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Rick Time-Series and time to live, ttl

2016-08-09 Thread Matthew Von-Maszewski

Anatoly,

The first time series TTL feature is due in Riak TS 1.4.  That release is 
looking good for late August.

The 1.4 release establishes a global TTL (also known as expiry policy) for all 
data.  A future release will allow individual policies per Riak TS table (yeah, 
I am coding on that now).

You can find some technical details about the feature here:

https://github.com/basho/leveldb/wiki/mv-expiry 

This feature will also arrive soon in our Riak KV 2.2.x product line.

Matthew

> On Aug 9, 2016, at 12:06 AM, Anatoly Smolyaninov  
> wrote:
> 
> Hello! 
> 
> Decided to take a look at Riak TS for our monitoring system.
> I went through the docs and wonder if it possible to set TTL for metrics? For 
> example, in KairosDB with Cassandra, I could set TTL, which would cause 
> metric to be marked with a “tombstone” after this time passed and deleted in 
> the next clean-up tier.
> 
> Does Riak TS have similar feature for auto clean-up old time-series data? 
> 
> Thanks.
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Recovering Riak data if it can no longer load in memory

2016-07-12 Thread Matthew Von-Maszewski

You can further reduce memory used by leveldb with the following setting in 
riak.conf:

leveldb.threads = 5

The value "5" needs to be a prime number.  The system defaults to 71.  Many 
Linux implementations will allocate 8Mbytes per thread for stack.  So bunches 
of threads lead to bunches of memory reserved for stack.  That is fine on 
servers with higher memory.  But probably part of your problem on a small 
memory machine.

The thread count is high to promote parallelism across vnodes on the same 
server, especially with "entropy = active".  So again, this setting is 
sacrificing performance to save memory.

Matthew

P.S.  You really want 8 CPU cores, 4 as a dirt minimum.  And review this for 
more cpu performance info:

https://github.com/basho/leveldb/wiki/riak-tuning-2



> On Jul 12, 2016, at 4:04 PM, Vikram Lalit  wrote:
> 
> Thanks much Matthew. Yes the server is low-memory given only development 
> right now - I'm using an AWS micro instance, so 1 GB RAM and 1 vCPU.
> 
> Thanks for the tip - let me try move the manifest file to a larger instance 
> and see how that works. More than reducing the memory footprint in dev, my 
> concern was more around reacting to a possible production scenario where the 
> db stops responding due to memory overload. Understood now that moving to a 
> larger instance should be possible. Thanks again.
> 
> On Tue, Jul 12, 2016 at 12:26 PM, Matthew Von-Maszewski  <mailto:matth...@basho.com>> wrote:
> It would be helpful if you described the physical characteristics of the 
> servers:  memory size, logical cpu count, etc.
> 
> Google created leveldb to be highly reliable in the face of crashes.  If it 
> is not restarting, that suggests to me that you have a low memory condition 
> that is not able to load leveldb's MANIFEST file.  That is easily fixed by 
> moving the dataset to a machine with larger memory.
> 
> There is also a special flag to reduce Riak's leveldb memory foot print 
> during development work.  The setting reduces the leveldb performance, but 
> lets you run with less memory.
> 
> In riak.conf, set:
> 
> leveldb.limited_developer_mem = true
> 
> Matthew
> 
> 
> > On Jul 12, 2016, at 11:56 AM, Vikram Lalit  > <mailto:vikramla...@gmail.com>> wrote:
> >
> > Hi - I've been testing a Riak cluster (of 3 nodes) with an ejabberd 
> > messaging cluster in front of it that writes data to the Riak nodes. Whilst 
> > load testing the platform (by creating 0.5 million ejabberd users via 
> > Tsung), I found that the Riak nodes suddenly crashed. My question is how do 
> > we recover from such a situation if it were to occur in production?
> >
> > To provide further context / details, the leveldb log files storing the 
> > data suddenly became too huge, thus making the AWS Riak instances not able 
> > to load them in memory anymore. So we get a core dump if 'riak start' is 
> > fired on those instances. I had an n_val = 2, and all 3 nodes went down 
> > almost simultaneously, so in such a scenario, we cannot even rely on a 2nd 
> > copy of the data. One way to of course prevent it in the first place would 
> > be to use auto-scaling, but I'm wondering is there a ex post facto / post 
> > the event recovery that can be performed in such a scenario? Is it possible 
> > to simply copy the leveldb data to a larger memory instance, or to curtail 
> > the data further to allow loading in the same instance?
> >
> > Appreciate if you can provide inputs - a tad concerned as to how we could 
> > recover from such a situation if it were to happen in production (apart 
> > from leveraging auto-scaling as a preventive measure).
> >
> > Thanks!
> >
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com 
> > <http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>
> 
> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Recovering Riak data if it can no longer load in memory

2016-07-12 Thread Matthew Von-Maszewski

It would be helpful if you described the physical characteristics of the 
servers:  memory size, logical cpu count, etc.

Google created leveldb to be highly reliable in the face of crashes.  If it is 
not restarting, that suggests to me that you have a low memory condition that 
is not able to load leveldb's MANIFEST file.  That is easily fixed by moving 
the dataset to a machine with larger memory.

There is also a special flag to reduce Riak's leveldb memory foot print during 
development work.  The setting reduces the leveldb performance, but lets you 
run with less memory.

In riak.conf, set:

leveldb.limited_developer_mem = true

Matthew


> On Jul 12, 2016, at 11:56 AM, Vikram Lalit  wrote:
> 
> Hi - I've been testing a Riak cluster (of 3 nodes) with an ejabberd messaging 
> cluster in front of it that writes data to the Riak nodes. Whilst load 
> testing the platform (by creating 0.5 million ejabberd users via Tsung), I 
> found that the Riak nodes suddenly crashed. My question is how do we recover 
> from such a situation if it were to occur in production?
> 
> To provide further context / details, the leveldb log files storing the data 
> suddenly became too huge, thus making the AWS Riak instances not able to load 
> them in memory anymore. So we get a core dump if 'riak start' is fired on 
> those instances. I had an n_val = 2, and all 3 nodes went down almost 
> simultaneously, so in such a scenario, we cannot even rely on a 2nd copy of 
> the data. One way to of course prevent it in the first place would be to use 
> auto-scaling, but I'm wondering is there a ex post facto / post the event 
> recovery that can be performed in such a scenario? Is it possible to simply 
> copy the leveldb data to a larger memory instance, or to curtail the data 
> further to allow loading in the same instance?
> 
> Appreciate if you can provide inputs - a tad concerned as to how we could 
> recover from such a situation if it were to happen in production (apart from 
> leveraging auto-scaling as a preventive measure).
> 
> Thanks!
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

[ANN] leveldb expiry beta1 available

2016-07-07 Thread Matthew Von-Maszewski

Basho is releasing a beta test version of a very rudimentary implementation for 
expiry within leveldb today.  This code is NOT for production use.  Download 
and usage notes [1] exist on Basho's leveldb wiki [2].

Please email all feedback either to this mailing list [3] or directly to 
matth...@basho.com .  The mailing list is preferred.

Technical details of the actual code changes are also available for anyone 
interested [4].  Riak users should also consider the recent performance 
configuration of the +S parameter [5] as part of their testing.

Thank you,
Matthew

[1] https://github.com/basho/leveldb/wiki/expiry_beta1 

[2] https://github.com/basho/leveldb/wiki 

[3] riak-users@lists.basho.com 
[4] https://github.com/basho/leveldb/wiki/mv-expiry 

[5] https://github.com/basho/leveldb/wiki/riak-tuning-2 



___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: How to increase Riak write performance for sequential alpha-numeric keys

2016-05-05 Thread Matthew Von-Maszewski

I failed to mention these links that give additional details concerning the 
timed grooming feature:

discussion of the timed grooming feature is here:

https://github.com/basho/leveldb/wiki/mv-timed-grooming

discussion of the timed grooming fix and disable is here:

https://github.com/basho/leveldb/wiki/mv-timed-grooming2


> On May 5, 2016, at 7:59 PM, Matthew Von-Maszewski  wrote:
> 
> Alex,
> 
> I am making a guess.  I would be better able to support this guess with data 
> from leveldb LOG files of one server.
> 
> The performance difference between the sequential keys and the reverse of the 
> sequential keys is informative.  The reverse ordered keys essentially become 
> more like "random keys" within the leveldb key space.  Overtime, they 
> disperse to different .sst table files scattering the "focal point" of the 
> read-before-write operations.  The sequential keys are keeping all newly 
> written data within the same .sst table file, i.e. the same "focal point".
> 
> And there is a new tuning called "timed grooming" in the Riak 2.1.3 and 2.1.4 
> releases that has a bug.  The bug can cause more frequent compaction cycles 
> than intended for 2.1.3 and 2.1.4 under some work loads.  Your particular 
> sequential keys might be such a load.  This is what the LOG files would 
> indicate.  The biggest impact of more frequent compaction cycles is that the 
> Linux page cache and leveldb block caches get invalidated more often causing 
> longer read cycles to determine "nothing is there" in the read-before-write 
> operation of Riak.
> 
> Your higher ring size is likely not helping, but there is math that can prove 
> or disprove this assumption.  And again, the base numbers are within the LOG 
> files.
> 
> We do have an eleveldb patch release available.  You would have to manual 
> install it on each of the nodes.  The patch disables the timed grooming and 
> contains some critical bug fixes that lead to the Riak 2.1.4 release.  You 
> would need to tell me which operating system package you originally download 
> for Riak, then I can send an appropriate link.
> 
> Matthew
> 
> 
>> On May 5, 2016, at 10:11 AM, alexc155  wrote:
>> 
>> Hi,
>> 
>> Thanks for your reply.
>> 
>> I don't think that write_once is going to work for us as we have to
>> periodically update the data (although if we remove the data before
>> re-inserting it, would that work?)
>> 
>> Why does read-before-write slow down new writes so much?
>> 
>> Some new information we've found - it seems that if we write the data and
>> then update it, we get fast speeds too. It's just the initial write of the
>> data that is slow.
>> 
>> So why is writing sequential keys so much slower than updating them or
>> writing non-sequential keys?
>> 
>> 
>> 
>> --
>> View this message in context: 
>> http://riak-users.197444.n3.nabble.com/How-to-increase-Riak-write-performance-for-sequential-alpha-numeric-keys-tp4034219p4034225.html
>> Sent from the Riak Users mailing list archive at Nabble.com.
>> 
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: How to increase Riak write performance for sequential alpha-numeric keys

2016-05-05 Thread Matthew Von-Maszewski

Alex,

I am making a guess.  I would be better able to support this guess with data 
from leveldb LOG files of one server.

The performance difference between the sequential keys and the reverse of the 
sequential keys is informative.  The reverse ordered keys essentially become 
more like "random keys" within the leveldb key space.  Overtime, they disperse 
to different .sst table files scattering the "focal point" of the 
read-before-write operations.  The sequential keys are keeping all newly 
written data within the same .sst table file, i.e. the same "focal point".

And there is a new tuning called "timed grooming" in the Riak 2.1.3 and 2.1.4 
releases that has a bug.  The bug can cause more frequent compaction cycles 
than intended for 2.1.3 and 2.1.4 under some work loads.  Your particular 
sequential keys might be such a load.  This is what the LOG files would 
indicate.  The biggest impact of more frequent compaction cycles is that the 
Linux page cache and leveldb block caches get invalidated more often causing 
longer read cycles to determine "nothing is there" in the read-before-write 
operation of Riak.

Your higher ring size is likely not helping, but there is math that can prove 
or disprove this assumption.  And again, the base numbers are within the LOG 
files.

We do have an eleveldb patch release available.  You would have to manual 
install it on each of the nodes.  The patch disables the timed grooming and 
contains some critical bug fixes that lead to the Riak 2.1.4 release.  You 
would need to tell me which operating system package you originally download 
for Riak, then I can send an appropriate link.

Matthew

> On May 5, 2016, at 10:11 AM, alexc155  wrote:
> 
> Hi,
> 
> Thanks for your reply.
> 
> I don't think that write_once is going to work for us as we have to
> periodically update the data (although if we remove the data before
> re-inserting it, would that work?)
> 
> Why does read-before-write slow down new writes so much?
> 
> Some new information we've found - it seems that if we write the data and
> then update it, we get fast speeds too. It's just the initial write of the
> data that is slow.
> 
> So why is writing sequential keys so much slower than updating them or
> writing non-sequential keys?
> 
> 
> 
> --
> View this message in context: 
> http://riak-users.197444.n3.nabble.com/How-to-increase-Riak-write-performance-for-sequential-alpha-numeric-keys-tp4034219p4034225.html
> Sent from the Riak Users mailing list archive at Nabble.com.
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: How to increase Riak write performance for sequential alpha-numeric keys

2016-05-05 Thread Matthew Von-Maszewski

Alex,

This is why I was requesting you run "riak-debug" on one server.  I would like 
to look at the leveldb LOG files before guessing and/or offering other 
solutions.  I have a guess but would like to confirm it with supporting 
evidence from the LOG file.

Matthew


> On May 5, 2016, at 10:11 AM, alexc155  wrote:
> 
> Hi,
> 
> Thanks for your reply.
> 
> I don't think that write_once is going to work for us as we have to
> periodically update the data (although if we remove the data before
> re-inserting it, would that work?)
> 
> Why does read-before-write slow down new writes so much?
> 
> Some new information we've found - it seems that if we write the data and
> then update it, we get fast speeds too. It's just the initial write of the
> data that is slow.
> 
> So why is writing sequential keys so much slower than updating them or
> writing non-sequential keys?
> 
> 
> 
> --
> View this message in context: 
> http://riak-users.197444.n3.nabble.com/How-to-increase-Riak-write-performance-for-sequential-alpha-numeric-keys-tp4034219p4034225.html
> Sent from the Riak Users mailing list archive at Nabble.com.
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: How to increase Riak write performance for sequential alpha-numeric keys

2016-05-05 Thread Matthew Von-Maszewski

Alex,

The successor to "last_write_wins" is the "write_once" bucket type.  You can 
read about its characteristics and limitations here:

   http://docs.basho.com/riak/kv/2.1.4/developing/app-guide/write-once/ 


This bucket type eliminates the Riak's typical read-before-write operation.  
Your experience with better performance by reversing the keys suggests to me 
that this bucket type might be what you need.

Also, I would be willing to review your general environment and particularly 
leveldb's actions.  I would need you to run "riak-debug" on one of the servers 
then post the tar file someplace private such as dropbox.  There might be other 
insights I can share based upon leveldb's actions and your physical server 
configuration.

Matthew



> On May 5, 2016, at 8:32 AM, alexc155  wrote:
> 
> We're using Riak as a simple key value store and we're having write 
> performance problems which we think is due to the format of our keys which we 
> can't easily change because they're tied into different parts of the business 
> and systems.
> 
> 
> We're not trying to do anything complicated with Riak: No solr, secondary 
> indexes or map reducing - just simple keys to strings of around 10Kb of JSON.
> 
> 
> We've got upwards of 3 billion records to store so we've opted for LevelDb as 
> the backend.
> 
> 
> It's a 3 node cluster running on 3 dedicated Ubuntu VMs each with 16 cpu 
> cores and 12GB memory backed by SSDs on a 10Gb network.
> 
> 
> Using basho bench we know that it's capable of speeds upwards of 5000 rows 
> per sec when using randomised keys, but the problem comes when we use our 
> actual data.
> 
> 
> The keys are formatted using the following pattern:
> 
> 
> USC~1930~1~1~001
> USC~1930~1~1~002
> USC~1930~1~1~003
> USC~1930~1~1~004
> 
> Most of the long key stays the same with numbers at the end going up. (The 
> "~" are changeable - we can set them to whatever character. They're just 
> delimiters in the keys)
> 
> 
> Using these keys, write performance is a tenth of the speed at 400 rows per 
> sec.
> 
> 
> We don't need to worry about different versions of the data so we've set the 
> following in our bucket type:
> 
> 
> "allow_mult": false
> "last_write_wins": true
> "DW": 0
> "n_val": 2
> "w": 1
> "r": 1
> "basic_quorum": false
> 
> On the riak servers we've set the ulimit to:
> 
> 
> riak soft nofile 32768
> riak hard nofile 65536
> 
> and other settings like this:
> 
> 
> ring_size = 128
> protobuf.backlog = 1024
> anti_entropy = passive
> 
> We're using the v2 .net client from basho to do the putting which runs in an 
> API on 3 machines.
> 
> 
> We've checked all the usual bottlenecks: CPU, memory, network IO, disk IO and 
> throttles on the Riak servers and windows API servers.
> 
> 
> As a kicker, if we reverse the keys e.g.
> 
> 
> 100~1~1~0391~CSU
> speed goes up to over 3000 rows, but that's a dirty kludge.
> 
> 
> Can anyone explain why Riak doesn't like sequential alpha-numeric keys and 
> what we can change to improve performance?
> 
> 
> Thanks!
> 
> 
> View this message in context: How to increase Riak write performance for 
> sequential alpha-numeric keys 
> 
> Sent from the Riak Users mailing list archive 
>  at Nabble.com.
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-04 Thread Matthew Von-Maszewski

Guillaume,

Two points:

1.  You can send the “riak debug” from one server and I will verify that 2.0.18 
is indicated in the LOG file.

2.  Your previous “riak debug” from server “riak1” indicated that only two CPU 
cores existed.  We performance test with eight, twelve, and twenty-four core 
servers, not two.  You have two heavy weight applications, Riak and Solr, 
competing for time on those two cores.  Actually, you have three applications 
due to leveldb’s background compaction operations.

One leveldb compaction is CPU intensive.  The compaction reads a block from the 
disk, computes a CRC32 check of the block, decompresses the block, merges the 
keys of this block with one or more blocks from other files, then compresses 
the new block, computes a new CRC32, and finally writes the block to disk.  And 
there can be multiple compactions running simultaneously.  All of your CPU time 
could be periodically lost to leveldb compactions.

There are some minor tunings we could do, like disabling compression in 
leveldb, that might help.  But I seriously doubt you are going to achieve your 
desired results with only two cores.  Adding a sixth server with two cores is 
not really going to help.

Matthew


> On May 4, 2016, at 4:27 AM, Guillaume Boddaert 
>  wrote:
> 
> Thanks, I've installed the new library as stated in the documentation using 
> 2.0.18 files.
> 
> I was unable to find the vnode LOG file from the documentation, as my vnodes 
> looks like file, not directories. So I can't verify that I run the proper 
> version of the library after my riak restart.
> 
> Anyway, it has unfortunately no effect:
> http://www.awesomescreenshot.com/image/1219821/1b292613c051da86df5696034c114b14
>  
> <http://www.awesomescreenshot.com/image/1219821/1b292613c051da86df5696034c114b14>
> 
> I think i'll try to add a 6th node that don't rely on network disks and see 
> what's going on.
> 
> G.
> 
> 
> On 03/05/2016 22:47, Matthew Von-Maszewski wrote:
>> Guillaume,
>> 
>> A prebuilt eleveldb 2.0.18 for Debian 7 is found here:
>>
>>  
>> <https://s3.amazonaws.com/downloads.basho.com/patches/eleveldb/2.0.18/eleveldb_2.0.18_debian7.tgz>https://s3.amazonaws.com/downloads.basho.com/patches/eleveldb/2.0.18/eleveldb_2.0.18_debian7.tgz
>>  
>> <https://s3.amazonaws.com/downloads.basho.com/patches/eleveldb/2.0.18/eleveldb_2.0.18_debian7.tgz>
>> 
>> There are good instructions for applying an eleveldb patch here:
>> 
>>   
>> <http://docs.basho.com/community/productadvisories/leveldbsegfault/#patch-eleveldb-so>http://docs.basho.com/community/productadvisories/leveldbsegfault/#patch-eleveldb-so
>>  
>> <http://docs.basho.com/community/productadvisories/leveldbsegfault/#patch-eleveldb-so>
>> 
>> Key points about the above web page:
>> 
>> - use the eleveldb patch file link in this email, NOT links on the web page
>> 
>> - the Debian directory listed on the web page will be slightly different 
>> than your Riak 2.1.4 installation:
>> 
>> /usr/lib/riak/lib/eleveldb-/priv/
>> 
>> 
>> Matthew
>> 
>> 
>>> On May 3, 2016, at 1:01 PM, Matthew Von-Maszewski >> <mailto:matth...@basho.com>> wrote:
>>> 
>>> Guillaume,
>>> 
>>> I have reviewed the debug package for your riak1 server.  There are two 
>>> potential areas of follow-up:
>>> 
>>> 1.  You are running our most recent Riak 2.1.4 which has eleveldb 2.0.17.  
>>> We have seen a case where a recent feature in eleveldb 2.0.17 caused too 
>>> much cache flushing, impacting leveldb’s performance.  A discussion is here:
>>> 
>>>   https://github.com/basho/leveldb/wiki/mv-timed-grooming2 
>>> <https://github.com/basho/leveldb/wiki/mv-timed-grooming2>
>>> 
>>> 2.  Yokozuna search was recently updated for some timeout problems.  Those 
>>> updates are not yet in a public build.  One of our other engineers is 
>>> likely to respond to you on that topic.
>>> 
>>> 
>>> An eleveldb 2.0.18 is tagged and available via github if you want to build 
>>> it yourself.  Otherwise, Basho may be releasing prebuilt patches of 
>>> eleveldb 2.0.18 in the near future.  But no date is currently set.
>>> 
>>> Matthew
>>> 
>>>> On May 3, 2016, at 10:50 AM, Luke Bakken >>> <mailto:lbak...@basho.com>> wrote:
>>>> 
>>>> Guillaume -
>>>> 
>>>> You said earlier "My data are stored on an openstack volume that
>>>> support up to 3000IOPS". There is a likelihood that your writ

Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-03 Thread Matthew Von-Maszewski

Guillaume,

A prebuilt eleveldb 2.0.18 for Debian 7 is found here:
   
https://s3.amazonaws.com/downloads.basho.com/patches/eleveldb/2.0.18/eleveldb_2.0.18_debian7.tgz
 
<https://s3.amazonaws.com/downloads.basho.com/patches/eleveldb/2.0.18/eleveldb_2.0.18_debian7.tgz>

There are good instructions for applying an eleveldb patch here:

 
http://docs.basho.com/community/productadvisories/leveldbsegfault/#patch-eleveldb-so

Key points about the above web page:

- use the eleveldb patch file link in this email, NOT links on the web page

- the Debian directory listed on the web page will be slightly different than 
your Riak 2.1.4 installation:

/usr/lib/riak/lib/eleveldb-/priv/


Matthew


> On May 3, 2016, at 1:01 PM, Matthew Von-Maszewski  wrote:
> 
> Guillaume,
> 
> I have reviewed the debug package for your riak1 server.  There are two 
> potential areas of follow-up:
> 
> 1.  You are running our most recent Riak 2.1.4 which has eleveldb 2.0.17.  We 
> have seen a case where a recent feature in eleveldb 2.0.17 caused too much 
> cache flushing, impacting leveldb’s performance.  A discussion is here:
> 
>   https://github.com/basho/leveldb/wiki/mv-timed-grooming2 
> <https://github.com/basho/leveldb/wiki/mv-timed-grooming2>
> 
> 2.  Yokozuna search was recently updated for some timeout problems.  Those 
> updates are not yet in a public build.  One of our other engineers is likely 
> to respond to you on that topic.
> 
> 
> An eleveldb 2.0.18 is tagged and available via github if you want to build it 
> yourself.  Otherwise, Basho may be releasing prebuilt patches of eleveldb 
> 2.0.18 in the near future.  But no date is currently set.
> 
> Matthew
> 
>> On May 3, 2016, at 10:50 AM, Luke Bakken > <mailto:lbak...@basho.com>> wrote:
>> 
>> Guillaume -
>> 
>> You said earlier "My data are stored on an openstack volume that
>> support up to 3000IOPS". There is a likelihood that your write load is
>> exceeding the capacity of your virtual environment, especially if some
>> Riak nodes are sharing physical disk or server infrastructure.
>> 
>> Some suggestions:
>> 
>> * If you're not using Riak Search, set "search = off" in riak.conf
>> 
>> * Be sure to carefully read and apply all tunings:
>> http://docs.basho.com/riak/kv/2.1.4/using/performance/ 
>> <http://docs.basho.com/riak/kv/2.1.4/using/performance/>
>> 
>> * You may wish to increase the memory dedicated to leveldb:
>> http://docs.basho.com/riak/kv/2.1.4/configuring/backend/#leveldb
>> 
>> --
>> Luke Bakken
>> Engineer
>> lbak...@basho.com
>> 
>> 
>> On Tue, May 3, 2016 at 7:33 AM, Guillaume Boddaert
>>  wrote:
>>> Hi,
>>> 
>>> Sorry for the delay, I've spent a lot of time trying to understand if the
>>> problem was elsewhere.
>>> I've simplified my infrastructure and got a simple layout that don't rely
>>> anymore on loadbalancer and also corrected some minor performance issue on
>>> my workers.
>>> 
>>> At the moment, i have up to 32 workers that are calling riak for writes,
>>> each of them are set to :
>>> w=1
>>> dw=0
>>> timeout=1000
>>> using protobuf
>>> a timeouted attempt is rerun 180s later
>>> 
>>> From my application server perspective, 23% of the calls are rejected by
>>> timeout (75446 tries, 57564 success, 17578 timeout).
>>> 
>>> Here is a sample riak-admin stat for one of my 5 hosts:
>>> 
>>> node_put_fsm_time_100 : 999331
>>> node_put_fsm_time_95 : 773682
>>> node_put_fsm_time_99 : 959444
>>> node_put_fsm_time_mean : 156242
>>> node_put_fsm_time_median : 20235
>>> vnode_put_fsm_time_100 : 5267527
>>> vnode_put_fsm_time_95 : 2437457
>>> vnode_put_fsm_time_99 : 4819538
>>> vnode_put_fsm_time_mean : 175567
>>> vnode_put_fsm_time_median : 6928
>>> 
>>> I am using leveldb, so i can't tune bitcask backend as suggested.
>>> 
>>> I've changed the vmdirty settings and enabled them:
>>> admin@riak1:~$ sudo sysctl -a | grep dirtyvm.dirty_background_ratio = 0
>>> vm.dirty_background_bytes = 209715200
>>> vm.dirty_ratio = 40
>>> vm.dirty_bytes = 0
>>> vm.dirty_writeback_centisecs = 100
>>> vm.dirty_expire_centisecs = 200
>>> 
>>> I've seen less idle time between writes, iostat is showing near constant
>>> writes between 20 and 500 kb/s, with some surges around 4000 kb/s. That's
>>> better, but not that great.

Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-03 Thread Matthew Von-Maszewski

Guillaume,

I have reviewed the debug package for your riak1 server.  There are two 
potential areas of follow-up:

1.  You are running our most recent Riak 2.1.4 which has eleveldb 2.0.17.  We 
have seen a case where a recent feature in eleveldb 2.0.17 caused too much 
cache flushing, impacting leveldb’s performance.  A discussion is here:

  https://github.com/basho/leveldb/wiki/mv-timed-grooming2 


2.  Yokozuna search was recently updated for some timeout problems.  Those 
updates are not yet in a public build.  One of our other engineers is likely to 
respond to you on that topic.


An eleveldb 2.0.18 is tagged and available via github if you want to build it 
yourself.  Otherwise, Basho may be releasing prebuilt patches of eleveldb 
2.0.18 in the near future.  But no date is currently set.

Matthew

> On May 3, 2016, at 10:50 AM, Luke Bakken  wrote:
> 
> Guillaume -
> 
> You said earlier "My data are stored on an openstack volume that
> support up to 3000IOPS". There is a likelihood that your write load is
> exceeding the capacity of your virtual environment, especially if some
> Riak nodes are sharing physical disk or server infrastructure.
> 
> Some suggestions:
> 
> * If you're not using Riak Search, set "search = off" in riak.conf
> 
> * Be sure to carefully read and apply all tunings:
> http://docs.basho.com/riak/kv/2.1.4/using/performance/
> 
> * You may wish to increase the memory dedicated to leveldb:
> http://docs.basho.com/riak/kv/2.1.4/configuring/backend/#leveldb
> 
> --
> Luke Bakken
> Engineer
> lbak...@basho.com
> 
> 
> On Tue, May 3, 2016 at 7:33 AM, Guillaume Boddaert
>  wrote:
>> Hi,
>> 
>> Sorry for the delay, I've spent a lot of time trying to understand if the
>> problem was elsewhere.
>> I've simplified my infrastructure and got a simple layout that don't rely
>> anymore on loadbalancer and also corrected some minor performance issue on
>> my workers.
>> 
>> At the moment, i have up to 32 workers that are calling riak for writes,
>> each of them are set to :
>> w=1
>> dw=0
>> timeout=1000
>> using protobuf
>> a timeouted attempt is rerun 180s later
>> 
>> From my application server perspective, 23% of the calls are rejected by
>> timeout (75446 tries, 57564 success, 17578 timeout).
>> 
>> Here is a sample riak-admin stat for one of my 5 hosts:
>> 
>> node_put_fsm_time_100 : 999331
>> node_put_fsm_time_95 : 773682
>> node_put_fsm_time_99 : 959444
>> node_put_fsm_time_mean : 156242
>> node_put_fsm_time_median : 20235
>> vnode_put_fsm_time_100 : 5267527
>> vnode_put_fsm_time_95 : 2437457
>> vnode_put_fsm_time_99 : 4819538
>> vnode_put_fsm_time_mean : 175567
>> vnode_put_fsm_time_median : 6928
>> 
>> I am using leveldb, so i can't tune bitcask backend as suggested.
>> 
>> I've changed the vmdirty settings and enabled them:
>> admin@riak1:~$ sudo sysctl -a | grep dirtyvm.dirty_background_ratio = 0
>> vm.dirty_background_bytes = 209715200
>> vm.dirty_ratio = 40
>> vm.dirty_bytes = 0
>> vm.dirty_writeback_centisecs = 100
>> vm.dirty_expire_centisecs = 200
>> 
>> I've seen less idle time between writes, iostat is showing near constant
>> writes between 20 and 500 kb/s, with some surges around 4000 kb/s. That's
>> better, but not that great.
>> 
>> Here is the current configuration for my "activity_fr" bucket type and
>> "tweet" bucket:
>> 
>> 
>> admin@riak1:~$ http localhost:8098/types/activity_fr/props
>> HTTP/1.1 200 OK
>> Content-Encoding: gzip
>> Content-Length: 314
>> Content-Type: application/json
>> Date: Tue, 03 May 2016 14:30:21 GMT
>> Server: MochiWeb/1.1 WebMachine/1.10.8 (that head fake, tho)
>> Vary: Accept-Encoding
>> {
>>"props": {
>>"active": true,
>>"allow_mult": false,
>>"basic_quorum": false,
>>"big_vclock": 50,
>>"chash_keyfun": {
>>"fun": "chash_std_keyfun",
>>"mod": "riak_core_util"
>>},
>>"claimant": "r...@riak2.lighthouse-analytics.co",
>>"dvv_enabled": false,
>>"dw": "quorum",
>>"last_write_wins": true,
>>"linkfun": {
>>"fun": "mapreduce_linkfun",
>>"mod": "riak_kv_wm_link_walker"
>>},
>>"n_val": 3,
>>"notfound_ok": true,
>>"old_vclock": 86400,
>>"postcommit": [],
>>"pr": 0,
>>"precommit": [],
>>"pw": 0,
>>"r": "quorum",
>>"rw": "quorum",
>>"search_index": "activity_fr.20160422104506",
>>"small_vclock": 50,
>>"w": "quorum",
>>"young_vclock": 20
>>}
>> }
>> 
>> admin@riak1:~$ http localhost:8098/types/activity_fr/buckets/tweet/props
>> HTTP/1.1 200 OK
>> Content-Encoding: gzip
>> Content-Length: 322
>> Content-Type: application/json
>> Date: Tue, 03 May 2016 14:30:02 GMT
>> Server: MochiWeb/1.1 WebMachine/1.10.8 (that head fake, tho)
>> Vary: Accept-Encoding
>> 
>> {
>>"props": {
>>"active": true,
>>"allow_mult": false,
>>"bas

Re: 2i indexes and keys request inconsistencies

2016-03-08 Thread Matthew Von-Maszewski

Is the database being actively modified during your queries?  

Queries can lock down a "snapshot" within leveldb.  The query operation can 
return keys that existed at the time of the snapshot, but have been 
subsequently deleted by normal operations.

In such a case, the query is correct in giving you the key and the 404 
afterward is also correct.  They represent two different versions of the 
database over time.

Not sure if this is a valid scenario for you or not.

Matthew

> On Mar 8, 2016, at 1:22 PM, Alexander Popov  wrote:
> 
> Noticied that sometimes 2i query and all keys requesrs returns extra records 
> ~2% of all records.
> 
> When call this items by get request after,  it returns 404 and after that key 
> stops to returns in 2i and keys requests.
> 
> Does it normally or my database is corrupted?
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Ok, I am stumped. Losing data or riak stop

2016-02-28 Thread Matthew Von-Maszewski

All,

Thanks to Joe Olson for his detective work and write-up.  Further thanks to 
Russell Brown and Dan Brown for debug work provided yesterday and today.

I have updated the Riak ticket with the cause and fix discussion:

   https://github.com/basho/riak_kv/issues/1356

I will discuss this issue internally with relation to the upcoming Riak 2.0.7 
and 2.2.0 releases.  And will create a proper branch and pull request tomorrow.

This is a data loss scenario in tiered storage if Riak starts and stops prior 
to the first recovery log being translated into an .sst table file.  Once the 
first .log file becomes an .sst file, all subsequent recovery .log files go to 
the proper location and will be found upon next stop/start cycle.  Stated 
another way, the first 30Mbytes to 60Mbytes of data written to each vnode after 
a restart is subject to data loss if Riak restarted again quickly.

Matthew

> On Feb 26, 2016, at 11:19 AM, Joe Olson  wrote:
> 
> 
> 
> Negative.
> 
> I have ring size set to 8, leveldb split across two sets of drives ("fast" 
> and "slow", but meaningless on the test Vagrant box...just two separate 
> directories). I checked all of the ../leveldb/* directories. All LOG files 
> are identical, and no errors in any of them.
> 
> I will try to build another Vagrant machine with the default riak.conf and 
> see if I can get this to repeat. It is almost as if the KV pairs are not 
> persisting to disk at all.
> 
> 
> From: "Matthew Von-Maszewski" 
> To: "Joe Olson" 
> Cc: "riak-users" , "cmancini" 
> Sent: Friday, February 26, 2016 10:12:15 AM
> Subject: Re: Ok, I am stumped. Losing data or riak stop
> 
> Joe,
> 
> Are there any error messages in the leveldb LOG and/or LOG.old files?  These 
> files are located within each vnode's directory, likely 
> /var/lib/riak/data/leveldb/*/LOG* on your machine.
> 
> The LOG files are not to be confused with 000xxx.log files.  The lower case 
> *.log files are the recovery files that should contain the keys you are 
> missing.  If they are not loading properly, the LOG files should have clues.
> 
> Matthew
> 
> On Feb 26, 2016, at 11:04 AM, Christopher Mancini  <mailto:cmanc...@basho.com>> wrote:
> 
> Hey Joe,
> 
> I will do my best to help, but I am not the most experienced with Riak 
> operations. Your best bet to get to a solution as fast as possible is to 
> include the full users group, which I have added to the recipients of this 
> message.
> 
> 1. Are the Riak data directories within Vagrant shared directories between 
> the host and guest? I have had issues with OS file system caching before when 
> working with web server files.
> 
> 2. What version of Ubuntu are you using?
> 
> 3. How did you install Riak on Ubuntu?
> 
> 4. Have you tried restoring the original distribution riak.conf file and seen 
> if the issue persists? This would help you determine if the issue is your 
> config or something with your environment.
> 
> Chris
> 
> On Fri, Feb 26, 2016 at 10:55 AM Joe Olson  <mailto:technol...@nododos.com>> wrote:
> 
> Chris - 
> 
>  I cannot figure out what is going on. Here is my test case. Configuration 
> file attached. I am running a single node of Riak on a vagrant box with a 
> level DB back end. I don't even have to bring the box down, merely stopping 
> and restarting riak '(riak stop' and 'riak start' or 'risk restart) causes 
> all the keys to be lost. The riak node is set up on a Vagrant box. But 
> againI do not have to bring the machine up or down to get this error.
> 
> I've also deleted the ring info in /var/lib/riak/ring, and deleted all the 
> leveldb files. In this case, the bucket type is just n_val = 1, and the ring 
> size is the minimum of 8. 
> 
> Is it possible Riak is not flushing RAM to disk after write? The keys only 
> reside in RAM?
> 
> My test procedure:
> 
> On a remote machine=
> 
> riak01@ubuntu:/etc$ curl -i http:// 
> <>:8098/types/n1/buckets/test/keys?keys=true
> HTTP/1.1 200 OK
> Vary: Accept-Encoding
> Server: MochiWeb/1.1 WebMachine/1.10.8 (that head fake, tho)
> Date: Fri, 26 Feb 2016 13:14:59 GMT
> Content-Type: application/json
> Content-Length: 17
> 
> {"keys":["test"]}
> 
> riak01@ubuntu:/etc$
> 
> 
> 
> On the single Riak node itself
> 
> [vagrant@i-2016022519 -9bb5c84f riak]$ sudo riak stop
> ok
> [vagrant@i-2016022519 -9bb5c84f riak]$ sudo riak start
> [vagrant@i-2016022519 -9bb5c84f riak]$ sudo riak ping
> pong
> 
> 
> 
> Back to the remote machine
> 
> riak01@ubuntu:/etc$ curl -i http:// 
> <>:8

Re: Ok, I am stumped. Losing data or riak stop

2016-02-26 Thread Matthew Von-Maszewski

What I failed to say, was make the copy after you populate and stop, but before 
you attempt to start Riak again.

Matthew

> On Feb 26, 2016, at 11:19 AM, Joe Olson  wrote:
> 
> 
> 
> Negative.
> 
> I have ring size set to 8, leveldb split across two sets of drives ("fast" 
> and "slow", but meaningless on the test Vagrant box...just two separate 
> directories). I checked all of the ../leveldb/* directories. All LOG files 
> are identical, and no errors in any of them.
> 
> I will try to build another Vagrant machine with the default riak.conf and 
> see if I can get this to repeat. It is almost as if the KV pairs are not 
> persisting to disk at all.
> 
> 
> From: "Matthew Von-Maszewski" 
> To: "Joe Olson" 
> Cc: "riak-users" , "cmancini" 
> Sent: Friday, February 26, 2016 10:12:15 AM
> Subject: Re: Ok, I am stumped. Losing data or riak stop
> 
> Joe,
> 
> Are there any error messages in the leveldb LOG and/or LOG.old files?  These 
> files are located within each vnode's directory, likely 
> /var/lib/riak/data/leveldb/*/LOG* on your machine.
> 
> The LOG files are not to be confused with 000xxx.log files.  The lower case 
> *.log files are the recovery files that should contain the keys you are 
> missing.  If they are not loading properly, the LOG files should have clues.
> 
> Matthew
> 
> On Feb 26, 2016, at 11:04 AM, Christopher Mancini  <mailto:cmanc...@basho.com>> wrote:
> 
> Hey Joe,
> 
> I will do my best to help, but I am not the most experienced with Riak 
> operations. Your best bet to get to a solution as fast as possible is to 
> include the full users group, which I have added to the recipients of this 
> message.
> 
> 1. Are the Riak data directories within Vagrant shared directories between 
> the host and guest? I have had issues with OS file system caching before when 
> working with web server files.
> 
> 2. What version of Ubuntu are you using?
> 
> 3. How did you install Riak on Ubuntu?
> 
> 4. Have you tried restoring the original distribution riak.conf file and seen 
> if the issue persists? This would help you determine if the issue is your 
> config or something with your environment.
> 
> Chris
> 
> On Fri, Feb 26, 2016 at 10:55 AM Joe Olson  <mailto:technol...@nododos.com>> wrote:
> 
> Chris - 
> 
>  I cannot figure out what is going on. Here is my test case. Configuration 
> file attached. I am running a single node of Riak on a vagrant box with a 
> level DB back end. I don't even have to bring the box down, merely stopping 
> and restarting riak '(riak stop' and 'riak start' or 'risk restart) causes 
> all the keys to be lost. The riak node is set up on a Vagrant box. But 
> againI do not have to bring the machine up or down to get this error.
> 
> I've also deleted the ring info in /var/lib/riak/ring, and deleted all the 
> leveldb files. In this case, the bucket type is just n_val = 1, and the ring 
> size is the minimum of 8. 
> 
> Is it possible Riak is not flushing RAM to disk after write? The keys only 
> reside in RAM?
> 
> My test procedure:
> 
> On a remote machine=
> 
> riak01@ubuntu:/etc$ curl -i http:// 
> <>:8098/types/n1/buckets/test/keys?keys=true
> HTTP/1.1 200 OK
> Vary: Accept-Encoding
> Server: MochiWeb/1.1 WebMachine/1.10.8 (that head fake, tho)
> Date: Fri, 26 Feb 2016 13:14:59 GMT
> Content-Type: application/json
> Content-Length: 17
> 
> {"keys":["test"]}
> 
> riak01@ubuntu:/etc$
> 
> 
> 
> On the single Riak node itself
> 
> [vagrant@i-2016022519 -9bb5c84f riak]$ sudo riak stop
> ok
> [vagrant@i-2016022519 -9bb5c84f riak]$ sudo riak start
> [vagrant@i-2016022519 -9bb5c84f riak]$ sudo riak ping
> pong
> 
> 
> 
> Back to the remote machine
> 
> riak01@ubuntu:/etc$ curl -i http:// 
> <>:8098/types/n1/buckets/test/keys?keys=true
> HTTP/1.1 200 OK
> Vary: Accept-Encoding
> Server: MochiWeb/1.1 WebMachine/1.10.8 (that head fake, tho)
> Date: Fri, 26 Feb 2016 13:16:34 GMT
> Content-Type: application/json
> Content-Length: 11
> 
> {"keys":[]}
> 
> riak01@ubuntu:/etc$
> 
> 
> 
> -- 
> Sincerely,
> 
> Christopher Mancini
> -
> 
> employee = {
> purpose: solve problems with code,
> phone:7164625591,
> email: cmanc...@basho.com <mailto:cmanc...@basho.com>,
> github:http://www.github.com/christophermancini 
> <http://www.github.com/christophermancini>
> }
> ___
> riak-users mailing list
> riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Ok, I am stumped. Losing data or riak stop

2016-02-26 Thread Matthew Von-Maszewski

Joe,

If the sample data is not confidential, how about creating a tar file of the 
entire leveldb data directory and either emailing to me directly or posting 
somewhere I can download it?  No need to copy the entire mailing list on the 
file or download location.

Matthew

> On Feb 26, 2016, at 11:19 AM, Joe Olson  wrote:
> 
> 
> 
> Negative.
> 
> I have ring size set to 8, leveldb split across two sets of drives ("fast" 
> and "slow", but meaningless on the test Vagrant box...just two separate 
> directories). I checked all of the ../leveldb/* directories. All LOG files 
> are identical, and no errors in any of them.
> 
> I will try to build another Vagrant machine with the default riak.conf and 
> see if I can get this to repeat. It is almost as if the KV pairs are not 
> persisting to disk at all.
> 
> 
> From: "Matthew Von-Maszewski" 
> To: "Joe Olson" 
> Cc: "riak-users" , "cmancini" 
> Sent: Friday, February 26, 2016 10:12:15 AM
> Subject: Re: Ok, I am stumped. Losing data or riak stop
> 
> Joe,
> 
> Are there any error messages in the leveldb LOG and/or LOG.old files?  These 
> files are located within each vnode's directory, likely 
> /var/lib/riak/data/leveldb/*/LOG* on your machine.
> 
> The LOG files are not to be confused with 000xxx.log files.  The lower case 
> *.log files are the recovery files that should contain the keys you are 
> missing.  If they are not loading properly, the LOG files should have clues.
> 
> Matthew
> 
> On Feb 26, 2016, at 11:04 AM, Christopher Mancini  <mailto:cmanc...@basho.com>> wrote:
> 
> Hey Joe,
> 
> I will do my best to help, but I am not the most experienced with Riak 
> operations. Your best bet to get to a solution as fast as possible is to 
> include the full users group, which I have added to the recipients of this 
> message.
> 
> 1. Are the Riak data directories within Vagrant shared directories between 
> the host and guest? I have had issues with OS file system caching before when 
> working with web server files.
> 
> 2. What version of Ubuntu are you using?
> 
> 3. How did you install Riak on Ubuntu?
> 
> 4. Have you tried restoring the original distribution riak.conf file and seen 
> if the issue persists? This would help you determine if the issue is your 
> config or something with your environment.
> 
> Chris
> 
> On Fri, Feb 26, 2016 at 10:55 AM Joe Olson  <mailto:technol...@nododos.com>> wrote:
> 
> Chris - 
> 
>  I cannot figure out what is going on. Here is my test case. Configuration 
> file attached. I am running a single node of Riak on a vagrant box with a 
> level DB back end. I don't even have to bring the box down, merely stopping 
> and restarting riak '(riak stop' and 'riak start' or 'risk restart) causes 
> all the keys to be lost. The riak node is set up on a Vagrant box. But 
> againI do not have to bring the machine up or down to get this error.
> 
> I've also deleted the ring info in /var/lib/riak/ring, and deleted all the 
> leveldb files. In this case, the bucket type is just n_val = 1, and the ring 
> size is the minimum of 8. 
> 
> Is it possible Riak is not flushing RAM to disk after write? The keys only 
> reside in RAM?
> 
> My test procedure:
> 
> On a remote machine=
> 
> riak01@ubuntu:/etc$ curl -i http:// 
> <>:8098/types/n1/buckets/test/keys?keys=true
> HTTP/1.1 200 OK
> Vary: Accept-Encoding
> Server: MochiWeb/1.1 WebMachine/1.10.8 (that head fake, tho)
> Date: Fri, 26 Feb 2016 13:14:59 GMT
> Content-Type: application/json
> Content-Length: 17
> 
> {"keys":["test"]}
> 
> riak01@ubuntu:/etc$
> 
> 
> 
> On the single Riak node itself
> 
> [vagrant@i-2016022519 -9bb5c84f riak]$ sudo riak stop
> ok
> [vagrant@i-2016022519 -9bb5c84f riak]$ sudo riak start
> [vagrant@i-2016022519 -9bb5c84f riak]$ sudo riak ping
> pong
> 
> 
> 
> Back to the remote machine
> 
> riak01@ubuntu:/etc$ curl -i http:// 
> <>:8098/types/n1/buckets/test/keys?keys=true
> HTTP/1.1 200 OK
> Vary: Accept-Encoding
> Server: MochiWeb/1.1 WebMachine/1.10.8 (that head fake, tho)
> Date: Fri, 26 Feb 2016 13:16:34 GMT
> Content-Type: application/json
> Content-Length: 11
> 
> {"keys":[]}
> 
> riak01@ubuntu:/etc$
> 
> 
> 
> -- 
> Sincerely,
> 
> Christopher Mancini
> -
> 
> employee = {
> purpose: solve problems with code,
> phone:7164625591,
> email: cmanc...@basho.com <mailto:cmanc...@basho.com>,
> github:http://www.github.com/christophermancini 
> <http://www.github.com/christophermancini>
> }
> ___
> riak-users mailing list
> riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Ok, I am stumped. Losing data or riak stop

2016-02-26 Thread Matthew Von-Maszewski

Joe,

Are there any error messages in the leveldb LOG and/or LOG.old files?  These 
files are located within each vnode's directory, likely 
/var/lib/riak/data/leveldb/*/LOG* on your machine.

The LOG files are not to be confused with 000xxx.log files.  The lower case 
*.log files are the recovery files that should contain the keys you are 
missing.  If they are not loading properly, the LOG files should have clues.

Matthew

> On Feb 26, 2016, at 11:04 AM, Christopher Mancini  wrote:
> 
> Hey Joe,
> 
> I will do my best to help, but I am not the most experienced with Riak 
> operations. Your best bet to get to a solution as fast as possible is to 
> include the full users group, which I have added to the recipients of this 
> message.
> 
> 1. Are the Riak data directories within Vagrant shared directories between 
> the host and guest? I have had issues with OS file system caching before when 
> working with web server files.
> 
> 2. What version of Ubuntu are you using?
> 
> 3. How did you install Riak on Ubuntu?
> 
> 4. Have you tried restoring the original distribution riak.conf file and seen 
> if the issue persists? This would help you determine if the issue is your 
> config or something with your environment.
> 
> Chris
> 
> On Fri, Feb 26, 2016 at 10:55 AM Joe Olson  > wrote:
> 
> Chris - 
> 
>  I cannot figure out what is going on. Here is my test case. Configuration 
> file attached. I am running a single node of Riak on a vagrant box with a 
> level DB back end. I don't even have to bring the box down, merely stopping 
> and restarting riak '(riak stop' and 'riak start' or 'risk restart) causes 
> all the keys to be lost. The riak node is set up on a Vagrant box. But 
> againI do not have to bring the machine up or down to get this error.
> 
> I've also deleted the ring info in /var/lib/riak/ring, and deleted all the 
> leveldb files. In this case, the bucket type is just n_val = 1, and the ring 
> size is the minimum of 8. 
> 
> Is it possible Riak is not flushing RAM to disk after write? The keys only 
> reside in RAM?
> 
> My test procedure:
> 
> On a remote machine=
> 
> riak01@ubuntu:/etc$ curl -i http:// 
> <>:8098/types/n1/buckets/test/keys?keys=true
> HTTP/1.1 200 OK
> Vary: Accept-Encoding
> Server: MochiWeb/1.1 WebMachine/1.10.8 (that head fake, tho)
> Date: Fri, 26 Feb 2016 13:14:59 GMT
> Content-Type: application/json
> Content-Length: 17
> 
> {"keys":["test"]}
> 
> riak01@ubuntu:/etc$
> 
> 
> 
> On the single Riak node itself
> 
> [vagrant@i-2016022519 -9bb5c84f riak]$ sudo riak stop
> ok
> [vagrant@i-2016022519 -9bb5c84f riak]$ sudo riak start
> [vagrant@i-2016022519 -9bb5c84f riak]$ sudo riak ping
> pong
> 
> 
> 
> Back to the remote machine
> 
> riak01@ubuntu:/etc$ curl -i http:// 
> <>:8098/types/n1/buckets/test/keys?keys=true
> HTTP/1.1 200 OK
> Vary: Accept-Encoding
> Server: MochiWeb/1.1 WebMachine/1.10.8 (that head fake, tho)
> Date: Fri, 26 Feb 2016 13:16:34 GMT
> Content-Type: application/json
> Content-Length: 11
> 
> {"keys":[]}
> 
> riak01@ubuntu:/etc$
> 
> 
> 
> -- 
> Sincerely,
> 
> Christopher Mancini
> -
> 
> employee = {
> purpose: solve problems with code,
> phone:7164625591,
> email: cmanc...@basho.com ,
> github:http://www.github.com/christophermancini 
> 
> }
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: riak crash

2016-02-22 Thread Matthew Von-Maszewski

Raviraj,

Please run 'riak-debug'.  This is in the bin directory along with 'riak start' 
and 'riak-admin'.

riak-debug will produce a file named similar to 
/home/user/r...@10.0.0.15-riak-debug.tar.gz 


You should email that file to me directly, or post it to dropbox or similar and 
send me a link.  You do not want to send that file to the entire mailing list.

I will review the file and suggest next steps.

Matthew

> On Feb 22, 2016, at 5:13 AM, Raviraj Vaishampayan  
> wrote:
> 
> Hi,
> 
> We have been using riak to gather our test data and analyze results after 
> test completes.
> Recently we have observed riak crash in riak console logs.
> This causes our tests failing to record data to riak and bailing out :-(
> 
> The crash logs are as follow:
> 2016-02-19 16:25:26.255 [error] <0.2160.0> gen_fsm <0.2160.0> in state active 
> terminated with reason: no function clause matching 
> riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, 
> {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
>  line 1195
> 2016-02-19 16:25:26.260 [error] <0.2160.0> CRASH REPORT Process <0.2160.0> 
> with 2 neighbours exited with reason: no function clause matching 
> riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, 
> {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
>  line 1195 in gen_fsm:terminate/7 line 622
> 2016-02-19 16:25:26.260 [error] <0.172.0> Supervisor riak_core_vnode_sup had 
> child undefined started with {riak_core_vnode,start_link,undefined} at 
> <0.2160.0> exit with reason no function clause matching 
> riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, 
> {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
>  line 1195 in context child_terminated
> 2016-02-19 16:25:26.261 [error] <0.4319.0> gen_fsm <0.4319.0> in state ready 
> terminated with reason: no function clause matching 
> riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, 
> {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
>  line 1195
> 2016-02-19 16:25:26.275 [error] <0.4319.0> CRASH REPORT Process <0.4319.0> 
> with 10 neighbours exited with reason: no function clause matching 
> riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, 
> {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
>  line 1195 in gen_fsm:terminate/7 line 622
> 2016-02-19 16:25:26.278 [error] <0.4320.0> Supervisor 
> {<0.4320.0>,poolboy_sup} had child riak_core_vnode_worker started with 
> riak_core_vnode_worker:start_link([{worker_module,riak_core_vnode_worker},{worker_args,[268322566228720457638957762256505085639956365312,...]},...])
>  at undefined exit with reason no function clause matching 
> riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, 
> {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
>  line 1195 in context shutdown_error
> 2016-02-19 16:25:26.278 [error] <0.4320.0> gen_server <0.4320.0> terminated 
> with reason: no function clause matching 
> riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, 
> {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
>  line 1195
> 2016-02-19 16:25:26.278 [error] <0.4320.0> CRASH REPORT Process <0.4320.0> 
> with 0 neighbours exited with reason: no function clause matching 
> riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, 
> {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
>  line 1195 in gen_server:terminate/6 line 744
> 2016-02-19 16:25:26.806 [error] <0.2157.0> gen_fsm <0.2157.0> in state active 
> terminated with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}}
> 2016-02-19 16:25:26.808 [error] <0.2157.0> CRASH REPORT Process <0.2157.0> 
> with 2 neighbours exited with reason: 
> {timeout,{gen_server,call,[<0.5141.0>,stop]}} in gen_fsm:terminate/7 line 600
> 2016-02-19 16:25:26.809 [error] <0.5450.0> gen_fsm <0.5450.0> in state ready 
> terminated with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}}
> 2016-02-19 16:25:26.809 [error] <0.172.0> Supervisor riak_core_vnode_sup had 
> child undefined started with {riak_core_vnode,start_link,undefined} at 
> <0.2157.0> exit with reason {timeout,{gen_server,call,[<0.5141.0>,stop]}} in 
> context child_terminated
> 2016-02-19 16:25:26.809 [error] <0.5450.0> CRASH REPORT Process <0.5450.0> 
> with 10 neighbours exited with reason: 
> {timeout,{gen_server,call,[<0.5141.0>,stop]}} in gen_fsm:terminate/7 line 622
> 2016-02-19 16:25:26.809 [error] <0.5451.0> Supervisor 
> {<0.5451.0>,poolboy_sup} had child riak_core_v

Re: Leveldb segfault during Riak startup

2016-01-19 Thread Matthew Von-Maszewski

riak-users,

Antti's issues has been addressed via a correction to leveldb.  Upcoming 
releases of Riak will of course contain this correction.  The fix for Riak 2.x 
is tagged 2.0.11 at github.com/basho/leveldb.  The same fix is also part of the 
leveldb "develop" branch.

Bug discussion is here:  
https://github.com/basho/leveldb/wiki/mv-startup-segfault

Matthew


> On Jan 1, 2016, at 3:25 PM, Antti Kuusela  wrote:
> 
> Hi Matthew,
> 
> This phenomenon started after upgrade to 2.1.2. I downgraded the servers to 
> 2.1.1, now waiting to see if it makes a difference.
> 
> I installed the binary packages from Basho repo.
> 
> 
> 
> ________
> From: Matthew Von-Maszewski 
> Sent: 31 December 2015 18:25
> To: Antti Kuusela
> Cc: Luke Bakken; riak-users
> Subject: Re: Leveldb segfault during Riak startup
> 
> I also failed to ask two basic questions:
> 
> 1.  did this failure start after your upgrade to 2.1.3, or happen prior to 
> upgrade also?
> 
> 2.  did you use a Basho package for Centos 7, or did you build from source 
> code?
> 
> Matthew
> 
> 
>> On Dec 31, 2015, at 6:06 AM, Antti Kuusela  
>> wrote:
>> 
>> Hi Luke,
>> 
>> We erased the btrfs file system, replaced it with xfs on lvm with 
>> thinly-provisioned volumes and continued testing with a new database from 
>> scratch. The same problem continues, though. From /var/log/messages:
>> 
>> Dec 31 03:35:31 storage5 riak[66419]: Starting up
>> Dec 31 03:35:45 storage5 kernel: traps: beam.smp[66731] general protection 
>> ip:7f02280b3f16 sp:7f0197ad8dd0 error:0 in eleveldb.so[7f0228066000+93000]
>> Dec 31 03:35:46 storage5 run_erl[66417]: Erlang closed the connection.
>> 
>> 
>> 
>> 18.12.2015, 16:45, Luke Bakken kirjoitti:
>>> Hi Antti,
>>> 
>>> Riak is not tested on btrfs and the file system is not officially
>>> supported. We recommend ext4 or xfs for Linux. ZFS is an option on
>>> Solaris derivatives and FreeBSD.
>>> 
>>> --
>>> Luke Bakken
>>> Engineer
>>> lbak...@basho.com
>>> 
>>> 
>>> On Fri, Dec 18, 2015 at 6:14 AM, Antti Kuusela
>>>  wrote:
>>>> Hi,
>>>> 
>>>> I have been testing Riak and Riak CS as a possible solution for our future
>>>> storage needs. I have a five server cluster running Centos 7. Riak version
>>>> is 2.1.3 (first installed as 2.1.1, updated twice via Basho repo) and Riak
>>>> CS version is 2.1.0. The servers each have 64GB RAM and six 4TB disks in
>>>> raid 6 using btrfs.
>>>> 
>>>> I have been pushing random data into Riak-CS via s3cmd to see how the 
>>>> system
>>>> behaves. Smallest objects have been 2000 bytes, largest 100MB. I have also
>>>> been making btrfs snapshots of the entire platform data dir nightly for
>>>> backup purposes. Stop Riak CS, wait 10 seconds, stop Riak, wait 10, make
>>>> snapshot, start Riak, wait 180 seconds, start Riak CS. This is performed on
>>>> each of the servers in turn with a five minute wait in between. I have 
>>>> added
>>>> the waits to try spread the startup load and allow the system time to get
>>>> things running. New data is constantly pushed to the S3 API but restarting
>>>> the nodes in rotation causes by far the highest stress on the system.
>>>> 
>>>> I have encountered one problem in particular. Quite often one of the Riak
>>>> nodes starts up but after a couple of minutes it just drops, all processes
>>>> exited except for epmd.
>>>> 
>>>> Following is from /var/log/riak/console, most of the lines skipped for sake
>>>> of brevity. Normal startup stuff, as far as I can see:
>>>> 
>>>> 2015-12-16 00:26:04.446 [info] <0.7.0> Application lager started on node
>>>> 'riak@192.168.50.32'
>>>> ...
>>>> 2015-12-16 00:26:04.490 [info] <0.72.0> alarm_handler:
>>>> {set,{system_memory_high_watermark,[]}}
>>>> ...
>>>> 2015-12-16 00:26:04.781 [info]
>>>> <0.206.0>@riak_core_capability:process_capability_changes:555 New
>>>> capability: {riak_core,vnode_routing} = proxy
>>>> ...
>>>> 2015-12-16 00:26:04.869 [info] <0.7.0> Application riak_core started on 
>>>> node
>>>> 'riak@192.168.50.32'
>>>> ...
>>>> 2015-12-16 00:26:04.969 [info] <0.407.0>@riak_kv_env:doc_env:46 Environment
>&

Re: Connection multiplexing in the Erlang client

2016-01-03 Thread Matthew Von-Maszewski

Paulo,

There is nothing to prevent you from "stacking" multiple protocol buffer 
request packets into the TCP chain.  None of our published clients support this 
usage, but it is known to work.  You would always get the response packets in 
the order you sent request packets.  The benefit is that there is no round trip 
wait between response of first packet and arrival at server for request in 
second packet.

Start with stacking two packets at a time, then work up until throughput no 
longer changes.

Matthew

> On Jan 2, 2016, at 8:30 AM, Paulo Almeida  wrote:
> 
> Hi Russell,
> 
> Thanks for the confirmation.
> 
> I'll take a look at the riak_api. I suspect the protobuf messages would also 
> need to be changed to include a token/correlation id in order to allow async 
> request/replies.
> 
> Regards,
> 
> Paulo
> 
> On Thu, Dec 31, 2015 at 10:16 AM, Russell Brown  > wrote:
> Hi Paulo,
> You’ll need more than client work. If you’re interested in exploring the 
> server side code riak_api (https://github.com/basho/riak_api 
> ) is probably where you want to look.
> 
> I’m happy to help/advise on this if you want to get stuck in :D
> 
> Cheers
> 
> Russell
> 
> On 31 Dec 2015, at 08:47, Paulo Almeida  > wrote:
> 
> > Hi,
> >
> > A poolboy or other connection pool based solution still means 1 open tcp 
> > connection per worker. I'm exploring the option of having a single TCP 
> > connection (actually 1 for each Riak node) and then multiplex concurrent 
> > requests to a DB node in a single TCP connection (think HTTP/2 TCP 
> > connection multiplexing).
> >
> > Thanks.
> >
> > Paulo
> >
> > On Thu, Dec 31, 2015 at 2:59 AM, Bryan Hunt  > > wrote:
> > Paulo,
> >
> > You can find a list of community maintained PBC pooling ‎libraries in the 
> > Erlang sub-section of this page:
> >
> > ‎http://docs.basho.com/riak/latest/dev/using/libraries/#Community-Libraries 
> > 
> >
> > II was under the impression that Riak Erlang client ships with poolboy ‎so 
> > I'm uncertain of the distinction between the different libraries listed.
> >
> > Perhaps someone could comment to clarify?
> >
> > Bryan
> >
> >
> >
> > From: Paulo Almeida
> > Sent: Wednesday, December 30, 2015 10:40 PM
> > To: riak-users@lists.basho.com 
> > Subject: Connection multiplexing in the Erlang client
> >
> > Hi,
> >
> > Does the Erlang Riak client support multiplexing multiple concurrent calls 
> > in a single TCP connection? Specifically when using the PB interface 
> > (riakc_pb_socket:start_link).
> >
> > Thanks.
> >
> > Regards,
> >
> > Paulo
> >
> >
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com 
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com 
> > 
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Leveldb segfault during Riak startup

2015-12-31 Thread Matthew Von-Maszewski

I also failed to ask two basic questions:

1.  did this failure start after your upgrade to 2.1.3, or happen prior to 
upgrade also?

2.  did you use a Basho package for Centos 7, or did you build from source code?

Matthew


> On Dec 31, 2015, at 6:06 AM, Antti Kuusela  wrote:
> 
> Hi Luke,
> 
> We erased the btrfs file system, replaced it with xfs on lvm with 
> thinly-provisioned volumes and continued testing with a new database from 
> scratch. The same problem continues, though. From /var/log/messages:
> 
> Dec 31 03:35:31 storage5 riak[66419]: Starting up
> Dec 31 03:35:45 storage5 kernel: traps: beam.smp[66731] general protection 
> ip:7f02280b3f16 sp:7f0197ad8dd0 error:0 in eleveldb.so[7f0228066000+93000]
> Dec 31 03:35:46 storage5 run_erl[66417]: Erlang closed the connection.
> 
> 
> 
> 18.12.2015, 16:45, Luke Bakken kirjoitti:
>> Hi Antti,
>> 
>> Riak is not tested on btrfs and the file system is not officially
>> supported. We recommend ext4 or xfs for Linux. ZFS is an option on
>> Solaris derivatives and FreeBSD.
>> 
>> --
>> Luke Bakken
>> Engineer
>> lbak...@basho.com
>> 
>> 
>> On Fri, Dec 18, 2015 at 6:14 AM, Antti Kuusela
>>  wrote:
>>> Hi,
>>> 
>>> I have been testing Riak and Riak CS as a possible solution for our future
>>> storage needs. I have a five server cluster running Centos 7. Riak version
>>> is 2.1.3 (first installed as 2.1.1, updated twice via Basho repo) and Riak
>>> CS version is 2.1.0. The servers each have 64GB RAM and six 4TB disks in
>>> raid 6 using btrfs.
>>> 
>>> I have been pushing random data into Riak-CS via s3cmd to see how the system
>>> behaves. Smallest objects have been 2000 bytes, largest 100MB. I have also
>>> been making btrfs snapshots of the entire platform data dir nightly for
>>> backup purposes. Stop Riak CS, wait 10 seconds, stop Riak, wait 10, make
>>> snapshot, start Riak, wait 180 seconds, start Riak CS. This is performed on
>>> each of the servers in turn with a five minute wait in between. I have added
>>> the waits to try spread the startup load and allow the system time to get
>>> things running. New data is constantly pushed to the S3 API but restarting
>>> the nodes in rotation causes by far the highest stress on the system.
>>> 
>>> I have encountered one problem in particular. Quite often one of the Riak
>>> nodes starts up but after a couple of minutes it just drops, all processes
>>> exited except for epmd.
>>> 
>>> Following is from /var/log/riak/console, most of the lines skipped for sake
>>> of brevity. Normal startup stuff, as far as I can see:
>>> 
>>> 2015-12-16 00:26:04.446 [info] <0.7.0> Application lager started on node
>>> 'riak@192.168.50.32'
>>> ...
>>> 2015-12-16 00:26:04.490 [info] <0.72.0> alarm_handler:
>>> {set,{system_memory_high_watermark,[]}}
>>> ...
>>> 2015-12-16 00:26:04.781 [info]
>>> <0.206.0>@riak_core_capability:process_capability_changes:555 New
>>> capability: {riak_core,vnode_routing} = proxy
>>> ...
>>> 2015-12-16 00:26:04.869 [info] <0.7.0> Application riak_core started on node
>>> 'riak@192.168.50.32'
>>> ...
>>> 2015-12-16 00:26:04.969 [info] <0.407.0>@riak_kv_env:doc_env:46 Environment
>>> and OS variables:
>>> 2015-12-16 00:26:05.124 [warning] <0.6.0> lager_error_logger_h dropped 9
>>> messages in the last second that exceeded the limit of 100 messages/sec
>>> 2015-12-16 00:26:05.124 [info] <0.407.0> riak_kv_env: Open file limit: 65536
>>> 2015-12-16 00:26:05.124 [warning] <0.407.0> riak_kv_env: Cores are disabled,
>>> this may hinder debugging
>>> 2015-12-16 00:26:05.124 [info] <0.407.0> riak_kv_env: Erlang process limit:
>>> 262144
>>> 2015-12-16 00:26:05.125 [info] <0.407.0> riak_kv_env: Erlang ports limit:
>>> 65536
>>> 2015-12-16 00:26:05.125 [info] <0.407.0> riak_kv_env: ETS table count limit:
>>> 256000
>>> 2015-12-16 00:26:05.125 [info] <0.407.0> riak_kv_env: Thread pool size: 64
>>> 2015-12-16 00:26:05.125 [info] <0.407.0> riak_kv_env: Generations before
>>> full sweep: 0
>>> 2015-12-16 00:26:05.125 [info] <0.407.0> riak_kv_env: Schedulers: 12 for 12
>>> cores
>>> 2015-12-16 00:26:05.125 [info] <0.407.0> riak_kv_env: sysctl vm.swappiness
>>> is 0 greater than or equal to 0)
>>> 2015-12-16 00:26:05.125 [info] <0.407.0> riak_kv_env: sysctl
>>> net.core.wmem_default is 8388608 lesser than or equal to 8388608)
>>> ...
>>> 2015-12-16 00:26:05.139 [info] <0.478.0>@riak_core:wait_for_service:504
>>> Waiting for service riak_kv to start (0 seconds)
>>> 2015-12-16 00:26:05.158 [info]
>>> <0.495.0>@riak_kv_entropy_manager:set_aae_throttle_limits:790 Setting AAE
>>> throttle limits: [{-1,0},{200,10},{500,50},{750,250},{900,1000},{1100,5000}]
>>> ...
>>> 2015-12-16 00:26:30.160 [info]
>>> <0.495.0>@riak_kv_entropy_manager:perhaps_log_throttle_change:853 Changing
>>> AAE throttle from undefined -> 5000 msec/key, based on maximum vnode mailbox
>>> size {unknown_mailbox_sizes,node_list,['riak@192.168.50.32']} from
>>> ['riak@192.168.50.32']
>>> 2015-12-16 00:27:12.053 [info] <0.478.0>@riak_core:wait_for_serv

Re: Leveldb segfault during Riak startup

2015-12-31 Thread Matthew Von-Maszewski

Would you forward the riak.conf for your setup and paste the first 30 lines 
from a LOG file from any leveldb vnode a server that has experienced the crash. 
 Example lines from a LOG file are:

2015/12/27-11:56:53.559717 7f02a47b0700Version: 2.0.10
2015/12/27-11:56:53.559850 7f02a47b0700 Options.comparator: 
leveldb.InternalKeyComparator
2015/12/27-11:56:53.559862 7f02a47b0700  Options.create_if_missing: 1
2015/12/27-11:56:53.559869 7f02a47b0700Options.error_if_exists: 0
2015/12/27-11:56:53.559876 7f02a47b0700Options.paranoid_checks: 0
2015/12/27-11:56:53.559882 7f02a47b0700 Options.verify_compactions: 1
2015/12/27-11:56:53.559889 7f02a47b0700Options.env: 
0x7f03b40026e0
2015/12/27-11:56:53.559900 7f02a47b0700   Options.info_log: 
0x7f0278002830
2015/12/27-11:56:53.559912 7f02a47b0700  Options.write_buffer_size: 35628845
2015/12/27-11:56:53.559925 7f02a47b0700 Options.max_open_files: 1000
2015/12/27-11:56:53.559933 7f02a47b0700Options.block_cache: 
0x7f0278001840
2015/12/27-11:56:53.559940 7f02a47b0700 Options.block_size: 4096
2015/12/27-11:56:53.559947 7f02a47b0700   Options.block_size_steps: 16
2015/12/27-11:56:53.559953 7f02a47b0700 Options.block_restart_interval: 16
2015/12/27-11:56:53.559959 7f02a47b0700Options.compression: 1
2015/12/27-11:56:53.559971 7f02a47b0700  Options.filter_policy: 
leveldb.BuiltinBloomFilter2
2015/12/27-11:56:53.559982 7f02a47b0700  Options.is_repair: false
2015/12/27-11:56:53.559993 7f02a47b0700 Options.is_internal_db: false
2015/12/27-11:56:53.560005 7f02a47b0700  Options.total_leveldb_mem: 
23595048960
2015/12/27-11:56:53.560015 7f02a47b0700  Options.block_cache_threshold: 33554432
2015/12/27-11:56:53.560022 7f02a47b0700  Options.limited_developer_mem: false
2015/12/27-11:56:53.560029 7f02a47b0700  Options.mmap_size: 0
2015/12/27-11:56:53.560035 7f02a47b0700   Options.delete_threshold: 1000
2015/12/27-11:56:53.560043 7f02a47b0700   Options.fadvise_willneed: false
2015/12/27-11:56:53.560056 7f02a47b0700  Options.tiered_slow_level: 0
2015/12/27-11:56:53.560066 7f02a47b0700 Options.tiered_fast_prefix: 
/var/db/riak/data/leveldb/0
2015/12/27-11:56:53.560079 7f02a47b0700 Options.tiered_slow_prefix: 
/var/db/riak/data/leveldb/0
2015/12/27-11:56:53.560090 7f02a47b0700 crc32c: hardware
2015/12/27-11:56:53.560098 7f02a47b0700File cache size: 
4621898294
2015/12/27-11:56:53.560105 7f02a47b0700   Block cache size: 
329028150

Thank,
Matthew

  Happy New Year.

> On Dec 31, 2015, at 6:06 AM, Antti Kuusela  wrote:
> 
> Hi Luke,
> 
> We erased the btrfs file system, replaced it with xfs on lvm with 
> thinly-provisioned volumes and continued testing with a new database from 
> scratch. The same problem continues, though. From /var/log/messages:
> 
> Dec 31 03:35:31 storage5 riak[66419]: Starting up
> Dec 31 03:35:45 storage5 kernel: traps: beam.smp[66731] general protection 
> ip:7f02280b3f16 sp:7f0197ad8dd0 error:0 in eleveldb.so[7f0228066000+93000]
> Dec 31 03:35:46 storage5 run_erl[66417]: Erlang closed the connection.
> 
> 
> 
> 18.12.2015, 16:45, Luke Bakken kirjoitti:
>> Hi Antti,
>> 
>> Riak is not tested on btrfs and the file system is not officially
>> supported. We recommend ext4 or xfs for Linux. ZFS is an option on
>> Solaris derivatives and FreeBSD.
>> 
>> --
>> Luke Bakken
>> Engineer
>> lbak...@basho.com
>> 
>> 
>> On Fri, Dec 18, 2015 at 6:14 AM, Antti Kuusela
>>  wrote:
>>> Hi,
>>> 
>>> I have been testing Riak and Riak CS as a possible solution for our future
>>> storage needs. I have a five server cluster running Centos 7. Riak version
>>> is 2.1.3 (first installed as 2.1.1, updated twice via Basho repo) and Riak
>>> CS version is 2.1.0. The servers each have 64GB RAM and six 4TB disks in
>>> raid 6 using btrfs.
>>> 
>>> I have been pushing random data into Riak-CS via s3cmd to see how the system
>>> behaves. Smallest objects have been 2000 bytes, largest 100MB. I have also
>>> been making btrfs snapshots of the entire platform data dir nightly for
>>> backup purposes. Stop Riak CS, wait 10 seconds, stop Riak, wait 10, make
>>> snapshot, start Riak, wait 180 seconds, start Riak CS. This is performed on
>>> each of the servers in turn with a five minute wait in between. I have added
>>> the waits to try spread the startup load and allow the system time to get
>>> things running. New data is constantly pushed to the S3 API but restarting
>>> the nodes in rotation causes by far the highest stress on the system.
>>> 
>>> I have encountered one problem in particular. Quite often one of the Riak
>>> nodes starts up but after a couple of minutes it just drops, all processes
>>> exited except for epmd.
>>> 
>>> Following is from /var/log/riak/console, most of the lines skipped for sake
>>> of brevity. Normal startup stuf

Re: Latency of Riak KV drops with small size server goes long

2015-11-08 Thread Matthew Von-Maszewski

Try 2 times the ring size.

Long term max latency increasing could be due to disk overhead. 

Sent from my iPhone

> On Nov 8, 2015, at 5:51 PM, mtakahashi-ivi  
> wrote:
> 
> Could someone advise me?
> 
> I changed concurrency of benchmark from 300 to 16.
> 
>  
> 
> It seems 95th percentile latency is stable. But max latency is going up as
> time goes on.
> Should I use small concurrency with small server?
> Is there any way to suppress max latency?
> 
> 
> 
> --
> View this message in context: 
> http://riak-users.197444.n3.nabble.com/Latency-of-Riak-KV-drops-with-small-size-server-goes-long-tp4033651p4033669.html
> Sent from the Riak Users mailing list archive at Nabble.com.
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Ownership handoff timed out

2015-10-29 Thread Matthew Von-Maszewski

I queried Basho’s Client Services team.  They tell me the upgrade / coexist 
should be no problem.

Matthew

> On Oct 29, 2015, at 1:38 PM, Vladyslav Zakhozhai 
>  wrote:
> 
> Matthew can you describe the bug more detail?
> 
> My plan was to migrate to eleveldb and only then to migrate to Riak 2.0. It 
> seems that I need to change my plans to migrate to Riak 2.0 first. It is sad.
> 
> Is it safe to migrate Riak 1.4.12/Riak CS 1.5.0 to Riak 2.0 on production 
> environment? According to official upgrade guides I can upgrade nodes one by 
> one in the same cluster. So Riak 2.0 and Riak 1.4.12 nodes can coexist in one 
> cluster. Am I right?
> 
> Thank you.
> 
> On Thu, Oct 29, 2015 at 7:04 PM Matthew Von-Maszewski  <mailto:matth...@basho.com>> wrote:
> Sad to say your LOG files suggest the same bug as seen elsewhere and fixed by 
> recent changes in the leveldb code.
> 
> The tougher issue is that the fixes are currently only available for our 2.0 
> product series.  A backport would be non-trivial due to the number of places 
> changed between 1.4 and 2.0 and the number of places the fix overlaps those 
> changes.  The corrected code is tagged “2.0.9” in eleveldb and leveldb.
> 
> The only path readily available to you is to have your receiving cluster 
> upgraded to 2.0 Riak CS and manually build/patch eleveldb to the 2.0.9 
> version. Then start your handoffs.   (eleveldb version 2.0.9 is not present 
> in any shipping version of Riak … yet). 
> 
> I will write again if I can think of an easier solution.  But nothing is 
> occurring to me or the team members I have queried.
> 
> Matthew
> 
> 
>> On Oct 29, 2015, at 12:14 PM, Vladyslav Zakhozhai 
>> mailto:v.zakhoz...@smartweb.com.ua>> wrote:
>> 
> 
>> Hi,
>> 
>> Matthew thank for you answer. eleveldb LOGs are attached.
>> Here is LOGs from 2 eleveldb nodes (eggeater was not restarted; what about 
>> rattlesnake I'm not sure).
>> 
>> On Thu, Oct 29, 2015 at 5:24 PM Matthew Von-Maszewski > <mailto:matth...@basho.com>> wrote:
>> Hi,
>> 
>> There was a known eleveldb bug with handoff receiving that could cause a 
>> timeout.  But it does not sound like bug fits your symptoms.  However, I am 
>> willing to verify my diagnosis.  I would need you to gather the LOG files 
>> from all vnodes on the RECEIVING side (or at least from the vnode that you 
>> are attempting and is failing).
>> 
>> I will check it for the symptoms of the known bug.
>> 
>> Note:  the LOG files reset on each restart of Riak.  So you must gather the 
>> LOG files right after the failure without restarting Riak.
>> 
>> Matthew
>> 
>> 
>>> On Oct 29, 2015, at 11:11 AM, Vladyslav Zakhozhai 
>>> mailto:v.zakhoz...@smartweb.com.ua>> wrote:
>>> 
>>> Hi,
>>> 
>>> I want to make small update. Jon your hint about problems on sender side is 
>>> correct. As I've already told there problems with available resources on 
>>> sender nodes. There are no enough available RAM which is a cause of 
>>> swapiness and load on disks. Restarting of sender nodes helps me (at least 
>>> temoprarily).
>>> 
>>> 
>>> On Thu, Oct 29, 2015 at 12:19 PM Vladyslav Zakhozhai 
>>> mailto:v.zakhoz...@smartweb.com.ua>> wrote:
>>> Hi,
>>> 
>>> Average size of objects in Riak - 300 Kb. This objects are images. This 
>>> data updates very very rearly (there almost no updates).
>>> 
>>> I have GC turned on and works:
>>> root@python:~# riak-cs-gc status
>>> There is no garbage collection in progress
>>>   The current garbage collection interval is: 900
>>>   The current garbage collection leeway time is: 86400
>>>   Last run started at: 20151029T100600Z
>>>   Next run scheduled for: 20151029T102100Z
>>> 
>>> Network misconfigurations were not detected. The result of your script 
>>> shows correct info.
>>> 
>>> But I see that almost all nodes with bitcask suffers from low free memory 
>>> and they swapped. I think that it can be an issue. But my question is, what 
>>> workaround is for this problem.
>>> 
>>> I've wrote in my first post that I tuned handoff_timeout and 
>>> handoff_receive_timeout (now this vaules are 30 and 60). But 
>>> situation is the same.
>>> 
>>> 
>>> On Tue, Oct 27, 2015 at 4:06 PM Jon Meredith >> <mailto:jmered...@basho.com>> wrote:
>>> Hi,
>>> 
>>> Handoff problems without obvious disk is

Re: Ownership handoff timed out

2015-10-29 Thread Matthew Von-Maszewski

Sad to say your LOG files suggest the same bug as seen elsewhere and fixed by 
recent changes in the leveldb code.

The tougher issue is that the fixes are currently only available for our 2.0 
product series.  A backport would be non-trivial due to the number of places 
changed between 1.4 and 2.0 and the number of places the fix overlaps those 
changes.  The corrected code is tagged “2.0.9” in eleveldb and leveldb.

The only path readily available to you is to have your receiving cluster 
upgraded to 2.0 Riak CS and manually build/patch eleveldb to the 2.0.9 version. 
Then start your handoffs.   (eleveldb version 2.0.9 is not present in any 
shipping version of Riak … yet). 

I will write again if I can think of an easier solution.  But nothing is 
occurring to me or the team members I have queried.

Matthew

> On Oct 29, 2015, at 12:14 PM, Vladyslav Zakhozhai 
>  wrote:
> 
> Hi,
> 
> Matthew thank for you answer. eleveldb LOGs are attached.
> Here is LOGs from 2 eleveldb nodes (eggeater was not restarted; what about 
> rattlesnake I'm not sure).
> 
> On Thu, Oct 29, 2015 at 5:24 PM Matthew Von-Maszewski  <mailto:matth...@basho.com>> wrote:
> Hi,
> 
> There was a known eleveldb bug with handoff receiving that could cause a 
> timeout.  But it does not sound like bug fits your symptoms.  However, I am 
> willing to verify my diagnosis.  I would need you to gather the LOG files 
> from all vnodes on the RECEIVING side (or at least from the vnode that you 
> are attempting and is failing).
> 
> I will check it for the symptoms of the known bug.
> 
> Note:  the LOG files reset on each restart of Riak.  So you must gather the 
> LOG files right after the failure without restarting Riak.
> 
> Matthew
> 
> 
>> On Oct 29, 2015, at 11:11 AM, Vladyslav Zakhozhai 
>> mailto:v.zakhoz...@smartweb.com.ua>> wrote:
>> 
>> Hi,
>> 
>> I want to make small update. Jon your hint about problems on sender side is 
>> correct. As I've already told there problems with available resources on 
>> sender nodes. There are no enough available RAM which is a cause of 
>> swapiness and load on disks. Restarting of sender nodes helps me (at least 
>> temoprarily).
>> 
>> 
>> On Thu, Oct 29, 2015 at 12:19 PM Vladyslav Zakhozhai 
>> mailto:v.zakhoz...@smartweb.com.ua>> wrote:
>> Hi,
>> 
>> Average size of objects in Riak - 300 Kb. This objects are images. This data 
>> updates very very rearly (there almost no updates).
>> 
>> I have GC turned on and works:
>> root@python:~# riak-cs-gc status
>> There is no garbage collection in progress
>>   The current garbage collection interval is: 900
>>   The current garbage collection leeway time is: 86400
>>   Last run started at: 20151029T100600Z
>>   Next run scheduled for: 20151029T102100Z
>> 
>> Network misconfigurations were not detected. The result of your script shows 
>> correct info.
>> 
>> But I see that almost all nodes with bitcask suffers from low free memory 
>> and they swapped. I think that it can be an issue. But my question is, what 
>> workaround is for this problem.
>> 
>> I've wrote in my first post that I tuned handoff_timeout and 
>> handoff_receive_timeout (now this vaules are 30 and 60). But 
>> situation is the same.
>> 
>> 
>> On Tue, Oct 27, 2015 at 4:06 PM Jon Meredith > <mailto:jmered...@basho.com>> wrote:
>> Hi,
>> 
>> Handoff problems without obvious disk issues can be due to the database 
>> containing large objects.  Do you frequently update objects in CS, and if so 
>> have you had garbage collection running?
>> 
>> The timeout is happening on the receiver side after not receiving any tcp 
>> data for handoff_receive_timeout *milli*seconds.  I know you said you 
>> increased it, but not how high.  I would bump that up to 30 to give the 
>> sender a chance to read larger objects off disk.
>> 
>> To check if the sender is transmitting, on the source node you could run
>>   redbug:start("riak_core_handoff_sender:visit_item", [{arity, 
>> true},{print_file,"/tmp/visit_item.log"},{time, 360},{msgs, 100}]).
>> 
>> That file should fill fairly fast with an entry for every object the sender 
>> tries to transmit.
>> 
>> There's a long shot it could be network misconfiguration. Run this from the 
>> source node having problems 
>> 
>> rpc:multicall(erlang, apply, [fun() -> TargetNode = node(), [_Name,Host] = 
>> string:tokens(atom_to_list(TargetNode), "@"), {ok, Port} = 
>> riak_core_gen_server:call({riak_core

Re: Ownership handoff timed out

2015-10-29 Thread Matthew Von-Maszewski

Hi,

There was a known eleveldb bug with handoff receiving that could cause a 
timeout.  But it does not sound like bug fits your symptoms.  However, I am 
willing to verify my diagnosis.  I would need you to gather the LOG files from 
all vnodes on the RECEIVING side (or at least from the vnode that you are 
attempting and is failing).

I will check it for the symptoms of the known bug.

Note:  the LOG files reset on each restart of Riak.  So you must gather the LOG 
files right after the failure without restarting Riak.

Matthew


> On Oct 29, 2015, at 11:11 AM, Vladyslav Zakhozhai 
>  wrote:
> 
> Hi,
> 
> I want to make small update. Jon your hint about problems on sender side is 
> correct. As I've already told there problems with available resources on 
> sender nodes. There are no enough available RAM which is a cause of swapiness 
> and load on disks. Restarting of sender nodes helps me (at least temoprarily).
> 
> 
> On Thu, Oct 29, 2015 at 12:19 PM Vladyslav Zakhozhai 
> mailto:v.zakhoz...@smartweb.com.ua>> wrote:
> Hi,
> 
> Average size of objects in Riak - 300 Kb. This objects are images. This data 
> updates very very rearly (there almost no updates).
> 
> I have GC turned on and works:
> root@python:~# riak-cs-gc status
> There is no garbage collection in progress
>   The current garbage collection interval is: 900
>   The current garbage collection leeway time is: 86400
>   Last run started at: 20151029T100600Z
>   Next run scheduled for: 20151029T102100Z
> 
> Network misconfigurations were not detected. The result of your script shows 
> correct info.
> 
> But I see that almost all nodes with bitcask suffers from low free memory and 
> they swapped. I think that it can be an issue. But my question is, what 
> workaround is for this problem.
> 
> I've wrote in my first post that I tuned handoff_timeout and 
> handoff_receive_timeout (now this vaules are 30 and 60). But 
> situation is the same.
> 
> 
> On Tue, Oct 27, 2015 at 4:06 PM Jon Meredith  > wrote:
> Hi,
> 
> Handoff problems without obvious disk issues can be due to the database 
> containing large objects.  Do you frequently update objects in CS, and if so 
> have you had garbage collection running?
> 
> The timeout is happening on the receiver side after not receiving any tcp 
> data for handoff_receive_timeout *milli*seconds.  I know you said you 
> increased it, but not how high.  I would bump that up to 30 to give the 
> sender a chance to read larger objects off disk.
> 
> To check if the sender is transmitting, on the source node you could run
>   redbug:start("riak_core_handoff_sender:visit_item", [{arity, 
> true},{print_file,"/tmp/visit_item.log"},{time, 360},{msgs, 100}]).
> 
> That file should fill fairly fast with an entry for every object the sender 
> tries to transmit.
> 
> There's a long shot it could be network misconfiguration. Run this from the 
> source node having problems 
> 
> rpc:multicall(erlang, apply, [fun() -> TargetNode = node(), [_Name,Host] = 
> string:tokens(atom_to_list(TargetNode), "@"), {ok, Port} = 
> riak_core_gen_server:call({riak_core_handoff_listener, TargetNode}, 
> handoff_port), HandoffIP = riak_core_handoff_listener:get_handoff_ip(), 
> TNHandoffIP = case HandoffIP of error -> Host; {ok, "0.0.0.0"} -> Host; {ok, 
> Other} -> Other end, {node(), HandoffIP, TNHandoffIP, 
> inet:gethostbyname(TNHandoffIP), Port} end, []]).
> 
> and it will print out a a list of remote nodes and IP addresses (and 
> hopefully an empty list of failed nodes)
> 
> {[{'dev1@127.0.0.1 ',  < node name
>   {ok,"0.0.0.0"}, < handoff ip address configured in 
> app.config
>   "127.0.0.1",< hostname passed to socket open
>   {ok,{hostent,"127.0.0.1",[],inet,4,[{127,0,0,1}]}}, <--- DNS entry for 
> hostname
>   10019}],< handoff port
>  []} <--- empty list of errors
> 
> Good luck, Jon.
> 
> On Tue, Oct 27, 2015 at 3:55 AM Vladyslav Zakhozhai 
> mailto:v.zakhoz...@smartweb.com.ua>> wrote:
> Hi,
> 
> Jon thank you for the answer. During approval of my mail to this list I've 
> troubleshoot my issue more deep. And yes, your are right. Neither {error, 
> enotconn} nor max_concurrency is my problem.
> 
> I'm going to migrate my cluster entierly to eleveldb only, i.e. I need to 
> refuse using bitcask. I have a talk with basho support and they said that it 
> is tricky to tune bitcask on servers with 32 GB RAM (and I guess that it is 
> not tricky, but it is impossible, because bitcask loads all keys in memory 
> regardless of free available RAM). With LevelDB I have opportunity to tune 
> using RAM on servers.
> 
> So I have 15 nodes with multibackend (bitcask for data and leveldb for 
> metadata). 2 additional servers are without multibackend - only with leveldb. 
> Now I'm not sure do I need still use mutibackend with levedb-only backend.
> 
> And my problem is (as I mentioned earli

Re: Limiting LevelDB memory in Riak 1.4

2015-10-08 Thread Matthew Von-Maszewski

See this:

https://docs.basho.com/riak/1.4.7/ops/advanced/backends/leveldb/

It is the 1.4 docs relating to memory management for leveldb.

Basic answer is that you multiple max_open_files times 4 megabytes times number 
of vnodes on the server.

Matthew


> On Oct 8, 2015, at 1:32 PM, Ricardo Mayerhofer  wrote:
> 
> Hi all,
> I have a multi backend strategy in my Riak cluster, where LevelDB bucket 
> stores historical data whereas Bitcask bucket stores live data. Given the 
> purpose of LevelDB bucket I don't want it to take much memory and use most o 
> the memory to Bitcask.
> 
> I know in 2.0 there is "leveldb.maximum_memory.percent" to limit LevelDB 
> memory use. Is there a way to achieve the same thing in Riak 1.4?
> 
> Thanks.
> 
> -- 
> Ricardo Mayerhofer
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: riak 2.1.1 : Erlang crash dump

2015-10-03 Thread Matthew Von-Maszewski

Girish,

This feels like a sibling explosion to me.  I cannot help prove or fix it.  
Writing this paragraph as bait for others to help.

Matthew

Sent from my iPad

> On Oct 3, 2015, at 8:34 PM, Girish Shankarraman  
> wrote:
> 
> Thank you for the response, Jon.
> 
> So I changed it to 50% and it crashed again.
> I have a 5 nodes cluster with 60GB RAM on each node. Ring size is set to 64. 
> (Attached riak conf if any one has some ideas).
> 
> I still see the erlang process consuming the entire capacity of the system 
> (52 GB).
> 
>   PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
> 24256 riak  20   0 67.134g 0.052t  18740 D   0.0 90.0   2772:44 beam.smp
> 
>  Cluster Status 
> Ring ready: true
> 
> ++--+---+-+---+
> |node|status| avail |ring |pending|
> ++--+---+-+---+
> | (C) riak@20.0.0.11 |valid |  up   | 20.3|  --   |
> | riak@20.0.0.12 |valid |  up   | 20.3|  --   |
> | riak@20.0.0.13 |valid |  up   | 20.3|  --   |
> | riak@20.0.0.14 |valid |  up   | 20.3|  --   |
> | riak@20.0.0.15 |valid |  up   | 18.8|  --   |
> 
> Thanks,
> 
> — Girish Shankarraman
> 
> 
> From: Jon Meredith 
> Date: Thursday, October 1, 2015 at 2:06 PM
> To: girish shankarraman , 
> "riak-users@lists.basho.com" 
> Subject: Re: riak 2.1.1 : Erlang crash dump
> 
> It looks like Riak was unable to allocate 4Gb of memory.  You may have to 
> reduce the amount of memory allocated for leveldb from the default 70%, try 
> setting this in your /etc/riak/riak.conf file
> 
> leveldb.maximum_memory.percent = 50
> 
> The memory footprint for Riak should stabilize after a few hours and on 
> servers with smaller amounts of memory, the 30% left over may not be enough.
> 
> Please let us know how you get on.
> 
>> On Wed, Sep 30, 2015 at 5:31 PM Girish Shankarraman 
>>  wrote:
>> I have 7 node cluster for riak with a ring_size of 128.
>> 
>> System Details:
>> Each node is a VM with 16GB of memory.
>> The backend is using leveldb.
>> sys_system_architecture : <<"x86_64-unknown-linux-gnu">>
>> sys_system_version : <<"Erlang R16B02_basho8 (erts-5.10.3) [source] [64-bit] 
>> [smp:4:4] [async-threads:64] [kernel-poll:true] [frame-pointer]">>
>> riak_control_version : <<"2.1.1-0-g5898c40">>
>> cluster_info_version : <<"2.0.2-0-ge231144">>
>> yokozuna_version : <<"2.1.0-0-gcb41c27”>>
>> 
>> Scenario:
>> We have up to 400-1000 json records being written/sec. Each record might be 
>> a few 100 bytes.
>> I see the following crash message in the erlang logs after a few hours of 
>> processing. Any suggestions on what could be going on here ?
>> 
>> = Tue Sep 29 20:20:56 UTC 2015
>> [os_mon] memory supervisor port (memsup): Erlang has closed^M
>> [os_mon] cpu supervisor port (cpu_sup): Erlang has closed^M
>> ^M
>> Crash dump was written to: /var/log/riak/erl_crash.dump^M
>> eheap_alloc: Cannot allocate 3936326656 bytes of memory (of type "heap").^M
>> 
>> Also tested running this at 50GB per Riak Node(VM) and things work but 
>> memory keeps growing, so throwing hardware at it doesn’t seem very scalable.
>> 
>> Thanks,
>> 
>> — Girish Shankarraman
>> 
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: riak handoff of partition error

2015-08-17 Thread Matthew Von-Maszewski

Amao Wang,

Would you send the riak-diag from the machine receiving the handoff, 
10.21.136.93?

Thank you,
Matthew

> On Jul 30, 2015, at 5:35 AM, changmao wang  wrote:
> 
> Hi Riak-users group,
> 
> I have found some errors related to handoff of partition in 
> /etc/riak/log/errors.
> Details are as below:
> 
> 2015-07-30 16:04:33.643 [error] 
> <0.12872.15>@riak_core_handoff_sender:start_fold:262 ownership_transfer 
> transfer of riak_kv_vnode from 'riak@10.21.136.76 ' 
> 45671926166590716193865151022383844364247891968 to 'riak@10.21.136.93 
> ' 45671926166590716193865151022383844364247891968 
> failed because of enotconn
> 2015-07-30 16:04:33.643 [error] 
> <0.197.0>@riak_core_handoff_manager:handle_info:289 An outbound handoff of 
> partition riak_kv_vnode 45671926166590716193865151022383844364247891968 was 
> terminated for reason: {shutdown,{error,enotconn}}
>   
>  
> 
> I have searched it with google and found related articles. However, there's 
> no solution.
> http://lists.basho.com/pipermail/riak-users_lists.basho.com/2014-October/016052.html
>  
> 
> 
> Attached is my riak-diag command log.
> 
> Could you take a look at it?
> -- 
> Amao Wang
> Best & Regards
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Monitoring protobuffs and CPU usage on a Riak KV cluster

2015-08-06 Thread Matthew Von-Maszewski

Two ideas about lots of cpu with nothing happening:

- is Riak entropy setting "active".  Weekly it will rebuild hash trees

- does this activity occur after a handoff?  Sometimes leveldb needs time to 
catch up afterward in current releases (fix coming soon).  

Matthew

Sent from my iPad

> On Aug 5, 2015, at 7:16 PM, Matthew Brender  wrote:
> 
> Hey all, I'm practicing what I preach and asking a question to others
> here. A community member pinged me with a couple questions:
> 
> "I'm looking into an issue with Riak on our test environment (we
> basically let devs spin up a copy of our prod stack on a giant box in
> ec2).
> 
> We've gotten some complaints that riak seems to be using a lot of CPU,
> even when "nothing" is happening (something is always happening with
> the # of apps and services on the box, even when no one is poking it).
> 
> Is there a way to turn on an "access" log, so I could see what
> requests are being made against riak, to try and correlate with? It
> would have to be able to tell me about pbuffs usage, not just http."
> 
> Could anyone help with these questions. I see a couple separate ones:
> 
> 1. How to monitor what is causing higher CPU usage on a Riak cluster
> 2. How to view an access log of protobuff and http requests, or just
> the *most helpful* counters for them
> 
> I know a ton of statistics are visible [1] through riak-admin, but I'd
> like to hear what others recommend in particular.
> 
> [1] http://docs.basho.com/riak/latest/ops/running/stats-and-monitoring/
> 
> Thanks,
> 
> Matt Brender | Developer Advocacy Lead
> Basho Technologies
> t: @mjbrender
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: LevelDB Question

2015-08-05 Thread Matthew Von-Maszewski

All,

The “Implementation Details” section of the referenced webpage is about 2 years 
behind Basho’s changes to leveldb.  I already have the action item to rewrite 
it.  Apologies for any confusion.  The tiered storage section has an accurate, 
cumulative table.

Matthew


> On Aug 5, 2015, at 10:04 AM, xia...@xiaclo.net wrote:
> 
> From the Riak docs 
> (http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/#Tiered-Storage),
>  under "selecting a level", it appears ‎the 10x rule doesn't apply to the 
> lower levels, and oddly looks more like 20x for the higher ones.  Keep in 
> mind that data can exist in multiple levels at the same time and there is 
> overhead for metadata and headers.  Also, having AAE enabled greatly 
> increases those storage sizes.
> 
> I would judge the ideal level based on that table, as well as the maximum 
> number of vnodes you will have on a single node since those numbers are for a 
> single vnode.
> 
> I hope this helps.
> 
> PS. level 6 is unlimited, and only bound by the amount of data you store in 
> Riak.
> 
> From: Joe Olson
> Sent: Wednesday, 5 August 2015 23:33
> To: riak-users@lists.basho.com
> Subject: LevelDB Question
> 
> 
> Suppose I have come to the conclusion each of my LevelDB backed Riak nodes 
> needs to hold 2TB of data.
> 
> Also suppose I have the ability to store data on more expensive SSD drives, 
> and less expensive magnetic drives.
> 
> My question: What should leveldb.tiered be set to in /etc/riak/riak.conf?
> 
> I know from the LevelDB docs on the Basho Riak site 
> (http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/), each 
> level holds 10x the data the level above it holds, starting at level 1 with 
> 10MB.
> 
> If this is correct, the data capacities of each level should be:
> 
> Level 1: 10MB
> Level 2: 100 MB
> Level 3: 1 GB
> Level 4: 10 GB
> Level 5: 100 GB
> Level 6: 1 TB
> 
> If this is this case, I would assume using leveldb.tiered = 5 (100GB) or 
> leveldb.tiered = 6 (1 TB) of SSD capacity needed. The remainder (either 1.9 
> TB or 1 TB) will be stored on the magnetic drives on a different mount point.
> 
> Am I reasoning this out correctly?
> 
> If so, will my SSD drives ever *exceed* 100GB or 1TB of data? Not that I'd 
> just it that close, anyway
> 
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: LevelDB Question

2015-08-05 Thread Matthew Von-Maszewski

Joe,

You want to start here:


http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/#Tiered-Storage 


In addition to the 3 configuration lines given in that section, you need a 
fourth:

   leveldb.data_root=“./leveldb”

You need to make sure the leveldb.tiered.path.fast and .path.slow exist before 
starting Riak AND create the ./leveldb directory within the two paths.  Riak 
will not create those paths for you.

Matthew


> On Aug 5, 2015, at 9:27 AM, Joe Olson  wrote:
> 
> 
> Suppose I have come to the conclusion each of my LevelDB backed Riak nodes 
> needs to hold 2TB of data.
> 
> Also suppose I have the ability to store data on more expensive SSD drives, 
> and less expensive magnetic drives.
> 
> My question: What should leveldb.tiered be set to in /etc/riak/riak.conf?
> 
> I know from the LevelDB docs on the Basho Riak site 
> (http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/), each 
> level holds 10x the data the level above it holds, starting at level 1 with 
> 10MB.
> 
> If this is correct, the data capacities of each level should be:
> 
> Level 1: 10MB
> Level 2: 100 MB
> Level 3: 1 GB
> Level 4: 10 GB
> Level 5: 100 GB
> Level 6: 1 TB
> 
> If this is this case, I would assume using leveldb.tiered = 5 (100GB) or 
> leveldb.tiered = 6 (1 TB) of SSD capacity needed. The remainder (either 1.9 
> TB or 1 TB) will be stored on the magnetic drives on a different mount point.
> 
> Am I reasoning this out correctly?
> 
> If so, will my SSD drives ever *exceed* 100GB or 1TB of data? Not that I'd 
> just it that close, anyway
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak 2.0.2 leveldb uses more disk space than Riak 1.4.x?

2015-07-16 Thread Matthew Von-Maszewski

Wes,

I was away from Basho for five months.  I suspect there was confusion as who 
should respond and how to respond.

I have two guesses as to your 5% growth:

Guess 1 … more metadata at the block level

2.0 Riak contains a new leveldb feature called dynamic block size.  This is 
where leveldb starts constructing .sst table files differently to optimize 
memory usage for overall performance.  Each file has overall index and a per 
block index.  This new feature begins to adjust more index information into the 
blocks to intentionally reduce the size of file’s overall index.  The overall 
index must reside completely in memory while the file is open.  Shifting index 
data to the blocks allows a greater number of files to be opened simultaneously 
for a given amount of computer memory.  This helps performance because opening 
a file is huge time cost.

The content changes of both the file level index and the block level indexes 
typically create a net increase in file size, though the changes to your 
compression ratio can go either way.  We would have to look at sample .sst 
files from levels 3 and 4 with the sst_scan tool to make a valid assessment 
(https://github.com/basho/leveldb/blob/develop/tools/sst_scan.cc).

Technical details of dynamic block sizing are here:

 https://github.com/basho/leveldb/wiki/mv-dynamic-block-size

Guess 2 … weaker than Guess 1, but possible

I am going to guess that you are getting more, smaller .sst table files than 
before at levels 0 and 1.  More files means more disk space lost to due to the 
difference between space needed and whole blocks allocated by the file system.  
There can be a slight reduction in compression of file metadata too, but that 
is a questionable contributor.  The impact is limited to levels 0 and 1, but 
that still adds up.

A bug was discovered mid December 2014 and a fix placed on a branch for 
subsequent releases.  The fix too was lost in the above confusion and is just 
now making its way into the 2.0.x and 2.1.x releases.

The missing fix is here:

   https://github.com/basho/leveldb/wiki/mv-sequential-tuning 
<https://github.com/basho/leveldb/wiki/mv-sequential-tuning>

Matthew

> On Jul 16, 2015, at 2:06 PM, Wes Jossey  wrote:
> 
> Nope. Never did. 
> 
> 
> 
>> On Jul 16, 2015, at 13:32, Matthew Von-Maszewski  wrote:
>> 
>> Did you ever get a reply to this query?
>> 
>> Matthew
>> 
>>> On Jan 11, 2015, at 5:48 PM, Weston Jossey  wrote:
>>> 
>>> Hi All,
>>> Just wanted to put out an observation and see if it's either just me, or 
>>> something expected.
>>> 
>>> I've begun updating our large Riak 1.4 cluster to Riak 2.0.  Each cluster 
>>> has 43TB spread evenly over 32 nodes.  The riak 2.0 test nodes, after 
>>> running for 14 days, have on average around 5% more disk usage (in terms of 
>>> size, not IOPS) than the riak 1.4 cluster. Given that the cluster is evenly 
>>> balanced, I'd expect all nodes to be roughly the same size (or at least 
>>> within a point or two).  
>>> 
>>> Is this expected?  Does this have something to do with the dynamic settings 
>>> for the leveldb configuration parameters that is built into Riak 2?
>>> 
>>> The issue isn't a big one.  I'm just curious if this is expected / 
>>> anticipated, as it'll probably be worth noting in the Riak documentation as 
>>> part of the upgrade process.
>>> 
>>> Thanks!
>>> -Wes
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak LevelDB Deletion Problem

2015-07-14 Thread Matthew Von-Maszewski

Antonio,

Someone reminded me that you could make temporary space on your servers by 
deactivating active_anti_entropy, then deleting its data.  Of course, this 
assumes you are running “anti_entropy = active” in your riak.conf file.

I will send you some better notes if you think this is worth researching.  Let 
me know your thoughts.

Matthew



> On Jul 14, 2015, at 4:21 PM, Antonio Teixeira  wrote:
> 
> Ok Matthew,
> 
> We will proceed with the deletion , will monitor the disk space and will come 
> back with further reports to the list.
> 
> Thanks for your time,
> Antonio
> 
> 2015-07-14 18:32 GMT+01:00 Matthew Von-Maszewski  <mailto:matth...@basho.com>>:
> Antonio,
> 
> A Riak delete operation happens in these steps:
> 
> - Riak writes a “tombstone” value for the key to the N vnodes that contain it 
> (this is a new record)
> 
> - Riak by default, waits 3 seconds to verify all vnodes agree to the 
> tombstone/delete 
> 
> - Riak issues an actual delete operation against the key to leveldb
> 
> - leveldb creates its own tombstone
> 
> - the leveldb tombstone “floats” through level-0 and level-1 as part of 
> normal compactions
> 
> - upon reaching level-2, leveldb will initiate immediate compaction and 
> propagation of tombstones in .sst table files containing 1000 or more 
> tombstones.  This is when disk space recovery begins.
> 
> 
> Yes, this means that initially the leveldb vnodes will grow in size until 
> enough stuff (new data and/or tombstones) forces the tombstones to level-2 
> via normal compaction operations.  “Enough stuff” to fill levels 0 and 1 is 
> about 4.2Gbytes of compressed Riak objects.
> 
> The “get” operation you mentioned is something that happens internally.  A 
> manual “get” by your code will not influence the operation.
> 
> Matthew
> 
> 
>> On Jul 14, 2015, at 1:01 PM, Antonio Teixeira > <mailto:eagle.anto...@gmail.com>> wrote:
>> 
>> Hi Matthew,
>> 
>> We will be removing close to 1 TB of data from the node , and since we are 
>> short on "disk space" when we saw that the disk space was actually rising we 
>> halted the data removal.
>> 
>> Now according to some docs I have read, if after a deletion ( a few seconds 
>> ) we make a .get()  it force the release of the diskspace , is this true ?
>> 
>> For us it's not possible to move data to another node, is there any way even 
>> manually to release the space ? or at least to force :
>> " The actual release occurs significantly later (days, weeks, or even months 
>> later) when the tombstone record merges with the actual data in a background 
>> compaction."
>> 
>> Thanks for all the help,
>> Antonio
>> 
>> 
>> 2015-07-14 17:43 GMT+01:00 Matthew Von-Maszewski > <mailto:matth...@basho.com>>:
>> Antonio,
>> 
>> Here is a detailed discussion of the Riak / leveldb delete scenario:
>> 
>>  https://github.com/basho/leveldb/wiki/mv-aggressive-delete 
>> <https://github.com/basho/leveldb/wiki/mv-aggressive-delete>
>> 
>> Pay close attention to the section titled “Update April 6, 2014”.  This 
>> explains why as much as 4.2G bytes per vnode might remain within leveldb 
>> after deleting all keys.  There is no mechanism to override the logic that 
>> causes the disk space retention.
>> 
>> One workaround is to use Riak’s handoff mechanism to transfer vnodes from 
>> one physical server to another.  The vnode transfer will remove all deletion 
>> tombstones on the destination.  The last step of the transfer then deletes 
>> all leveldb files on the original server, recovering the space.
>> 
>> Matthew
>> 
>> 
>> 
>> > On Jul 14, 2015, at 12:32 PM, Antonio Teixeira > > <mailto:eagle.anto...@gmail.com>> wrote:
>> >
>> > Hello,
>> >
>> > We have been migrating our Riak Database to another infrastructure through 
>> > a "streaming" process and right now we should have somewhere around 2Gb of 
>> > free space the Hard Disk, however those 2Gb are still being used by Riak. 
>> > After some research I believe the problem is the Objects are only being 
>> > marked for deletion and not actually deleted at runtime. What we need is a 
>> > way to aggressively deleted those keys or some way to force Riak to delete 
>> > those marked keys and subsequently release the Disk Space. The Riak 
>> > version we are using is v2.0.2
>> > ___
>> > riak-users mailing list
>> > riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com 
>> > <http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>
>> 
>> 
> 
> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak LevelDB Deletion Problem

2015-07-14 Thread Matthew Von-Maszewski

Antonio,

A Riak delete operation happens in these steps:

- Riak writes a “tombstone” value for the key to the N vnodes that contain it 
(this is a new record)

- Riak by default, waits 3 seconds to verify all vnodes agree to the 
tombstone/delete 

- Riak issues an actual delete operation against the key to leveldb

- leveldb creates its own tombstone

- the leveldb tombstone “floats” through level-0 and level-1 as part of normal 
compactions

- upon reaching level-2, leveldb will initiate immediate compaction and 
propagation of tombstones in .sst table files containing 1000 or more 
tombstones.  This is when disk space recovery begins.


Yes, this means that initially the leveldb vnodes will grow in size until 
enough stuff (new data and/or tombstones) forces the tombstones to level-2 via 
normal compaction operations.  “Enough stuff” to fill levels 0 and 1 is about 
4.2Gbytes of compressed Riak objects.

The “get” operation you mentioned is something that happens internally.  A 
manual “get” by your code will not influence the operation.

Matthew


> On Jul 14, 2015, at 1:01 PM, Antonio Teixeira  wrote:
> 
> Hi Matthew,
> 
> We will be removing close to 1 TB of data from the node , and since we are 
> short on "disk space" when we saw that the disk space was actually rising we 
> halted the data removal.
> 
> Now according to some docs I have read, if after a deletion ( a few seconds ) 
> we make a .get()  it force the release of the diskspace , is this true ?
> 
> For us it's not possible to move data to another node, is there any way even 
> manually to release the space ? or at least to force :
> " The actual release occurs significantly later (days, weeks, or even months 
> later) when the tombstone record merges with the actual data in a background 
> compaction."
> 
> Thanks for all the help,
> Antonio
> 
> 
> 2015-07-14 17:43 GMT+01:00 Matthew Von-Maszewski  <mailto:matth...@basho.com>>:
> Antonio,
> 
> Here is a detailed discussion of the Riak / leveldb delete scenario:
> 
>  https://github.com/basho/leveldb/wiki/mv-aggressive-delete 
> <https://github.com/basho/leveldb/wiki/mv-aggressive-delete>
> 
> Pay close attention to the section titled “Update April 6, 2014”.  This 
> explains why as much as 4.2G bytes per vnode might remain within leveldb 
> after deleting all keys.  There is no mechanism to override the logic that 
> causes the disk space retention.
> 
> One workaround is to use Riak’s handoff mechanism to transfer vnodes from one 
> physical server to another.  The vnode transfer will remove all deletion 
> tombstones on the destination.  The last step of the transfer then deletes 
> all leveldb files on the original server, recovering the space.
> 
> Matthew
> 
> 
> 
> > On Jul 14, 2015, at 12:32 PM, Antonio Teixeira  > <mailto:eagle.anto...@gmail.com>> wrote:
> >
> > Hello,
> >
> > We have been migrating our Riak Database to another infrastructure through 
> > a "streaming" process and right now we should have somewhere around 2Gb of 
> > free space the Hard Disk, however those 2Gb are still being used by Riak. 
> > After some research I believe the problem is the Objects are only being 
> > marked for deletion and not actually deleted at runtime. What we need is a 
> > way to aggressively deleted those keys or some way to force Riak to delete 
> > those marked keys and subsequently release the Disk Space. The Riak version 
> > we are using is v2.0.2
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com 
> > <http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>
> 
> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak LevelDB Deletion Problem

2015-07-14 Thread Matthew Von-Maszewski

Antonio,

Here is a detailed discussion of the Riak / leveldb delete scenario:

 https://github.com/basho/leveldb/wiki/mv-aggressive-delete

Pay close attention to the section titled “Update April 6, 2014”.  This 
explains why as much as 4.2G bytes per vnode might remain within leveldb after 
deleting all keys.  There is no mechanism to override the logic that causes the 
disk space retention.

One workaround is to use Riak’s handoff mechanism to transfer vnodes from one 
physical server to another.  The vnode transfer will remove all deletion 
tombstones on the destination.  The last step of the transfer then deletes all 
leveldb files on the original server, recovering the space.

Matthew

> On Jul 14, 2015, at 12:32 PM, Antonio Teixeira  
> wrote:
> 
> Hello, 
> 
> We have been migrating our Riak Database to another infrastructure through a 
> "streaming" process and right now we should have somewhere around 2Gb of free 
> space the Hard Disk, however those 2Gb are still being used by Riak. After 
> some research I believe the problem is the Objects are only being marked for 
> deletion and not actually deleted at runtime. What we need is a way to 
> aggressively deleted those keys or some way to force Riak to delete those 
> marked keys and subsequently release the Disk Space. The Riak version we are 
> using is v2.0.2
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: LevelDB

2015-06-08 Thread Matthew Von-Maszewski

Congrats on the configure running. I will ask around concerning best practices 
and get back to you tomorrow. 

Is your /mnt/slow a remote mount?  If so, what details can you share?

Sent from my iPhone

> On Jun 8, 2015, at 2:58 PM, Joe Olson  wrote:
> 
> Matthew - I got it to work. Setting the filesystem rights to 777 got it to 
> start. I thought starting Riak as root would allow writing to the filesystem, 
> but I guess not.
> 
> Do most people run riak as an OS user "riak"? What is best practice here?
> 
> Thanks for all your help.
> 
> 
> 
> From: "Matthew Von-Maszewski" 
> To: "Joe Olson" 
> Cc: "riak-users" 
> Sent: Monday, June 8, 2015 11:40:56 AM
> Subject: Re: LevelDB
> 
> Odd.
> 
> The tree builds fine, including the lock file, once the config was good:
> 
> /mnt
> ├── fast
> │   └── leveldb
> │   ├── 0
> │   │   ├── CURRENT
> │   │   ├── LOCK
> │   │   ├── LOG
> │   │   ├── MANIFEST-02
> │   │   ├── sst_0
> │   │   ├── sst_1
> │   │   ├── sst_2
> │   │   └── sst_3
> │   ├── 1096126227998177188652763624537212264741949407232
> │   │   ├── CURRENT
> │   │   ├── LOCK
> │   │   ├── LOG
> │   │   ├── MANIFEST-02
> │   │   ├── sst_0
> │   │   ├── sst_1
> │   │   ├── sst_2
> │   │   └── sst_3
> 
> ├── slow
> │   └── leveldb
> │   ├── 0
> │   │   ├── sst_4
> │   │   ├── sst_5
> │   │   └── sst_6
> │   ├── 1096126227998177188652763624537212264741949407232
> │   │   ├── sst_4
> │   │   ├── sst_5
> │   │   └── sst_6
> 
> What, if anything is Riak building in your tree?  Again, I had to create the 
> leveldb directory in each path manually.  Did a chmod a+rwx on directories 
> fast, slow, fast/leveldb, and slow/leveldb.  Then started the code and waited.
> 
> I was getting your LOCK crash before I added the data_root line to the config.
> 
> Matthew
> 
> P.S.  It may be hours before I respond again.  Heading to the airport now.
> 
> 
> On Jun 8, 2015, at 12:24 PM, Joe Olson  wrote:
> 
> Also, I am testing as follows: make the changes to riak.conf, start riak as 
> root,wait a few minutes and issue a 'riak ping' as root. Riak usually starts 
> for 10-15 seconds, and then crashes (doesn't respond to the ping anymore)
> 
> 
> 
> From: "Joe Olson" 
> To: "Matthew Von-Maszewski" 
> Sent: Monday, June 8, 2015 11:22:24 AM
> Subject: Re: LevelDB
> 
> I think our email just crossed each other. Let me try again.
> 
> 
> 
> From: "Joe Olson" 
> To: "Matthew Von-Maszewski" 
> Cc: "riak-users" 
> Sent: Monday, June 8, 2015 11:21:26 AM
> Subject: Re: LevelDB
> 
> Matt - thanks for the support.
> 
> Adding
>leveldb.data_root=“./leveldb”
> 
> results in
> "IO error: 
> /mnt/fast/./leveldb/91343852333181432387730302044767688728495783936/LOCK: No 
> such file or directory"
> 
> Changing it to 
>leveldb.data_root=“leveldb”
> 
> results in 
>"IO error: 
> /mnt/fast/leveldb/91343852333181432387730302044767688728495783936/LOCK: No 
> such file or directory"}
> 
> Running "riak start" as root has no effect, and I also double checked the 
> directory permissions. I also manually created the leveldb subdirectory. 
> Still get the same results.
> 
> 
> 
> 
> 
> 
> From: "Matthew Von-Maszewski" 
> To: "Joe Olson" 
> Cc: "riak-users" 
> Sent: Monday, June 8, 2015 10:56:24 AM
> Subject: Re: LevelDB
> 
> Joe,
> 
> Long story short, I am slowly rebuilding my debug setup.  Taking longer than 
> I thought.  I suspect, but have not yet verified, that if you add one more 
> line:
> 
> leveldb.data_root=“./leveldb”
> 
> … your troubles will go away.  If you test that before I do, let me know your 
> results.  I will get the documentation updated if it works.
> 
> Matthew
> 
> 
> On Jun 8, 2015, at 10:37 AM, Joe Olson  wrote:
> 
> 
> 
> Here you go. Thanks for the help. My riak.conf is mostly stock, except for 
> the LevelDB changes.
> 
> 
> 
> 
> 
> From: "Matthew Von-Maszewski" 
> To: "Mr. Technology" 
> Cc: "riak-users" 
> Sent: Saturday, June 6, 2015 9:08:00 AM
> Subject: Re: LevelDB
> 
> There is a script macro that puts paths together during program start.  I 
> will go look at that.
> 
> It would help if you would send your riak.conf file.
> 
> Matthew
> 
> 
> On Jun 5, 2015, at 7:48 PM, Mr. Technology  wrote:
> 
> After installi

Re: LevelDB

2015-06-08 Thread Matthew Von-Maszewski

Odd.

The tree builds fine, including the lock file, once the config was good:

/mnt
├── fast
│   └── leveldb
│   ├── 0
│   │   ├── CURRENT
│   │   ├── LOCK
│   │   ├── LOG
│   │   ├── MANIFEST-02
│   │   ├── sst_0
│   │   ├── sst_1
│   │   ├── sst_2
│   │   └── sst_3
│   ├── 1096126227998177188652763624537212264741949407232
│   │   ├── CURRENT
│   │   ├── LOCK
│   │   ├── LOG
│   │   ├── MANIFEST-02
│   │   ├── sst_0
│   │   ├── sst_1
│   │   ├── sst_2
│   │   └── sst_3

├── slow
│   └── leveldb
│   ├── 0
│   │   ├── sst_4
│   │   ├── sst_5
│   │   └── sst_6
│   ├── 1096126227998177188652763624537212264741949407232
│   │   ├── sst_4
│   │   ├── sst_5
│   │   └── sst_6

What, if anything is Riak building in your tree?  Again, I had to create the 
leveldb directory in each path manually.  Did a chmod a+rwx on directories 
fast, slow, fast/leveldb, and slow/leveldb.  Then started the code and waited.

I was getting your LOCK crash before I added the data_root line to the config.

Matthew

P.S.  It may be hours before I respond again.  Heading to the airport now.


> On Jun 8, 2015, at 12:24 PM, Joe Olson  wrote:
> 
> Also, I am testing as follows: make the changes to riak.conf, start riak as 
> root,wait a few minutes and issue a 'riak ping' as root. Riak usually starts 
> for 10-15 seconds, and then crashes (doesn't respond to the ping anymore)
> 
> 
> 
> From: "Joe Olson" 
> To: "Matthew Von-Maszewski" 
> Sent: Monday, June 8, 2015 11:22:24 AM
> Subject: Re: LevelDB
> 
> I think our email just crossed each other. Let me try again.
> 
> 
> 
> From: "Joe Olson" 
> To: "Matthew Von-Maszewski" 
> Cc: "riak-users" 
> Sent: Monday, June 8, 2015 11:21:26 AM
> Subject: Re: LevelDB
> 
> Matt - thanks for the support.
> 
> Adding
>leveldb.data_root=“./leveldb”
> 
> results in
> "IO error: 
> /mnt/fast/./leveldb/91343852333181432387730302044767688728495783936/LOCK: No 
> such file or directory"
> 
> Changing it to 
>leveldb.data_root=“leveldb”
> 
> results in 
>"IO error: 
> /mnt/fast/leveldb/91343852333181432387730302044767688728495783936/LOCK: No 
> such file or directory"}
> 
> Running "riak start" as root has no effect, and I also double checked the 
> directory permissions. I also manually created the leveldb subdirectory. 
> Still get the same results.
> 
> 
> 
> 
> 
> 
> From: "Matthew Von-Maszewski" 
> To: "Joe Olson" 
> Cc: "riak-users" 
> Sent: Monday, June 8, 2015 10:56:24 AM
> Subject: Re: LevelDB
> 
> Joe,
> 
> Long story short, I am slowly rebuilding my debug setup.  Taking longer than 
> I thought.  I suspect, but have not yet verified, that if you add one more 
> line:
> 
> leveldb.data_root=“./leveldb”
> 
> … your troubles will go away.  If you test that before I do, let me know your 
> results.  I will get the documentation updated if it works.
> 
> Matthew
> 
> 
> On Jun 8, 2015, at 10:37 AM, Joe Olson  <mailto:technol...@nododos.com>> wrote:
> 
> 
> 
> Here you go. Thanks for the help. My riak.conf is mostly stock, except for 
> the LevelDB changes.
> 
> 
> 
> 
> 
> From: "Matthew Von-Maszewski" mailto:matth...@basho.com>>
> To: "Mr. Technology" mailto:technol...@nododos.com>>
> Cc: "riak-users"  <mailto:riak-users@lists.basho.com>>
> Sent: Saturday, June 6, 2015 9:08:00 AM
> Subject: Re: LevelDB
> 
> There is a script macro that puts paths together during program start.  I 
> will go look at that.
> 
> It would help if you would send your riak.conf file.
> 
> Matthew
> 
> 
> On Jun 5, 2015, at 7:48 PM, Mr. Technology  <mailto:technol...@nododos.com>> wrote:
> 
> After installing Riak 2.1.1 on a 5 node Centos7 based test cluster, I 
> followed the instructions at 
> http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/ 
> <http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/> to attempt 
> to configure a tiered LevelDB backend.
> 
> My /etc/riak/riak.conf:
> 
> storage_backend = leveldb
> leveldb.tiered = 4
> leveldb.tiered.path.fast = /mnt/fast
> leveldb.tiered.path.slow = /mnt/slow
> 
> Upon start, riak crashes with the following message in the log:
> 
> ** 
> {function_clause,[{riak_kv_vnode,terminate,[{bad_return_value,{stop,{db_open,"IO
>  error: 
> /mnt/fast//var/lib/riak/leveldb/159851741583067506678528028578343455274867621888/LOCK:
>  No such file or 
> directory"}}},undefined],[{file,&q

Re: LevelDB

2015-06-08 Thread Matthew Von-Maszewski

Joe,

I have built up your config and confirmed two bugs:

- you need the fourth config line:

leveldb.tiered = 4
leveldb.tiered.path.fast = /mnt/fast
leveldb.tiered.path.slow = /mnt/slow
leveldb.data_root = ./leveldb

- you must manually create the /mnt/fast/leveldb and /mnt/slow/leveldb 
directories before starting riak

Quick test says everything is happy after that.  I will get documentation and 
start-up scripts updated.

Thanks,
Matthew


> On Jun 8, 2015, at 10:37 AM, Joe Olson  wrote:
> 
> 
> 
> Here you go. Thanks for the help. My riak.conf is mostly stock, except for 
> the LevelDB changes.
> 
> 
> 
> 
> 
> From: "Matthew Von-Maszewski" 
> To: "Mr. Technology" 
> Cc: "riak-users" 
> Sent: Saturday, June 6, 2015 9:08:00 AM
> Subject: Re: LevelDB
> 
> There is a script macro that puts paths together during program start.  I 
> will go look at that.
> 
> It would help if you would send your riak.conf file.
> 
> Matthew
> 
> 
> On Jun 5, 2015, at 7:48 PM, Mr. Technology  <mailto:technol...@nododos.com>> wrote:
> 
> After installing Riak 2.1.1 on a 5 node Centos7 based test cluster, I 
> followed the instructions at 
> http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/ 
> <http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/> to attempt 
> to configure a tiered LevelDB backend.
> 
> My /etc/riak/riak.conf:
> 
> storage_backend = leveldb
> leveldb.tiered = 4
> leveldb.tiered.path.fast = /mnt/fast
> leveldb.tiered.path.slow = /mnt/slow
> 
> Upon start, riak crashes with the following message in the log:
> 
> ** 
> {function_clause,[{riak_kv_vnode,terminate,[{bad_return_value,{stop,{db_open,"IO
>  error: 
> /mnt/fast//var/lib/riak/leveldb/159851741583067506678528028578343455274867621888/LOCK:
>  No such file or 
> directory"}}},undefined],[{file,"src/riak_kv_vnode.erl"},{line,1155}]},{riak_core_vnode,terminate,3,[{file,"src/riak_core_vnode.erl"},{line,907}]},{gen_fsm,terminate,7,[{file,"gen_fsm.erl"},{line,597}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}
> 
> I assume the error is due to riak trying to write to the bogus path:
> 
> /mnt/fast//var/lib/riak/leveldb/159851741583067506678528028578343455274867621888/LOCK
> 
> (bogus due to the double slash above)
> 
> Is this a bug, erroneous documentation, or a configuration problem?? I 
> followed the example as closely as I could.
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: LevelDB

2015-06-08 Thread Matthew Von-Maszewski

Joe,

Long story short, I am slowly rebuilding my debug setup.  Taking longer than I 
thought.  I suspect, but have not yet verified, that if you add one more line:

leveldb.data_root=“./leveldb”

… your troubles will go away.  If you test that before I do, let me know your 
results.  I will get the documentation updated if it works.

Matthew


> On Jun 8, 2015, at 10:37 AM, Joe Olson  wrote:
> 
> 
> 
> Here you go. Thanks for the help. My riak.conf is mostly stock, except for 
> the LevelDB changes.
> 
> 
> 
> 
> 
> From: "Matthew Von-Maszewski" 
> To: "Mr. Technology" 
> Cc: "riak-users" 
> Sent: Saturday, June 6, 2015 9:08:00 AM
> Subject: Re: LevelDB
> 
> There is a script macro that puts paths together during program start.  I 
> will go look at that.
> 
> It would help if you would send your riak.conf file.
> 
> Matthew
> 
> 
> On Jun 5, 2015, at 7:48 PM, Mr. Technology  <mailto:technol...@nododos.com>> wrote:
> 
> After installing Riak 2.1.1 on a 5 node Centos7 based test cluster, I 
> followed the instructions at 
> http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/ 
> <http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/> to attempt 
> to configure a tiered LevelDB backend.
> 
> My /etc/riak/riak.conf:
> 
> storage_backend = leveldb
> leveldb.tiered = 4
> leveldb.tiered.path.fast = /mnt/fast
> leveldb.tiered.path.slow = /mnt/slow
> 
> Upon start, riak crashes with the following message in the log:
> 
> ** 
> {function_clause,[{riak_kv_vnode,terminate,[{bad_return_value,{stop,{db_open,"IO
>  error: 
> /mnt/fast//var/lib/riak/leveldb/159851741583067506678528028578343455274867621888/LOCK:
>  No such file or 
> directory"}}},undefined],[{file,"src/riak_kv_vnode.erl"},{line,1155}]},{riak_core_vnode,terminate,3,[{file,"src/riak_core_vnode.erl"},{line,907}]},{gen_fsm,terminate,7,[{file,"gen_fsm.erl"},{line,597}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}
> 
> I assume the error is due to riak trying to write to the bogus path:
> 
> /mnt/fast//var/lib/riak/leveldb/159851741583067506678528028578343455274867621888/LOCK
> 
> (bogus due to the double slash above)
> 
> Is this a bug, erroneous documentation, or a configuration problem?? I 
> followed the example as closely as I could.
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: LevelDB

2015-06-06 Thread Matthew Von-Maszewski

There is a script macro that puts paths together during program start.  I will 
go look at that.

It would help if you would send your riak.conf file.

Matthew


> On Jun 5, 2015, at 7:48 PM, Mr. Technology  wrote:
> 
> After installing Riak 2.1.1 on a 5 node Centos7 based test cluster, I 
> followed the instructions at 
> http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/ to attempt 
> to configure a tiered LevelDB backend.
> 
> My /etc/riak/riak.conf:
> 
> storage_backend = leveldb
> leveldb.tiered = 4
> leveldb.tiered.path.fast = /mnt/fast
> leveldb.tiered.path.slow = /mnt/slow
> 
> Upon start, riak crashes with the following message in the log:
> 
> ** 
> {function_clause,[{riak_kv_vnode,terminate,[{bad_return_value,{stop,{db_open,"IO
>  error: 
> /mnt/fast//var/lib/riak/leveldb/159851741583067506678528028578343455274867621888/LOCK:
>  No such file or 
> directory"}}},undefined],[{file,"src/riak_kv_vnode.erl"},{line,1155}]},{riak_core_vnode,terminate,3,[{file,"src/riak_core_vnode.erl"},{line,907}]},{gen_fsm,terminate,7,[{file,"gen_fsm.erl"},{line,597}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}
> 
> I assume the error is due to riak trying to write to the bogus path:
> 
> /mnt/fast//var/lib/riak/leveldb/159851741583067506678528028578343455274867621888/LOCK
> 
> (bogus due to the double slash above)
> 
> Is this a bug, erroneous documentation, or a configuration problem?? I 
> followed the example as closely as I could.
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: problems with starting nodes

2014-12-25 Thread Matthew Von-Maszewski

This explains the file limit warning:

http://docs.basho.com/riak/latest/ops/tuning/open-files-limit/

You really do Not want to ignore that warning.  128000 is a good start if your 
operating system allows that high of a number.  64000 is ok.

Matthew


On Dec 25, 2014, at 1:58 AM, Sargun Dhillon  wrote:

> Can you post the files in the log directory on a github gist, and run
> the command "./dev1/bin/riak console"? In addition, run Riak as a
> non-root user, with the max files limit bumped up as high as you can
> set it.
> 
> On Wed, Dec 24, 2014 at 10:47 PM, Ildar Alishev  
> wrote:
>> Hello
>> 
>> 
>> I have a problem starting nodes
>> 
>> 
>> it says
>> 
>> root@salo:/home/salohost/work/riak-2.0.2/dev# ./dev1/bin/riak start
>> 
>>  WARNING: ulimit -n is 1024; 65536 is the recommended minimum.
>> 
>> riak failed to start within 15 seconds,
>> see the output of 'riak console' for more information.
>> If you want to wait longer, set the environment variable
>> WAIT_FOR_ERLANG to the number of seconds to wait.
>> 
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak Nodes Crashing

2014-12-08 Thread Matthew Von-Maszewski

Satish,

This is to be expected.  You have a ring size of 64 and 5 nodes.  5 does not 
evenly divide into 64.  4 nodes contain 13 vnodes.  One node only contains 12 
vnodes:

13 / 64 = 20.3125%
12 / 64 = 18.75 %

All is fine.

Matthew


On Dec 8, 2014, at 12:17 PM, ender  wrote:

> I intend to upgrade to 2.0 at some point but that will be a bigger task.  
> Another question:
> 
> [ec2-user@ip-10-197-93-214 ~]$ sudo riak-admin member_status
> = Membership 
> ==
> Status RingPendingNode
> ---
> valid  20.3%  --  'riak@10.196.72.106'
> valid  20.3%  --  'riak@10.196.72.124'
> valid  20.3%  --  'riak@10.196.72.247'
> valid  20.3%  --  'riak@10.197.93.214'
> valid  18.8%  --  'riak@10.197.94.33'
> ---
> Valid:5 / Leaving:0 / Exiting:0 / Joining:0 / Down:0
> 
> When I first created the cluster, all 5 nodes Ring matched to within 0.1% of 
> each other.  Now node 5 has consistently been at a lower percentage than the 
> other 4.  Is this normal?
> 
> 
> On Mon, Dec 8, 2014 at 9:06 AM, Matthew Von-Maszewski  
> wrote:
> Satish,
> 
> This additional information continues to support my suspicion that the memory 
> management is not fully accounting for your number of open files.  A large 
> query can cause many files that were previously unused to open.  An open 
> table file in leveldb uses memory heavily (for the file's block index and 
> bloom filter).  Also, leveldb will allow the memory limit to be knowingly 
> exceeded in the case of queries that cover large segments of the key space.
> 
> There are fixes for both of those scenarios in Riak 2.0, but not in the 1.x 
> series.
> 
> Matthew 
> 
> 
> On Dec 8, 2014, at 11:22 AM, ender  wrote:
> 
>> Hello Matthew,
>> 
>> I was going through my cluster setup again, checking up on stuff, when I 
>> noticed something.  So just for some background, when I originally started 
>> using Riak it was as a replacement for MongoDB.  To get things up and 
>> running quickly I "cheated" and just wrote some code that took a MongoDB 
>> query and recast it as a Riak search query. Then I enabled the search hook 
>> on all my buckets, and it just worked!  Of course, Riak search wasn't the 
>> fastest thing on 2 legs, so I started to redo my data model to make it more 
>> key-value store friendly. Where I had to, I used secondary indexes.  Once 
>> I'd converted the data model for a bucket I would remove the search hook on 
>> that bucket.  On Friday evening I discovered that on 2 of my buckets I'd 
>> forgotten to remove the search hook.  One of the buckets only has a couple 
>> of thousand records in it, so no big deal.  But the other one! - that bucket 
>> has the most reads, the most writes, and has over 200M records stored in it. 
>>  I removed the search hook on both of those buckets and the cluster has been 
>> stable over the weekend and is still up as of now. I did not disable active 
>> anti-entropy, since I did not want to change too many variables at the same 
>> time.  I will do that today.  Question is, was this a coincidence or do you 
>> think it's possible the search indexing was causing the OOM errors?
>> 
>> Satish
>> 
>> On Sat, Dec 6, 2014 at 6:59 AM, Matthew Von-Maszewski  
>> wrote:
>> Satish,
>> 
>> I do NOT recommend adding a sixth node before the other five are stable 
>> again.  There was another customer that did that recently and things just 
>> got worse due to the vnode handoff actions to the sixth node.
>> 
>> I do recommend one or both of the following:
>> 
>> - disable active anti-entropy in app.config, {anti_entropy, {off, []}}.  
>> Then restart all nodes.  We quickly replaced 1.4.7 due to an bug in the 
>> active anti-entropy.  I do not know the details of the bug.  But no one had 
>> seen a crash from it.  However, you may be seeing a long term problem due to 
>> that same bug.  The anti-entropy feature in 1.4.7 is not really protecting 
>> your data anyway.  It might as well be disabled until you are ready to 
>> upgrade.
>> 
>> - further reduce the max_open_files parameter simply to get memory stable:  
>> use 75 instead of the recent 150.  You must restart all nodes after making 
>> the change in app.config.
>> 
>> 
>> I will need to solicit support from others at Basho if the two w

Re: Riak Nodes Crashing

2014-12-08 Thread Matthew Von-Maszewski

Satish,

This additional information continues to support my suspicion that the memory 
management is not fully accounting for your number of open files.  A large 
query can cause many files that were previously unused to open.  An open table 
file in leveldb uses memory heavily (for the file's block index and bloom 
filter).  Also, leveldb will allow the memory limit to be knowingly exceeded in 
the case of queries that cover large segments of the key space.

There are fixes for both of those scenarios in Riak 2.0, but not in the 1.x 
series.

Matthew 


On Dec 8, 2014, at 11:22 AM, ender  wrote:

> Hello Matthew,
> 
> I was going through my cluster setup again, checking up on stuff, when I 
> noticed something.  So just for some background, when I originally started 
> using Riak it was as a replacement for MongoDB.  To get things up and running 
> quickly I "cheated" and just wrote some code that took a MongoDB query and 
> recast it as a Riak search query. Then I enabled the search hook on all my 
> buckets, and it just worked!  Of course, Riak search wasn't the fastest thing 
> on 2 legs, so I started to redo my data model to make it more key-value store 
> friendly. Where I had to, I used secondary indexes.  Once I'd converted the 
> data model for a bucket I would remove the search hook on that bucket.  On 
> Friday evening I discovered that on 2 of my buckets I'd forgotten to remove 
> the search hook.  One of the buckets only has a couple of thousand records in 
> it, so no big deal.  But the other one! - that bucket has the most reads, the 
> most writes, and has over 200M records stored in it.  I removed the search 
> hook on both of those buckets and the cluster has been stable over the 
> weekend and is still up as of now. I did not disable active anti-entropy, 
> since I did not want to change too many variables at the same time.  I will 
> do that today.  Question is, was this a coincidence or do you think it's 
> possible the search indexing was causing the OOM errors?
> 
> Satish
> 
> On Sat, Dec 6, 2014 at 6:59 AM, Matthew Von-Maszewski  
> wrote:
> Satish,
> 
> I do NOT recommend adding a sixth node before the other five are stable 
> again.  There was another customer that did that recently and things just got 
> worse due to the vnode handoff actions to the sixth node.
> 
> I do recommend one or both of the following:
> 
> - disable active anti-entropy in app.config, {anti_entropy, {off, []}}.  Then 
> restart all nodes.  We quickly replaced 1.4.7 due to an bug in the active 
> anti-entropy.  I do not know the details of the bug.  But no one had seen a 
> crash from it.  However, you may be seeing a long term problem due to that 
> same bug.  The anti-entropy feature in 1.4.7 is not really protecting your 
> data anyway.  It might as well be disabled until you are ready to upgrade.
> 
> - further reduce the max_open_files parameter simply to get memory stable:  
> use 75 instead of the recent 150.  You must restart all nodes after making 
> the change in app.config.
> 
> 
> I will need to solicit support from others at Basho if the two workarounds 
> above do not stabilize the cluster.  
> 
> Matthew
> 
> On Dec 5, 2014, at 5:54 PM, ender  wrote:
> 
>> Would adding a 6th node mean each node would use less memory as  a stopgap 
>> measure?
>> 
>> On Fri, Dec 5, 2014 at 2:20 PM, ender  wrote:
>> Hey Matthew it just crashed again.  This time I got the syslog and leveldb 
>> logs right away.
>> 
>> 
>> 
>> On Fri, Dec 5, 2014 at 11:43 AM, Matthew Von-Maszewski  
>> wrote:
>> Satish,
>> 
>> Here is a key line from /var/log/messages:
>> 
>> Dec  5 06:52:43 ip-10-196-72-106 kernel: [26881589.804401] beam.smp invoked 
>> oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
>> 
>> The log entry does NOT match the timestamps of the crash.log and error.log 
>> below.  But that is ok.  The operating system killed off Riak.  There would 
>> have be no notification in the Riak log's of the operating system's actions.
>> 
>> The fact that the out of memory monitor, oom-killer, killed Riak further 
>> supports the change to max_open_files.  I recommend we now wait to see if 
>> the problem occurs again.
>> 
>> 
>> Matthew
>> 
>> 
>> On Dec 5, 2014, at 2:35 PM, ender  wrote:
>> 
>>> Hey Matthew,
>>> 
>>> The crash occurred around 3:00am:
>>> 
>>> -rw-rw-r-- 1 riak riak920 Dec  5 03:01 crash.log
>>> -rw-rw-r-- 1 riak riak617 Dec  5 03:01 error.log
>>> 
>>> I have attached the syslog that covers that time.  I also went ahe

Re: Riak Nodes Crashing

2014-12-06 Thread Matthew Von-Maszewski

Satish,

I do NOT recommend adding a sixth node before the other five are stable again.  
There was another customer that did that recently and things just got worse due 
to the vnode handoff actions to the sixth node.

I do recommend one or both of the following:

- disable active anti-entropy in app.config, {anti_entropy, {off, []}}.  Then 
restart all nodes.  We quickly replaced 1.4.7 due to an bug in the active 
anti-entropy.  I do not know the details of the bug.  But no one had seen a 
crash from it.  However, you may be seeing a long term problem due to that same 
bug.  The anti-entropy feature in 1.4.7 is not really protecting your data 
anyway.  It might as well be disabled until you are ready to upgrade.

- further reduce the max_open_files parameter simply to get memory stable:  use 
75 instead of the recent 150.  You must restart all nodes after making the 
change in app.config.


I will need to solicit support from others at Basho if the two workarounds 
above do not stabilize the cluster.  

Matthew

On Dec 5, 2014, at 5:54 PM, ender  wrote:

> Would adding a 6th node mean each node would use less memory as  a stopgap 
> measure?
> 
> On Fri, Dec 5, 2014 at 2:20 PM, ender  wrote:
> Hey Matthew it just crashed again.  This time I got the syslog and leveldb 
> logs right away.
> 
> 
> 
> On Fri, Dec 5, 2014 at 11:43 AM, Matthew Von-Maszewski  
> wrote:
> Satish,
> 
> Here is a key line from /var/log/messages:
> 
> Dec  5 06:52:43 ip-10-196-72-106 kernel: [26881589.804401] beam.smp invoked 
> oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
> 
> The log entry does NOT match the timestamps of the crash.log and error.log 
> below.  But that is ok.  The operating system killed off Riak.  There would 
> have be no notification in the Riak log's of the operating system's actions.
> 
> The fact that the out of memory monitor, oom-killer, killed Riak further 
> supports the change to max_open_files.  I recommend we now wait to see if the 
> problem occurs again.
> 
> 
> Matthew
> 
> 
> On Dec 5, 2014, at 2:35 PM, ender  wrote:
> 
>> Hey Matthew,
>> 
>> The crash occurred around 3:00am:
>> 
>> -rw-rw-r-- 1 riak riak920 Dec  5 03:01 crash.log
>> -rw-rw-r-- 1 riak riak617 Dec  5 03:01 error.log
>> 
>> I have attached the syslog that covers that time.  I also went ahead and 
>> changed max_open_files in app.config to to 150 from 315.
>> 
>> Satish
>> 
>> 
>> On Fri, Dec 5, 2014 at 11:29 AM, Matthew Von-Maszewski  
>> wrote:
>> Satish,
>> 
>> The "key" system log varies by Linux platform.  Yes, /var/log/messages may 
>> hold some key clues.  Again, be sure the file covers the time of a crash.
>> 
>> Matthew
>> 
>> 
>> On Dec 5, 2014, at 1:29 PM, ender  wrote:
>> 
>>> Hey Matthew,
>>> 
>>> I see a /var/log/messages file, but no syslog or system.log etc.  Is it the 
>>> messages file you want?
>>> 
>>> Satish
>>> 
>>> 
>>> On Fri, Dec 5, 2014 at 10:06 AM, Matthew Von-Maszewski  
>>> wrote:
>>> Satish,
>>> 
>>> I find nothing compelling in the log or the app.config.  Therefore I have 
>>> two additional suggestions/requests:
>>> 
>>> - lower max_open_files in app.config to to 150 from 315.  There was one 
>>> other customer report regarding the limit not properly stopping out of 
>>> memory (OOM) conditions.
>>> 
>>> - try to locate a /var/log/syslog* file from a node that contains the time 
>>> of the crash.  There may be helpful information there.  Please send that 
>>> along.
>>> 
>>> 
>>> Unrelated to this crash … 1.4.7 has a known bug in its active anti-entropy 
>>> (AAE) logic.  This bug is NOT known to cause a crash.  The bug does cause 
>>> AAE to be unreliable for data restoration.  The proper steps for upgrading 
>>> to the current release (1.4.12) are:
>>> 
>>> -- across the entire cluster
>>> - disable anti_entropy in app.config on all nodes: {anti_entropy, {off, []}}
>>> - perform a rolling restart of all nodes … AAE is now disabled in the 
>>> cluster 
>>> 
>>> -- on each node
>>> - stop the node
>>> - remove (erase all files and directories) /vol/lib/riak/anti_entropy
>>> - update Riak to the new software revision
>>> - start the node again
>>> 
>>> -- across the entire cluster
>>> - enable anti_entropy in app.config on all nodes: {anti_entropy, {on, []}}
>>> - perform a rolling restart of all nodes … AAE is now enabled in the

Re: Riak Nodes Crashing

2014-12-05 Thread Matthew Von-Maszewski

Satish,

Here is a key line from /var/log/messages:

Dec  5 06:52:43 ip-10-196-72-106 kernel: [26881589.804401] beam.smp invoked 
oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0

The log entry does NOT match the timestamps of the crash.log and error.log 
below.  But that is ok.  The operating system killed off Riak.  There would 
have be no notification in the Riak log's of the operating system's actions.

The fact that the out of memory monitor, oom-killer, killed Riak further 
supports the change to max_open_files.  I recommend we now wait to see if the 
problem occurs again.


Matthew


On Dec 5, 2014, at 2:35 PM, ender  wrote:

> Hey Matthew,
> 
> The crash occurred around 3:00am:
> 
> -rw-rw-r-- 1 riak riak920 Dec  5 03:01 crash.log
> -rw-rw-r-- 1 riak riak617 Dec  5 03:01 error.log
> 
> I have attached the syslog that covers that time.  I also went ahead and 
> changed max_open_files in app.config to to 150 from 315.
> 
> Satish
> 
> 
> On Fri, Dec 5, 2014 at 11:29 AM, Matthew Von-Maszewski  
> wrote:
> Satish,
> 
> The "key" system log varies by Linux platform.  Yes, /var/log/messages may 
> hold some key clues.  Again, be sure the file covers the time of a crash.
> 
> Matthew
> 
> 
> On Dec 5, 2014, at 1:29 PM, ender  wrote:
> 
>> Hey Matthew,
>> 
>> I see a /var/log/messages file, but no syslog or system.log etc.  Is it the 
>> messages file you want?
>> 
>> Satish
>> 
>> 
>> On Fri, Dec 5, 2014 at 10:06 AM, Matthew Von-Maszewski  
>> wrote:
>> Satish,
>> 
>> I find nothing compelling in the log or the app.config.  Therefore I have 
>> two additional suggestions/requests:
>> 
>> - lower max_open_files in app.config to to 150 from 315.  There was one 
>> other customer report regarding the limit not properly stopping out of 
>> memory (OOM) conditions.
>> 
>> - try to locate a /var/log/syslog* file from a node that contains the time 
>> of the crash.  There may be helpful information there.  Please send that 
>> along.
>> 
>> 
>> Unrelated to this crash … 1.4.7 has a known bug in its active anti-entropy 
>> (AAE) logic.  This bug is NOT known to cause a crash.  The bug does cause 
>> AAE to be unreliable for data restoration.  The proper steps for upgrading 
>> to the current release (1.4.12) are:
>> 
>> -- across the entire cluster
>> - disable anti_entropy in app.config on all nodes: {anti_entropy, {off, []}}
>> - perform a rolling restart of all nodes … AAE is now disabled in the 
>> cluster 
>> 
>> -- on each node
>> - stop the node
>> - remove (erase all files and directories) /vol/lib/riak/anti_entropy
>> - update Riak to the new software revision
>> - start the node again
>> 
>> -- across the entire cluster
>> - enable anti_entropy in app.config on all nodes: {anti_entropy, {on, []}}
>> - perform a rolling restart of all nodes … AAE is now enabled in the cluster 
>> 
>> The nodes will start rebuilding the AAE hash data.  Suggest you perform the 
>> last rolling restart during a low utilization time of your cluster.
>> 
>> 
>> Matthew
>> 
>> 
>> On Dec 5, 2014, at 11:02 AM, ender  wrote:
>> 
>>> Hi Matthew,
>>> 
>>> Riak version: 1.4.7
>>> 5 Nodes in cluster
>>> RAM: 30GB
>>> 
>>> The leveldb logs are attached.
>>> 
>>> 
>>> 
>>> On Thu, Dec 4, 2014 at 1:34 PM, Matthew Von-Maszewski  
>>> wrote:
>>> Satish,
>>> 
>>> Some questions:
>>> 
>>> - what version of Riak are you running?  logs suggest 1.4.7
>>> - how many nodes in your cluster?
>>> - what is the physical memory (RAM size) of each node?
>>> - would you send the leveldb LOG  files from one of the crashed servers:
>>> tar -czf satish_LOG.tgz /vol/lib/riak/leveldb/*/LOG*
>>> 
>>> 
>>> Matthew
>>> 
>>> On Dec 4, 2014, at 4:02 PM, ender  wrote:
>>> 
>>> > My RIak installation has been running successfully for about a year.  
>>> > This week nodes suddenly started randomly crashing.  The machines have 
>>> > plenty of memory and free disk space, and looking in the ring directory 
>>> > nothing appears to amiss:
>>> >
>>> > [ec2-user@ip-10-196-72-247 ~]$ ls -l /vol/lib/riak/ring
>>> > total 80
>>> > -rw-rw-r-- 1 riak riak 17829 Nov 29 19:42 
>>> > riak_core_ring.default.20141129194225
>>> > -rw-rw-r-- 1 riak riak 17829 Dec  3 19:07 
>>> &g

Re: Riak Nodes Crashing

2014-12-05 Thread Matthew Von-Maszewski

Satish,

I find nothing compelling in the log or the app.config.  Therefore I have two 
additional suggestions/requests:

- lower max_open_files in app.config to to 150 from 315.  There was one other 
customer report regarding the limit not properly stopping out of memory (OOM) 
conditions.

- try to locate a /var/log/syslog* file from a node that contains the time of 
the crash.  There may be helpful information there.  Please send that along.


Unrelated to this crash … 1.4.7 has a known bug in its active anti-entropy 
(AAE) logic.  This bug is NOT known to cause a crash.  The bug does cause AAE 
to be unreliable for data restoration.  The proper steps for upgrading to the 
current release (1.4.12) are:

-- across the entire cluster
- disable anti_entropy in app.config on all nodes: {anti_entropy, {off, []}}
- perform a rolling restart of all nodes … AAE is now disabled in the cluster 

-- on each node
- stop the node
- remove (erase all files and directories) /vol/lib/riak/anti_entropy
- update Riak to the new software revision
- start the node again

-- across the entire cluster
- enable anti_entropy in app.config on all nodes: {anti_entropy, {on, []}}
- perform a rolling restart of all nodes … AAE is now enabled in the cluster 

The nodes will start rebuilding the AAE hash data.  Suggest you perform the 
last rolling restart during a low utilization time of your cluster.


Matthew


On Dec 5, 2014, at 11:02 AM, ender  wrote:

> Hi Matthew,
> 
> Riak version: 1.4.7
> 5 Nodes in cluster
> RAM: 30GB
> 
> The leveldb logs are attached.
> 
> 
> 
> On Thu, Dec 4, 2014 at 1:34 PM, Matthew Von-Maszewski  
> wrote:
> Satish,
> 
> Some questions:
> 
> - what version of Riak are you running?  logs suggest 1.4.7
> - how many nodes in your cluster?
> - what is the physical memory (RAM size) of each node?
> - would you send the leveldb LOG  files from one of the crashed servers:
> tar -czf satish_LOG.tgz /vol/lib/riak/leveldb/*/LOG*
> 
> 
> Matthew
> 
> On Dec 4, 2014, at 4:02 PM, ender  wrote:
> 
> > My RIak installation has been running successfully for about a year.  This 
> > week nodes suddenly started randomly crashing.  The machines have plenty of 
> > memory and free disk space, and looking in the ring directory nothing 
> > appears to amiss:
> >
> > [ec2-user@ip-10-196-72-247 ~]$ ls -l /vol/lib/riak/ring
> > total 80
> > -rw-rw-r-- 1 riak riak 17829 Nov 29 19:42 
> > riak_core_ring.default.20141129194225
> > -rw-rw-r-- 1 riak riak 17829 Dec  3 19:07 
> > riak_core_ring.default.20141203190748
> > -rw-rw-r-- 1 riak riak 17829 Dec  4 16:29 
> > riak_core_ring.default.20141204162956
> > -rw-rw-r-- 1 riak riak 17847 Dec  4 20:45 
> > riak_core_ring.default.20141204204548
> >
> > [ec2-user@ip-10-196-72-247 ~]$ du -h /vol/lib/riak/ring
> > 84K   /vol/lib/riak/ring
> >
> > I have attached a tarball with the app.config file plus all the logs from 
> > the node at the time of the crash.  Any help much appreciated!
> >
> > Satish
> >
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak Nodes Crashing

2014-12-04 Thread Matthew Von-Maszewski

Satish,

Some questions:

- what version of Riak are you running?  logs suggest 1.4.7
- how many nodes in your cluster?
- what is the physical memory (RAM size) of each node?
- would you send the leveldb LOG  files from one of the crashed servers:
tar -czf satish_LOG.tgz /vol/lib/riak/leveldb/*/LOG*


Matthew

On Dec 4, 2014, at 4:02 PM, ender  wrote:

> My RIak installation has been running successfully for about a year.  This 
> week nodes suddenly started randomly crashing.  The machines have plenty of 
> memory and free disk space, and looking in the ring directory nothing appears 
> to amiss:
> 
> [ec2-user@ip-10-196-72-247 ~]$ ls -l /vol/lib/riak/ring
> total 80
> -rw-rw-r-- 1 riak riak 17829 Nov 29 19:42 
> riak_core_ring.default.20141129194225
> -rw-rw-r-- 1 riak riak 17829 Dec  3 19:07 
> riak_core_ring.default.20141203190748
> -rw-rw-r-- 1 riak riak 17829 Dec  4 16:29 
> riak_core_ring.default.20141204162956
> -rw-rw-r-- 1 riak riak 17847 Dec  4 20:45 
> riak_core_ring.default.20141204204548
> 
> [ec2-user@ip-10-196-72-247 ~]$ du -h /vol/lib/riak/ring
> 84K   /vol/lib/riak/ring
> 
> I have attached a tarball with the app.config file plus all the logs from the 
> node at the time of the crash.  Any help much appreciated!
> 
> Satish
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak unexpectedly crashes

2014-11-27 Thread Matthew Von-Maszewski

Ivaylo,

We will need to await help from others.  This is not a crash scenario that I 
happen to know.  The other problem is that today and tomorrow are holidays in 
the U.S.

The typical requests at this point are for a copy of your app.config and any 
information from the Riak crash.log that details the crash point.  That might 
lead someone to giving you a solid answer.

Matthew


On Nov 27, 2014, at 10:46 AM, Ivaylo Panitchkov  
wrote:

> 
> Hi Matthew,
> 
> That happened few times in the past with different versions.
> The one we are running right now is 1.4.7 with bitcask and the problem 
> occurred again few days ago.
> 
> Thanks,
> Ivaylo
> 
> 
> 
> On Thu, Nov 27, 2014 at 10:37 AM, Matthew Von-Maszewski  
> wrote:
> What version of Riak?  What backend are you using (bitcask, memory, leveldb)?
> 
> Matthew
> 
> On Nov 27, 2014, at 10:30 AM, Ivaylo Panitchkov  
> wrote:
> 
>> 
>> Hi Alexander,
>> 
>> Yes, they are set properly in /etc/security/limits.conf 
>> 
>> Ivaylo
>> 
>> 
>> On Thu, Nov 27, 2014 at 9:44 AM, Alexander Sicular  
>> wrote:
>> Are you sure your ulimit persists through reboot?
>> 
>> -Alexander
>> 
>> @siculars
>> http://siculars.posthaven.com
>> 
>> Sent from my iRotaryPhone
>> 
>> > On Nov 27, 2014, at 09:39, Ivaylo Panitchkov  
>> > wrote:
>> >
>> >
>> > Hello,
>> >
>> > I'm running a single Riak server used for dev purposes and noticed 
>> > something unexpected.
>> > It happened few times in the past and I just want to share my observation 
>> > with you if someone experienced the same problem.
>> > While the server was up and running it happened the machine needed to be 
>> > rebooted.
>> > Once the reboot cycle completed I noticed the running Riak process 
>> > suddenly crashed.
>> > If I try to start it again it continues crashing.
>> > I haven't time to investigate what causing the problem but the way to 
>> > solve it was to simply delete the content of the bitcask folder (it's a 
>> > dev server so nothing critical there).
>> >
>> > Thanks in advance,
>> > Ivaylo
>> >
>> >
>> > --
>> > Ivaylo Panitchkov
>> > Software developer
>> > Hibernum Creations Inc.
>> >
>> > Ce courriel est confidentiel et peut aussi être protégé par la loi.Si vous 
>> > avez reçu ce courriel par erreur, veuillez nous en aviser immédiatement en 
>> > y répondant, puis supprimer ce message de votre système. Veuillez ne pas 
>> > le copier, l’utiliser pour quelque raison que ce soit ni divulguer son 
>> > contenu à quiconque.
>> > This email is confidential and may also be legally privileged. If you have 
>> > received this email in error, please notify us immediately by reply email 
>> > and then delete this message from your system. Please do not copy it or 
>> > use it for any purpose or disclose its content.
>> > ___
>> > riak-users mailing list
>> > riak-users@lists.basho.com
>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
>> 
>> 
>> -- 
>> Ivaylo Panitchkov
>> Software developer
>> Hibernum Creations Inc.
>> 
>> Ce courriel est confidentiel et peut aussi être protégé par la loi.Si vous 
>> avez reçu ce courriel par erreur, veuillez nous en aviser immédiatement en y 
>> répondant, puis supprimer ce message de votre système. Veuillez ne pas le 
>> copier, l’utiliser pour quelque raison que ce soit ni divulguer son contenu 
>> à quiconque.
>> This email is confidential and may also be legally privileged. If you have 
>> received this email in error, please notify us immediately by reply email 
>> and then delete this message from your system. Please do not copy it or use 
>> it for any purpose or disclose its content.
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> 
> -- 
> Ivaylo Panitchkov
> Software developer
> Hibernum Creations Inc.
> 
> Ce courriel est confidentiel et peut aussi être protégé par la loi.Si vous 
> avez reçu ce courriel par erreur, veuillez nous en aviser immédiatement en y 
> répondant, puis supprimer ce message de votre système. Veuillez ne pas le 
> copier, l’utiliser pour quelque raison que ce soit ni divulguer son contenu à 
> quiconque.
> This email is confidential and may also be legally privileged. If you have 
> received this email in error, please notify us immediately by reply email and 
> then delete this message from your system. Please do not copy it or use it 
> for any purpose or disclose its content.

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak unexpectedly crashes

2014-11-27 Thread Matthew Von-Maszewski

What version of Riak?  What backend are you using (bitcask, memory, leveldb)?

Matthew

On Nov 27, 2014, at 10:30 AM, Ivaylo Panitchkov  
wrote:

> 
> Hi Alexander,
> 
> Yes, they are set properly in /etc/security/limits.conf 
> 
> Ivaylo
> 
> 
> On Thu, Nov 27, 2014 at 9:44 AM, Alexander Sicular  wrote:
> Are you sure your ulimit persists through reboot?
> 
> -Alexander
> 
> @siculars
> http://siculars.posthaven.com
> 
> Sent from my iRotaryPhone
> 
> > On Nov 27, 2014, at 09:39, Ivaylo Panitchkov  
> > wrote:
> >
> >
> > Hello,
> >
> > I'm running a single Riak server used for dev purposes and noticed 
> > something unexpected.
> > It happened few times in the past and I just want to share my observation 
> > with you if someone experienced the same problem.
> > While the server was up and running it happened the machine needed to be 
> > rebooted.
> > Once the reboot cycle completed I noticed the running Riak process suddenly 
> > crashed.
> > If I try to start it again it continues crashing.
> > I haven't time to investigate what causing the problem but the way to solve 
> > it was to simply delete the content of the bitcask folder (it's a dev 
> > server so nothing critical there).
> >
> > Thanks in advance,
> > Ivaylo
> >
> >
> > --
> > Ivaylo Panitchkov
> > Software developer
> > Hibernum Creations Inc.
> >
> > Ce courriel est confidentiel et peut aussi être protégé par la loi.Si vous 
> > avez reçu ce courriel par erreur, veuillez nous en aviser immédiatement en 
> > y répondant, puis supprimer ce message de votre système. Veuillez ne pas le 
> > copier, l’utiliser pour quelque raison que ce soit ni divulguer son contenu 
> > à quiconque.
> > This email is confidential and may also be legally privileged. If you have 
> > received this email in error, please notify us immediately by reply email 
> > and then delete this message from your system. Please do not copy it or use 
> > it for any purpose or disclose its content.
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> -- 
> Ivaylo Panitchkov
> Software developer
> Hibernum Creations Inc.
> 
> Ce courriel est confidentiel et peut aussi être protégé par la loi.Si vous 
> avez reçu ce courriel par erreur, veuillez nous en aviser immédiatement en y 
> répondant, puis supprimer ce message de votre système. Veuillez ne pas le 
> copier, l’utiliser pour quelque raison que ce soit ni divulguer son contenu à 
> quiconque.
> This email is confidential and may also be legally privileged. If you have 
> received this email in error, please notify us immediately by reply email and 
> then delete this message from your system. Please do not copy it or use it 
> for any purpose or disclose its content.
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Compaction errors when new node is added

2014-11-23 Thread Matthew Von-Maszewski

Timo,

You need to upgrade to 1.4.10.  There is a known leveldb bug in 1.4.9 that fits 
your symptoms:

- 1.4.10 Release Notes:  https://github.com/basho/riak/blob/1.4/RELEASE-NOTES.md

- leveldb bug:  https://github.com/basho/leveldb/issues/135

Matthew

  
On Nov 23, 2014, at 8:36 AM, Timo Gatsonides  wrote:

> Hi,
> 
> I have recently added a 7th node to an existing cluster of 6 nodes. Standard 
> ring size. Riak 1.4.9, the new node is on Ubuntu 14, I used the Ubuntu 12 
> package.
> 
> It seems like the progress of transferring data is really, really slow. The 
> disks are continuously reading and writing data (50+Mb/s), but the total 
> amount of data is growing slowly.
> 
> In the leveldb logs I see a lot of error messages like these:
> 
> 7fec9efb5700 Compaction error: IO error: 
> /bigdata/riak/bigleveldb/479555224749202520035584085735030365824602865664/sst_0/001432.sst:
>  No such file or directory
> 
> I have already stopped the node once, ran leveldb:repair on all the 
> partitions and then restarted Riak. Now again these messages are appearing.
> 
> Can someone please explain the cause of this and if this is normal?
> 
> Background info: the new node is using ZFS, all the other nodes have around 
> 3TB of data in the /bigdata/riak/bigleveldb directories and there are a lot 
> of relatively large values that are 2-10Mb in size.
> 
> Kind regards,
> Timo
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: {read_block_error,<<"0">>} with level_db

2014-07-30 Thread Matthew Von-Maszewski


There have been zero errors during the leveldb::ReadBlock call.

On Jul 30, 2014, at 12:19 PM, Bryan  wrote:

> We are diagnosing a weird intermittent problem where we are getting an 
> {error, disconnected} exception from the riak erlang client. There is not a 
> lot of information generated on our side, other than we trying to do a 
> request on a connection and get this error. We have our own connection 
> pooling mechanism with 10 connections available to each node in the cluster.
> 
> When I do a vnode_status on one of the suspect nodes, I get a dozen of the 
> following type of output. They all have the {read_block_error,<<“0”>>} tuple:
> 
> VNode: 45671926166590716193865151022383844364247891968
> Backend: riak_kv_eleveldb_backend
> Status:
> [{stats,<<"   Compactions\nLevel  Files Size(MB) 
> Time(sec) Read(MB) 
> Write(MB)\n--\n  03   
>  0 00 0\n">>},
>  {read_block_error,<<"0">>},
>  {fixed_indexes,true}]
> 
> So my question is pretty simple and just looking for clarification. How 
> should this be read? 1) that there are 0 read_block_errors? Or 2)  that the 
> read_block_error code is  <<“0”>>. 
> 
> Cheers,
> Bryan
> 
> 
> 
> 
> Bryan Hughes
> Go Factory
> (415) 515-7916
> 
> http://www.go-factory.net
> 
> Connecting the Internet of Things
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: RIAK 1.4.6 - Mass key deletion

2014-07-20 Thread Matthew Von-Maszewski

Simon,

The aggressive delete code is only in the 2.0 release.  There are currently no 
plans to backport that feature to 1.4.

A couple of generic performance improvements for compactions are now in the 
1.4.10 release.  These improvements relate to general compactions.  They do not 
speed the removal of deleted items.

Matthew


On Jul 20, 2014, at 9:24 AM, Simon Effenberg  wrote:

> Hi Matthew,
> 
> so is there a awy to improve the compaction rate in Riak < 2.0 or do I
> have to upgrade to 2.0 to get this?
> 
> Cheers
> Simon
> 
> On Sun, Apr 06, 2014 at 06:30:30PM -0400, Matthew Von-Maszewski wrote:
>>   Edgar,
>>   This is indirectly related to you key deletion discussion.  I made changes
>>   recently to the aggressive delete code.  The second section of the
>>   following (updated) web page discusses the adjustments:
>>   https://github.com/basho/leveldb/wiki/Mv-aggressive-delete
>>   Matthew
>>   On Apr 6, 2014, at 4:29 PM, Edgar Veiga  wrote:
>> 
>> Matthew, thanks again for the response!
>> That said, I'll wait again for the 2.0 (and maybe buy some bigger disks
>> :)
>> Best regards
>> 
>> On 6 April 2014 15:02, Matthew Von-Maszewski  wrote:
>> 
>>   Edgar,
>>   In Riak 1.4, there is no advantage to using empty values versus
>>   deleting.
>>   leveldb is a "write once" data store.  New data for a given key never
>>   physically overwrites old data for the same key.  New data "hides" the
>>   old data by being in a lower level, and therefore picked first.
>>   leveldb's compaction operation will remove older key/value pairs only
>>   when the newer key/value is pair is part of a compaction involving
>>   both new and old.  The new and the old key/value pairs must have
>>   migrated to adjacent levels through normal compaction operations
>>   before leveldb will see them in the same compaction.  The migration
>>   could take days, weeks, or even months depending upon the size of your
>>   entire dataset and the rate of incoming write operations.
>>   leveldb's "delete" object is exactly the same as your empty JSON
>>   object.  The delete object simply has one more flag set that allows it
>>   to also be removed if and only if there is no chance for an identical
>>   key to exist on a higher level.
>>   I apologize that I cannot give you a more useful answer.  2.0 is on
>>   the horizon.
>>   Matthew
>>   On Apr 6, 2014, at 7:04 AM, Edgar Veiga  wrote:
>> 
>> Hi again!
>> Sorry to reopen this discussion, but I have another question
>> regarding the former post.
>> What if, instead of doing a mass deletion (We've already seen that
>> it will be non profitable, regarding disk space) I update all the
>> values with an empty JSON object "{}" ? Do you see any problem with
>> this? I no longer need those millions of values that are living in
>> the cluster... 
>> When the version 2.0 of riak runs stable I'll do the update and only
>> then delete those keys!
>> Best regards
>> 
>> On 18 February 2014 16:32, Edgar Veiga 
>> wrote:
>> 
>>   Ok, thanks a lot Matthew.
>> 
>>   On 18 February 2014 16:18, Matthew Von-Maszewski
>>wrote:
>> 
>> Riak 2.0 is coming.  Hold your mass delete until then.  The
>> "bug" is within Google's original leveldb architecture.  Riak
>> 2.0 sneaks around to get the disk space freed.
>> Matthew
>> On Feb 18, 2014, at 11:10 AM, Edgar Veiga
>>  wrote:
>> 
>>           The only/main purpose is to free disk space..
>>   I was a little bit concerned regarding this operation, but now
>>   with your feedback I'm tending to don't do nothing, I can't
>>   risk the growing of space... 
>>   Regarding the overhead I think that with a tight throttling
>>   system I could control and avoid overloading the cluster.
>>   Mixed feelings :S
>> 
>>   On 18 February 2014 15:45, Matthew Von-Maszewski
>>wrote:
>> 
>> Edgar,
>> The first "concern" I have is that leveldb's delete does not
>> free disk space.  Others have executed

Re: Tag 2.0.0rc1 ??

2014-07-18 Thread Matthew Von-Maszewski


I believe he meant "end of next week".  That seems more inline with the general 
goal.


On Jul 18, 2014, at 4:42 PM, Daurnimator  wrote:

> On 14 July 2014 16:40, Jared Morrow  wrote:
> I'd personally be very disappointed if something wasn't in your hands by the 
> end of this week.  
> 
> No word? :(
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: leveldb Hot Threads in 1.4.9?

2014-07-08 Thread Matthew Von-Maszewski

Responses inline.


On Jul 7, 2014, at 10:54 PM, Tom Lanyon  wrote:

> Hi Matthew,
> 
> On Sunday, 6 July 2014 at 3:04, Matthew Von-Maszewski wrote: 
>> Tom,
>> 
>> Basho prides itself on quickly responding to all user queries. I have failed 
>> that tradition in this case. Please accept my apologies.
> No problem; I appreciate you taking the time to look into our LOG.
> 
>> 
>> The LOG data suggests leveldb is not stalling, especially not for 4 hours. 
>> Therefore the problem is related to disk utilization.
> 
> That matches our experience - leveldb itself is working hard on disk 
> operations whilst Riak fails to respond to... anything, causing an apparent 
> 'stall' from the client application's perspective.
> 
>> You appear to have large values. I see .sst files where the average value is 
>> 100K to 1Mbyte in size. Is this intentional, or might you have a sibling 
>> problem?
> Yes, we have a split between very small (headers only, no body) items and 1MB 
> binary chunks.  If we had our time again we'd probably use multi-backend to 
> store these 1MB chunks in bitcask and keep leveldb for the small body-less 
> items which require 2i.
> 
>> My assessment is that your lower levels are full and therefore cascading 
>> regularly. "cascading" is like the typical champagne glass pyramid you see 
>> at weddings. Once all the glasses are full, new champagne at the top causes 
>> each subsequent layer to overflow into the one below that. You have the same 
>> problem, but with data. 
>> 
>> Your large values have filled each of the lower levels and regularly cause 
>> cascading data between multiple levels. The cascading is causing each 100K 
>> value write to become the equivalent of a 300K or 500K value as levels 
>> overflow. This cascading is chewing up your hard disk performance (by 
>> reducing the amount of time the hard drive has available for read requests).
> By increasing the size of the lower levels (as you show below), does this 
> mean there's more capacity for writes to occur in those levels before 
> compaction is triggered and hence compacting them less frequently?

Exactly.

> 
> I guess this turns your champagne fountain analogy into more of a 'tipping 
> bucket' where the data is no longer 'flowing' through the levels but is 
> instead building up in each level before tipping into the next when it's at 
> capacity?  (pictorial representation: 
> http://4.bp.blogspot.com/_DUDhlpPD8X8/SIcN8D66j9I/ASs/2Va3_n3vamk/s400/23157087_261a5da413.jpg)

Very good photo.  May have to save it for some future presentation.  Though I 
was visualizing champaign glasses versus large Octoberfest beer mugs.

> 
>> The leveldb code for Riak 2.0 has increased the size of all the levels. The 
>> table of sizes is found at the top of leveldb's db/version_set.cc 
>> (http://version_set.cc). You could patch your current code if desired with 
>> this table from 2.0:
>> 
>> { 
>> {10485760, 262144000, 57671680, 209715200, 0, 42000, true}, 
>> {10485760, 82914560, 57671680, 419430400, 0, 209715200, true}, 
>> {10485760, 314572800, 57671680, 3082813440, 2, 314572800, false}, 
>> {10485760, 419430400, 57671680, 6442450944ULL, 4294967296ULL, 419430400, 
>> false}, 
>> {10485760, 524288000, 57671680, 128849018880ULL, 85899345920ULL, 524288000, 
>> false}, 
>> {10485760, 629145600, 57671680, 2576980377600ULL, 1717986918400ULL, 
>> 629145600, false}, 
>> {10485760, 734003200, 57671680, 51539607552000ULL, 34359738368000ULL, 
>> 734003200, false} 
>> }; 
>> 
>> 
>> You cannot take the entire 2.0 leveldb into your 1.4 code base due to 
>> various option changes.
> I assume leveldb will just 'handle' making the levels larger once nodes are 
> restarted with this updated configuration?  I also assume that it would not 
> be wise to then rollback the change to smaller levels after this has been 
> done?

Yes, it "just works".  Rollback to smaller size would cause leveldb to churn 
for a long time as you assumed.

>> Let me know if this helps. I have previously hypothesized that "grooming" 
>> compactions should be limited to one thread total. However my test datasets 
>> never demonstrated a benefit. Your dataset might be the case that proves the 
>> benefit. I will go find the grooming patch to hot_threads for you if the 
>> above table proves insufficient.
> 
> Do I understand correctly that this would mean compactions would continue, 
> but limited to one thread, so that the rest of the application can still 
> respond to client req

Re: leveldb Hot Threads in 1.4.9?

2014-07-05 Thread Matthew Von-Maszewski

Tom,

Basho prides itself on quickly responding to all user queries.  I have failed 
that tradition in this case.  Please accept my apologies.

The LOG data suggests leveldb is not stalling, especially not for 4 hours.  
Therefore the problem is related to disk utilization.

You appear to have large values.  I see .sst files where the average value is 
100K to 1Mbyte in size.  Is this intentional, or might you have a sibling 
problem?

My assessment is that your lower levels are full and therefore cascading 
regularly.  "cascading" is like the typical champagne glass pyramid you see at 
weddings.  Once all the glasses are full, new champagne at the top causes each 
subsequent layer to overflow into the one below that.  You have the same 
problem, but with data.  

Your large values have filled each of the lower levels and regularly cause 
cascading data between multiple levels.  The cascading is causing each 100K 
value write to become the equivalent of a 300K or 500K value as levels 
overflow.  This cascading is chewing up your hard disk performance (by reducing 
the amount of time the hard drive has available for read requests).

The leveldb code for Riak 2.0 has increased the size of all the levels.  The 
table of sizes is found at the top of leveldb's db/version_set.cc.  You could 
patch your current code if desired with this table from 2.0:

{   

{10485760,  262144000,  57671680,  209715200, 0, 
42000, true},   
{10485760,   82914560,  57671680,  419430400, 0, 
209715200, true},   
{10485760,  314572800,  57671680, 3082813440, 2, 
314572800, false},   
{10485760,  419430400,  57671680, 6442450944ULL, 4294967296ULL,  
419430400, false},   
{10485760,  524288000,  57671680,   128849018880ULL,85899345920ULL,  
524288000, false},   
{10485760,  629145600,  57671680,  2576980377600ULL,  1717986918400ULL,  
629145600, false},   
{10485760,  734003200,  57671680, 51539607552000ULL, 34359738368000ULL,  
734003200, false}   
};  

You cannot take the entire 2.0 leveldb into your 1.4 code base due to various 
option changes.

Let me know if this helps.  I have previously hypothesized that "grooming" 
compactions should be limited to one thread total.  However my test datasets 
never demonstrated a benefit.  Your dataset might be the case that proves the 
benefit.  I will go find the grooming patch to hot_threads for you if the above 
table proves insufficient.

Matthew

On Jul 2, 2014, at 9:20 PM, Tom Lanyon  wrote:

> Hi Matthew, 
> 
> Just thought I'd see whether you were back from your travels and had had a 
> chance to take a look at the log file provided?
> 
> There's no rush if you haven't had a chance!
> 
> Regards,
> Tom
> 
> 
> On Tuesday, 24 June 2014 at 10:45, Tom Lanyon wrote:
> 
>> No problem, Matthew. 
>> 
>> Appreciate you taking a look when you have time.
>> 
>> Regards,
>> Tom
>> 
>> 
>> On Tuesday, 24 June 2014 at 9:45, Matthew Von-Maszewski wrote:
>> 
>>> Tom,
>>> 
>>> I have been distracted today and on a plane tomorrow. I apologize for the 
>>> delayed response. It may be late tomorrow before I can share further 
>>> thoughts. 
>>> 
>>> Again my apologies.
>>> 
>>> Matthew Von-Maszewski
>>> 
>>> 
>>> On Jun 23, 2014, at 8:58, Tom Lanyon >> (mailto:tom+r...@oneshoeco.com)> wrote:
>>> 
>>>> Thanks; the combined_log for our Riak node 3 is here:
>>>> 
>>>> https://www.dropbox.com/s/krhhwnplpeyhl0c/riak3-combined_log-20140623.log.gz
>>>> 
>>>> Let me know if you can't retrieve/view it.
>>>> 
>>>> With timestamps relative to this log file, at 2014/06/23-05:35 our 
>>>> monitoring detected node3's Riak as "down"; it wasn't serving any client 
>>>> protobuf requests, "riak ping" didn't respond and all of the other nodes 
>>>> marked node 3 as unreachable. We watched the process and it was busy doing 
>>>> leveldb compactions so we left it alone and it eventually recovered at 
>>>> 2014/06/23-09:32 (so ~4 hours unresponsive).
>>>> 
>>>> Yes - this cluster star

Re: leveldb Hot Threads in 1.4.9?

2014-06-22 Thread Matthew Von-Maszewski

Hot threads is included with 1.4.9.   The leveldb source file 
leveldb/util//hot_threads.cc is the key file.

The code helps throughput, but is not magical. "unresponsive for hours" is not 
a known problem in the 1.4.x code base.  Would you mind posting an aggregate 
LOG file from a period when this happens?

sort /var/lib/riak/*/LOG >combined_log

Substitute your actual data path for /var/lib/riak.

Matthew Von-Maszewski

On Jun 22, 2014, at 22:07, Tom Lanyon  wrote:

> Could someone please confirm whether 1.4.9 includes "Hot Threads" in leveldb? 
> 
> The release notes have a link to it, but I couldn't find my way through the 
> rebar & git maze to be absolutely sure it is in 1.4.9 but not 1.4.8.
> 
> We're seeing nodes unresponsive for hours during large compactions and 
> wondered if this leveldb improvement would help.
> 
> Thanks,
> Tom
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: leveldb issues.

2014-06-19 Thread Matthew Von-Maszewski

Yes.

On Jun 19, 2014, at 1:47 PM, Theo Schlossnagle  wrote:

> To be clear, you're considering a13160da8adeca96d80388cc77cb88ad5301aeaa the 
> latest?
> 
> 
> On Thu, Jun 19, 2014 at 12:36 PM, Matthew Von-Maszewski  
> wrote:
> Running repair now may have detected damage done to your data long ago.  
> Repair reads every file and tests the CRC on every block in the file.
> 
> Two known issues might have caused the original corruption:
> 
> https://github.com/basho/leveldb/wiki/mv-verify-compactions 
> 
> or 
> 
> https://github.com/basho/leveldb/wiki/mv-async-close
> 
> 
> Also, there is a bug in the 532bb commit that can cause level-0 files to be 
> wrongly deleted.  I suggest pulling the latest to get rid of that bug.
> 
> Matthew
> 
> 
> On Jun 19, 2014, at 12:24 PM, Theo Schlossnagle  wrote:
> 
>> I'm using basho/leveldb as of commit: 
>> 532bb6351e7835e862c8508520780bfc9d0c2b78 (no snappy)
>> 
>> I have an issue with some small sized database... they claim corruption, but 
>> when running a repair I have a nonsensical amount of sst's moved into "lost"
>> 
>> Worse, the files moved to "lost" have very very old creation times (in 
>> months).  The system has been restarted successfully many times in the 
>> interim, leading to more confusion.
>> 
>> I'm looking for pointers to get to the bottom of this.  It isn't critical 
>> thatI recover this data, but it is critical that I don't see this manifest 
>> when it does matter.
>> 
>> Before repair:
>> 
>> 40960   108441.log
>> 1   CURRENT
>> 0   LOCK
>> 3   LOG
>> 3   LOG.old
>> 92  MANIFEST-108439
>> 469 sst_0
>> 1   sst_1
>> 188185  sst_2
>> 4189331 sst_3
>> 6343660 sst_4
>> 1   sst_5
>> 1   sst_6
>> 
>> After repair:
>> 
>> 40960   108452.log
>> 1   CURRENT
>> 0   LOCK
>> 3   LOG
>> 3   LOG.old
>> 28  MANIFEST-108450
>> 8221971 lost
>> 2362sst_0
>> 1   sst_1
>> 188185  sst_2
>> 2352071 sst_3
>> 1   sst_4
>> 1   sst_5
>> 1   sst_6
>> 
>> 
>> 
>> -- 
>> Theo Schlossnagle
>> 
>> http://omniti.com/is/theo-schlossnagle
>> 
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> 
> -- 
> Theo Schlossnagle
> 
> http://omniti.com/is/theo-schlossnagle
> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: leveldb issues.

2014-06-19 Thread Matthew Von-Maszewski

Running repair now may have detected damage done to your data long ago.  Repair 
reads every file and tests the CRC on every block in the file.

Two known issues might have caused the original corruption:

https://github.com/basho/leveldb/wiki/mv-verify-compactions 

or 

https://github.com/basho/leveldb/wiki/mv-async-close


Also, there is a bug in the 532bb commit that can cause level-0 files to be 
wrongly deleted.  I suggest pulling the latest to get rid of that bug.

Matthew


On Jun 19, 2014, at 12:24 PM, Theo Schlossnagle  wrote:

> I'm using basho/leveldb as of commit: 
> 532bb6351e7835e862c8508520780bfc9d0c2b78 (no snappy)
> 
> I have an issue with some small sized database... they claim corruption, but 
> when running a repair I have a nonsensical amount of sst's moved into "lost"
> 
> Worse, the files moved to "lost" have very very old creation times (in 
> months).  The system has been restarted successfully many times in the 
> interim, leading to more confusion.
> 
> I'm looking for pointers to get to the bottom of this.  It isn't critical 
> thatI recover this data, but it is critical that I don't see this manifest 
> when it does matter.
> 
> Before repair:
> 
> 40960   108441.log
> 1   CURRENT
> 0   LOCK
> 3   LOG
> 3   LOG.old
> 92  MANIFEST-108439
> 469 sst_0
> 1   sst_1
> 188185  sst_2
> 4189331 sst_3
> 6343660 sst_4
> 1   sst_5
> 1   sst_6
> 
> After repair:
> 
> 40960   108452.log
> 1   CURRENT
> 0   LOCK
> 3   LOG
> 3   LOG.old
> 28  MANIFEST-108450
> 8221971 lost
> 2362sst_0
> 1   sst_1
> 188185  sst_2
> 2352071 sst_3
> 1   sst_4
> 1   sst_5
> 1   sst_6
> 
> 
> 
> -- 
> Theo Schlossnagle
> 
> http://omniti.com/is/theo-schlossnagle
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Performance slows down with write heavy use

2014-04-15 Thread Matthew Von-Maszewski

Jie,

A colleague points out that I typed "40" in one place and "50" in another.  
"40" is the appropriate setting.  

Matthew


On Apr 15, 2014, at 8:41 AM, Matthew Von-Maszewski  wrote:

> Jie,
> 
> 4G ram is small for your ring size and node count.  My first recommendation 
> is that you reduce your ring size to 32 {ring_creation_size, 32}.  Then 
> change the max_open_files setting of eleveldb to 40 {max_open_files, 50} and 
> block_size to 32768 {sst_block_size, 32768}.  These three settings should 
> quickly improve performance … but require you start over with your cluster.
> 
> Active Anti-Entropy has a bug in 1.4.7.  You might as well disable it too:  
> {anti_entropy, {off, []}}.  The recommendation is that you upgrade to 1.4.8.
> 
> Matthew
> 
> 
> 
> 
> On Apr 14, 2014, at 10:10 PM, Jie Lu  wrote:
> 
>> 
>> I also have a problem in performance test of Riak Cluster.
>> 
>> Riak version: 1.4.7
>> OS: openSUSE 11.3
>> RAM: 4G
>> ring size is 64
>> backend: leveldb
>> Nodes in cluster: 6 nodes
>> 
>> ~
>> 
>> I write a key/value with value is 1K bytes, and 25 concurrent  threads on 
>> one client nodes. The test result only 20 ops/s performance. 
>> 
>> Is there any performance benchmark to compare with?
>> 
>> 
>> 
>> 
>> 
>> 
>> On Tue, Apr 15, 2014 at 6:26 AM, Luke Bakken  wrote:
>> Hi Matthew -
>> 
>> Some suggestions:
>> 
>> * Upgrade to Riak 1.4.8
>> 
>> * Test with a ring size of 64
>> 
>> * Use staggered merge windows in your cluster 
>> (http://docs.basho.com/riak/latest/ops/advanced/backends/bitcask/)
>> 
>> * Since you're on dedicated hardware RAID, use the noop scheduler for your 
>> Riak data volumes:
>> 
>> cat /sys/block/sd*/queue/scheduler
>> noop anticipatory deadline [cfq]
>> 
>> * Increase +zdbbl in /etc/riak/vm.args to 96000
>> 
>> Thanks
>> --
>> Luke Bakken
>> CSE
>> lbak...@basho.com
>> 
>> 
>> On Mon, Apr 14, 2014 at 2:33 PM, Matthew MacClary 
>>  wrote:
>> I have a persistent issue I am trying to diagnose. In our use of Riak we 
>> have multiple data creators writing into a 7 node cluster. The value size is 
>> a bit large at around 2MB. The behavior I am seeing is that if I delete all 
>> data out of bitcask, then test performance I get fast writes. As I keep 
>> doing the same work of writing to the cluster, then the Riak write times 
>> will start tailing off and getting really bad.
>> 
>> Initial write times seen by my application: 0.5 seconds for 100MB worth of 
>> values (~200MB/s)
>> Subsequent write times: 11 seconds for 100MB worth of values (~9MB/s)
>> 
>> This slow down can happen over roughly 20-40 minutes of writing or about 
>> 200GB worth of key/value pairs written.
>> 
>> I can reset the cluster to get fast performance again by stopping Riak and 
>> deleting the bitcask directories, then starting Riak again. This step is not 
>> feasible for production, but during testing at least the write speed goes up 
>> by 20x.
>> 
>> Watching iostat I see that every few seconds the disk io jumps to ~11%. It 
>> doesn't seem that highly loaded from my cursory look. Watching top I see 
>> that beam.smp runs at around 100 for CPU% or less when heavily loaded. I am 
>> not sure how to tell what it is doing though :-)
>> 
>> Thanks for any suggestions!!
>> 
>> -Matt
>> 
>> 
>> 
>> 
>> System Description
>> 
>> 
>> avg value size = 2MB
>> Riak version = 1.4.1
>> n_val = 2
>> client threads total = 105
>> backend = bitcask
>> ring_creation_size = 128
>> node count = 7
>> node OS = RHEL 6.2
>> server RAM = 128GB
>> RAID = RAID0 across 8 SAS drives
>> FS = ext4
>> FS options = /dev/mapper/vg0-lv0 / ext4 
>> rw,noatime,barrier=0,stripe=512,data=ordered 0 0
>> bitcask size on one server = 133GB
>> AAE = off
>> interface = protobuf
>> client library = riak java client
>> file-max = 65536
>> 
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
>> 
>> 
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
>> 
>> 
>> 
>> -- 
>> Best Regards.
>> Lu Jie
>> 
>> 
>> 
>> -- 
>> Best Regards.
>> Lu Jie
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Performance slows down with write heavy use

2014-04-15 Thread Matthew Von-Maszewski

Jie,

4G ram is small for your ring size and node count.  My first recommendation is 
that you reduce your ring size to 32 {ring_creation_size, 32}.  Then change the 
max_open_files setting of eleveldb to 40 {max_open_files, 50} and block_size to 
32768 {sst_block_size, 32768}.  These three settings should quickly improve 
performance … but require you start over with your cluster.

Active Anti-Entropy has a bug in 1.4.7.  You might as well disable it too:  
{anti_entropy, {off, []}}.  The recommendation is that you upgrade to 1.4.8.

Matthew




On Apr 14, 2014, at 10:10 PM, Jie Lu  wrote:

> 
> I also have a problem in performance test of Riak Cluster.
> 
> Riak version: 1.4.7
> OS: openSUSE 11.3
> RAM: 4G
> ring size is 64
> backend: leveldb
> Nodes in cluster: 6 nodes
> 
> ~
> 
> I write a key/value with value is 1K bytes, and 25 concurrent  threads on one 
> client nodes. The test result only 20 ops/s performance. 
> 
> Is there any performance benchmark to compare with?
> 
> 
> 
> 
> 
> 
> On Tue, Apr 15, 2014 at 6:26 AM, Luke Bakken  wrote:
> Hi Matthew -
> 
> Some suggestions:
> 
> * Upgrade to Riak 1.4.8
> 
> * Test with a ring size of 64
> 
> * Use staggered merge windows in your cluster 
> (http://docs.basho.com/riak/latest/ops/advanced/backends/bitcask/)
> 
> * Since you're on dedicated hardware RAID, use the noop scheduler for your 
> Riak data volumes:
> 
> cat /sys/block/sd*/queue/scheduler
> noop anticipatory deadline [cfq]
> 
> * Increase +zdbbl in /etc/riak/vm.args to 96000
> 
> Thanks
> --
> Luke Bakken
> CSE
> lbak...@basho.com
> 
> 
> On Mon, Apr 14, 2014 at 2:33 PM, Matthew MacClary 
>  wrote:
> I have a persistent issue I am trying to diagnose. In our use of Riak we have 
> multiple data creators writing into a 7 node cluster. The value size is a bit 
> large at around 2MB. The behavior I am seeing is that if I delete all data 
> out of bitcask, then test performance I get fast writes. As I keep doing the 
> same work of writing to the cluster, then the Riak write times will start 
> tailing off and getting really bad.
> 
> Initial write times seen by my application: 0.5 seconds for 100MB worth of 
> values (~200MB/s)
> Subsequent write times: 11 seconds for 100MB worth of values (~9MB/s)
> 
> This slow down can happen over roughly 20-40 minutes of writing or about 
> 200GB worth of key/value pairs written.
> 
> I can reset the cluster to get fast performance again by stopping Riak and 
> deleting the bitcask directories, then starting Riak again. This step is not 
> feasible for production, but during testing at least the write speed goes up 
> by 20x.
> 
> Watching iostat I see that every few seconds the disk io jumps to ~11%. It 
> doesn't seem that highly loaded from my cursory look. Watching top I see that 
> beam.smp runs at around 100 for CPU% or less when heavily loaded. I am not 
> sure how to tell what it is doing though :-)
> 
> Thanks for any suggestions!!
> 
> -Matt
> 
> 
> 
> 
> System Description
> 
> 
> avg value size = 2MB
> Riak version = 1.4.1
> n_val = 2
> client threads total = 105
> backend = bitcask
> ring_creation_size = 128
> node count = 7
> node OS = RHEL 6.2
> server RAM = 128GB
> RAID = RAID0 across 8 SAS drives
> FS = ext4
> FS options = /dev/mapper/vg0-lv0 / ext4 
> rw,noatime,barrier=0,stripe=512,data=ordered 0 0
> bitcask size on one server = 133GB
> AAE = off
> interface = protobuf
> client library = riak java client
> file-max = 65536
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> 
> -- 
> Best Regards.
> Lu Jie
> 
> 
> 
> -- 
> Best Regards.
> Lu Jie
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: RIAK 1.4.6 - Mass key deletion

2014-04-10 Thread Matthew Von-Maszewski

Yes, you can send the AAE (active anti-entropy) data to a different disk.  

AAE calculates a hash each time you PUT new data to the regular database.  AAE 
then buffers around 1,000 hashes (I forget the exact value) to write as a block 
to the AAE database.  The AAE write is NOT in series with the user database 
writes.  Your throughput should not be impacted.  But this is not something I 
have personally measured/validated.

Matthew


On Apr 10, 2014, at 7:33 AM, Edgar Veiga  wrote:

> Hi Matthew!
> 
> I have a possibility of moving the data of anti-entropy directory to a 
> mechanic disk 7200, that exists on each of the machines. I was thinking of 
> changing the anti_entropy data dir config in app.config file and restart the 
> riak process.
> 
> Is there any problem using a mechanic disk to store the anti-entropy data?
> 
> Best regards!
> 
> 
> On 8 April 2014 23:58, Edgar Veiga  wrote:
> I'll wait a few more days, see if the AAE maybe "stabilises" and only after 
> that make a decision regarding this.
> The cluster expanding was on the roadmap, but not right now :)
> 
> I've attached a few screenshot, you can clearly observe  the evolution of one 
> of the machines after the anti-entropy data removal and consequent restart  
> (5th of April).
> 
> https://cloudup.com/cB0a15lCMeS
> 
> Best regards!
> 
> 
> On 8 April 2014 23:44, Matthew Von-Maszewski  wrote:
> No.  I do not see a problem with your plan.  But ...
> 
> I would prefer to see you add servers to your cluster.  Scalabilty is one of 
> Riak's fundamental characteristics.  As your database needs grow, we grow 
> with you … just add another server and migrate some of the vnodes there.
> 
> I obviously cannot speak to your budgetary constraints.  All of the engineers 
> at Basho, I am just one, are focused upon providing you performance and 
> features along with your scalability needs.  This seems to be a situation 
> where you might be sacrificing data integrity where another server or two 
> would address the situation.
> 
> And if 2.0 makes things better … sell the extra servers on Ebay.
> 
> Matthew
> 
> 
> On Apr 8, 2014, at 6:31 PM, Edgar Veiga  wrote:
> 
>> Thanks Matthew!
>> 
>> Today this situation has become unsustainable, In two of the machines I have 
>> an anti-entropy dir of 250G... It just keeps growing and growing and I'm 
>> almost reaching max size of the disks.
>> 
>> Maybe I'll just turn off aae in the cluster, remove all the data in the 
>> anti-entropy directory and wait for the v2 of riak. Do you see any problem 
>> with this?
>> 
>> Best regards!
>> 
>> 
>> On 8 April 2014 22:11, Matthew Von-Maszewski  wrote:
>> Edgar,
>> 
>> Today we disclosed a new feature for Riak's leveldb, Tiered Storage.  The 
>> details are here:
>> 
>> https://github.com/basho/leveldb/wiki/mv-tiered-options
>> 
>> This feature might give you another option in managing your storage volume. 
>> 
>> 
>> Matthew
>> 
>>> On Apr 8, 2014, at 11:07 AM, Edgar Veiga  wrote:
>>> 
>>>> It makes sense, I do a lot, and I really mean a LOT of updates per key, 
>>>> maybe thousands a day! The cluster is experiencing a lot more updates per 
>>>> each key, than new keys being inserted.
>>>> 
>>>> The hash trees will rebuild during the next weekend (normally it takes 
>>>> about two days to complete the operation) so I'll come back and give you 
>>>> some feedback (hopefully good) on the next Monday!
>>>> 
>>>> Again, thanks a lot, You've been very helpful.
>>>> Edgar
>>>> 
>>>> 
>>>> On 8 April 2014 15:47, Matthew Von-Maszewski  wrote:
>>>> Edgar,
>>>> 
>>>> The test I have running currently has reach 1 Billion keys.  It is running 
>>>> against a single node with N=1.  It has 42G of AAE data.  Here is my 
>>>> extrapolation to compare your numbers:
>>>> 
>>>> You have ~2.5 Billion keys.  I assume you are running N=3 (the default).  
>>>> AAE therefore is actually tracking ~7.5 Billion keys.  You have six nodes, 
>>>> therefore tracking ~1.25 Billion keys per node.
>>>> 
>>>> Raw math would suggest that my 42G of AAE data for 1 billion keys would 
>>>> extrapolate to 52.5G of AAE data for you.  Yet you have ~120G of AAE data. 
>>>>  Is something wrong?  No.  My data is still loading and has experience 
>>>> zero key/value updates/edits.
>>>> 
>>>> AAE hashes get

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Matthew Von-Maszewski

No.  I do not see a problem with your plan.  But ...

I would prefer to see you add servers to your cluster.  Scalabilty is one of 
Riak's fundamental characteristics.  As your database needs grow, we grow with 
you … just add another server and migrate some of the vnodes there.

I obviously cannot speak to your budgetary constraints.  All of the engineers 
at Basho, I am just one, are focused upon providing you performance and 
features along with your scalability needs.  This seems to be a situation where 
you might be sacrificing data integrity where another server or two would 
address the situation.

And if 2.0 makes things better … sell the extra servers on Ebay.

Matthew


On Apr 8, 2014, at 6:31 PM, Edgar Veiga  wrote:

> Thanks Matthew!
> 
> Today this situation has become unsustainable, In two of the machines I have 
> an anti-entropy dir of 250G... It just keeps growing and growing and I'm 
> almost reaching max size of the disks.
> 
> Maybe I'll just turn off aae in the cluster, remove all the data in the 
> anti-entropy directory and wait for the v2 of riak. Do you see any problem 
> with this?
> 
> Best regards!
> 
> 
> On 8 April 2014 22:11, Matthew Von-Maszewski  wrote:
> Edgar,
> 
> Today we disclosed a new feature for Riak's leveldb, Tiered Storage.  The 
> details are here:
> 
> https://github.com/basho/leveldb/wiki/mv-tiered-options
> 
> This feature might give you another option in managing your storage volume. 
> 
> 
> Matthew
> 
>> On Apr 8, 2014, at 11:07 AM, Edgar Veiga  wrote:
>> 
>>> It makes sense, I do a lot, and I really mean a LOT of updates per key, 
>>> maybe thousands a day! The cluster is experiencing a lot more updates per 
>>> each key, than new keys being inserted.
>>> 
>>> The hash trees will rebuild during the next weekend (normally it takes 
>>> about two days to complete the operation) so I'll come back and give you 
>>> some feedback (hopefully good) on the next Monday!
>>> 
>>> Again, thanks a lot, You've been very helpful.
>>> Edgar
>>> 
>>> 
>>> On 8 April 2014 15:47, Matthew Von-Maszewski  wrote:
>>> Edgar,
>>> 
>>> The test I have running currently has reach 1 Billion keys.  It is running 
>>> against a single node with N=1.  It has 42G of AAE data.  Here is my 
>>> extrapolation to compare your numbers:
>>> 
>>> You have ~2.5 Billion keys.  I assume you are running N=3 (the default).  
>>> AAE therefore is actually tracking ~7.5 Billion keys.  You have six nodes, 
>>> therefore tracking ~1.25 Billion keys per node.
>>> 
>>> Raw math would suggest that my 42G of AAE data for 1 billion keys would 
>>> extrapolate to 52.5G of AAE data for you.  Yet you have ~120G of AAE data.  
>>> Is something wrong?  No.  My data is still loading and has experience zero 
>>> key/value updates/edits.
>>> 
>>> AAE hashes get rewritten every time a user updates the value of a key.  
>>> AAE's leveldb is just like the user leveldb, all prior values of a key 
>>> accumulate in the .sst table files until compaction removes duplicates.  
>>> Similarly, a user delete of a key causes a delete tombstone in the AAE hash 
>>> tree.  Those delete tombstones have to await compactions too before leveldb 
>>> recovers the disk space.
>>> 
>>> AAE's hash trees rebuild weekly.  I am told that the rebuild operation will 
>>> actually destroy the existing files and start over.  That is when you 
>>> should see AAE space usage dropping dramatically.
>>> 
>>> Matthew
>>> 
>>> 
>>> On Apr 8, 2014, at 9:31 AM, Edgar Veiga  wrote:
>>> 
>>>> Thanks a lot Matthew!
>>>> 
>>>> A little bit of more info, I've gathered a sample of the contents of 
>>>> anti-entropy data of one of my machines:
>>>> - 44 folders with the name equal to the name of the folders in level-db 
>>>> dir (i.e. 393920363186844927172086927568060657641638068224/)
>>>> - each folder has a 5 files (log, current, log, etc) and 5 sst_* folders.
>>>> - The biggest sst folder is sst_3 with 4.3G
>>>> - Inside sst_3 folder there are 1219 files name 00.sst.
>>>> - Each of the 00*.sst files has ~3.7M
>>>> 
>>>> Hope this info gives you some more help! 
>>>> 
>>>> Best regards, and again, thanks a lot
>>>> Edgar
>>>> 
>>>> 
>>>> On 8 April 2014 13:24, Matthew Von-Maszewski  wrote:
>>>> Argh

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Matthew Von-Maszewski

Edgar,

Today we disclosed a new feature for Riak's leveldb, Tiered Storage.  The 
details are here:

https://github.com/basho/leveldb/wiki/mv-tiered-options

This feature might give you another option in managing your storage volume. 

Matthew

> On Apr 8, 2014, at 11:07 AM, Edgar Veiga  wrote:
> 
>> It makes sense, I do a lot, and I really mean a LOT of updates per key, 
>> maybe thousands a day! The cluster is experiencing a lot more updates per 
>> each key, than new keys being inserted.
>> 
>> The hash trees will rebuild during the next weekend (normally it takes about 
>> two days to complete the operation) so I'll come back and give you some 
>> feedback (hopefully good) on the next Monday!
>> 
>> Again, thanks a lot, You've been very helpful.
>> Edgar
>> 
>> 
>> On 8 April 2014 15:47, Matthew Von-Maszewski  wrote:
>> Edgar,
>> 
>> The test I have running currently has reach 1 Billion keys.  It is running 
>> against a single node with N=1.  It has 42G of AAE data.  Here is my 
>> extrapolation to compare your numbers:
>> 
>> You have ~2.5 Billion keys.  I assume you are running N=3 (the default).  
>> AAE therefore is actually tracking ~7.5 Billion keys.  You have six nodes, 
>> therefore tracking ~1.25 Billion keys per node.
>> 
>> Raw math would suggest that my 42G of AAE data for 1 billion keys would 
>> extrapolate to 52.5G of AAE data for you.  Yet you have ~120G of AAE data.  
>> Is something wrong?  No.  My data is still loading and has experience zero 
>> key/value updates/edits.
>> 
>> AAE hashes get rewritten every time a user updates the value of a key.  
>> AAE's leveldb is just like the user leveldb, all prior values of a key 
>> accumulate in the .sst table files until compaction removes duplicates.  
>> Similarly, a user delete of a key causes a delete tombstone in the AAE hash 
>> tree.  Those delete tombstones have to await compactions too before leveldb 
>> recovers the disk space.
>> 
>> AAE's hash trees rebuild weekly.  I am told that the rebuild operation will 
>> actually destroy the existing files and start over.  That is when you should 
>> see AAE space usage dropping dramatically.
>> 
>> Matthew
>> 
>> 
>> On Apr 8, 2014, at 9:31 AM, Edgar Veiga  wrote:
>> 
>>> Thanks a lot Matthew!
>>> 
>>> A little bit of more info, I've gathered a sample of the contents of 
>>> anti-entropy data of one of my machines:
>>> - 44 folders with the name equal to the name of the folders in level-db dir 
>>> (i.e. 393920363186844927172086927568060657641638068224/)
>>> - each folder has a 5 files (log, current, log, etc) and 5 sst_* folders.
>>> - The biggest sst folder is sst_3 with 4.3G
>>> - Inside sst_3 folder there are 1219 files name 00.sst.
>>> - Each of the 00*.sst files has ~3.7M
>>> 
>>> Hope this info gives you some more help! 
>>> 
>>> Best regards, and again, thanks a lot
>>> Edgar
>>> 
>>> 
>>> On 8 April 2014 13:24, Matthew Von-Maszewski  wrote:
>>> Argh. Missed where you said you had upgraded. Ok it will proceed with 
>>> getting you comparison numbers. 
>>> 
>>> Sent from my iPhone
>>> 
>>> On Apr 8, 2014, at 6:51 AM, Edgar Veiga  wrote:
>>> 
>>>> Thanks again Matthew, you've been very helpful!
>>>> 
>>>> Maybe you can give me some kind of advise on this issue I'm having since 
>>>> I've upgraded to 1.4.8.
>>>> 
>>>> Since I've upgraded my anti-entropy data has been growing a lot and has 
>>>> only stabilised in very high values... Write now my cluster has 6 machines 
>>>> each one with ~120G of anti-entropy data and 600G of level-db data. This 
>>>> seems to be quite a lot no? My total amount of keys is ~2.5 Billions.
>>>> 
>>>> Best regards,
>>>> Edgar
>>>> 
>>>> On 6 April 2014 23:30, Matthew Von-Maszewski  wrote:
>>>> Edgar,
>>>> 
>>>> This is indirectly related to you key deletion discussion.  I made changes 
>>>> recently to the aggressive delete code.  The second section of the 
>>>> following (updated) web page discusses the adjustments:
>>>> 
>>>> https://github.com/basho/leveldb/wiki/Mv-aggressive-delete
>>>> 
>>>> Matthew
>>>> 
>>>> 
>>>> On Apr 6, 2014, at 4:29 PM, Edgar Veiga  wrote:
>>>> 
>

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Matthew Von-Maszewski

Edgar,

The test I have running currently has reach 1 Billion keys.  It is running 
against a single node with N=1.  It has 42G of AAE data.  Here is my 
extrapolation to compare your numbers:

You have ~2.5 Billion keys.  I assume you are running N=3 (the default).  AAE 
therefore is actually tracking ~7.5 Billion keys.  You have six nodes, 
therefore tracking ~1.25 Billion keys per node.

Raw math would suggest that my 42G of AAE data for 1 billion keys would 
extrapolate to 52.5G of AAE data for you.  Yet you have ~120G of AAE data.  Is 
something wrong?  No.  My data is still loading and has experience zero 
key/value updates/edits.

AAE hashes get rewritten every time a user updates the value of a key.  AAE's 
leveldb is just like the user leveldb, all prior values of a key accumulate in 
the .sst table files until compaction removes duplicates.  Similarly, a user 
delete of a key causes a delete tombstone in the AAE hash tree.  Those delete 
tombstones have to await compactions too before leveldb recovers the disk space.

AAE's hash trees rebuild weekly.  I am told that the rebuild operation will 
actually destroy the existing files and start over.  That is when you should 
see AAE space usage dropping dramatically.

Matthew


On Apr 8, 2014, at 9:31 AM, Edgar Veiga  wrote:

> Thanks a lot Matthew!
> 
> A little bit of more info, I've gathered a sample of the contents of 
> anti-entropy data of one of my machines:
> - 44 folders with the name equal to the name of the folders in level-db dir 
> (i.e. 393920363186844927172086927568060657641638068224/)
> - each folder has a 5 files (log, current, log, etc) and 5 sst_* folders.
> - The biggest sst folder is sst_3 with 4.3G
> - Inside sst_3 folder there are 1219 files name 00.sst.
> - Each of the 00*.sst files has ~3.7M
> 
> Hope this info gives you some more help! 
> 
> Best regards, and again, thanks a lot
> Edgar
> 
> 
> On 8 April 2014 13:24, Matthew Von-Maszewski  wrote:
> Argh. Missed where you said you had upgraded. Ok it will proceed with getting 
> you comparison numbers. 
> 
> Sent from my iPhone
> 
> On Apr 8, 2014, at 6:51 AM, Edgar Veiga  wrote:
> 
>> Thanks again Matthew, you've been very helpful!
>> 
>> Maybe you can give me some kind of advise on this issue I'm having since 
>> I've upgraded to 1.4.8.
>> 
>> Since I've upgraded my anti-entropy data has been growing a lot and has only 
>> stabilised in very high values... Write now my cluster has 6 machines each 
>> one with ~120G of anti-entropy data and 600G of level-db data. This seems to 
>> be quite a lot no? My total amount of keys is ~2.5 Billions.
>> 
>> Best regards,
>> Edgar
>> 
>> On 6 April 2014 23:30, Matthew Von-Maszewski  wrote:
>> Edgar,
>> 
>> This is indirectly related to you key deletion discussion.  I made changes 
>> recently to the aggressive delete code.  The second section of the following 
>> (updated) web page discusses the adjustments:
>> 
>> https://github.com/basho/leveldb/wiki/Mv-aggressive-delete
>> 
>> Matthew
>> 
>> 
>> On Apr 6, 2014, at 4:29 PM, Edgar Veiga  wrote:
>> 
>>> Matthew, thanks again for the response!
>>> 
>>> That said, I'll wait again for the 2.0 (and maybe buy some bigger disks :)
>>> 
>>> Best regards
>>> 
>>> 
>>> On 6 April 2014 15:02, Matthew Von-Maszewski  wrote:
>>> Edgar,
>>> 
>>> In Riak 1.4, there is no advantage to using empty values versus deleting.
>>> 
>>> leveldb is a "write once" data store.  New data for a given key never 
>>> physically overwrites old data for the same key.  New data "hides" the old 
>>> data by being in a lower level, and therefore picked first.
>>> 
>>> leveldb's compaction operation will remove older key/value pairs only when 
>>> the newer key/value is pair is part of a compaction involving both new and 
>>> old.  The new and the old key/value pairs must have migrated to adjacent 
>>> levels through normal compaction operations before leveldb will see them in 
>>> the same compaction.  The migration could take days, weeks, or even months 
>>> depending upon the size of your entire dataset and the rate of incoming 
>>> write operations.
>>> 
>>> leveldb's "delete" object is exactly the same as your empty JSON object.  
>>> The delete object simply has one more flag set that allows it to also be 
>>> removed if and only if there is no chance for an identical key to exist on 
>>> a higher level.
>>> 
>>> I apologize

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Matthew Von-Maszewski

Argh. Missed where you said you had upgraded. Ok it will proceed with getting 
you comparison numbers. 

Sent from my iPhone

> On Apr 8, 2014, at 6:51 AM, Edgar Veiga  wrote:
> 
> Thanks again Matthew, you've been very helpful!
> 
> Maybe you can give me some kind of advise on this issue I'm having since I've 
> upgraded to 1.4.8.
> 
> Since I've upgraded my anti-entropy data has been growing a lot and has only 
> stabilised in very high values... Write now my cluster has 6 machines each 
> one with ~120G of anti-entropy data and 600G of level-db data. This seems to 
> be quite a lot no? My total amount of keys is ~2.5 Billions.
> 
> Best regards,
> Edgar
> 
>> On 6 April 2014 23:30, Matthew Von-Maszewski  wrote:
>> Edgar,
>> 
>> This is indirectly related to you key deletion discussion.  I made changes 
>> recently to the aggressive delete code.  The second section of the following 
>> (updated) web page discusses the adjustments:
>> 
>> https://github.com/basho/leveldb/wiki/Mv-aggressive-delete
>> 
>> Matthew
>> 
>> 
>>> On Apr 6, 2014, at 4:29 PM, Edgar Veiga  wrote:
>>> 
>>> Matthew, thanks again for the response!
>>> 
>>> That said, I'll wait again for the 2.0 (and maybe buy some bigger disks :)
>>> 
>>> Best regards
>>> 
>>> 
>>>> On 6 April 2014 15:02, Matthew Von-Maszewski  wrote:
>>>> Edgar,
>>>> 
>>>> In Riak 1.4, there is no advantage to using empty values versus deleting.
>>>> 
>>>> leveldb is a "write once" data store.  New data for a given key never 
>>>> physically overwrites old data for the same key.  New data "hides" the old 
>>>> data by being in a lower level, and therefore picked first.
>>>> 
>>>> leveldb's compaction operation will remove older key/value pairs only when 
>>>> the newer key/value is pair is part of a compaction involving both new and 
>>>> old.  The new and the old key/value pairs must have migrated to adjacent 
>>>> levels through normal compaction operations before leveldb will see them 
>>>> in the same compaction.  The migration could take days, weeks, or even 
>>>> months depending upon the size of your entire dataset and the rate of 
>>>> incoming write operations.
>>>> 
>>>> leveldb's "delete" object is exactly the same as your empty JSON object.  
>>>> The delete object simply has one more flag set that allows it to also be 
>>>> removed if and only if there is no chance for an identical key to exist on 
>>>> a higher level.
>>>> 
>>>> I apologize that I cannot give you a more useful answer.  2.0 is on the 
>>>> horizon.
>>>> 
>>>> Matthew
>>>> 
>>>> 
>>>>> On Apr 6, 2014, at 7:04 AM, Edgar Veiga  wrote:
>>>>> 
>>>>> Hi again!
>>>>> 
>>>>> Sorry to reopen this discussion, but I have another question regarding 
>>>>> the former post.
>>>>> 
>>>>> What if, instead of doing a mass deletion (We've already seen that it 
>>>>> will be non profitable, regarding disk space) I update all the values 
>>>>> with an empty JSON object "{}" ? Do you see any problem with this? I no 
>>>>> longer need those millions of values that are living in the cluster... 
>>>>> 
>>>>> When the version 2.0 of riak runs stable I'll do the update and only then 
>>>>> delete those keys!
>>>>> 
>>>>> Best regards
>>>>> 
>>>>> 
>>>>>> On 18 February 2014 16:32, Edgar Veiga  wrote:
>>>>>> Ok, thanks a lot Matthew.
>>>>>> 
>>>>>> 
>>>>>>> On 18 February 2014 16:18, Matthew Von-Maszewski  
>>>>>>> wrote:
>>>>>>> Riak 2.0 is coming.  Hold your mass delete until then.  The "bug" is 
>>>>>>> within Google's original leveldb architecture.  Riak 2.0 sneaks around 
>>>>>>> to get the disk space freed.
>>>>>>> 
>>>>>>> Matthew
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On Feb 18, 2014, at 11:10 AM, Edgar Veiga  
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Matthew Von-Maszewski

AAE is broken and brain dead in releases 1.4.3 through 1.4.7.  That might be 
your problem. 

I have a two billion key data set building now. I will forward node disk usage 
when available. 

Matthew

Sent from my iPhone

> On Apr 8, 2014, at 6:51 AM, Edgar Veiga  wrote:
> 
> Thanks again Matthew, you've been very helpful!
> 
> Maybe you can give me some kind of advise on this issue I'm having since I've 
> upgraded to 1.4.8.
> 
> Since I've upgraded my anti-entropy data has been growing a lot and has only 
> stabilised in very high values... Write now my cluster has 6 machines each 
> one with ~120G of anti-entropy data and 600G of level-db data. This seems to 
> be quite a lot no? My total amount of keys is ~2.5 Billions.
> 
> Best regards,
> Edgar
> 
>> On 6 April 2014 23:30, Matthew Von-Maszewski  wrote:
>> Edgar,
>> 
>> This is indirectly related to you key deletion discussion.  I made changes 
>> recently to the aggressive delete code.  The second section of the following 
>> (updated) web page discusses the adjustments:
>> 
>> https://github.com/basho/leveldb/wiki/Mv-aggressive-delete
>> 
>> Matthew
>> 
>> 
>>> On Apr 6, 2014, at 4:29 PM, Edgar Veiga  wrote:
>>> 
>>> Matthew, thanks again for the response!
>>> 
>>> That said, I'll wait again for the 2.0 (and maybe buy some bigger disks :)
>>> 
>>> Best regards
>>> 
>>> 
>>>> On 6 April 2014 15:02, Matthew Von-Maszewski  wrote:
>>>> Edgar,
>>>> 
>>>> In Riak 1.4, there is no advantage to using empty values versus deleting.
>>>> 
>>>> leveldb is a "write once" data store.  New data for a given key never 
>>>> physically overwrites old data for the same key.  New data "hides" the old 
>>>> data by being in a lower level, and therefore picked first.
>>>> 
>>>> leveldb's compaction operation will remove older key/value pairs only when 
>>>> the newer key/value is pair is part of a compaction involving both new and 
>>>> old.  The new and the old key/value pairs must have migrated to adjacent 
>>>> levels through normal compaction operations before leveldb will see them 
>>>> in the same compaction.  The migration could take days, weeks, or even 
>>>> months depending upon the size of your entire dataset and the rate of 
>>>> incoming write operations.
>>>> 
>>>> leveldb's "delete" object is exactly the same as your empty JSON object.  
>>>> The delete object simply has one more flag set that allows it to also be 
>>>> removed if and only if there is no chance for an identical key to exist on 
>>>> a higher level.
>>>> 
>>>> I apologize that I cannot give you a more useful answer.  2.0 is on the 
>>>> horizon.
>>>> 
>>>> Matthew
>>>> 
>>>> 
>>>>> On Apr 6, 2014, at 7:04 AM, Edgar Veiga  wrote:
>>>>> 
>>>>> Hi again!
>>>>> 
>>>>> Sorry to reopen this discussion, but I have another question regarding 
>>>>> the former post.
>>>>> 
>>>>> What if, instead of doing a mass deletion (We've already seen that it 
>>>>> will be non profitable, regarding disk space) I update all the values 
>>>>> with an empty JSON object "{}" ? Do you see any problem with this? I no 
>>>>> longer need those millions of values that are living in the cluster... 
>>>>> 
>>>>> When the version 2.0 of riak runs stable I'll do the update and only then 
>>>>> delete those keys!
>>>>> 
>>>>> Best regards
>>>>> 
>>>>> 
>>>>>> On 18 February 2014 16:32, Edgar Veiga  wrote:
>>>>>> Ok, thanks a lot Matthew.
>>>>>> 
>>>>>> 
>>>>>>> On 18 February 2014 16:18, Matthew Von-Maszewski  
>>>>>>> wrote:
>>>>>>> Riak 2.0 is coming.  Hold your mass delete until then.  The "bug" is 
>>>>>>> within Google's original leveldb architecture.  Riak 2.0 sneaks around 
>>>>>>> to get the disk space freed.
>>>>>>> 
>>>>>>> Matthew
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On Feb 18, 2014, at 11:10 AM, Edgar Veiga  
>>>

Re: RIAK 1.4.6 - Mass key deletion

2014-04-06 Thread Matthew Von-Maszewski

Edgar,

This is indirectly related to you key deletion discussion.  I made changes 
recently to the aggressive delete code.  The second section of the following 
(updated) web page discusses the adjustments:

https://github.com/basho/leveldb/wiki/Mv-aggressive-delete

Matthew


On Apr 6, 2014, at 4:29 PM, Edgar Veiga  wrote:

> Matthew, thanks again for the response!
> 
> That said, I'll wait again for the 2.0 (and maybe buy some bigger disks :)
> 
> Best regards
> 
> 
> On 6 April 2014 15:02, Matthew Von-Maszewski  wrote:
> Edgar,
> 
> In Riak 1.4, there is no advantage to using empty values versus deleting.
> 
> leveldb is a "write once" data store.  New data for a given key never 
> physically overwrites old data for the same key.  New data "hides" the old 
> data by being in a lower level, and therefore picked first.
> 
> leveldb's compaction operation will remove older key/value pairs only when 
> the newer key/value is pair is part of a compaction involving both new and 
> old.  The new and the old key/value pairs must have migrated to adjacent 
> levels through normal compaction operations before leveldb will see them in 
> the same compaction.  The migration could take days, weeks, or even months 
> depending upon the size of your entire dataset and the rate of incoming write 
> operations.
> 
> leveldb's "delete" object is exactly the same as your empty JSON object.  The 
> delete object simply has one more flag set that allows it to also be removed 
> if and only if there is no chance for an identical key to exist on a higher 
> level.
> 
> I apologize that I cannot give you a more useful answer.  2.0 is on the 
> horizon.
> 
> Matthew
> 
> 
> On Apr 6, 2014, at 7:04 AM, Edgar Veiga  wrote:
> 
>> Hi again!
>> 
>> Sorry to reopen this discussion, but I have another question regarding the 
>> former post.
>> 
>> What if, instead of doing a mass deletion (We've already seen that it will 
>> be non profitable, regarding disk space) I update all the values with an 
>> empty JSON object "{}" ? Do you see any problem with this? I no longer need 
>> those millions of values that are living in the cluster... 
>> 
>> When the version 2.0 of riak runs stable I'll do the update and only then 
>> delete those keys!
>> 
>> Best regards
>> 
>> 
>> On 18 February 2014 16:32, Edgar Veiga  wrote:
>> Ok, thanks a lot Matthew.
>> 
>> 
>> On 18 February 2014 16:18, Matthew Von-Maszewski  wrote:
>> Riak 2.0 is coming.  Hold your mass delete until then.  The "bug" is within 
>> Google's original leveldb architecture.  Riak 2.0 sneaks around to get the 
>> disk space freed.
>> 
>> Matthew
>> 
>> 
>> 
>> On Feb 18, 2014, at 11:10 AM, Edgar Veiga  wrote:
>> 
>>> The only/main purpose is to free disk space..
>>> 
>>> I was a little bit concerned regarding this operation, but now with your 
>>> feedback I'm tending to don't do nothing, I can't risk the growing of 
>>> space... 
>>> Regarding the overhead I think that with a tight throttling system I could 
>>> control and avoid overloading the cluster.
>>> 
>>> Mixed feelings :S
>>> 
>>> 
>>> 
>>> On 18 February 2014 15:45, Matthew Von-Maszewski  wrote:
>>> Edgar,
>>> 
>>> The first "concern" I have is that leveldb's delete does not free disk 
>>> space.  Others have executed mass delete operations only to discover they 
>>> are now using more disk space instead of less.  Here is a discussion of the 
>>> problem:
>>> 
>>> https://github.com/basho/leveldb/wiki/mv-aggressive-delete
>>> 
>>> The link also describes Riak's database operation overhead.  This is a 
>>> second "concern".  You will need to carefully throttle your delete rate or 
>>> the overhead will likely impact your production throughput.
>>> 
>>> We have new code to help quicken the actual purge of deleted data in Riak 
>>> 2.0.  But that release is not quite ready for production usage.
>>> 
>>> 
>>> What do you hope to achieve by the mass delete?
>>> 
>>> Matthew
>>> 
>>> 
>>> 
>>> 
>>> On Feb 18, 2014, at 10:29 AM, Edgar Veiga  wrote:
>>> 
>>>> Sorry, forgot that info!
>>>> 
>>>> It's leveldb.
>>>> 
>>>> Best regards
>>>> 
>>>> 
>>>> On 18 F

Re: RIAK 1.4.6 - Mass key deletion

2014-04-06 Thread Matthew Von-Maszewski

Edgar,

In Riak 1.4, there is no advantage to using empty values versus deleting.

leveldb is a "write once" data store.  New data for a given key never 
physically overwrites old data for the same key.  New data "hides" the old data 
by being in a lower level, and therefore picked first.

leveldb's compaction operation will remove older key/value pairs only when the 
newer key/value is pair is part of a compaction involving both new and old.  
The new and the old key/value pairs must have migrated to adjacent levels 
through normal compaction operations before leveldb will see them in the same 
compaction.  The migration could take days, weeks, or even months depending 
upon the size of your entire dataset and the rate of incoming write operations.

leveldb's "delete" object is exactly the same as your empty JSON object.  The 
delete object simply has one more flag set that allows it to also be removed if 
and only if there is no chance for an identical key to exist on a higher level.

I apologize that I cannot give you a more useful answer.  2.0 is on the horizon.

Matthew


On Apr 6, 2014, at 7:04 AM, Edgar Veiga  wrote:

> Hi again!
> 
> Sorry to reopen this discussion, but I have another question regarding the 
> former post.
> 
> What if, instead of doing a mass deletion (We've already seen that it will be 
> non profitable, regarding disk space) I update all the values with an empty 
> JSON object "{}" ? Do you see any problem with this? I no longer need those 
> millions of values that are living in the cluster... 
> 
> When the version 2.0 of riak runs stable I'll do the update and only then 
> delete those keys!
> 
> Best regards
> 
> 
> On 18 February 2014 16:32, Edgar Veiga  wrote:
> Ok, thanks a lot Matthew.
> 
> 
> On 18 February 2014 16:18, Matthew Von-Maszewski  wrote:
> Riak 2.0 is coming.  Hold your mass delete until then.  The "bug" is within 
> Google's original leveldb architecture.  Riak 2.0 sneaks around to get the 
> disk space freed.
> 
> Matthew
> 
> 
> 
> On Feb 18, 2014, at 11:10 AM, Edgar Veiga  wrote:
> 
>> The only/main purpose is to free disk space..
>> 
>> I was a little bit concerned regarding this operation, but now with your 
>> feedback I'm tending to don't do nothing, I can't risk the growing of 
>> space... 
>> Regarding the overhead I think that with a tight throttling system I could 
>> control and avoid overloading the cluster.
>> 
>> Mixed feelings :S
>> 
>> 
>> 
>> On 18 February 2014 15:45, Matthew Von-Maszewski  wrote:
>> Edgar,
>> 
>> The first "concern" I have is that leveldb's delete does not free disk 
>> space.  Others have executed mass delete operations only to discover they 
>> are now using more disk space instead of less.  Here is a discussion of the 
>> problem:
>> 
>> https://github.com/basho/leveldb/wiki/mv-aggressive-delete
>> 
>> The link also describes Riak's database operation overhead.  This is a 
>> second "concern".  You will need to carefully throttle your delete rate or 
>> the overhead will likely impact your production throughput.
>> 
>> We have new code to help quicken the actual purge of deleted data in Riak 
>> 2.0.  But that release is not quite ready for production usage.
>> 
>> 
>> What do you hope to achieve by the mass delete?
>> 
>> Matthew
>> 
>> 
>> 
>> 
>> On Feb 18, 2014, at 10:29 AM, Edgar Veiga  wrote:
>> 
>>> Sorry, forgot that info!
>>> 
>>> It's leveldb.
>>> 
>>> Best regards
>>> 
>>> 
>>> On 18 February 2014 15:27, Matthew Von-Maszewski  wrote:
>>> Which Riak backend are you using:  bitcask, leveldb, multi?
>>> 
>>> Matthew
>>> 
>>> 
>>> On Feb 18, 2014, at 10:17 AM, Edgar Veiga  wrote:
>>> 
>>> > Hi all!
>>> >
>>> > I have a fairly trivial question regarding mass deletion on a riak 
>>> > cluster, but firstly let me give you just some context. My cluster is 
>>> > running with riak 1.4.6 on 6 machines with a ring of 256 nodes and 1Tb 
>>> > ssd disks.
>>> >
>>> > I need to execute a massive object deletion on a bucket, I'm talking of 
>>> > ~1 billion keys (The object average size is ~1Kb). I will not retrive the 
>>> > keys from riak because a I have a file with all of them. I'll just start 
>>> > a script that reads them from the file and triggers an HTTP DELETE for 
>>> > each one.

Re: Change data directory

2014-04-04 Thread Matthew Von-Maszewski

Are you using leveldb or bitcask back end?  What is your desired data 
directory?  I will create an example. 

Sent from my iPhone

> On Apr 4, 2014, at 3:05 PM, Nabil Hassein  wrote:
> 
> Hello all,
> 
> I'm trying to change /etc/riak/app.config to store data in a directory of my 
> choosing rather than the default one. Simply changing platform_data_dir to 
> another directory yields errors, even after a `chown -R riak:riak` of the 
> relevant directory; the service starts but any attempts to use the REST API 
> yield 500 Internal Server Errors. Trying to change other things, such as the 
> ring_state_dir, generally results in riak failing to start at all.
> 
> Does anyone have an example configuration where data is stored in a different 
> directory to the defaults, or advice about where I might be going wrong?
> 
> I'm using a binary install of riak 1.4.8 on CentOS, if that's relevant.
> 
> Thanks,
> Nabil
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: LevelDB parameter planning - max_open_files

2014-04-04 Thread Matthew Von-Maszewski


Copying back to mailing list for others and archive.

On Apr 4, 2014, at 11:37 AM, Oleksiy Krivoshey  wrote:

> Great! 
> 
> I'm trying 2.0 right now and have found that 'total_leveldb_mem' and 
> 'total_leveldb_mem_percent' are really much easier to use and understand. 
> 
> Thanks!
> 
> 
> On 4 April 2014 18:14, Matthew Von-Maszewski  wrote:
> Oleksiy,
> 
> Go to step 6:  "Compare Step 2 and Step 5 …".   There is a link to an Excel 
> spreadsheet at the end of the sentence "The above calculations are automated 
> in this memory model spreadsheet.".   Forget the text and use the spreadsheet 
> (memory model spreadsheet).
> 
> Much of that text is still related to memory management of 1.2 and 1.3.  
> Seems it did not get updated to 1.4.  Hmm, that might be my fault.
> 
> Answers to your comments/questions below:
> 
> 1.  Step 3 on the page is just wrong with 1.4:  open_file_memory = 
> (max_open_files -10) * 4194304
> 
> 2.  average_sst_filesize is not relevant with 1.4.  It was used to estimate 
> the size of the bloom filter attached to each .sst file.  There is now a 
> fixed maximum of 150,001 bytes for the bloom filter, and it is the typical 
> size for all files in levels 2 through 6.
> 
> 3.  The page is attempting to estimate the total memory usage of one vnode.  
> The spreadsheet does the same.  Therefore the maximum memory per either model 
> is the "working memory per vnode" in Step 5, or the "working per vnode" line 
> in the spreadsheet, multiplied by the number of vnodes active on the node 
> (server).
> 
> 
> Now let me make a related note / sales pitch.  The upcoming Riak 2.0 
> eliminates all the manual calculations / planning.  You tell Riak what 
> percentage of memory is allocated to leveldb.  leveldb then dynamically 
> adjusts each vnode's allocation as your dataset changes and/or vnodes are 
> moved to and from the node (server).  
> 
> Matthew
> 
> 
> On Apr 4, 2014, at 5:08 AM, Oleksiy Krivoshey  wrote:
> 
>> Can someone please suggest how to understand the formula for 
>> open_file_memory on this page: 
>> http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/#Parameter-Planning
>> 
>> 1. It definitely lacks some brackets, the correct formula is:
>> 
>> OPEN_FILE_MEMORY =  (max_open_files-10) * (184 + (average_sst_filesize/2048) 
>> * (8+((key_size+value_size)/2048 +1)*0.6))
>> 
>> 
>> 2. How to estimate average_sst_filesize?
>> 
>> 
>> 3. does the result estimate the memory used by a single open file in any 
>> particular vnode? Or by a single vnode with max_open_files open? As 
>> max_open_files is a per vnode parameter then how to estimate the maximum 
>> memory used by leveldb if all vnodes have all max_open_files open? is it 
>> result*ring_size or result*ring_size*max_open_files?
>> 
>> 
>> Thanks!
>> 
>> -- 
>> Oleksiy
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> 
> -- 
> Oleksiy Krivoshey

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: LevelDB parameter planning - max_open_files

2014-04-04 Thread Matthew Von-Maszewski

Oleksiy,

Go to step 6:  "Compare Step 2 and Step 5 …".   There is a link to an Excel 
spreadsheet at the end of the sentence "The above calculations are automated in 
this memory model spreadsheet.".   Forget the text and use the spreadsheet 
(memory model spreadsheet).

Much of that text is still related to memory management of 1.2 and 1.3.  Seems 
it did not get updated to 1.4.  Hmm, that might be my fault.

Answers to your comments/questions below:

1.  Step 3 on the page is just wrong with 1.4:  open_file_memory = 
(max_open_files -10) * 4194304

2.  average_sst_filesize is not relevant with 1.4.  It was used to estimate the 
size of the bloom filter attached to each .sst file.  There is now a fixed 
maximum of 150,001 bytes for the bloom filter, and it is the typical size for 
all files in levels 2 through 6.

3.  The page is attempting to estimate the total memory usage of one vnode.  
The spreadsheet does the same.  Therefore the maximum memory per either model 
is the "working memory per vnode" in Step 5, or the "working per vnode" line in 
the spreadsheet, multiplied by the number of vnodes active on the node (server).

Now let me make a related note / sales pitch.  The upcoming Riak 2.0 eliminates 
all the manual calculations / planning.  You tell Riak what percentage of 
memory is allocated to leveldb.  leveldb then dynamically adjusts each vnode's 
allocation as your dataset changes and/or vnodes are moved to and from the node 
(server).  

Matthew

On Apr 4, 2014, at 5:08 AM, Oleksiy Krivoshey  wrote:

> Can someone please suggest how to understand the formula for open_file_memory 
> on this page: 
> http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/#Parameter-Planning
> 
> 1. It definitely lacks some brackets, the correct formula is:
> 
> OPEN_FILE_MEMORY =  (max_open_files-10) * (184 + (average_sst_filesize/2048) 
> * (8+((key_size+value_size)/2048 +1)*0.6))
> 
> 
> 2. How to estimate average_sst_filesize?
> 
> 
> 3. does the result estimate the memory used by a single open file in any 
> particular vnode? Or by a single vnode with max_open_files open? As 
> max_open_files is a per vnode parameter then how to estimate the maximum 
> memory used by leveldb if all vnodes have all max_open_files open? is it 
> result*ring_size or result*ring_size*max_open_files?
> 
> 
> Thanks!
> 
> -- 
> Oleksiy
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: http connections and CLOSE_WAIT

2014-04-03 Thread Matthew Von-Maszewski

Here we go, better description of SO_LINGER … this rings a bell.  Lets you set 
the time allowed for clean versus abrupt connection close.  The side effect is 
that your connection structures can get cleaned up quicker.

Still looking for validation of Squid / Apache using this to manage poorly 
behaving clients.

Matthew

On Apr 3, 2014, at 11:10 AM, Sean Cribbs  wrote:

> This is likely something you can tweak with sysctl. The HTTP interface 
> already has SO_REUSEADDR on. This may be of help:
> 
> http://tux.hk/index.php?m=05&y=09&entry=entry090521-111844
> 
> 
> On Thu, Apr 3, 2014 at 10:02 AM, Sean Allen  
> wrote:
> We are using pre11 right now.
> 
> When we open http connections, they hang around for a long time in CLOSE_WAIT 
> which results in really spiky performance. When the connections close, its 
> fast, then they build up again,
> 
> Is there something that needs to be configured w/ riak to get it to reuse the 
> sockets and not sit in close wait?
> 
> 
> -- 
> 
> Ce n'est pas une signature
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> 
> -- 
> Sean Cribbs 
> Software Engineer
> Basho Technologies, Inc.
> http://basho.com/
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Storing Photos in riak

2014-03-24 Thread Matthew Von-Maszewski

To date, only leveldb.  Will be expanding the matrix to bitcask this week for 
the first time.  Not sure where memory backend fits with that particular test 
scenario.  Will ask around.

Matthew


On Mar 24, 2014, at 2:38 PM, István  wrote:

> Hi Matthew,
> 
> Just a quick question. Do you test Riak with memory backend with 130k
> objects or just LevelDB?
> 
> Regards,
> Istvan
> 
> On Mon, Mar 24, 2014 at 5:55 AM, Matthew Von-Maszewski
>  wrote:
>> Ingo,
>> 
>> We regularly test Riak and its leveldb backend with 130k objects as part of 
>> our performance and qualification tests.  Works just fine.
>> 
>> The one thing to remember is that with N=3 redundancy, the bandwidth between 
>> the nodes will approach 3 times the incoming user bandwidth.  Make sure your 
>> network infrastructure is ready between your nodes (servers).
>> 
>> The 130K testing is based upon the loads created by one of our enterprise 
>> customers.  You can read details of that use case here:
>> 
>> http://media.basho.com/pdf/Voxer-Case-Study.pdf
>> 
>> Matthew
>> 
>> 
>> On Mar 24, 2014, at 7:22 AM, Ingo Rockel  
>> wrote:
>> 
>>> Hi List,
>>> 
>>> I'm currently evaluating the possibilities to migrate our mysql-based image 
>>> storage to something which actually scales and is easier to maintain.
>>> 
>>> Among different distributed file systems (like ceph, openstack swift) I was 
>>> looking at riak as we already have a riak cluster for storing our users 
>>> messages.
>>> 
>>> As we are only storing quite stripped down versions of our users photos, 
>>> the average size of the photos is only about 30-40kb, there might be some 
>>> above 100KB though. But all are less than 1MB.
>>> 
>>> As our objects are quite small and I don't need anything besides getting 
>>> the binary for a key I want to use riak without riak CS. Is this a good or 
>>> bad idea?
>>> 
>>> The current storage (which needs to be migrated) in MySQL has about 35 
>>> million entries which consume 1.1TB of space.
>>> 
>>> Any ideas or suggestions?
>>> 
>>> Ingo
>>> 
>>> --
>>> Software Architect
>>> 
>>> Blue Lion mobile GmbH
>>> Tel. +49 (0) 221 788 797 14
>>> Fax. +49 (0) 221 788 797 19
>>> Mob. +49 (0) 176 24 87 30 89
>>> 
>>> ingo.roc...@bluelionmobile.com
>>>>>> qeep: Hefferwolf
>>> 
>>> www.bluelionmobile.com
>>> www.qeep.net
>>> 
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
>> 
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> -- 
> the sun shines for all


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Storing Photos in riak

2014-03-24 Thread Matthew Von-Maszewski

Ingo,

Maximum size involves more components of Riak than my specialty area of 
leveldb.  So I did some asking around the engineering room.  The word back is:

- 1M is a typical, recommended maximum,
- if you care about worst case latencies (tail latencies), should stay below 
100K
- if you really, really care about latencies, stay below 10K

Matthew


On Mar 24, 2014, at 12:44 PM, Ingo Rockel  
wrote:

> Hi Matthew,
> 
> thanks for the Infos. What is the maximum size of objects which I can store 
> into riak without the need of splitting them and without suffering from 
> performance issues?
> 
> Best,
> 
>   Ingo
> 
> Am 24.03.2014 13:55, schrieb Matthew Von-Maszewski:
>> Ingo,
>> 
>> We regularly test Riak and its leveldb backend with 130k objects as part of 
>> our performance and qualification tests.  Works just fine.
>> 
>> The one thing to remember is that with N=3 redundancy, the bandwidth between 
>> the nodes will approach 3 times the incoming user bandwidth.  Make sure your 
>> network infrastructure is ready between your nodes (servers).
>> 
>> The 130K testing is based upon the loads created by one of our enterprise 
>> customers.  You can read details of that use case here:
>> 
>> http://media.basho.com/pdf/Voxer-Case-Study.pdf
>> 
>> Matthew
>> 
>> 
>> On Mar 24, 2014, at 7:22 AM, Ingo Rockel  
>> wrote:
>> 
>>> Hi List,
>>> 
>>> I'm currently evaluating the possibilities to migrate our mysql-based image 
>>> storage to something which actually scales and is easier to maintain.
>>> 
>>> Among different distributed file systems (like ceph, openstack swift) I was 
>>> looking at riak as we already have a riak cluster for storing our users 
>>> messages.
>>> 
>>> As we are only storing quite stripped down versions of our users photos, 
>>> the average size of the photos is only about 30-40kb, there might be some 
>>> above 100KB though. But all are less than 1MB.
>>> 
>>> As our objects are quite small and I don't need anything besides getting 
>>> the binary for a key I want to use riak without riak CS. Is this a good or 
>>> bad idea?
>>> 
>>> The current storage (which needs to be migrated) in MySQL has about 35 
>>> million entries which consume 1.1TB of space.
>>> 
>>> Any ideas or suggestions?
>>> 
>>> Ingo
>>> 
>>> --
>>> Software Architect
>>> 
>>> Blue Lion mobile GmbH
>>> Tel. +49 (0) 221 788 797 14
>>> Fax. +49 (0) 221 788 797 19
>>> Mob. +49 (0) 176 24 87 30 89
>>> 
>>> ingo.roc...@bluelionmobile.com
>>>>>> qeep: Hefferwolf
>>> 
>>> www.bluelionmobile.com
>>> www.qeep.net
>>> 
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
> 
> 
> -- 
> Software Architect
> 
> Blue Lion mobile GmbH
> Tel. +49 (0) 221 788 797 14
> Fax. +49 (0) 221 788 797 19
> Mob. +49 (0) 176 24 87 30 89
> 
> ingo.roc...@bluelionmobile.com
> >>> qeep: Hefferwolf
> 
> www.bluelionmobile.com
> www.qeep.net


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Storing Photos in riak

2014-03-24 Thread Matthew Von-Maszewski

Ingo,

We regularly test Riak and its leveldb backend with 130k objects as part of our 
performance and qualification tests.  Works just fine.

The one thing to remember is that with N=3 redundancy, the bandwidth between 
the nodes will approach 3 times the incoming user bandwidth.  Make sure your 
network infrastructure is ready between your nodes (servers).

The 130K testing is based upon the loads created by one of our enterprise 
customers.  You can read details of that use case here:

http://media.basho.com/pdf/Voxer-Case-Study.pdf

Matthew


On Mar 24, 2014, at 7:22 AM, Ingo Rockel  wrote:

> Hi List,
> 
> I'm currently evaluating the possibilities to migrate our mysql-based image 
> storage to something which actually scales and is easier to maintain.
> 
> Among different distributed file systems (like ceph, openstack swift) I was 
> looking at riak as we already have a riak cluster for storing our users 
> messages.
> 
> As we are only storing quite stripped down versions of our users photos, the 
> average size of the photos is only about 30-40kb, there might be some above 
> 100KB though. But all are less than 1MB.
> 
> As our objects are quite small and I don't need anything besides getting the 
> binary for a key I want to use riak without riak CS. Is this a good or bad 
> idea?
> 
> The current storage (which needs to be migrated) in MySQL has about 35 
> million entries which consume 1.1TB of space.
> 
> Any ideas or suggestions?
> 
> Ingo
> 
> -- 
> Software Architect
> 
> Blue Lion mobile GmbH
> Tel. +49 (0) 221 788 797 14
> Fax. +49 (0) 221 788 797 19
> Mob. +49 (0) 176 24 87 30 89
> 
> ingo.roc...@bluelionmobile.com
> >>> qeep: Hefferwolf
> 
> www.bluelionmobile.com
> www.qeep.net
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Cleaning up bucket after basho_bench run

2014-03-22 Thread Matthew Von-Maszewski

There is no manual mechanism (or secret Erlang command) to trigger leveldb 
compactions.  The only activity that triggers compactions in 1.4 is incoming 
writes.  Write more to compact more to free up disk space.  Not logical, but 
the truth.

Matthew


On Mar 22, 2014, at 7:48 PM, István  wrote:

> Matthew,
> 
> Thank for the details about LevelDB. Is there a way to trigger
> compaction from Erlang or any other way to get rid of tombstones
> faster with 1.4? If there is no such a thing I guess waiting is my
> only option.
> 
> Thanks everybody helping with this issue.
> 
> Regards,
> Istvan
> 
> 
> 
> 
> On Sat, Mar 22, 2014 at 5:33 AM, Matthew Von-Maszewski
>  wrote:
>> Leveldb, as written by Google, does not actively clean up delete 
>> "tombstones" or prior data records with the same key. The old data and 
>> tombstones stay on disk until they happen to participate in compaction at 
>> the highest "level".  The clean up can therefore happen days, weeks, or even 
>> months later depending upon the size of your dataset, speed of incoming 
>> writes, and distribution of new keys versus deleted keys.
>> 
>> Basho has added code to leveldb in Riak 2.0 to more aggressively free up 
>> disk space.  Details on this 2.0 feature are here:
>> 
>>   https://github.com/basho/leveldb/wiki/mv-aggressive-delete
>> 
>> Matthew Von-Maszewski
>> 
>> 
>> On Mar 22, 2014, at 1:53, István  wrote:
>> 
>>> All good, all the keys are gone! :)
>>> 
>>> I am just waiting Riak to free up the space. It seems it is not
>>> instant... Or I am missing something. I need to read up on how LevelDB
>>> actually frees up space.  I have updated the code to stop on {ReqID,
>>> done}. I think you get this only when you have no keys left. I have
>>> verified that that there are no keys left in the bucket.
>>> 
>>> 
>>> # curl -XGET -i http://127.0.0.1:8098/buckets/test/keys?keys=stream
>>> HTTP/1.1 200 OK
>>> Vary: Accept-Encoding
>>> Transfer-Encoding: chunked
>>> Server: MochiWeb/1.1 WebMachine/1.10.0 (never breaks eye contact)
>>> Date: Sat, 22 Mar 2014 05:51:48 GMT
>>> Content-Type: application/json
>>> 
>>> {"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}
>>> 
>>> Thanks Evan!
>>> I.
>>> 
>>> On Fri, Mar 21, 2014 at 9:44 PM, Evan Vigil-McClanahan
>>>  wrote:
>>>> Did some double checking on the off chance that I gave you some bad
>>>> advice.  Here's the function that the erlang client uses to accumulate
>>>> the outcome of stream_list_keys et al:
>>>> https://github.com/basho/riak-erlang-client/blob/master/src/riakc_pb_socket.erl#L2146-L2155
>>>> 
>>>> here is how you get the request id:
>>>> https://github.com/basho/riak-erlang-client/blob/master/src/riakc_pb_socket.erl#L490-L494
>>>> 
>>>> On Fri, Mar 21, 2014 at 9:29 PM, Evan Vigil-McClanahan
>>>>  wrote:
>>>>> You don't want to recurse when you get the {ReqID, done} message, you
>>>>> should just stop there.
>>>>> 
>>>>> On Fri, Mar 21, 2014 at 6:20 PM, István  wrote:
>>>>>> With help of Evan (evanmcc) on the IRC channel I was able to kick off
>>>>>> the clean up job using riak-erlang-client.
>>>>>> 
>>>>>> Here is the code:
>>>>>> 
>>>>>> https://gist.github.com/l1x/9698847
>>>>>> 
>>>>>> It sometimes behaves a bit weirdly, the PB client returns {40127151,
>>>>>> done} or something similar, that I can't recognize why but it
>>>>>> definitely deleted some of the

Re: Cleaning up bucket after basho_bench run

2014-03-22 Thread Matthew Von-Maszewski

Leveldb, as written by Google, does not actively clean up delete "tombstones" 
or prior data records with the same key. The old data and tombstones stay on 
disk until they happen to participate in compaction at the highest "level".  
The clean up can therefore happen days, weeks, or even months later depending 
upon the size of your dataset, speed of incoming writes, and distribution of 
new keys versus deleted keys.

Basho has added code to leveldb in Riak 2.0 to more aggressively free up disk 
space.  Details on this 2.0 feature are here:

   https://github.com/basho/leveldb/wiki/mv-aggressive-delete 

Matthew Von-Maszewski


On Mar 22, 2014, at 1:53, István  wrote:

> All good, all the keys are gone! :)
> 
> I am just waiting Riak to free up the space. It seems it is not
> instant... Or I am missing something. I need to read up on how LevelDB
> actually frees up space.  I have updated the code to stop on {ReqID,
> done}. I think you get this only when you have no keys left. I have
> verified that that there are no keys left in the bucket.
> 
> 
> # curl -XGET -i http://127.0.0.1:8098/buckets/test/keys?keys=stream
> HTTP/1.1 200 OK
> Vary: Accept-Encoding
> Transfer-Encoding: chunked
> Server: MochiWeb/1.1 WebMachine/1.10.0 (never breaks eye contact)
> Date: Sat, 22 Mar 2014 05:51:48 GMT
> Content-Type: application/json
> 
> {"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}
> 
> Thanks Evan!
> I.
> 
> On Fri, Mar 21, 2014 at 9:44 PM, Evan Vigil-McClanahan
>  wrote:
>> Did some double checking on the off chance that I gave you some bad
>> advice.  Here's the function that the erlang client uses to accumulate
>> the outcome of stream_list_keys et al:
>> https://github.com/basho/riak-erlang-client/blob/master/src/riakc_pb_socket.erl#L2146-L2155
>> 
>> here is how you get the request id:
>> https://github.com/basho/riak-erlang-client/blob/master/src/riakc_pb_socket.erl#L490-L494
>> 
>> On Fri, Mar 21, 2014 at 9:29 PM, Evan Vigil-McClanahan
>>  wrote:
>>> You don't want to recurse when you get the {ReqID, done} message, you
>>> should just stop there.
>>> 
>>> On Fri, Mar 21, 2014 at 6:20 PM, István  wrote:
>>>> With help of Evan (evanmcc) on the IRC channel I was able to kick off
>>>> the clean up job using riak-erlang-client.
>>>> 
>>>> Here is the code:
>>>> 
>>>> https://gist.github.com/l1x/9698847
>>>> 
>>>> It sometimes behaves a bit weirdly, the PB client returns {40127151,
>>>> done} or something similar, that I can't recognize why but it
>>>> definitely deleted some of the keys so far. I am letting it run for a
>>>> while and see what happens.
>>>> 
>>>> Regards,
>>>> Istvan
>>>> 
>>>> 
>>>> On Wed, Mar 19, 2014 at 1:02 AM, Christian Dahlqvist
>>>>  wrote:
>>>>> Hi Istvan,
>>>>> 
>>>>> Did you run the Basho Bench clean-up job with the following settings?
>>>>> 
>>>>> {driver, basho_bench_driver_riakc_pb}.
>>>>> {key_generator, {int_to_bin, {partitioned_sequential_int, 1000}}}.
>>>>> {operations, [{delete, 1}]}.
>>>>> 
>>>>> Also, how did you verify that the data was not deleted?
>>>>> 
>>>>> Best regards,
>>>>> 
>>>>> Christian
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Wed, Mar 19, 2014 at 6:49 AM, István  wrote:
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> I was trying to delete all of the keys generated with the following:
>>>>>> 
>>>>>> {key_generator, {int_to_bin, {

Re: What do you do when Riak freezes?

2014-03-19 Thread Matthew Von-Maszewski

I thought I knew the cause of this problem.  I do not.  We need to await input 
from others.

My apologies.

Other basic questions will be:  what version of Riak, what is your app.config, 
how many servers/nodes, any reason this one node is "different"?

Matthew


On Mar 19, 2014, at 5:30 PM, Michael Dillon  wrote:

> We are using AMazon EC2 m3.x2large nodes and while the freeze is occurring 
> free reports
> 
>  total   used   free sharedbuffers cached
> 
> Mem:  306232328818792   21804440  0  880924411832
> 
> -/+ buffers/cache:4318868   26304364
> 
> Swap:0  0  0
> 
> The Erlang processes seem to be unkillable because "shutdown -r now" is also 
> hanging. Right now these nodes are just being used for some testing, but 
> eventually we will go into production and I really need to have a plan for 
> how to detect and then deal with these Erlang freezes. Or better yet, a way 
> to avoid them even if it means detecting some condition in advance and then 
> rebooting the node.
> 
> 
> 
> 
> On Wed, Mar 19, 2014 at 2:07 PM, Matthew Von-Maszewski  
> wrote:
> 
> Any chance you are overflowing into swap?  Or in the case of XEN have you 
> exceeded the guaranteed RAM for the VM memory and moved into the disk backed 
> portion of "ram"?
> 
> What backend do you use within riak?  Do you have memory statistics from 
> before and after the seizure/freeze?
> 
> Matthew
> 
> 
> On Mar 19, 2014, at 4:56 PM, Michael Dillon  
> wrote:
> 
> > I've run into a problem with Riak freezing completely on one node running 
> > on Ubuntu 12.04 LTS on a XEN VM (EC2). If I ssh into the node and run "ps 
> > ax" that shell session also freezes. I also tried another ssh session with 
> > "netstat -lnp" to see if I could find the process ID to kill, but that also 
> > froze.
> >
> > I must admit that I have seen a similar problem with RabbitMQ running on 
> > Ubuntu 10 LTS on a an OpenVPS VM a few years ago.
> >
> > I suppose this is an Erlang issue of some sort, but I would really like 
> > some way to kill the Riak processes without a reboot if possible.
> >
> > --
> > PageFreezer.com
> > #200 - 311 Water Street
> > Vancouver,  BC  V6B 1B8
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> 
> -- 
> PageFreezer.com
> #200 - 311 Water Street
> Vancouver,  BC  V6B 1B8
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: What do you do when Riak freezes?

2014-03-19 Thread Matthew Von-Maszewski


Any chance you are overflowing into swap?  Or in the case of XEN have you 
exceeded the guaranteed RAM for the VM memory and moved into the disk backed 
portion of "ram"?

What backend do you use within riak?  Do you have memory statistics from before 
and after the seizure/freeze?

Matthew


On Mar 19, 2014, at 4:56 PM, Michael Dillon  wrote:

> I've run into a problem with Riak freezing completely on one node running on 
> Ubuntu 12.04 LTS on a XEN VM (EC2). If I ssh into the node and run "ps ax" 
> that shell session also freezes. I also tried another ssh session with 
> "netstat -lnp" to see if I could find the process ID to kill, but that also 
> froze.
> 
> I must admit that I have seen a similar problem with RabbitMQ running on 
> Ubuntu 10 LTS on a an OpenVPS VM a few years ago.
> 
> I suppose this is an Erlang issue of some sort, but I would really like some 
> way to kill the Riak processes without a reboot if possible. 
> 
> -- 
> PageFreezer.com
> #200 - 311 Water Street
> Vancouver,  BC  V6B 1B8
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

1 2 3 >

1 - 100 of 210 matches

Mail list logo