Re: using salt stack with riak
This state is meat of it, and should be pretty self-explanatory if you’re already writing Salt states.. riak-ulimit-pam: file.append: - name: /etc/pam.d/common-session - text: "session\trequired\tpam_limits.so" riak-ulimit-pam-noninteractive: file.append: - name: /etc/pam.d/common-session-noninteractive - text: "session\trequired\tpam_limits.so" riak-ulimit: file.append: - name: /etc/security/limits.conf - text: - "riak soft nofile 65536" - "riak hard nofile 65536" - require: - file: riak-ulimit-pam python-software-properties: pkg.installed basho-pkgrepo: pkgrepo.managed: - humanname: Basho PPA - name: deb http://apt.basho.com precise main - file: /etc/apt/sources.list.d/basho.list - key_url: http://apt.basho.com/gpg/basho.apt.key - require: - pkg: python-software-properties riak: pkg.installed: - version: 1.4.2-1 - require: - pkgrepo: basho-pkgrepo riak.running: - require: - pkg: riak - file: /etc/riak/app.config - file: /etc/riak/vm.args - file: riak-ulimit /etc/riak/app.config: file.managed: - source: salt://riak/app.config - mode: 644 - template: jinja - require: - pkg: riak - defaults: internal_ip: {{ salt['cmd.exec_code']('bash', 'hostname -I') }} /etc/riak/vm.args: file.managed: - source: salt://riak/vm.args - mode: 644 - template: jinja - require: - pkg: riak - defaults: internal_ip: {{ salt['cmd.exec_code']('bash', 'hostname -I') }} On 24 January 2014 04:22, Matt Davis wrote: > Nicely done Matt! Sure would love to see your states... I've got a fairly > good one for riak-cs, would love to see some others. > > > On Wed, Jan 22, 2014 at 2:14 PM, Matt Black wrote: > >> Hi Matt, >> >> We manage all our Riak infrastructure with a couple of Salt states and a >> custom module I wrote which you can see here: >> >> https://github.com/saltstack/salt-contrib/blob/master/modules/riak.py >> >> There's another Riak module in Salt core, but last time I checked it had >> less functionality. (I talked with them a while back about merging the two >> modules - perhaps I should bring that up again). >> >> I can send our Salt states as well, if you're interested :) >> >> >> On 23 January 2014 07:05, Matt Davis wrote: >> >>> Hey all, >>> >>> We're implementing salt stack for configuration management, and I've >>> been trying out how it works with riak, specifically remote command >>> execution. >>> >>> Anyone out there in riak-land been successfully integrating it with salt? >>> >>> I've hit a couple of "arroo?" moments and am curious what others have >>> experienced. >>> >>> -matt >>> >>> >>> ___ >>> riak-users mailing list >>> riak-users@lists.basho.com >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> >>> >> > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: riak2 erlang mapreduce counters
Thank you both Eric & Russeli for the answer, sadly it leads to more questions. Regardless of the type (though I can say in this case the counters were pushed from the python 2.0.2 client, so I assume its riak_dt_pncounter) I get this error: {"phase":0,"error":"badarg","input":"{ok,{r_object,<<\"ogir-fp\">>,<<\"682l2fp6\">>,[{r_content,{dict,4,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[[<<\"content-type\">>,97,112,112,108,105,99,97,116,105,111,110,47,114,105,97,107,95,99,111,117,110,116,101,114],[<<\"X-Riak-VTag\">>,52,103,66,88,71,122,56,55,111,105,66,112,103,65,75,99,54,72,55,69,110,79]],[[<<\"index\">>]],[],[[<<\"X-Riak-Last-Modified\">>|{1390,508268,984013}]],[],[]}}},<<69,1,71,1,0,0,0,29,70,1,131,108,0,0,0,1,104,2,109,...>>}],...},...}","type":"error","stack":"[{erlang,apply,[<<\"riak_kv_pncounter\">>,new,[]],[]},{riak_kv_crdt,crdt_value,2,[{file,\"src/riak_kv_crdt.erl\"},{line,94}]},{riak_kv_crdt,value,2,[{file,\"src/riak_kv_crdt.erl\"},{line,86}]},{mr_kv_counters,value,3,[{file,\"mr_kv_counters.erl\"},{line,38}]},{riak_kv_mrc_map,map,3,[{file,\"src/riak_kv_mrc_map.erl\"},{line,165}]},{riak_kv_mrc_map,process,3,[{file,\"src/riak_kv_mrc_map.erl\"},{line,141}]},{riak_pipe_vnode_worker,process_input,3,[{file,\"src/riak_pipe_vnode_worker.erl\"},{line,445}]},{riak_pipe_vnode_worker,...}]"} Just to help, this is my erlang MR code: value(RiakObject, _KeyData, _Arg) -> Key = riak_object:key(RiakObject), Count = riak_kv_crdt:value(RiakObject, <<"riak_kv_pncounter">>), [ {Key, Count} ]. What am I doing wrong? I can't seem to figure it out... I'm sure its something simple thing I'm just not seeing. Thanks again, Bryce On 01/23/2014 01:07 PM, Russell Brown wrote: On 23 Jan 2014, at 20:51, Eric Redmond wrote: For version 1.4 counters, riak_kv_pncounter. For 2.0 CRDT counters, riak_dt_pncounter. As in, if the data was written in 1.4, or in 2.0 using the legacy, backwards compatible 1.4 API endpoints, the the type is risk_kv_pncounter. If the counter is 2.0, bucket types counter, then risk_dt_pncounter. Really, we need to re-introduce the riak_kv_counter module for backwards compatibility, and add some friendly `value’ functions to risk_kv_crdt. I’m opening an issue for just this now. The other option is to include the riak_kv_types.hrl and use the macros ?MAP_TYPE, ?SET_TYPE, ?V1_COUNTER_TYPE, ?COUNTER_TYPE for now, and assume that we’ll have some helper functions for MapReduce in before 2.0. Cheers Russell Eric On Jan 23, 2014, at 3:44 PM, Bryce Verdier wrote: In 1.4 there was just the simple function riak_kv_counters:value. In 2.0 I found the riak_kv_crdt module, which has a value function in it. But I'm not sure what "type" to use for second value argument for a counter. Can someone share that with me? Thanks in advance, Bryce ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Split index with Riak 2.0 git (jan 15th) on a single dev node cluster
Rob, The one second wait is because yokozuna is the glue code (putting it very, very simply) between a Riak cluster and distributed Solr instances. When you write an object to Riak, yokozuna asynchronously fires off an update to the Solr service. Solr is, by default, configured to soft commit writes every 1 second, and hard commit every 30 seconds[1]. These values are set in the Solr configuration, based on suggested best practiced. You can configure solr.xml to reduce that delay, but the payoff isn't generally worth the overhead. The benefit of an asynchronous index update is that you get the predictable performance of Riak, with an eventually consistent searchable index. The alternative is to lock a Riak put/delete while Solr updates, which means you'll wait up to a second for confirmation for a simple write. We may eventually add such back pressure, but it's barely on our radar, since at the moment, atomic object/index updating is not of practical concern to most Riak users. If you have a specific requirement for such back pressure, please let us know and we may have some ideas for you. Another benefit to separating the write from indexing is that it allows us to leverage an index-specific AAE service. This means that if your dataset and index ever get out of sync, or your index corrupted, the index AAE can repair the discrepancy. If you've ever run Solr or Elastic Search against another datastore, you'll be familiar with having to repair indexes on occasion (whether by a server going down, other system errors, bit rot, etc). With Riak Search (yokozuna) this is handled automatically for you. Hope that helps, Eric http://wiki.apache.org/solr/NearRealtimeSearch On Jan 23, 2014, at 5:41 PM, Rob Speer wrote: > I'm still interested in the question as it applies to Yokozuna. Stable > full-text search in Riak will be very important to my company, so I'd want to > know what the equivalent behaviors are for Yokozuna queries. Do you have to > wait an unspecified amount of time between writing a document and querying it > via Yokozuna? Is there a way to know when it is ready? > > > On Thu, Jan 23, 2014 at 10:53 AM, Luke Bakken wrote: > Hi Rob, > > I believe Ryan meant to wait a second to do a Yokozuna search, not a > general Riak K/V operation. > > There is more information about "read your own writes" here: > http://basho.com/tag/configurable-behaviors/ > -- > Luke Bakken > CSE > lbak...@basho.com > > > On Wed, Jan 22, 2014 at 11:36 AM, Rob Speer wrote: > >> 5. Did you wait at least 1 second before running the queries? > > > > I'm not the original poster but I'm now wondering what this question means. > > Under what circumstances do you have to wait 1 second before query results > > are available? > > > > We want to always be able to run tests on our database rapidly, which > > includes loading data and then immediately querying to make sure the correct > > data is there. I know we've had some tests where we were not able to read > > our own writes, but I thought those were fixed by making sure we used vector > > clocks correctly. Is there a situation where you have to wait for an > > unspecified amount of time before you can read your writes? > > > > ___ > > riak-users mailing list > > riak-users@lists.basho.com > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Split index with Riak 2.0 git (jan 15th) on a single dev node cluster
I'm still interested in the question as it applies to Yokozuna. Stable full-text search in Riak will be very important to my company, so I'd want to know what the equivalent behaviors are for Yokozuna queries. Do you have to wait an unspecified amount of time between writing a document and querying it via Yokozuna? Is there a way to know when it is ready? On Thu, Jan 23, 2014 at 10:53 AM, Luke Bakken wrote: > Hi Rob, > > I believe Ryan meant to wait a second to do a Yokozuna search, not a > general Riak K/V operation. > > There is more information about "read your own writes" here: > http://basho.com/tag/configurable-behaviors/ > -- > Luke Bakken > CSE > lbak...@basho.com > > > On Wed, Jan 22, 2014 at 11:36 AM, Rob Speer wrote: > >> 5. Did you wait at least 1 second before running the queries? > > > > I'm not the original poster but I'm now wondering what this question > means. > > Under what circumstances do you have to wait 1 second before query > results > > are available? > > > > We want to always be able to run tests on our database rapidly, which > > includes loading data and then immediately querying to make sure the > correct > > data is there. I know we've had some tests where we were not able to read > > our own writes, but I thought those were fixed by making sure we used > vector > > clocks correctly. Is there a situation where you have to wait for an > > unspecified amount of time before you can read your writes? > > > > ___ > > riak-users mailing list > > riak-users@lists.basho.com > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Riak Search and Yokozuna Backup Strategy
Not to fork the thread too far from the topic being discussed, but is there any possibility of opening up the API used for multidatacenter replication? Specifically, the fullsync API? I imagine the code inside riak_repl can also be used for an external node to connect and get a full dump of a node's content using fullsync. Incremental backups could potentially be taken by using the AAE strategy, by the backup sink building a merkle tree of the data it has, and using that to generate the keylist of deltas. Unfortunately, riak_repl is not open source, so this is something Basho would have to build. On Thu, Jan 23, 2014 at 12:57 PM, Dave Martorana wrote: > I like that HyperDex provides direct backup support instead of simply > suggesting a stop-filecopy-start-catchup scenario. Are there any plans at > Basho to make backups a core function of Riak (or as a separate but included > utility) - it would certainly be nice to have something Basho provides help > ensure things are done properly each time, all the time. > > Cheers, > > Dave > > > On Thu, Jan 23, 2014 at 1:42 PM, Joe Caswell wrote: >> >> Apologies, clicked send in the middle of an incomplete thought. It should >> have read: >> >> Backing up the LevelDB data files while the node is stopped would remove >> the necessity of using the LevelDB repair process upon restoring to make the >> vnode self-consistent. >> >> From: Joe Caswell >> Date: Thursday, January 23, 2014 1:25 PM >> To: Sean McKibben , Elias Levy >> >> >> Cc: "riak-users@lists.basho.com" >> Subject: Re: Riak Search and Yokozuna Backup Strategy >> >> Backing up LevelDB data files can be accomplished while the node is >> running if the sst_x directories are backed up in numerical order. The >> undesirable side effects of that could be duplicated data, inconsistent >> manifest, or incomplete writes, which necessitates running the leveldb >> repair process upon restoration for any vnode backed up while the node was >> running. Since the data is initially written to the recovery log before >> being appended to level 0, and any compaction operation fully writes the >> data to its new location before removing it from its old location, if any of >> these operations are interrupted, the data can be completely recovered by >> leveldb repair. >> >> The only incomplete write that won't be recovered by the LevelDB repair >> process is the initial write to the recovery log, limiting exposure to the >> key being actively written at the time of the snapshot/backup. As long as 2 >> vnodes in the same preflist are not backed up while simultaneously writing >> the same key to the recovery log (i.e. rolling backups are good), this key >> will be recovered by AAE/read repair after restoration. >> >> Backing up the LevelDB data files while the node is stopped would remove >> the necessity of repairing the >> >> Backing up Riak Search data, on the other hand, is a dicey proposition. >> There are 3 bits to riak search data: the document you store, the output of >> the extractor, and the merge index. >> >> When you put a document in <<"key">> in a <<"bucket">> with search >> enabled, Riak uses the pre-defined extractor to parse the document into >> terms, possibly flattening the structure, and stores the result in >> <<"_rsid_bucket">>/<<"key">>, which is used during update operations to >> remove stale entries before adding new ones, and would most likely be stored >> in a different vnode, possibly on a different node entirely. The document >> id/link is inserted into the merge index entry for each term identified by >> the extractor, any or all of which may reside on different nodes. Since the >> document, its index document, and the term indexes could not be guaranteed >> to be captured in any single backup operation, it is a very real probability >> that these would be out of sync in the event that a restore is required. >> >> If restore is only required for a single node, consistency could be >> restored by running a repair operation for each riak_kv vnode and >> riak_search vnode stored on the node, which would repair the data from other >> nodes in the cluster. If more than one node is restored, it is quite likely >> that they both stored replicas of the same data, for some subset of the full >> data set. The only way to ensure consistency is fully restored in the >> latter case is to reindex the data set. This can be accomplished by reading >> and rewriting all of the data, or by reindexing via MapReduce as suggested >> in this earlier mailing list post: >> http://lists.basho.com/pipermail/riak-users_lists.basho.com/2012-October/009861.html >> >> In either restore case, having a backup of the merge_index data files is >> not helpful, so there does not appear to be any point in backing them up. >> >> Joe Caswell >> From: Sean McKibben >> Date: Tuesday, January 21, 2014 1:04 PM >> To: Elias Levy >> Cc: "riak-users@lists.basho.com" >> Subject: Re: Riak Search and Yokozuna Backup Strategy >> >
Re: riak2 erlang mapreduce counters
On 23 Jan 2014, at 20:51, Eric Redmond wrote: > For version 1.4 counters, riak_kv_pncounter. For 2.0 CRDT counters, > riak_dt_pncounter. As in, if the data was written in 1.4, or in 2.0 using the legacy, backwards compatible 1.4 API endpoints, the the type is risk_kv_pncounter. If the counter is 2.0, bucket types counter, then risk_dt_pncounter. Really, we need to re-introduce the riak_kv_counter module for backwards compatibility, and add some friendly `value’ functions to risk_kv_crdt. I’m opening an issue for just this now. The other option is to include the riak_kv_types.hrl and use the macros ?MAP_TYPE, ?SET_TYPE, ?V1_COUNTER_TYPE, ?COUNTER_TYPE for now, and assume that we’ll have some helper functions for MapReduce in before 2.0. Cheers Russell > > Eric > > On Jan 23, 2014, at 3:44 PM, Bryce Verdier wrote: > >> In 1.4 there was just the simple function riak_kv_counters:value. In 2.0 I >> found the riak_kv_crdt module, which has a value function in it. But I'm not >> sure what "type" to use for second value argument for a counter. >> >> Can someone share that with me? >> >> Thanks in advance, >> Bryce >> >> ___ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Riak Search and Yokozuna Backup Strategy
I like that HyperDex provides direct backup support instead of simply suggesting a stop-filecopy-start-catchup scenario. Are there any plans at Basho to make backups a core function of Riak (or as a separate but included utility) - it would certainly be nice to have something Basho provides help ensure things are done properly each time, all the time. Cheers, Dave On Thu, Jan 23, 2014 at 1:42 PM, Joe Caswell wrote: > Apologies, clicked send in the middle of an incomplete thought. It should > have read: > > Backing up the LevelDB data files while the node is stopped would remove > the necessity of using the LevelDB repair process upon restoring to make > the vnode self-consistent. > > From: Joe Caswell > Date: Thursday, January 23, 2014 1:25 PM > To: Sean McKibben , Elias Levy < > fearsome.lucid...@gmail.com> > > Cc: "riak-users@lists.basho.com" > Subject: Re: Riak Search and Yokozuna Backup Strategy > > Backing up LevelDB data files can be accomplished while the node is > running if the sst_x directories are backed up in numerical order. The > undesirable side effects of that could be duplicated data, inconsistent > manifest, or incomplete writes, which necessitates running the leveldb > repair process upon restoration for any vnode backed up while the node was > running. Since the data is initially written to the recovery log before > being appended to level 0, and any compaction operation fully writes the > data to its new location before removing it from its old location, if any > of these operations are interrupted, the data can be completely recovered > by leveldb repair. > > The only incomplete write that won't be recovered by the LevelDB repair > process is the initial write to the recovery log, limiting exposure to the > key being actively written at the time of the snapshot/backup. As long as > 2 vnodes in the same preflist are not backed up while simultaneously > writing the same key to the recovery log (i.e. rolling backups are good), > this key will be recovered by AAE/read repair after restoration. > > Backing up the LevelDB data files while the node is stopped would remove > the necessity of repairing the > > Backing up Riak Search data, on the other hand, is a dicey proposition. > There are 3 bits to riak search data: the document you store, the output > of the extractor, and the merge index. > > When you put a document in <<"key">> in a <<"bucket">> with search > enabled, Riak uses the pre-defined extractor to parse the document into > terms, possibly flattening the structure, and stores the result in > <<"_rsid_bucket">>/<<"key">>, which is used during update operations to > remove stale entries before adding new ones, and would most likely be > stored in a different vnode, possibly on a different node entirely. The > document id/link is inserted into the merge index entry for each term > identified by the extractor, any or all of which may reside on different > nodes. Since the document, its index document, and the term indexes could > not be guaranteed to be captured in any single backup operation, it is a > very real probability that these would be out of sync in the event that a > restore is required. > > If restore is only required for a single node, consistency could be > restored by running a repair operation for each riak_kv vnode and > riak_search vnode stored on the node, which would repair the data from > other nodes in the cluster. If more than one node is restored, it is quite > likely that they both stored replicas of the same data, for some subset of > the full data set. The only way to ensure consistency is fully restored in > the latter case is to reindex the data set. This can be accomplished by > reading and rewriting all of the data, or by reindexing via MapReduce as > suggested in this earlier mailing list post: > http://lists.basho.com/pipermail/riak-users_lists.basho.com/2012-October/009861.html > > In either restore case, having a backup of the merge_index data files is > not helpful, so there does not appear to be any point in backing them up. > > Joe Caswell > From: Sean McKibben > Date: Tuesday, January 21, 2014 1:04 PM > To: Elias Levy > Cc: "riak-users@lists.basho.com" > Subject: Re: Riak Search and Yokozuna Backup Strategy > > +1 LevelDB backup information is important to us > > > On Jan 20, 2014, at 4:38 PM, Elias Levy > wrote: > > Anyone from Basho care to comment? > > > On Thu, Jan 16, 2014 at 10:19 AM, Elias Levy > wrote: > >> >> Also, while LevelDB appears to be largely an append only format, the >> documentation currently does not recommend live backups, presumably because >> there are some issues that can crop up if restoring a DB that was not >> cleanly shutdown. >> >> I am guessing those issues are the ones documented as edge cases here: >> https://github.com/basho/leveldb/wiki/repair-notes >> >> That said, it looks like as of 1.4 those are largely cleared up, at least >> from what I gather from that page, and that one must on
Re: riak2 erlang mapreduce counters
For version 1.4 counters, riak_kv_pncounter. For 2.0 CRDT counters, riak_dt_pncounter. Eric On Jan 23, 2014, at 3:44 PM, Bryce Verdier wrote: > In 1.4 there was just the simple function riak_kv_counters:value. In 2.0 I > found the riak_kv_crdt module, which has a value function in it. But I'm not > sure what "type" to use for second value argument for a counter. > > Can someone share that with me? > > Thanks in advance, > Bryce > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
riak2 erlang mapreduce counters
In 1.4 there was just the simple function riak_kv_counters:value. In 2.0 I found the riak_kv_crdt module, which has a value function in it. But I'm not sure what "type" to use for second value argument for a counter. Can someone share that with me? Thanks in advance, Bryce ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Riak Recap for January 16th - January 23rd
Afternoon, Evening, Morning to All - We've got a tidy Recap today: podcasts, conferences, new code, and more. Enjoy and thanks for being a part of Riak. Mark twitter.com/pharkmillups --- Riak Recap for January 16th - January 23rd Sean Cribbs & Bryce Kerley talk Riak with the Ruby Rogues. - http://rubyrogues.com/139-rr-riak-with-sean-cribbs-and-bryce-kerley/ On last week's RC Hangout, John Daily, Jeremiah Peschka, David Rusek talked about CorrugatedIron, the most-widely used .NET client for Riak. Tomorrow's hangout will be pretty special and should be happening around 11 AM Pacific per usual.) - http://www.youtube.com/watch?v=9e7fA5WdLZY Basho is sponsoring the LA Ruby Conference happening February 6-8. - https://twitter.com/larubyconf/status/424672649227161600 There's a Riak vs. MongoDB meetup happening next Thursday in Serbia. - http://www.meetup.com/Java-Serbia/events/160517022/ Joel Jacobsen is spinning up a Riak meetup in Oslo. There's no date on the first meetup yet but it's never too early to get involved. - http://www.meetup.com/Oslo-Riak-Meetup/ Erlang Factory is happening March 3 - 12 in San Francisco. As usual, Basho will be there in a big way, with talks from Joe Devivo, Nathan Aschbacher, C.S Meiklejohn, JTuple, and Tom Santero. They are also running a 20% discount right now on all training courses (of which Riak is one). - http://www.erlang-factory.com/conference/show/conference-6/home/ Gideon de Kok is fast at work getting Riaku updated for Riak 2.0. - https://twitter.com/gideondk/status/425556278573035520 Hector "Hungry Man" Castro built an Omnibus repo for Basho Bench that makes it dead simple to build self-contained platform-specific packages. - https://github.com/hectcastro/omnibus-basho-bench Q & A - http://stackoverflow.com/questions/21305029/riak-merkle-tree-implementation - http://stackoverflow.com/questions/21162829/bi-directional-map-in-riak - http://stackoverflow.com/questions/21135378/riak-stumped-on-a-basho-mapreduce-challenge - http://stackoverflow.com/questions/21102148/link-a-node-to-a-png-in-riak-causing-the-json-to-not-return - http://stackoverflow.com/questions/21122281/exception-in-riak-java-store-in-bucket ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Riak Search and Yokozuna Backup Strategy
Apologies, clicked send in the middle of an incomplete thought. It should have read: Backing up the LevelDB data files while the node is stopped would remove the necessity of using the LevelDB repair process upon restoring to make the vnode self-consistent. From: Joe Caswell Date: Thursday, January 23, 2014 1:25 PM To: Sean McKibben , Elias Levy Cc: "riak-users@lists.basho.com" Subject: Re: Riak Search and Yokozuna Backup Strategy Backing up LevelDB data files can be accomplished while the node is running if the sst_x directories are backed up in numerical order. The undesirable side effects of that could be duplicated data, inconsistent manifest, or incomplete writes, which necessitates running the leveldb repair process upon restoration for any vnode backed up while the node was running. Since the data is initially written to the recovery log before being appended to level 0, and any compaction operation fully writes the data to its new location before removing it from its old location, if any of these operations are interrupted, the data can be completely recovered by leveldb repair. The only incomplete write that won't be recovered by the LevelDB repair process is the initial write to the recovery log, limiting exposure to the key being actively written at the time of the snapshot/backup. As long as 2 vnodes in the same preflist are not backed up while simultaneously writing the same key to the recovery log (i.e. rolling backups are good), this key will be recovered by AAE/read repair after restoration. Backing up the LevelDB data files while the node is stopped would remove the necessity of repairing the Backing up Riak Search data, on the other hand, is a dicey proposition. There are 3 bits to riak search data: the document you store, the output of the extractor, and the merge index. When you put a document in <<"key">> in a <<"bucket">> with search enabled, Riak uses the pre-defined extractor to parse the document into terms, possibly flattening the structure, and stores the result in <<"_rsid_bucket">>/<<"key">>, which is used during update operations to remove stale entries before adding new ones, and would most likely be stored in a different vnode, possibly on a different node entirely. The document id/link is inserted into the merge index entry for each term identified by the extractor, any or all of which may reside on different nodes. Since the document, its index document, and the term indexes could not be guaranteed to be captured in any single backup operation, it is a very real probability that these would be out of sync in the event that a restore is required. If restore is only required for a single node, consistency could be restored by running a repair operation for each riak_kv vnode and riak_search vnode stored on the node, which would repair the data from other nodes in the cluster. If more than one node is restored, it is quite likely that they both stored replicas of the same data, for some subset of the full data set. The only way to ensure consistency is fully restored in the latter case is to reindex the data set. This can be accomplished by reading and rewriting all of the data, or by reindexing via MapReduce as suggested in this earlier mailing list post: http://lists.basho.com/pipermail/riak-users_lists.basho.com/2012-October/009 861.html In either restore case, having a backup of the merge_index data files is not helpful, so there does not appear to be any point in backing them up. Joe Caswell From: Sean McKibben Date: Tuesday, January 21, 2014 1:04 PM To: Elias Levy Cc: "riak-users@lists.basho.com" Subject: Re: Riak Search and Yokozuna Backup Strategy +1 LevelDB backup information is important to us On Jan 20, 2014, at 4:38 PM, Elias Levy wrote: > Anyone from Basho care to comment? > > > On Thu, Jan 16, 2014 at 10:19 AM, Elias Levy > wrote: >> >> Also, while LevelDB appears to be largely an append only format, the >> documentation currently does not recommend live backups, presumably because >> there are some issues that can crop up if restoring a DB that was not cleanly >> shutdown. >> >> I am guessing those issues are the ones documented as edge cases here: >> https://github.com/basho/leveldb/wiki/repair-notes >> >> That said, it looks like as of 1.4 those are largely cleared up, at least >> from what I gather from that page, and that one must only ensure that data is >> copied in a certain order and that you run the LevelDB repair algorithm when >> retiring the files. >> >> So is the backup documentation on LevelDB still correct? Will Basho will >> enable hot backups on LevelDB backends any time soon? >> >> >> >> On Thu, Jan 16, 2014 at 10:05 AM, Elias Levy >> wrote: >>> How well does Riak Search play with backups? Can you backup the Riak Search >>> data without bringing the node down? >>> >>> The Riak documentation backup page is completely silent on Riak Search and >>> its merge_index backend. >>> >>> And l
Re: Riak Search and Yokozuna Backup Strategy
Backing up LevelDB data files can be accomplished while the node is running if the sst_x directories are backed up in numerical order. The undesirable side effects of that could be duplicated data, inconsistent manifest, or incomplete writes, which necessitates running the leveldb repair process upon restoration for any vnode backed up while the node was running. Since the data is initially written to the recovery log before being appended to level 0, and any compaction operation fully writes the data to its new location before removing it from its old location, if any of these operations are interrupted, the data can be completely recovered by leveldb repair. The only incomplete write that won't be recovered by the LevelDB repair process is the initial write to the recovery log, limiting exposure to the key being actively written at the time of the snapshot/backup. As long as 2 vnodes in the same preflist are not backed up while simultaneously writing the same key to the recovery log (i.e. rolling backups are good), this key will be recovered by AAE/read repair after restoration. Backing up the LevelDB data files while the node is stopped would remove the necessity of repairing the Backing up Riak Search data, on the other hand, is a dicey proposition. There are 3 bits to riak search data: the document you store, the output of the extractor, and the merge index. When you put a document in <<"key">> in a <<"bucket">> with search enabled, Riak uses the pre-defined extractor to parse the document into terms, possibly flattening the structure, and stores the result in <<"_rsid_bucket">>/<<"key">>, which is used during update operations to remove stale entries before adding new ones, and would most likely be stored in a different vnode, possibly on a different node entirely. The document id/link is inserted into the merge index entry for each term identified by the extractor, any or all of which may reside on different nodes. Since the document, its index document, and the term indexes could not be guaranteed to be captured in any single backup operation, it is a very real probability that these would be out of sync in the event that a restore is required. If restore is only required for a single node, consistency could be restored by running a repair operation for each riak_kv vnode and riak_search vnode stored on the node, which would repair the data from other nodes in the cluster. If more than one node is restored, it is quite likely that they both stored replicas of the same data, for some subset of the full data set. The only way to ensure consistency is fully restored in the latter case is to reindex the data set. This can be accomplished by reading and rewriting all of the data, or by reindexing via MapReduce as suggested in this earlier mailing list post: http://lists.basho.com/pipermail/riak-users_lists.basho.com/2012-October/009 861.html In either restore case, having a backup of the merge_index data files is not helpful, so there does not appear to be any point in backing them up. Joe Caswell From: Sean McKibben Date: Tuesday, January 21, 2014 1:04 PM To: Elias Levy Cc: "riak-users@lists.basho.com" Subject: Re: Riak Search and Yokozuna Backup Strategy +1 LevelDB backup information is important to us On Jan 20, 2014, at 4:38 PM, Elias Levy wrote: > Anyone from Basho care to comment? > > > On Thu, Jan 16, 2014 at 10:19 AM, Elias Levy > wrote: >> >> Also, while LevelDB appears to be largely an append only format, the >> documentation currently does not recommend live backups, presumably because >> there are some issues that can crop up if restoring a DB that was not cleanly >> shutdown. >> >> I am guessing those issues are the ones documented as edge cases here: >> https://github.com/basho/leveldb/wiki/repair-notes >> >> That said, it looks like as of 1.4 those are largely cleared up, at least >> from what I gather from that page, and that one must only ensure that data is >> copied in a certain order and that you run the LevelDB repair algorithm when >> retiring the files. >> >> So is the backup documentation on LevelDB still correct? Will Basho will >> enable hot backups on LevelDB backends any time soon? >> >> >> >> On Thu, Jan 16, 2014 at 10:05 AM, Elias Levy >> wrote: >>> How well does Riak Search play with backups? Can you backup the Riak Search >>> data without bringing the node down? >>> >>> The Riak documentation backup page is completely silent on Riak Search and >>> its merge_index backend. >>> >>> And looking forward, what is the backup strategy for Yokozuna? Will it make >>> use of Solr's Replication Handler, or something more lower level? Will the >>> node need to be offline to backup it up? >>> >> > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mail
Re: using salt stack with riak
Nicely done Matt! Sure would love to see your states... I've got a fairly good one for riak-cs, would love to see some others. On Wed, Jan 22, 2014 at 2:14 PM, Matt Black wrote: > Hi Matt, > > We manage all our Riak infrastructure with a couple of Salt states and a > custom module I wrote which you can see here: > > https://github.com/saltstack/salt-contrib/blob/master/modules/riak.py > > There's another Riak module in Salt core, but last time I checked it had > less functionality. (I talked with them a while back about merging the two > modules - perhaps I should bring that up again). > > I can send our Salt states as well, if you're interested :) > > > On 23 January 2014 07:05, Matt Davis wrote: > >> Hey all, >> >> We're implementing salt stack for configuration management, and I've been >> trying out how it works with riak, specifically remote command execution. >> >> Anyone out there in riak-land been successfully integrating it with salt? >> >> I've hit a couple of "arroo?" moments and am curious what others have >> experienced. >> >> -matt >> >> >> ___ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Search is not working in RIAK
Also note that the JSON extractor will concatenate nested fields with underscores, so the field name for your search will be "employee_employee_name" (assuming the JSON you pasted is just incomplete). On Thu, Jan 23, 2014 at 6:43 AM, Dave Parfitt wrote: > Hello Ajit - > > I'm not a Riak Search expert, but running the JSON you included through > http://jsonlint.com yields several errors. > > Cheers - > Dave > > > > On Thu, Jan 23, 2014 at 12:19 AM, Ajit Prasad(RMG00) < > ajit.apra...@igate.com> wrote: > >> Hi, >> >> Search doesn’t seem to work on a normal JSON file. Have created index on >> JSON file and installed the search on the bucket also. >> >> $search-cmd search out2_buk "employee_name:\"Ajit\"" >> >> :: Searching for 'employee_name:"Ajit"' / '' in out2_buk... >> >> -- >> >> :: Found 0 results. >> >> >> >> File content is pasted below : >> >> "employee":{ >> >> "employee_name":"Raghav Maitra", >> >> "employee_id":"1001", >> >> "designation":"Technical Analyst", >> >> "emp_contract_no":"1987-XXVI", >> >> "department": >> >> {"dept_name":"ADW", >> >> "sub_dept_name":"A1", >> >> "dept_id":"A1-2001", >> >> "dept_description":"Analysis and Devlopement Wing", >> >> "dept_site_name":"Diego Gracia" >> >> } >> >> "employee_name":"Ajit", >> >> "employee_id":"1002", >> >> "designation":"Technical Analyst", >> >> "emp_contract_no":"2001-XXVI", >> >> "department": >> >> {"dept_name":"ADW", >> >> "sub_dept_name":"A1", >> >> "dept_id":"A1-2001", >> >> "dept_description":"Analysis and Devlopement Wing", >> >> "dept_site_name":"Diego Gracia" >> >> >> >> } >> >> >> >> >> >> >> >> >> >> Thanks and regards, >> >> Ajit >> >> Cell:+91-9980084384 >> >> >> >> >> ~~Disclaimer~~~ >> Information contained and transmitted by this e-mail is confidential and >> proprietary to iGATE and its affiliates and is intended for use only by the >> recipient. If you are not the intended recipient, you are hereby notified >> that any dissemination, distribution, copying or use of this e-mail is >> strictly prohibited and you are requested to delete this e-mail immediately >> and notify the originator or mailad...@igate.com. iGATE does not enter >> into any agreement with any party by e-mail. Any views expressed by an >> individual do not necessarily reflect the view of iGATE. iGATE is not >> responsible for the consequences of any actions taken on the basis of >> information provided, through this email. The contents of an attachment to >> this e-mail may contain software viruses, which could damage your own >> computer system. While iGATE has taken every reasonable precaution to >> minimise this risk, we cannot accept liability for any damage which you >> sustain as a result of software viruses. You should carry out your own >> virus checks before opening an attachment. To know more about iGATE please >> visit www.igate.com. >> >> >> >> ___ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > -- Sean Cribbs Software Engineer Basho Technologies, Inc. http://basho.com/ ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Split index with Riak 2.0 git (jan 15th) on a single dev node cluster
Hi Rob, I believe Ryan meant to wait a second to do a Yokozuna search, not a general Riak K/V operation. There is more information about "read your own writes" here: http://basho.com/tag/configurable-behaviors/ -- Luke Bakken CSE lbak...@basho.com On Wed, Jan 22, 2014 at 11:36 AM, Rob Speer wrote: >> 5. Did you wait at least 1 second before running the queries? > > I'm not the original poster but I'm now wondering what this question means. > Under what circumstances do you have to wait 1 second before query results > are available? > > We want to always be able to run tests on our database rapidly, which > includes loading data and then immediately querying to make sure the correct > data is there. I know we've had some tests where we were not able to read > our own writes, but I thought those were fixed by making sure we used vector > clocks correctly. Is there a situation where you have to wait for an > unspecified amount of time before you can read your writes? > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: consistency impact on performance
Luke, Yes, the load looks pretty small in our cluster and that explains what I saw. It is interesting to see the difference of the performance between NoSQL with server-side sharding. Anyway, thanks for your reply. I'm confident with the result now. Satoshi 2014/1/22 Luke Bakken > Satoshi, > > The difference in performance between w=1 and w=3 depends a lot on > your cluster setup and hardware, as well as cluster load. > > In your tests, your cluster is most likely under no load, and all > vnodes are ready to process incoming messages. In this case w=1 and > w=3 will have little difference in response time since vnodes will be > able to respond immediately to the write message. > > A better test would be to increase cluster load with a tool like > basho_bench and benchmark Riak and Cassandra using that tool. > -- > Luke Bakken > CSE > lbak...@basho.com > > > On Sun, Jan 19, 2014 at 9:55 PM, Satoshi Yamada > wrote: > > Luke, > > > > Thanks for your reply. > > > > Actually, the size of the data we use for our product is as small as > > hundreds bytes. > > I use such data as 330MB because I thought the difference of performance > > gets clearer by > > storing big data. I did test similarly when the data size is 1MB and > 100B, > > but did not > > see much difference either. > > > > thanks, > > Satoshi > > > > > > > > > > 2014/1/20 Luke Bakken > >> > >> Hi Satoshi, > >> > >> When using Riak, your object sizes should ideally be 1MB or less. A > >> 330MB object will never result in acceptable Riak performance. > >> > >> If you intend to store large objects like this I strongly recommend > >> using Riak CS, which will break up the object for you into chunks that > >> can be managed by Riak. It also provides an S3-compatible API. > >> > >> Please read more about Riak CS here: > http://docs.basho.com/riakcs/latest/ > >> > >> Thanks > >> -- > >> Luke Bakken > >> CSE > >> lbak...@basho.com > >> > >> > >> On Sun, Jan 19, 2014 at 9:29 PM, Satoshi Yamada > >> wrote: > >> > Hi, i'm Satoshi, new to riak. > >> > > >> > I would like to check how consistency changes impacts on the > performance > >> > of > >> > riak cluster. I used w=1 and w=3, and I expected at least two times > more > >> > of > >> > execution time, but there seems no significant change. I saw more > >> > difference > >> > in Cassandra, so I wonder if it's normal in riak or there is something > >> > wrong > >> > in my testing. Can anyone give me some advice on it? > >> > > >> > I simply checked as shown below. > >> > > >> > w=1 > >> > $ time curl -v -XPUT > >> > http://mycluster.com:8098/buckets/w1/keys/data.tar.gz?w=1 -H > >> > "X-Riak-Vclock: > >> > a85hYGB.." -H "Content-Type: text/plain" --data-binary > >> > @data.tar.gz > >> > ... > >> > ... > >> > ... > >> > real0m27.501s > >> > user 0m0.378s > >> > sys0m0.674s > >> > > >> > w=3 > >> > $ time curl -v -XPUT > >> > http://mycluster.com:8098/buckets/w3/keys/data.tar.gz?w=3 -H > >> > "X-Riak-Vclock: > >> > a85hYGB.." -H "Content-Type: text/plain" --data-binary > >> > @data.tar.gz > >> > ... > >> > ... > >> > ... > >> > real0m29.278s > >> > user 0m0.398s > >> > sys0m0.674s > >> > > >> > My cluster consists of 40, all active and healthy machines and running > >> > riak-1.4.1. > >> > The data I use is 330MB. > >> > > >> > Thanks in advance, > >> > Satoshi > >> > > >> > ___ > >> > riak-users mailing list > >> > riak-users@lists.basho.com > >> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >> > > > > > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Search is not working in RIAK
Hello Ajit - I'm not a Riak Search expert, but running the JSON you included through http://jsonlint.com yields several errors. Cheers - Dave On Thu, Jan 23, 2014 at 12:19 AM, Ajit Prasad(RMG00) wrote: > Hi, > > Search doesn’t seem to work on a normal JSON file. Have created index on > JSON file and installed the search on the bucket also. > > $search-cmd search out2_buk "employee_name:\"Ajit\"" > > :: Searching for 'employee_name:"Ajit"' / '' in out2_buk... > > -- > > :: Found 0 results. > > > > File content is pasted below : > > "employee":{ > > "employee_name":"Raghav Maitra", > > "employee_id":"1001", > > "designation":"Technical Analyst", > > "emp_contract_no":"1987-XXVI", > > "department": > > {"dept_name":"ADW", > > "sub_dept_name":"A1", > > "dept_id":"A1-2001", > > "dept_description":"Analysis and Devlopement Wing", > > "dept_site_name":"Diego Gracia" > > } > > "employee_name":"Ajit", > > "employee_id":"1002", > > "designation":"Technical Analyst", > > "emp_contract_no":"2001-XXVI", > > "department": > > {"dept_name":"ADW", > > "sub_dept_name":"A1", > > "dept_id":"A1-2001", > > "dept_description":"Analysis and Devlopement Wing", > > "dept_site_name":"Diego Gracia" > > > > } > > > > > > > > > > Thanks and regards, > > Ajit > > Cell:+91-9980084384 > > > > > ~~Disclaimer~~~ > Information contained and transmitted by this e-mail is confidential and > proprietary to iGATE and its affiliates and is intended for use only by the > recipient. If you are not the intended recipient, you are hereby notified > that any dissemination, distribution, copying or use of this e-mail is > strictly prohibited and you are requested to delete this e-mail immediately > and notify the originator or mailad...@igate.com. iGATE does not enter > into any agreement with any party by e-mail. Any views expressed by an > individual do not necessarily reflect the view of iGATE. iGATE is not > responsible for the consequences of any actions taken on the basis of > information provided, through this email. The contents of an attachment to > this e-mail may contain software viruses, which could damage your own > computer system. While iGATE has taken every reasonable precaution to > minimise this risk, we cannot accept liability for any damage which you > sustain as a result of software viruses. You should carry out your own > virus checks before opening an attachment. To know more about iGATE please > visit www.igate.com. > > > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com