Re: Did a force-remove of two nodes, now system is unresponsive
Hi Marcel, What is the configured ring size for this cluster? You can slow down the transfers by running $ riak-admin transfer-limit 1 in one of your riak nodes. iowait should decrease as well once transfer-limit is lowered, unless one of your disks is failing or is about to fail. Regards, Ciprian On Mon, Aug 18, 2014 at 9:18 PM, marcel.koopman marcel.koop...@gmail.com wrote: We have a 5 node riak cluster. Two nodes in this cluster, had to be removed because they are no longer available (since a half year). So a force remove was done. After this, the 3 remaining nodes began to transfer all data. So we ended up with a complete unresponsive system. The iowait is blocking us now. So we are hoping that this will settle today, the next transaction was actually adding two new nodes. And yes this is production, Is there any chance we lost data? -- View this message in context: http://riak-users.197444.n3.nabble.com/Did-a-force-remove-of-two-nodes-now-system-is-unresponsive-tp4031603.html Sent from the Riak Users mailing list archive at Nabble.com. ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Bitcask Key Listing
I currently maintain my own indexes for some things, and use natural keys where I can, but a question has been nagging me lately. Why is key listing slow? Specifically, why is bitcask key listing slow? One of the biggest issues with bitcask is all keys (including the bucket name and some overhead) must fit into RAM. For large amounts of keys, I understand the coordination data transfer will hurt, but shouldn't things like list buckets (or listing keys from small buckets) be fast? Is there a reason this is slow, and is there a plan to fix it? Thanks, Jason ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Fwd: RiakCS 504 Timeout on s3cmd for certain keys
Hey Kota, We’re currently using the following versions; # Download RiakCS # Version: 1.4.5 # OS: Ubuntu 12.04 (Precise) AMD 64 curl -O http://s3.amazonaws.com/downloads.basho.com/riak-cs/1.4/1.4.5/ubuntu/precise/riak-cs_1.4.5-1_amd64.deb # Download Riak # Version: 1.4.8 # OS: Ubuntu 12.04 (Precise) AMD 64 curl -O http://s3.amazonaws.com/downloads.basho.com/riak/1.4/1.4.8/ubuntu/precise/riak_1.4.8-1_amd64.deb I checked our RiakCS app.config and fold_objects_for_list_keys is set to false. What impact would it have my cluster if I flip that to true? Would I simply update the app.config and restart RiakCS? As for the consideration on garbage collection, the slow performance is happening consistently over the span of a week (since we noticed it as we don’t often list buckets). I suspect its not the case regarding large amounts of objects being deleted as generally all data going into that bucket is write-once (we process PDFs pages to .JPG and PUT them in that bucket, the only time overwrites occur is if we manually re-trigger the processing script to run on a specific document) Adding ke...@basho.com as we have another thread going on about this same topic, I figured we could merge the discussion to reduce duplicate effort here. Alex Millar, CTO Office: 1-800-354-8010 ext. 704 Mobile: 519-729-2539 GoBonfire.com From: Kota Uenishi k...@basho.com Reply: Kota Uenishi k...@basho.com Date: August 18, 2014 at 10:03:40 PM To: Alex Millar a...@gobonfire.com Cc: Charlie Voiselle cvoise...@basho.com, Tad Bickford tbickf...@basho.com, Riak-Users riak-users@lists.basho.com, Brandon Noad bran...@gobonfire.com Subject: Re: Fwd: RiakCS 504 Timeout on s3cmd for certain keys Alex, Riak CS 1.4.5 and 1.5.0 had a lot of improvement after those articles you put the URL, not it is not using Riak's bucket listing but using Riak's internal API for more efficient listing. What version of Riak CS are you using? I want you to make sure you're using those versions and a line `{fold_objects_for_list_keys, true},` at riak_cs section of app.config (assuming all other Riak part correctly configured). Based on this I’m thinking that cost of this type of query is only going to get worse over time as we add more keys to this bucket (unless secondary indexes can be added). Or am I totally out to lunch here and there’s some other underlying problem? The strange part is s3cmd. Riak CS has incremental bucket listing API that requires clients to iterate on every 1000 objects (common prefixes), but s3cmd iterates all the specified bucket before printing them all. You can observe how s3cmd and Riak CS interacts if you specify '-d' option like this: ``` s3cmd -d -c yours.s3cfg ls -r s3://yourbucket/yourdir/ ``` I would expect Riak CS's listing API is not much slow as to need 5 seconds (or, say, 10 seconds) because, on each request it just returns 1000 objects. There might be another possibility on slow query - if you had many (say, more than 10 thousands) deleted objects on the same bucket it might affect each 1000 listing. This will eventually be solved as Riak CS's garbage collection removes deleted manifests, which is just marked as deleted (and to be ignored correctly). [1] http://www.quora.com/Riak/Is-it-really-expensive-for-Riak-to-list-all-buckets-Why On Thu, Aug 14, 2014 at 6:05 AM, Alex Millar a...@gobonfire.com wrote: Good afternoon Charlie, So the issue we’re having is only with bucket listing. alxndrmlr@alxndrmlr-mbp $ time s3cmd -c .s3cfg-riakcs-admin ls s3://bonfirehub-resources-can-east-doc-conversion DIR s3://bonfirehub-resources-can-east-doc-conversion/organizations/ real 2m0.747s user 0m0.076s sys 0m0.030s where as… alxndrmlr@alxndrmlr-mbp $ time s3cmd -c .s3cfg-riakcs-admin ls s3://bonfirehub-resources-can-east-doc-conversion/organizations/OrganizationID-1/documents/proposals DIR s3://bonfirehub-resources-can-east-doc-conversion/organizations/OrganizationID-1/documents/proposals/ real 0m10.262s user 0m0.075s sys 0m0.028s The contents of this bucket contains a lot of very small files (basically for each PDF we receive I split it to .JPG foreach page and store them here. Based on the my latest counts it looks like we have around 170,000 .JPG files in that bucket. Here’s a snippet from the HAProxy log for the 504 timeouts… Aug 12 16:01:34 localhost.localdomain haproxy[4718]: 192.0.223.236:48457 [12/Aug/2014:16:01:24.454] riak_cs~ riak_cs_backend/riak3 161/0/0/-1/10162 504 194 - - sH-- 0/0/0/0/0 0/0 {bonfirehub-resources-can-east-doc-conversion.bf-riakcs.com} GET /?delimiter=/ HTTP/1.1 I’ve put together a video showing off the top results of each of the 5 riak nodes while performing $ time s3cmd -c .s3cfg-riakcs-admin ls s3://bonfirehub-resources-can-east-doc-conversion https://dl.dropboxusercontent.com/u/5723659/RiakCS%20ls%20monitoring%20results.mov Now I’ve had a hunch this is just
Re: Bitcask Key Listing
Jason, There are two aspects to to a key listing operation that make it expensive relative to normal gets or puts. The first part is that, due to the way data is distributed in Riak, key listing requires a covering set of vnodes to participate in order to determine the list of keys for a bucket. A minimal covering set of vnodes works out to 1/N nodes in the cluster where N is the n_val of the bucket. By default this is 3 so in the default case a key listing request must send a request to and receive responses from 1/3 of the nodes in the cluster. This incurs network traversal overhead as the keys from each vnode are returned and the speed to completion is limited by the slowest vnode in the covering set. This is true regardless of the backend in use. The second part is specific to bitcask. Bitcask is an unordered backend and the consequence of this when doing a key listing is that all of the keys stored by a vnode that participates in a key listing request must be scanned. It doesn't matter if there are 2 keys or 2000 keys for the bucket being queried, they all must be scanned. This is a case where all the keys being stored in memory is beneficial to performance, but as the amount of data stored increases so does the expense to scan over it. The leveldb backend is ordered and we are able to take advantage of that fact to only scan over data for the bucket in question, but for bitcask that is not an option. At this time there is nothing in the works to specifically improve key listing performance. It is certainly something we are aware of, but at this time there are other things with higher priority. Hope that helps answer your question. Kelly On 08/19/2014 05:17 AM, Jaston Campbell wrote: I currently maintain my own indexes for some things, and use natural keys where I can, but a question has been nagging me lately. Why is key listing slow? Specifically, why is bitcask key listing slow? One of the biggest issues with bitcask is all keys (including the bucket name and some overhead) must fit into RAM. For large amounts of keys, I understand the coordination data transfer will hurt, but shouldn't things like list buckets (or listing keys from small buckets) be fast? Is there a reason this is slow, and is there a plan to fix it? Thanks, Jason ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Fwd: RiakCS 504 Timeout on s3cmd for certain keys
Alex, The value you had set for the fold_objects_for_list_keys setting is one I was very interested to see and I highly recommend setting it to true for your cluster. The impact of setting this to true should be to make bucket listing operations generally more efficient. There should be no detrimental effects. There are also some optimizations for bucket listing queries that use the prefix request parameter so I would expect queries to list specific subdirectories in a bucket to show improved perforamnce as well. Changing the app.config and restarting the CS node is the correct way to have it take effect. As for GC performance, I would recommend to add an entry to your app.config file to set gc_paginated_indexes to true. This option causes the GC process to use a more efficient process for determining data that is eligible for collection and generally results in far fewer timeouts and better success for users. Kelly On 08/19/2014 07:32 AM, Alex Millar wrote: Hey Kota, We’re currently using the following versions; # Download RiakCS # Version: 1.4.5 # OS: Ubuntu 12.04 (Precise) AMD 64 curl -O http://s3.amazonaws.com/downloads.basho.com/riak-cs/1.4/1.4.5/ubuntu/precise/riak-cs_1.4.5-1_amd64.deb # Download Riak # Version: 1.4.8 # OS: Ubuntu 12.04 (Precise) AMD 64 curl -O http://s3.amazonaws.com/downloads.basho.com/riak/1.4/1.4.8/ubuntu/precise/riak_1.4.8-1_amd64.deb I checked our RiakCS app.config and fold_objects_for_list_keys is set to false. What impact would it have my cluster if I flip that to true? Would I simply update the app.config and restart RiakCS? As for the consideration on garbage collection, the slow performance is happening consistently over the span of a week (since we noticed it as we don’t often list buckets). I suspect its not the case regarding large amounts of objects being deleted as generally all data going into that bucket is write-once (we process PDFs pages to .JPG and PUT them in that bucket, the only time overwrites occur is if we manually re-trigger the processing script to run on a specific document) *Adding ke...@basho.com* as we have another thread going on about this same topic, I figured we could merge the discussion to reduce duplicate effort here. Bonfire Logo*Alex Millar*, CTO Office: 1-800-354-8010 ext. 704 tel:+18003548010 Mobile: 519-729-2539 tel:+15197292539 *GoBonfire*.com http://GoBonfire.com From: Kota Uenishi k...@basho.com mailto:k...@basho.com Reply: Kota Uenishi k...@basho.com mailto:k...@basho.com Date: August 18, 2014 at 10:03:40 PM To: Alex Millar a...@gobonfire.com mailto:a...@gobonfire.com Cc: Charlie Voiselle cvoise...@basho.com mailto:cvoise...@basho.com, Tad Bickford tbickf...@basho.com mailto:tbickf...@basho.com, Riak-Users riak-users@lists.basho.com mailto:riak-users@lists.basho.com, Brandon Noad bran...@gobonfire.com mailto:bran...@gobonfire.com Subject: Re: Fwd: RiakCS 504 Timeout on s3cmd for certain keys Alex, Riak CS 1.4.5 and 1.5.0 had a lot of improvement after those articles you put the URL, not it is not using Riak's bucket listing but using Riak's internal API for more efficient listing. What version of Riak CS are you using? I want you to make sure you're using those versions and a line `{fold_objects_for_list_keys, true},` at riak_cs section of app.config (assuming all other Riak part correctly configured). Based on this I’m thinking that cost of this type of query is only going to get worse over time as we add more keys to this bucket (unless secondary indexes can be added). Or am I totally out to lunch here and there’s some other underlying problem? The strange part is s3cmd. Riak CS has incremental bucket listing API that requires clients to iterate on every 1000 objects (common prefixes), but s3cmd iterates all the specified bucket before printing them all. You can observe how s3cmd and Riak CS interacts if you specify '-d' option like this: ``` s3cmd -d -c yours.s3cfg ls -r s3://yourbucket/yourdir/ ``` I would expect Riak CS's listing API is not much slow as to need 5 seconds (or, say, 10 seconds) because, on each request it just returns 1000 objects. There might be another possibility on slow query - if you had many (say, more than 10 thousands) deleted objects on the same bucket it might affect each 1000 listing. This will eventually be solved as Riak CS's garbage collection removes deleted manifests, which is just marked as deleted (and to be ignored correctly). [1] http://www.quora.com/Riak/Is-it-really-expensive-for-Riak-to-list-all-buckets-Why On Thu, Aug 14, 2014 at 6:05 AM, Alex Millar a...@gobonfire.com mailto:a...@gobonfire.com wrote: Good afternoon Charlie, So the issue we’re having is only with bucket listing. alxndrmlr@alxndrmlr-mbp $ time s3cmd -c .s3cfg-riakcs-admin ls s3://bonfirehub-resources-can-east-doc-conversion DIR s3://bonfirehub-resources-can-east-doc-conversion/organizations/
Re: Riak Search Issue
Hi Eric, You were right on naming the bucket the same as the index... it worked that way: bucket = client.bucket_type('futbolistas').bucket('famoso') results = bucket.search('name_s:Lion*') print results {'num_found': 2, 'max_score': 1.0, 'docs': [{u'age_i': u'30', u'name_s': u'Lionel', u'_yz_rk': u'lionel', u'_yz_rb': u'fcb', u'score': u'1.e+00', u'leader_b': u'true', u'_yz_id': u'1*futbolistas*fcb*lionel*59', u'_yz_rt': u'futbolistas'}, {u'age_i': u'30', u'name_s': u'Lionel', u'_yz_rk': u'lionel', u'_yz_rb': u'famoso', u'score': u'1.e+00', u'leader_b': u'true', u'_yz_id': u'1*futbolistas*famoso*lionel*8', u'_yz_rt': u'futbolistas'}]} Later will check to install GIT's version and see if it works with a different bucket name. Thanks. Alex On Mon, Aug 18, 2014 at 11:12 PM, Alex De la rosa alex.rosa@gmail.com wrote: Hi Eric, I will try this suggestion, also I will try Luke's suggestion on using GIT's latest version instead of PIP to see if is something already fixed. Once done that, I will tell you guys if is really a bug or if it was fixed already on GIT cloning. Thanks, Alex On Mon, Aug 18, 2014 at 11:10 PM, Eric Redmond eredm...@basho.com wrote: Alex, You may have discovered a legitimate bug in the python driver. In the meantime, if you give your bucket and index the same name, you can proceed, while we investigate. Thanks, Eric On Aug 18, 2014, at 2:00 PM, Alex De la rosa alex.rosa@gmail.com wrote: Yes, I did it in purpose, because I did so many testings that I wanted to start fresh... so I kinda translated the documentation, but that is irrelevant to the case. Thanks, Alex On Mon, Aug 18, 2014 at 10:59 PM, Eric Redmond eredm...@basho.com wrote: Your steps seemed to have named the index famoso. Eric On Aug 18, 2014, at 1:56 PM, Alex De la rosa alex.rosa@gmail.com wrote: Ok, I found the first error in the documentation, parameters are in reverse order: bucket = client.bucket('animals', 'cats') should be: bucket = client.bucket('cats', 'animals') Now I could save and it found the bucket type: bucket = client.bucket('fcb','futbolistas') VS bucket = client.bucket('futbolistas', 'fcb') However, even fixing that, the next step fails as it was failing before: PYTHON: bucket = client.bucket('fcb','futbolistas') results = bucket.search('name_s:Lion*') print results Traceback (most recent call last): File x.py, line 13, in module results = bucket.search('name_s:Lion*') File /usr/local/lib/python2.7/dist-packages/riak/bucket.py, line 420, in search return self._client.fulltext_search(self.name, query, **params) File /usr/local/lib/python2.7/dist-packages/riak/client/transport.py, line 184, in wrapper return self._with_retries(pool, thunk) File /usr/local/lib/python2.7/dist-packages/riak/client/transport.py, line 126, in _with_retries return fn(transport) File /usr/local/lib/python2.7/dist-packages/riak/client/transport.py, line 182, in thunk return fn(self, transport, *args, **kwargs) File /usr/local/lib/python2.7/dist-packages/riak/client/operations.py, line 573, in fulltext_search return transport.search(index, query, **params) File /usr/local/lib/python2.7/dist-packages/riak/transports/pbc/transport.py, line 564, in search MSG_CODE_SEARCH_QUERY_RESP) File /usr/local/lib/python2.7/dist-packages/riak/transports/pbc/connection.py, line 50, in _request return self._recv_msg(expect) File /usr/local/lib/python2.7/dist-packages/riak/transports/pbc/connection.py, line 142, in _recv_msg raise RiakError(err.errmsg) riak.RiakError: 'No index fcb found.' Again it says fcb index not found... and this time I fully followed the right documentation and didn't use bucket.enable_search() Thanks, Alex On Mon, Aug 18, 2014 at 10:49 PM, Alex De la rosa alex.rosa@gmail.com wrote: Hi Eric, I'm sorry but I followed the documentation that you provided me and still raises issues: STEP 1: Create Index: famoso PYTHON: client.create_search_index('famoso') STEP 2: Create Bucket Type: futbolistas SHELL: riak-admin bucket-type create futbolistas '{props:{search_index:famoso}}'
Re: Riak Search Issue
Hi Sean, Yeah, I opted to follow that pattern on my latest attempt as I see it more clear that the way in the documentation. Still same issue although with Eric we saw it works fine when index and bucket has the same name. Thanks! Alex On Mon, Aug 18, 2014 at 11:27 PM, Sean Cribbs s...@basho.com wrote: Don't use bucket with 2 arguments, use client.bucket_type('futbolistas').bucket('fcb'). This makes your intent more clear. The 2-arity version of bucket() was for backwards-compatibility. On Mon, Aug 18, 2014 at 4:10 PM, Eric Redmond eredm...@basho.com wrote: Alex, You may have discovered a legitimate bug in the python driver. In the meantime, if you give your bucket and index the same name, you can proceed, while we investigate. Thanks, Eric On Aug 18, 2014, at 2:00 PM, Alex De la rosa alex.rosa@gmail.com wrote: Yes, I did it in purpose, because I did so many testings that I wanted to start fresh... so I kinda translated the documentation, but that is irrelevant to the case. Thanks, Alex On Mon, Aug 18, 2014 at 10:59 PM, Eric Redmond eredm...@basho.com wrote: Your steps seemed to have named the index famoso. Eric On Aug 18, 2014, at 1:56 PM, Alex De la rosa alex.rosa@gmail.com wrote: Ok, I found the first error in the documentation, parameters are in reverse order: bucket = client.bucket('animals', 'cats') should be: bucket = client.bucket('cats', 'animals') Now I could save and it found the bucket type: bucket = client.bucket('fcb','futbolistas') VS bucket = client.bucket('futbolistas', 'fcb') However, even fixing that, the next step fails as it was failing before: PYTHON: bucket = client.bucket('fcb','futbolistas') results = bucket.search('name_s:Lion*') print results Traceback (most recent call last): File x.py, line 13, in module results = bucket.search('name_s:Lion*') File /usr/local/lib/python2.7/dist-packages/riak/bucket.py, line 420, in search return self._client.fulltext_search(self.name, query, **params) File /usr/local/lib/python2.7/dist-packages/riak/client/transport.py, line 184, in wrapper return self._with_retries(pool, thunk) File /usr/local/lib/python2.7/dist-packages/riak/client/transport.py, line 126, in _with_retries return fn(transport) File /usr/local/lib/python2.7/dist-packages/riak/client/transport.py, line 182, in thunk return fn(self, transport, *args, **kwargs) File /usr/local/lib/python2.7/dist-packages/riak/client/operations.py, line 573, in fulltext_search return transport.search(index, query, **params) File /usr/local/lib/python2.7/dist-packages/riak/transports/pbc/transport.py, line 564, in search MSG_CODE_SEARCH_QUERY_RESP) File /usr/local/lib/python2.7/dist-packages/riak/transports/pbc/connection.py, line 50, in _request return self._recv_msg(expect) File /usr/local/lib/python2.7/dist-packages/riak/transports/pbc/connection.py, line 142, in _recv_msg raise RiakError(err.errmsg) riak.RiakError: 'No index fcb found.' Again it says fcb index not found... and this time I fully followed the right documentation and didn't use bucket.enable_search() Thanks, Alex On Mon, Aug 18, 2014 at 10:49 PM, Alex De la rosa alex.rosa@gmail.com wrote: Hi Eric, I'm sorry but I followed the documentation that you provided me and still raises issues: STEP 1: Create Index: famoso PYTHON: client.create_search_index('famoso') STEP 2: Create Bucket Type: futbolistas SHELL: riak-admin bucket-type create futbolistas '{props:{search_index:famoso}}' = futbolistas created riak-admin bucket-type activate futbolistas = futbolistas has been activated STEP 3: Create Bucket and Add data: fcb PYTHON: bucket = client.bucket('futbolistas', 'fcb') c = bucket.new('lionel', {'name_s': 'Lionel', 'age_i': 30,
RE: Riak python client and Solr
Thanks Eric, It throws errors for 'group.field'='build.version'. search_results = riak_client.fulltext_search(self.Result_Index, 'build.type:CI', group='on', 'group.field'='build.version') Cannot appear past keyword arguments. I tried some variations of the same, did not seem to work Any suggestions? Thanks, Meghna From: Eric Redmond [mailto:eredm...@basho.com] Sent: Tuesday, August 19, 2014 1:12 PM To: Sapre, Meghna A Cc: riak-users Subject: Re: Riak python client and Solr You don't pass in a query as a url encoded string, but rather a set of parameters. So you'd call something like: search_results = riak_client.fulltext_search(self.Result_Index, 'build.type:CI', group='on', 'group.field'='build.version') Eric On Aug 19, 2014, at 1:08 PM, Sapre, Meghna A meghna.a.sa...@intel.commailto:meghna.a.sa...@intel.com wrote: Hi, I am trying to use the group and stats options with riak search. I get expected results with http urls, but not with python-riak-client fulltext pbc search. Here's what I'm trying to do: q = 'build.type:CIgroup=ongroup.field=build.version' try: search_results = riak_client.fulltext_search(self.Result_Index, q) except Exception as e: print e log.exception(e) This throws an error: no field name specified in query and no default specified via 'df' param. The same query string works without the group options, and the complete string works in the http API. Any suggestions on how to make this work? Thanks, Meghna ___ riak-users mailing list riak-users@lists.basho.commailto:riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Counters inside Maps
Imagine I have a Riak object footballer with some static fields: name, team, number. I store them like this now: 1: CREATE INDEX FOR RIAK SEARCH curl -XPUT http://148.251.140.229:8098/search/index/ix_footballers; 2: CREATE BUCKET TYPE riak-admin bucket-type create tp_footballers '{props:{allow_mult:false,search_index:ix_footballers}}' riak-admin bucket-type activate tp_footballers 3: INSERT A PLAYER bucket = client.bucket_type('tp_footballers').bucket('footballers') key = bucket.new('lionelmessi', data={'name_s':'Messi', 'team_s':'Barcelona', 'number_i':10}, content_type='application/json') key.store() 4: SEARCH FOR BARCELONA PLAYERS r = client.fulltext_search('ix_footballers', 'team_s:Barcelona') So far so good :) BUT... what if I want to have a field goals_i that is a counter that will be incremented each match day with the number of goals he scored? What is the syntax/steps to do to set up footballers as a MAP and then put a COUNTER inside? I know is possible as I read it in some data dump some Basho employee passed me some time ago, but I can't manage to see how to do it now. Thanks! Alex ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Counters inside Maps
Alex, Assuming you've already made your bucket-type with map as the datatype, then bucket.new() will return you a Map instead of a RiakObject. Translating your example above: key = bucket.new('lionelmessi') key.registers['name'].assign('Messi') key.registers['team'].assign('Barcelona') key.counters['number'].increment(10) key.store() Note that because Maps are based on mutation operations and not replacing the value with new ones, you can later do this without setting the entire value: key.counters['number'].increment(1) key.store() This will also change your searches, however, in that the fields will be suffixed with the embedded type you are using: r = client.fulltext_search('ix_footballers', 'team_register:Barcelona') Hope that helps! On Tue, Aug 19, 2014 at 2:59 PM, Alex De la rosa alex.rosa@gmail.com wrote: Imagine I have a Riak object footballer with some static fields: name, team, number. I store them like this now: 1: CREATE INDEX FOR RIAK SEARCH curl -XPUT http://148.251.140.229:8098/search/index/ix_footballers; 2: CREATE BUCKET TYPE riak-admin bucket-type create tp_footballers '{props:{allow_mult:false,search_index:ix_footballers}}' riak-admin bucket-type activate tp_footballers 3: INSERT A PLAYER bucket = client.bucket_type('tp_footballers').bucket('footballers') key = bucket.new('lionelmessi', data={'name_s':'Messi', 'team_s':'Barcelona', 'number_i':10}, content_type='application/json') key.store() 4: SEARCH FOR BARCELONA PLAYERS r = client.fulltext_search('ix_footballers', 'team_s:Barcelona') So far so good :) BUT... what if I want to have a field goals_i that is a counter that will be incremented each match day with the number of goals he scored? What is the syntax/steps to do to set up footballers as a MAP and then put a COUNTER inside? I know is possible as I read it in some data dump some Basho employee passed me some time ago, but I can't manage to see how to do it now. Thanks! Alex ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com -- Sean Cribbs s...@basho.com Software Engineer Basho Technologies, Inc. http://basho.com/ ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Counters inside Maps
Cool! Understood :) Thanks! Alex On Wednesday, August 20, 2014, Sean Cribbs s...@basho.com wrote: On Tue, Aug 19, 2014 at 3:34 PM, Alex De la rosa alex.rosa@gmail.com javascript:; wrote: Hi Sean, I didn't created the bucket_type as a map datatype as at first i was just testing simple Riak Search... then it occurred to me what if I want a counter in the data? :) Your example is pretty straightforward to follow and simple. Just 2 questions: 1. key.counters['number'].increment(1) = No need to define a counters data-type somewhere before putting it inside the map as we normally need in simple buckets? If it works automatically is great :) Yes, it works automatically. All included datatypes are available inside maps. 2. if we use number_counter instead of number_i does Search/SOLR understand is an integer? in case you want to do a range... as somewhere in the docs I read that better to use _s for strings, _b for binary, _i for integers, etc... so SOLR knows how to treat the data... I believe there will be no strange behaviours for having _register instead of _s and _counter instead of _i, right? The default Solr schema that ships with Riak accounts for these datatypes automatically and uses the appropriate index field type: https://github.com/basho/yokozuna/blob/develop/priv/default_schema.xml#L96-L104 If you write your own schema, you will want to include or change the schema fields appropriately. Thanks! Alex On Wed, Aug 20, 2014 at 12:24 AM, Sean Cribbs s...@basho.com javascript:; wrote: Alex, Assuming you've already made your bucket-type with map as the datatype, then bucket.new() will return you a Map instead of a RiakObject. Translating your example above: key = bucket.new('lionelmessi') key.registers['name'].assign('Messi') key.registers['team'].assign('Barcelona') key.counters['number'].increment(10) key.store() Note that because Maps are based on mutation operations and not replacing the value with new ones, you can later do this without setting the entire value: key.counters['number'].increment(1) key.store() This will also change your searches, however, in that the fields will be suffixed with the embedded type you are using: r = client.fulltext_search('ix_footballers', 'team_register:Barcelona') Hope that helps! On Tue, Aug 19, 2014 at 2:59 PM, Alex De la rosa alex.rosa@gmail.com javascript:; wrote: Imagine I have a Riak object footballer with some static fields: name, team, number. I store them like this now: 1: CREATE INDEX FOR RIAK SEARCH curl -XPUT http://148.251.140.229:8098/search/index/ix_footballers; 2: CREATE BUCKET TYPE riak-admin bucket-type create tp_footballers '{props:{allow_mult:false,search_index:ix_footballers}}' riak-admin bucket-type activate tp_footballers 3: INSERT A PLAYER bucket = client.bucket_type('tp_footballers').bucket('footballers') key = bucket.new('lionelmessi', data={'name_s':'Messi', 'team_s':'Barcelona', 'number_i':10}, content_type='application/json') key.store() 4: SEARCH FOR BARCELONA PLAYERS r = client.fulltext_search('ix_footballers', 'team_s:Barcelona') So far so good :) BUT... what if I want to have a field goals_i that is a counter that will be incremented each match day with the number of goals he scored? What is the syntax/steps to do to set up footballers as a MAP and then put a COUNTER inside? I know is possible as I read it in some data dump some Basho employee passed me some time ago, but I can't manage to see how to do it now. Thanks! Alex ___ riak-users mailing list riak-users@lists.basho.com javascript:; http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com -- Sean Cribbs s...@basho.com javascript:; Software Engineer Basho Technologies, Inc. http://basho.com/ -- Sean Cribbs s...@basho.com javascript:; Software Engineer Basho Technologies, Inc. http://basho.com/ ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Is it a good practice to make riak a service and automatically start when the machine starts?
Hi, We have a little uncertainty in our team about whether to have riak automatically start when machine get rebooted. It do bring us some convenient if riak can start by default when machine crashed for some reason, and automatically restart. but i was wondering is there any case that automatically starting a problematic node and join the cluster would cause some problem. do you guys have any idea? Thanks. Gavin ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Is it a good practice to make riak a service and automatically start when the machine starts?
Gavin, I think if you monitor the crash and reboot and take note or flag if it happens often, then that could be when you investigate the node more in depth. Having a node go up and down often is a sign clearly of something bad happening that should be investigated. For a rare reboot/crash, having it start on boot and automatically come up seems like the more ops friendly way to treat some event that should be rare. Due to Riak working without all its nodes up, we've had people who forgot to start Riak nodes and never noticed they were down for weeks. This is good in that Riak can take it, but not very awesome when you do bring it up and the node has a lot of handoff work to do to catch up. -Jared On Tue, Aug 19, 2014 at 9:16 PM, Gavin Huang shuminghu...@gmail.com wrote: Hi, We have a little uncertainty in our team about whether to have riak automatically start when machine get rebooted. It do bring us some convenient if riak can start by default when machine crashed for some reason, and automatically restart. but i was wondering is there any case that automatically starting a problematic node and join the cluster would cause some problem. do you guys have any idea? Thanks. Gavin ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Is it a good practice to make riak a service and automatically start when the machine starts?
thanks for the quick reply, it make sense for me. On Wed, Aug 20, 2014 at 1:07 PM, Jared Morrow ja...@basho.com wrote: Gavin, I think if you monitor the crash and reboot and take note or flag if it happens often, then that could be when you investigate the node more in depth. Having a node go up and down often is a sign clearly of something bad happening that should be investigated. For a rare reboot/crash, having it start on boot and automatically come up seems like the more ops friendly way to treat some event that should be rare. Due to Riak working without all its nodes up, we've had people who forgot to start Riak nodes and never noticed they were down for weeks. This is good in that Riak can take it, but not very awesome when you do bring it up and the node has a lot of handoff work to do to catch up. -Jared On Tue, Aug 19, 2014 at 9:16 PM, Gavin Huang shuminghu...@gmail.com wrote: Hi, We have a little uncertainty in our team about whether to have riak automatically start when machine get rebooted. It do bring us some convenient if riak can start by default when machine crashed for some reason, and automatically restart. but i was wondering is there any case that automatically starting a problematic node and join the cluster would cause some problem. do you guys have any idea? Thanks. Gavin ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com