RIAK use in VMs

2013-04-17 Thread Yan Martins
I know it's not good to use RIAK in VMs, but even so I would like to get some support in one scenario: We have a 4 node RIAK cluster, each one in different VMs. In case one of the VMs goes down and needs to be restarted in a different machine (so it won't have the same data as the node it's replac

Re: Experimental branch - 2i query improvements

2013-04-17 Thread Martin Sumner
D'oh. I meant 400 bytes. On 18 April 2013 00:05, Kresten Krab Thorup wrote: > Very interesting! > > Regarding feature #1: I don't understand how an ets based index adds up to > 400k per posting; as your write up suggests. Did you mean 400b? I thought > ets was reasonably memory efficient. Do

Re: Experimental branch - 2i query improvements

2013-04-17 Thread Kresten Krab Thorup
Very interesting! Regarding feature #1: I don't understand how an ets based index adds up to 400k per posting; as your write up suggests. Did you mean 400b? I thought ets was reasonably memory efficient. Do you use very large keys? Sent from my iPhone On 16/04/2013, at 23.50, "Martin Sumner"

Re: Experimental branch - 2i query improvements

2013-04-17 Thread Ian Rees
I've been experimenting with a port of my application from single-node BerkeleyDB backend to Riak, with some promising results. Features #1 and #3 are excellent news, and will definitely result in increased performance. Thanks for keeping us posted on your work, Ian On Tue, Apr 16, 2013 at 4:48

Re: "expected_binaries" error in search

2013-04-17 Thread Rob Speer
So, actually, it turns out that's not exactly the command I typed. I actually typed: curl -v -XPUT -H "Content-Type: application/json" -d '{"terms": "", segments: "d do doc docs"}' ' http://riak.lumi:8098/riak/test-search/doc9?returnbody=true' That is, I left out a pair of quotation marks, so my

Re: "expected_binaries" error in search

2013-04-17 Thread Rob Speer
Oh right, and here's the error I get over HTTP when I run the second command: > PUT /riak/test-search/doc9?returnbody=true HTTP/1.1 > User-Agent: curl/7.27.0 > Host: riak.lumi:8098 > Accept: */* > Content-Type: application/json > Content-Length: 39 > < HTTP/1.1 500 Internal Server Error < Vary: Ac

Re: "expected_binaries" error in search

2013-04-17 Thread Rob Speer
I reproduced this at the command line. Here I'm storing two documents, with IDs 'doc8' and 'doc9', into a search-enabled bucket named 'test-search'. # This command works, even though 'lsh' is empty. I believe this is because I've never put a field named 'lsh' in this bucket, 'test-search'. curl -v

Re: riak errors related to riak_pipe_vnode

2013-04-17 Thread Alexander Moore
Hi Bhuwan, That error usually means that you are overloading your cluster in some fashion. To fix this you could try a rolling restart of your nodes. To find the root cause though, can you reply with the following information: What are the ulimit settings on your riak machines? Are you running a

riak errors related to riak_pipe_vnode

2013-04-17 Thread Bhuwan Chawla
I'm seeing tons of errors in our log files and not sure what they mean (Riak 1.2.1): error.log <0.314.0>@riak_pipe_vnode:new_worker:766 Pipe worker startup failed:fitting was gone before startup erlang.log [error] Pipe worker startup failed:fitting was gone before startup Any advice is greatly

Re: SV: Riak + Disco (MapReduce alternative)

2013-04-17 Thread Antonio Rohman Fernandez
I think it might not be much of a problem to not repeat Maps... From this example on their site: ---from disco.core import Job, result_iterator def map(line, params):  for word in line.split():    yield word, 1 def reduce(iter, params): 

Re: Riak + Disco (MapReduce alternative)

2013-04-17 Thread Antonio Rohman Fernandez
Thank you! is pretty interesting, will have to try and see : ) On 17.04.2013 15:54, Sean Cribbs wrote: > This presentation might interest you: http://www.infoq.com/presentations/Big-Data-Hadoop-Java [7] > > On Wed, Apr 17, 2013 at 6:31 AM, Antonio Rohman Fernandez wrote: > >> Is written i

Re: Riak + Disco (MapReduce alternative)

2013-04-17 Thread Dmitri Zagidulin
What's interesting is that Pavlo Baron mentions a heavily modified Disco, that he altered to make it run alongside each Riak node. I wonder if those mods are available? It would be great to talk to him about this. On Wed, Apr 17, 2013 at 9:54 AM, Sean Cribbs wrote: > This presentation might inte

SV: Riak + Disco (MapReduce alternative)

2013-04-17 Thread Jens Rantil
Hi, I've been following the Disco Project for a couple of years. The tricky part with using Disco with Riak would be to make sure each map phase is not executed multiple times over the same data*. Also, since each map phase would (preferably) run on the same host as its data (for data locality)

Re: Riak + Disco (MapReduce alternative)

2013-04-17 Thread Sean Cribbs
This presentation might interest you: http://www.infoq.com/presentations/Big-Data-Hadoop-Java On Wed, Apr 17, 2013 at 6:31 AM, Antonio Rohman Fernandez < roh...@mahalostudio.com> wrote: > ** > > Is written in Erlang ( the core app ) but the MR jobs are written in > Python... in the same way that

Re: Riak + Disco (MapReduce alternative)

2013-04-17 Thread Antonio Rohman Fernandez
Is written in Erlang ( the core app ) but the MR jobs are written in Python... in the same way that Riak is written in Erlang but the MR jobs are written in JavaScript ( also can be written in Erlang though ) Thanks, Rohman On 17.04.2013 13:27, Chris Corbyn wrote: >> Has anyone tried to us

Re: Riak + Disco (MapReduce alternative)

2013-04-17 Thread Chris Corbyn
Oh no, spoke to soon, it is 50% Erlang… I just happened to be looking at the 50% Python when browsing the source on GitHub. Il giorno 17/apr/2013, alle ore 21:27, Chris Corbyn ha scritto: >> Has anyone tried to use Riak with Disco? [ http://discoproject.org ] I was >> looking for Hadoop alter

Re: Riak + Disco (MapReduce alternative)

2013-04-17 Thread Chris Corbyn
> Has anyone tried to use Riak with Disco? [ http://discoproject.org ] I was > looking for Hadoop alternatives ( as the RIAK-HADOOP connector project seems > not going anywhere ) and I think Disco is quite interesting, moreover is > written in Erlang same as Riak. Looks like it would be a good m

Riak + Disco (MapReduce alternative)

2013-04-17 Thread Antonio Rohman Fernandez
Hello everybody, Has anyone tried to use Riak with Disco? [ http://discoproject.org ] I was looking for Hadoop alternatives ( as the RIAK-HADOOP connector project seems not going anywhere ) and I think Disco is quite interesting, moreover is written in Erlang same as Riak. Looks like it would

Re: Simple mapreduce with 2i returns different result

2013-04-17 Thread Mattias Sjölinder
Ok, we got it. We have made a fix that is working for us handling the result. Thanks for your time and responses around this! Regards, Mattias 2013/4/17 OJ Reeves > Hi guys, > > Russell is correct. In CI we gather the responses as they come in, in > whichever order they're received. We make s

Re: Experimental branch - 2i query improvements

2013-04-17 Thread Olav Frengstad
Yes, but also include funs in Riak and could be enabled by passing sending. To enable regex matching the args might contain a 'regex' atom. Taking the my previous example you could match a timerange by passing a term [{datetime, [{month, [8], {hour, [{8, 16}]}]}] which translates to anything in aug

Re: Experimental branch - 2i query improvements

2013-04-17 Thread Martin Sumner
Olav, Do you mean similar to passing a map function into M/R, but a function which is applied to the term rather than the object? I agree that would be neater, and much more powerful ... I just don't know how to do it. I'm short of time at the moment to progress this, but perhaps in the next cou

Re: Experimental branch - 2i query improvements

2013-04-17 Thread Olav Frengstad
The features sounds very promising, from a ease-of-use perspective feature #3 and #4 definitely have great value. Being able to do multi-conditional 2i queries based on composite keys is something im personally excited about. Out of curiosity, have you thought about adding additional flexibility t

Re: stream_list_keys

2013-04-17 Thread tom kelly
Hi Christian, Thanks for the reply! That code snippet is very similar to what I had, except I was missing the ack_keys call. That leads to the trivial question of why is there two keys messages, {Id, From, {keys,Keys}} and {Id, {keys,Keys}}? I guess the ack has been recently added for flow contro

Re: Simple mapreduce with 2i returns different result

2013-04-17 Thread OJ Reeves
Hi guys, Russell is correct. In CI we gather the responses as they come in, in whichever order they're received. We make some effort to flatten the result, but we take the approach of being "smart enough to get you the results, but no smarter". We don't do anything to reorder or manage the result

Re: Simple mapreduce with 2i returns different result

2013-04-17 Thread Russell Brown
On 17 Apr 2013, at 08:54, Mattias Sjölinder wrote: > Thanks for your help. Your query returned the same number over and over again > just as expected. > > I think I have found the reason for my problem though. The client lib > CorrugatedIron seems to wrap each document in the MapReduce resul

Re: Simple mapreduce with 2i returns different result

2013-04-17 Thread Mattias Sjölinder
Thanks for your help. Your query returned the same number over and over again just as expected. I think I have found the reason for my problem though. The client lib CorrugatedIron seems to wrap each document in the MapReduce result into a array turning the result into a nested array looking like: