The only issue with this approach is AFAIK that M/R effectively runs with R=1,
i.e. it doesn't ensure that a value is consistent across replicas.
IMHO riak_kv_mapreduce should have a map_get_object_value, which does a proper
RiakClient:get, i.e. something like this: [will be slower, but will honour the
bucket's default R value].
map_get_object_value({error, notfound}=NF, KD, Action) ->
notfound_map_action(NF, KD, Action);
map_get_object_value(RO, KD, Action) ->
{ok, RiakClient} = riak:local_client(),
case RiakClient:get(riak_object:bucket(RO),riak_object:bucket(RO)) of
{error, notfound}=NF ->
notfound_map_action(NF, KD, Action);
{ok, RiakObject} ->
[riak_object:get_value(RiakObject)]
end.
Kresten
On Aug 9, 2012, at 10:46 AM, Parnell Springmeyer <[email protected]> wrote:
> Jeremy,
>
> I was looking for something similar and first built an extra handler onto an
> internal erlang cowboy API server that used maelstrom (my own worker pool OTP
> application).
>
> It was used to make a simple POST with a string of the {bucket, key} pairs
> and the server would concurrently GET and combine the results and send it
> back. This was very fast (thousands of keys GET in ms).
>
> Since that seemed gross, I then decided (based on some input from someone
> else on the list) to try using a simple Map/Reduce phase that did not use
> javascript but the erlang functions (since those are going to be really fast
> and take advantage Erlang's concurrency better than the javascript VM's).
>
> In python, you can do this to run that type of M/R phase without knowing any
> Erlang code:
>
> client = riak.RiakClient()
>
> # Add your KNOWN bucket and key pairs (you can do this in a loop)
> query = client.add(bucket, key)
> query.add(bucket, key)
> query.add(bucket, key)
> etc… (as many as you like)
>
> # Now tell the map and reduce phases to use Erlang module "riak_kv_mapreduce"
> and its given function
> # "map_object_value" and "reduce_set_union".
> results = client.map(["riak_kv_mapreduce", "map_object_value"]) \
> .reduce(["riak_kv_mapreduce", "reduce_set_union"]) \
> .run()
>
> The above returns results faster for me, than the brokered multi-get approach
> I used (I guarantee my brokered multi-get is faster than anything you can do
> with python + gevent, if that's the case, the M/R phase is definitely the
> route you want to go).
>
> So IMHO, it is very fast as long as you know the buckets and keys you want to
> get.
>
> On Aug 9, 2012, at 12:11 AM, Jeremy Dunck wrote:
>
>> I'm new to riak and need multi-get (that is, getting the value and/or
>> existence of keys in a single network-trip latency).
>>
>> I was wondering what the latency of the map-reduce approach is?
>> http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-February/003229.html
>>
>> Alternatively, has anyone tried scaling concurrent gets (perhaps with
>> evented io) to do many concurrent requests and combining results on
>> the client?
>>
>> I am toying with a python+gevent multiget function. If the stance is
>> still that a multiget operation doesn't belong in core, I'm a little
>> surprised that there doesn't seem to at least be a nice client-lib API
>> func to do it. It sure seems useful...
>>
>> In my use-case, the immediate need is to know whether a db insert
>> needs to be done. We're handling too many keys to want to store in
>> memory (so no redis, etc), and we don't want to go to the db more than
>> we need to, so it seems riak would be good here. But we're getting
>> 1000s of potential insert keys and want to whittle down all those to a
>> relative few db inserts.
>>
>> So I was thinking riak key-per-id, and insert to the db iff the riak
>> key doesn't exist, then add the riak key. We'll get some race
>> conditions on the insert, but that's OK in our case.
>>
>> We do need low latency on the riak check, though, hence either
>> multiplexing w/ eventing or map-reduce (if that latency is actually
>> good).
>>
>> Am I doing it wrong?
>>
>> _______________________________________________
>> riak-users mailing list
>> [email protected]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Mobile: + 45 2343 4626 | Skype: krestenkrabthorup | Twitter: @drkrab
Trifork A/S | Margrethepladsen 4 | DK- 8000 Aarhus C | Phone : +45 8732
8787 | www.trifork.com
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com