On Nov 19, 2010, at 2:15 PM, Parker Thompson wrote:
> I'm experimenting with Riak by trying to port a simple a/b testing framework
> that's currently SQL backed. Since I'm using Ripple/riak-client my code below
> are in Ruby/JS.
>
> The domain model is fairly simple. I have visitors, which get created for any
> user who hits the site, visitors see alternatives (currently these are
> ActiveRecord objects) and are tracked by creating experiences (the joining of
> a alternative ID and a visitor). Finally, as visitors do things we track
> events, which are distinguished from one another by their classes.
>
The first concept you'll have to give up with Riak is "join tables", since you
can't have indexes on them in the same way as you can with a relational DB. A
more natural model would be to have a "double" of the ActiveRecord object,
which has the same key/id, and then links to all visitors who viewed that
alternative. That is, you'd have another model (or maybe just an RObject,
depending on how you want to deal with it), like so:
class Riak::Alternative
include Ripple::Document
many :visitors, :class_name => "Riak::Visitor"
property :alternative_id, Integer, :presence => true
key_on :alternative_id
end
Then some portions of your MapReduce query will become simpler, some more
difficult. I'm using a technique below I blogged about called "forwarding",
which puts the data you want to return at the end of the query in the keyData
for subsequent phases. In a relational DB you'd probably use a nested SELECT
or some crazy group by/having combination. The Riak version feels more like a
fanout (and double-back).
########
def visitors_who_shared
Riak::MapReduce.new(Ripple.client).
add("riak_alternatives", ar_id.to_s).
link(:bucket => 'riak_visitors').
map(link_to_events_forward_visitor).
map(map_share_events_to_visitor).
reduce(["riak_kv_mapreduce", "reduce_set_union"]).
map(map_identity, :keep => true).
run
end
# Inspect the links, select the ones that point to events, put the visitor's
key as the keyData
# You could also put the whole object in the keyData, but this saves bandwidth
and computation.
def link_to_events_forward_visitor
<<-FUNCTION
function(object, keyData, arg){
return object.values[0].metadata.Links.reduce(function(acc, link){
if(link[0] == "events")
acc.push([link[0],
link[1], object.key]);
return acc;
},
[]);
}
FUNCTION
end
# If the data is a ShareEvent, map to the visitor who created it
def map_share_event_to_visitor
<<-FUNCTION
function(v, keyData){
var data = JSON.parse(v.values[0].data);
if(data._type == "Riak::ShareEvent" ){
return [["visitors", keyData]];
} else {
return [];
}
}
FUNCTION
end
def map_identity
<<-FUNCTION
function(v){ return [v]; }
FUNCTION
end
#########
The result of your visitors_who_shared method could then be used to vivify
Visitor objects (it's straightforward, but I'm not putting the code here).
Long term, you'll want to be creating your own Javascript built-in functions
instead of passing the source along with every query. I've also only solved
one issue with your schema above (denormalizing the "experiences" into
"alternatives"). Please ask again if you have other questions/issues.
Sean Cribbs <[email protected]>
Developer Advocate
Basho Technologies, Inc.
http://basho.com/
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com