Hi,

I was interested to see how consistent LazyBarrierStrategy was in the MovieLens 
recommendation scenario. As such, I took "the first" 25 users in the MovieLens 
dataset and calculated their recommendations using and not-using 
LazyBarrierStrategy.

noLazy = graph.traversal()
lazy = 
graph.traversal(GraphTraversalSource.build().with(LazyBarrierStrategy.instance()).engine(StandardTraversalEngine.build()))
times = [];
g.V().has(label,'user').limit(25).forEachRemaining{ user ->
   temp = []
   g = noLazy
   
temp.add(clock(1){g.V(user).outE('rated').has('stars',gte(4)).inV().aggregate('userLibrary').inE('rated').has('stars',gte(4)).outV().outE('rated').has('stars',gte(4)).inV().where(not(within('userLibrary'))).groupCount().by('name').iterate()})
   g = lazy
   
temp.add(clock(1){g.V(user).outE('rated').has('stars',gte(4)).inV().aggregate('userLibrary').inE('rated').has('stars',gte(4)).outV().outE('rated').has('stars',gte(4)).inV().where(not(within('userLibrary'))).groupCount().by('name').iterate()})
   times.add(temp)
}

Here are the runtimes calculated where the first column is without 
LazyBarrierStrategy and the second is with it. In every instance, using 
LazyBarrierStrategy is significantly faster that without it.

gremlin> times
==>[13999.353023, 450.689434]
==>[10542.223952, 425.302257]
==>[11071.591249, 419.38044099999996]
==>[17027.139231999998, 467.55589499999996]
==>[2805.7487699999997, 352.16767899999996]
==>[31102.009551, 537.647733]
==>[4584.469966, 384.308581]
==>[16788.467617, 459.42238899999995]
==>[10753.991812999999, 427.419588]
==>[47468.688998, 652.944036]
==>[14083.812865, 506.90151399999996]
==>[15580.792864, 458.823214]
==>[9645.832747999999, 449.04409]
==>[11381.738593, 464.385542]
==>[13237.242909999999, 490.935812]
==>[2142.835126, 367.29828399999997]
==>[45372.748350999995, 568.501034]
==>[7209.624532999999, 405.968438]
==>[22556.824235, 487.382746]
==>[15064.135717, 478.760474]
==>[7029.503852999999, 410.060832]
==>[9033.066171999999, 462.697269]
==>[10691.243585, 455.077389]
==>[3089.741913, 369.11348]
==>[3723.592992, 385.54621099999997]

Then some math calculate total and average time for each column.

gremlin> times.collect{it[0]}.sum()  // TOTAL TIME
==>355986.420628
gremlin> times.collect{it[1]}.sum()
==>11337.334362000001
gremlin> times.collect{it[0]}.sum() / times.size() // AVERAGE TIME
==>14239.45682512
gremlin> times.collect{it[1]}.sum() / times.size()
==>453.49337448000006

How much faster (on average) is using LazyBarrierStrategy in the recommendation 
traversals on the MovieLens dataset? --- ~31x faster.

gremlin> times.collect{it[0]}.sum() / times.collect{it[1]}.sum()
==>31.399481506091966

So, there are 6040 MovieLens users. To calculate all the recommendations 
without LazyBarrierStrategy, it would take ~24 hours.

gremlin> (g.V().has(label,'user').count().next() * 14239.45682512) / 1000 / 60 
/ 60
==>23.8906442288

With LazyBarrierStrategy, it would take ~45 minutes.

gremlin> (g.V().has(label,'user').count().next() * 453.49337448000006) / 1000 / 
60 / 60
==>0.76086110607200010

….assuming no sequential execution on both examples.

Pretty cool. Be scary if the same traversal gave radically different runtimes 
for different users -- then the logic in LazyBarrierStrategy would be mighty 
complicated.

Take care,
Marko.

http://markorodriguez.com

On Jun 5, 2015, at 2:11 PM, Marko Rodriguez <[email protected]> wrote:

> Hi,
> 
> The first ping that came through on IM was Bob Briody saying: "So you have to 
> know when its appropriate to add a barrier()?"
> 
> As of yesterday, yes. As of today, no.
> 
> I did two things:
> 
>       1. CollectingBarrierStep (for which NoOpBarrierStep extends) now 
> supports a maxBarrierSize so as to protect from OME (i.e. a "lazy barrier").
>               
> https://github.com/apache/incubator-tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/util/CollectingBarrierStep.java
>       2. LazyBarrierStrategy looks for traversals with a particular pattern 
> (e.g. not OLAP, deep traversal, lots of starts, little head filtering, etc.) 
> and inserts NoOpBarrierStep's accordingly.
>               
> https://github.com/apache/incubator-tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/finalization/LazyBarrierStrategy.java
> 
> LazyBarrierStrategy is not a default strategy so you will have to turn it on 
> to use it. However, how look at the same MovieLens recommendation traversal 
> from yesterday when LazyBarrierStrategy is turned on -- no more need for 
> manual barrier() additions!
> 
> gremlin> graph = TinkerGraph.open()
> ==>tinkergraph[vertices:0 edges:0]
> gremlin> g = graph.traversal()
> ==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
> gremlin> g = 
> graph.traversal(GraphTraversalSource.build().with(LazyBarrierStrategy.instance()).engine(StandardTraversalEngine.build()))
>  // adding a non-default strategy
> ==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
> gremlin>
> ////// some script to load the MovieLens data set into TinkerGraph //////
> gremlin>
> gremlin> g.V('u2590').outE('rated').has('stars',gte(4)).inV().
> gremlin>              aggregate('userLibrary').
> gremlin>           inE('rated').has('stars',gte(4)).outV().
> gremlin>           outE('rated').has('stars',gte(4)).inV().
> gremlin>              
> where(not(within('userLibrary'))).groupCount().by('name').iterate().toString()
> ==>[TinkerGraphStep([u2590],vertex), VertexStep(OUT,[rated],edge), 
> HasStep([stars.gte(4)]), EdgeVertexStep(IN), NoOpBarrierStep, 
> AggregateStep(userLibrary), VertexStep(IN,[rated],edge), 
> HasStep([stars.gte(4)]), EdgeVertexStep(OUT), NoOpBarrierStep, 
> VertexStep(OUT,[rated],edge), HasStep([stars.gte(4)]), EdgeVertexStep(IN), 
> NoOpBarrierStep, WhereStep(local,without([userLibrary])), 
> GroupCountStep(value(name))] // barriers inserted automagically
> gremlin> 
> clock(20){g.V('u2590').outE('rated').has('stars',gte(4)).inV().aggregate('userLibrary').inE('rated').has('stars',gte(4)).outV().outE('rated').has('stars',gte(4)).inV().where(not(within('userLibrary'))).groupCount().by('name').next()}
> ==>520.97150945
> 
> The logic behind LazyBarrierStrategy is still up in the air… however, 
> hopefully this first push gets the discussing going.
> 
> Take care,
> Marko.
> 
> http://markorodriguez.com
> 
> On Jun 4, 2015, at 11:24 AM, Marko Rodriguez <[email protected]> wrote:
> 
>> Hello,
>> 
>> TinkerPop3 introduces the concept of a Traverser which encapsulates not only 
>> represents the current element it is "at," but also metadata like where it 
>> has been before (path), a local, side-effect data structure (sack), and how 
>> many traversers it represents (bulk). Bulk is a key concept that we required 
>> when we needed to do OLAP graph algorithms. With traversers emanating from 
>> every vertex in the graph, the number of traversers grows fast. However, if 
>> your traversal doesn't require the traversers history (path) or its local 
>> data structure (sack), then well, where its current "at" is the only 
>> uniquely identifying characteritic and thus, 100 traversers at v[1] is the 
>> same as 1 traverser at v[1] with "long bulk = 100," where counting requires 
>> much less space (a 64-bit long) than enumeration.
>> 
>> While this concept was great for OLAP, it is also great for OLTP.
>> 
>> Check this out:
>> 
>> gremlin> graph = TinkerGraph.open()
>> ==>tinkergraph[vertices:0 edges:0]
>> gremlin> graph.io(gryo()).readGraph('data/grateful-dead.kryo')
>> ==>null
>> gremlin> g = graph.traversal()
>> ==>graphtraversalsource[tinkergraph[vertices:808 edges:8049], standard]
>> gremlin> 
>> clockWithResult(20){g.V().out().out().out().out().out().out().out().out().out().out().count().next()}
>>      // CNTRL-C as this will take the life of the universe to compute
>> gremlin> clockWithResult(20){g.V().repeat(out()).times(10).count().next()}
>> ==>2880.6283869
>> ==>4947356572210703772
>> gremlin> 
>> clockWithResult(20){g.V().out().barrier().out().barrier().out().barrier().out().barrier().out().barrier().out().barrier().out().barrier().out().barrier().out().barrier().out().barrier().count().next()}
>> ==>27.921249
>> ==>4947356572210703772
>> 
>> There are three traversal above. All of them yield the same result -- that 
>> is, there are 4,947,356,572,210,703,772 length-10 paths in the Grateful Dead 
>> graph. Or, there are 4.9 sextillion length 10-paths.
>> 
>> What makes the traversals above different is their respective use of bulking 
>> -- and ultimately their runtime in milliseconds.
>> 
>>      1. This is a sequential traversal where each of the 4.9 sextillion 
>> objects ultimately touched much be enumerated.
>>      2. This is a repeating traversal where the emanations from any one 
>> vertex are iterated before the next vertex is processed. However, for that 
>> one vertex's branch, results are bulked.
>>      3. This is a barrier traversal where all the traversers are bulked at 
>> each outgoing step (i.e. a BSP computation, though in sequential form).
>> 
>> All of these traversals are single threaded and over the same data. However, 
>> look at the run times --- sometimes computing power is not the key, but the 
>> way in which data is represented.
>> 
>> To brag -- I can determine that there are 4,947,356,572,210,703,772 
>> length-10 paths in the Grateful Dead graph in 27 milliseconds. 
>> 
>> A PRACTICAL APPLICATION:
>> 
>> Bulking is crucial when you have recurrence in your traversal --- in other 
>> words, when your traversal touches the same elements over and over again. 
>> One place this happens a lot is in recommendation algorithms. Lets take the 
>> MovieLens dataset. Lets recommend movies to user "u2590."
>> 
>> g.V('u2590').outE('rated').has('stars',gte(4)).inV().
>>              aggregate('userLibrary').
>>           inE('rated').has('stars',gte(4)).outV().
>>           outE('rated').has('stars',gte(4)).inV().
>>              where(not(within('userLibrary'))).groupCount().by('name')
>> 
>> Here is the runtime of that traversal:
>> 
>> gremlin> 
>> clock(5){g.V('u2590').outE('rated').has('stars',gte(4)).inV().aggregate('userLibrary').inE('rated').has('stars',gte(4)).outV().outE('rated').has('stars',gte(4)).inV().where(not(within('userLibrary'))).groupCount().by('name').next()}
>> ==>13624.124757
>> 
>> However, I bet another user probably rated the same movies as user u2590. 
>> Thus, lets add a barrier() on those people. Finally, I bet those users also 
>> rated many of the same movies. Thus, lets barrier those movies.
>> 
>> g.V('u2590').outE('rated').has('stars',gte(4)).inV().
>>              aggregate('userLibrary').
>>           inE('rated').has('stars',gte(4)).outV().barrier()
>>           outE('rated').has('stars',gte(4)).inV().barrier()
>>              where(not(within('userLibrary'))).groupCount().by('name')
>> 
>> Here is the runtime of this new traversal which yields the exact same result:
>> 
>> gremlin> 
>> clock(5){g.V('u2590').outE('rated').has('stars',gte(4)).inV().aggregate('userLibrary').inE('rated').has('stars',gte(4)).outV().barrier().outE('rated').has('stars',gte(4)).inV().barrier().where(not(within('userLibrary'))).groupCount().by('name').next()}
>> ==>531.6441133999999
>> 
>> From 13.5 seconds down to 0.5 seconds. Now that is the beauty of bulking.
>> 
>> Enjoy!,
>> Marko.
>> 
>> http://markorodriguez.com
>> 
>> References:
>> 
>>      [1] 
>> http://tinkerpop.incubator.apache.org/docs/3.0.0.M8-incubating/#a-note-on-barrier-steps
>> 
>> 
> 

Reply via email to