Hi,
I was interested to see how consistent LazyBarrierStrategy was in the MovieLens
recommendation scenario. As such, I took "the first" 25 users in the MovieLens
dataset and calculated their recommendations using and not-using
LazyBarrierStrategy.
noLazy = graph.traversal()
lazy =
graph.traversal(GraphTraversalSource.build().with(LazyBarrierStrategy.instance()).engine(StandardTraversalEngine.build()))
times = [];
g.V().has(label,'user').limit(25).forEachRemaining{ user ->
temp = []
g = noLazy
temp.add(clock(1){g.V(user).outE('rated').has('stars',gte(4)).inV().aggregate('userLibrary').inE('rated').has('stars',gte(4)).outV().outE('rated').has('stars',gte(4)).inV().where(not(within('userLibrary'))).groupCount().by('name').iterate()})
g = lazy
temp.add(clock(1){g.V(user).outE('rated').has('stars',gte(4)).inV().aggregate('userLibrary').inE('rated').has('stars',gte(4)).outV().outE('rated').has('stars',gte(4)).inV().where(not(within('userLibrary'))).groupCount().by('name').iterate()})
times.add(temp)
}
Here are the runtimes calculated where the first column is without
LazyBarrierStrategy and the second is with it. In every instance, using
LazyBarrierStrategy is significantly faster that without it.
gremlin> times
==>[13999.353023, 450.689434]
==>[10542.223952, 425.302257]
==>[11071.591249, 419.38044099999996]
==>[17027.139231999998, 467.55589499999996]
==>[2805.7487699999997, 352.16767899999996]
==>[31102.009551, 537.647733]
==>[4584.469966, 384.308581]
==>[16788.467617, 459.42238899999995]
==>[10753.991812999999, 427.419588]
==>[47468.688998, 652.944036]
==>[14083.812865, 506.90151399999996]
==>[15580.792864, 458.823214]
==>[9645.832747999999, 449.04409]
==>[11381.738593, 464.385542]
==>[13237.242909999999, 490.935812]
==>[2142.835126, 367.29828399999997]
==>[45372.748350999995, 568.501034]
==>[7209.624532999999, 405.968438]
==>[22556.824235, 487.382746]
==>[15064.135717, 478.760474]
==>[7029.503852999999, 410.060832]
==>[9033.066171999999, 462.697269]
==>[10691.243585, 455.077389]
==>[3089.741913, 369.11348]
==>[3723.592992, 385.54621099999997]
Then some math calculate total and average time for each column.
gremlin> times.collect{it[0]}.sum() // TOTAL TIME
==>355986.420628
gremlin> times.collect{it[1]}.sum()
==>11337.334362000001
gremlin> times.collect{it[0]}.sum() / times.size() // AVERAGE TIME
==>14239.45682512
gremlin> times.collect{it[1]}.sum() / times.size()
==>453.49337448000006
How much faster (on average) is using LazyBarrierStrategy in the recommendation
traversals on the MovieLens dataset? --- ~31x faster.
gremlin> times.collect{it[0]}.sum() / times.collect{it[1]}.sum()
==>31.399481506091966
So, there are 6040 MovieLens users. To calculate all the recommendations
without LazyBarrierStrategy, it would take ~24 hours.
gremlin> (g.V().has(label,'user').count().next() * 14239.45682512) / 1000 / 60
/ 60
==>23.8906442288
With LazyBarrierStrategy, it would take ~45 minutes.
gremlin> (g.V().has(label,'user').count().next() * 453.49337448000006) / 1000 /
60 / 60
==>0.76086110607200010
….assuming no sequential execution on both examples.
Pretty cool. Be scary if the same traversal gave radically different runtimes
for different users -- then the logic in LazyBarrierStrategy would be mighty
complicated.
Take care,
Marko.
http://markorodriguez.com
On Jun 5, 2015, at 2:11 PM, Marko Rodriguez <[email protected]> wrote:
> Hi,
>
> The first ping that came through on IM was Bob Briody saying: "So you have to
> know when its appropriate to add a barrier()?"
>
> As of yesterday, yes. As of today, no.
>
> I did two things:
>
> 1. CollectingBarrierStep (for which NoOpBarrierStep extends) now
> supports a maxBarrierSize so as to protect from OME (i.e. a "lazy barrier").
>
> https://github.com/apache/incubator-tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/util/CollectingBarrierStep.java
> 2. LazyBarrierStrategy looks for traversals with a particular pattern
> (e.g. not OLAP, deep traversal, lots of starts, little head filtering, etc.)
> and inserts NoOpBarrierStep's accordingly.
>
> https://github.com/apache/incubator-tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/strategy/finalization/LazyBarrierStrategy.java
>
> LazyBarrierStrategy is not a default strategy so you will have to turn it on
> to use it. However, how look at the same MovieLens recommendation traversal
> from yesterday when LazyBarrierStrategy is turned on -- no more need for
> manual barrier() additions!
>
> gremlin> graph = TinkerGraph.open()
> ==>tinkergraph[vertices:0 edges:0]
> gremlin> g = graph.traversal()
> ==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
> gremlin> g =
> graph.traversal(GraphTraversalSource.build().with(LazyBarrierStrategy.instance()).engine(StandardTraversalEngine.build()))
> // adding a non-default strategy
> ==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
> gremlin>
> ////// some script to load the MovieLens data set into TinkerGraph //////
> gremlin>
> gremlin> g.V('u2590').outE('rated').has('stars',gte(4)).inV().
> gremlin> aggregate('userLibrary').
> gremlin> inE('rated').has('stars',gte(4)).outV().
> gremlin> outE('rated').has('stars',gte(4)).inV().
> gremlin>
> where(not(within('userLibrary'))).groupCount().by('name').iterate().toString()
> ==>[TinkerGraphStep([u2590],vertex), VertexStep(OUT,[rated],edge),
> HasStep([stars.gte(4)]), EdgeVertexStep(IN), NoOpBarrierStep,
> AggregateStep(userLibrary), VertexStep(IN,[rated],edge),
> HasStep([stars.gte(4)]), EdgeVertexStep(OUT), NoOpBarrierStep,
> VertexStep(OUT,[rated],edge), HasStep([stars.gte(4)]), EdgeVertexStep(IN),
> NoOpBarrierStep, WhereStep(local,without([userLibrary])),
> GroupCountStep(value(name))] // barriers inserted automagically
> gremlin>
> clock(20){g.V('u2590').outE('rated').has('stars',gte(4)).inV().aggregate('userLibrary').inE('rated').has('stars',gte(4)).outV().outE('rated').has('stars',gte(4)).inV().where(not(within('userLibrary'))).groupCount().by('name').next()}
> ==>520.97150945
>
> The logic behind LazyBarrierStrategy is still up in the air… however,
> hopefully this first push gets the discussing going.
>
> Take care,
> Marko.
>
> http://markorodriguez.com
>
> On Jun 4, 2015, at 11:24 AM, Marko Rodriguez <[email protected]> wrote:
>
>> Hello,
>>
>> TinkerPop3 introduces the concept of a Traverser which encapsulates not only
>> represents the current element it is "at," but also metadata like where it
>> has been before (path), a local, side-effect data structure (sack), and how
>> many traversers it represents (bulk). Bulk is a key concept that we required
>> when we needed to do OLAP graph algorithms. With traversers emanating from
>> every vertex in the graph, the number of traversers grows fast. However, if
>> your traversal doesn't require the traversers history (path) or its local
>> data structure (sack), then well, where its current "at" is the only
>> uniquely identifying characteritic and thus, 100 traversers at v[1] is the
>> same as 1 traverser at v[1] with "long bulk = 100," where counting requires
>> much less space (a 64-bit long) than enumeration.
>>
>> While this concept was great for OLAP, it is also great for OLTP.
>>
>> Check this out:
>>
>> gremlin> graph = TinkerGraph.open()
>> ==>tinkergraph[vertices:0 edges:0]
>> gremlin> graph.io(gryo()).readGraph('data/grateful-dead.kryo')
>> ==>null
>> gremlin> g = graph.traversal()
>> ==>graphtraversalsource[tinkergraph[vertices:808 edges:8049], standard]
>> gremlin>
>> clockWithResult(20){g.V().out().out().out().out().out().out().out().out().out().out().count().next()}
>> // CNTRL-C as this will take the life of the universe to compute
>> gremlin> clockWithResult(20){g.V().repeat(out()).times(10).count().next()}
>> ==>2880.6283869
>> ==>4947356572210703772
>> gremlin>
>> clockWithResult(20){g.V().out().barrier().out().barrier().out().barrier().out().barrier().out().barrier().out().barrier().out().barrier().out().barrier().out().barrier().out().barrier().count().next()}
>> ==>27.921249
>> ==>4947356572210703772
>>
>> There are three traversal above. All of them yield the same result -- that
>> is, there are 4,947,356,572,210,703,772 length-10 paths in the Grateful Dead
>> graph. Or, there are 4.9 sextillion length 10-paths.
>>
>> What makes the traversals above different is their respective use of bulking
>> -- and ultimately their runtime in milliseconds.
>>
>> 1. This is a sequential traversal where each of the 4.9 sextillion
>> objects ultimately touched much be enumerated.
>> 2. This is a repeating traversal where the emanations from any one
>> vertex are iterated before the next vertex is processed. However, for that
>> one vertex's branch, results are bulked.
>> 3. This is a barrier traversal where all the traversers are bulked at
>> each outgoing step (i.e. a BSP computation, though in sequential form).
>>
>> All of these traversals are single threaded and over the same data. However,
>> look at the run times --- sometimes computing power is not the key, but the
>> way in which data is represented.
>>
>> To brag -- I can determine that there are 4,947,356,572,210,703,772
>> length-10 paths in the Grateful Dead graph in 27 milliseconds.
>>
>> A PRACTICAL APPLICATION:
>>
>> Bulking is crucial when you have recurrence in your traversal --- in other
>> words, when your traversal touches the same elements over and over again.
>> One place this happens a lot is in recommendation algorithms. Lets take the
>> MovieLens dataset. Lets recommend movies to user "u2590."
>>
>> g.V('u2590').outE('rated').has('stars',gte(4)).inV().
>> aggregate('userLibrary').
>> inE('rated').has('stars',gte(4)).outV().
>> outE('rated').has('stars',gte(4)).inV().
>> where(not(within('userLibrary'))).groupCount().by('name')
>>
>> Here is the runtime of that traversal:
>>
>> gremlin>
>> clock(5){g.V('u2590').outE('rated').has('stars',gte(4)).inV().aggregate('userLibrary').inE('rated').has('stars',gte(4)).outV().outE('rated').has('stars',gte(4)).inV().where(not(within('userLibrary'))).groupCount().by('name').next()}
>> ==>13624.124757
>>
>> However, I bet another user probably rated the same movies as user u2590.
>> Thus, lets add a barrier() on those people. Finally, I bet those users also
>> rated many of the same movies. Thus, lets barrier those movies.
>>
>> g.V('u2590').outE('rated').has('stars',gte(4)).inV().
>> aggregate('userLibrary').
>> inE('rated').has('stars',gte(4)).outV().barrier()
>> outE('rated').has('stars',gte(4)).inV().barrier()
>> where(not(within('userLibrary'))).groupCount().by('name')
>>
>> Here is the runtime of this new traversal which yields the exact same result:
>>
>> gremlin>
>> clock(5){g.V('u2590').outE('rated').has('stars',gte(4)).inV().aggregate('userLibrary').inE('rated').has('stars',gte(4)).outV().barrier().outE('rated').has('stars',gte(4)).inV().barrier().where(not(within('userLibrary'))).groupCount().by('name').next()}
>> ==>531.6441133999999
>>
>> From 13.5 seconds down to 0.5 seconds. Now that is the beauty of bulking.
>>
>> Enjoy!,
>> Marko.
>>
>> http://markorodriguez.com
>>
>> References:
>>
>> [1]
>> http://tinkerpop.incubator.apache.org/docs/3.0.0.M8-incubating/#a-note-on-barrier-steps
>>
>>
>