[ https://issues.apache.org/jira/browse/TINKERPOP3-866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marko A. Rodriguez closed TINKERPOP3-866. ----------------------------------------- Resolution: Fixed This has been implemented. There is no perfect backwards compatibility so we still use the {{groupV3d0()}} model. If someone comes up with a solution, please articulate. > GroupStep and Traversal-Based Reductions > ---------------------------------------- > > Key: TINKERPOP3-866 > URL: https://issues.apache.org/jira/browse/TINKERPOP3-866 > Project: TinkerPop 3 > Issue Type: Improvement > Components: process > Affects Versions: 3.0.1-incubating > Reporter: Marko A. Rodriguez > Assignee: Marko A. Rodriguez > Labels: breaking > Fix For: 3.1.0-incubating > > > Right now {{GroupStep}} is defined as: > {code} > public final class GroupStep<S, K, V, R> extends ReducingBarrierStep<S, > Map<K, R>> implements MapReducer, TraversalParent { > private Traversal.Admin<S, K> keyTraversal = null; > private Traversal.Admin<S, V> valueTraversal = null; > private Traversal.Admin<Collection<V>, R> reduceTraversal = null; > ... > {code} > Look at {{reduceTraversal}}. It takes a {{Collection<V>}} of "values" and > reduces them to a "reduction" {{R}}. Why are we using {{Collection<V>}}, why > is this not: > {code} > private Traversal.Admin<V, R> reduceTraversal = null; > {code} > Now, when a new {{K}} is created (and reduce is defined), we clone > {{reduceTraversal}}. Thus, each key has a {{reduceTraversal}} (identical > clones) that operate in a stream like fashion on {{V}} to yield {{R}}. This > enables us to remove the {{Collection<V>}} (memory hog) and allows us to > defined {{GroupCountStep}} in terms of {{GroupStep}} without (?limited?) > computational cost. HOWEVER, this changes the API as people who did this: > {code} > g.V.group.by(label()).by(outE().count()).by(sum(local)) > {code} > would now have to do this: > {code} > g.V.group.by(label()).by(outE().count()).by(sum()) > {code} > Its very minor, given the speed up we would gain and the ability for us to > now do "groupCount" efficiently on arbitrary values -- not just bulks (e.g. > sacks). -- This message was sent by Atlassian JIRA (v6.3.4#6332)