zjxian created TINKERPOP-2376:
---------------------------------

             Summary: Probability distribution controlled by weight when using 
sample step
                 Key: TINKERPOP-2376
                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2376
             Project: TinkerPop
          Issue Type: New Feature
          Components: process
    Affects Versions: 3.4.6
         Environment: Gremlin-Tinkerpop 3.4.6 on Fedora 32
            Reporter: zjxian
         Attachments: out.csv

create a simple graph with 1 central node and 3 surronding nodes

add 3 edges with equal weight (1) and form a stargraph

traverse from center ( v[0] ) to other (3) nodes, sample(1) and record the 
destination node

do that 10000 times

estimated probabitlity distribution: 

v[1]:v[2]:v[3] = 3333:3333:3333 (1:1:1)

what i got: 

v[1]:v[2]:v[3] = 3320:4439:2241

I've checked some source file, like 
([https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/filter/SampleGlobalStep.java]).
  The probability distribution shoud be like 1/3:4/9:2/9, which is very close 
to the results I got.

I think some improvements is needed here to make "random walk" in tinkerpop 
really useful.

the script i use:
{code:java}
//代码占位符
conf = new BaseConfiguration()
conf.setProperty("gremlin.tinkergraph.vertexIdManager","LONG")
conf.setProperty("gremlin.tinkergraph.edgeIdManager","LONG")
conf.setProperty("gremlin.tinkergraph.vertexPropertyIdManager","LONG");
graph = TinkerGraph.open(conf)g=graph.traversal()
for(i=0;i<=3;i++){    
  g.addV().iterate()
}
for(i=1;i<=3;i++){
 g.V(0).addE("connect").property("weight",1).to(g.V(i)).iterate()
}
["bash", "-c", "rm -f out.csv"].execute().waitFor()file=new 
File("out.csv")file.append("id\r\n")
for(i=0;i<10000;i++){
 g.V(0).outE().sample(1).by("weight").otherV().map{file.append 
it.get().id()+"\r\n"}.iterate()
}
{code}
see result in attached out.csv

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to