[ https://issues.apache.org/jira/browse/TINKERPOP-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yang Xia updated TINKERPOP-3074: -------------------------------- Affects Version/s: 3.7.2 3.6.7 > The sample() step is largely unusable with large graphs > ------------------------------------------------------- > > Key: TINKERPOP-3074 > URL: https://issues.apache.org/jira/browse/TINKERPOP-3074 > Project: TinkerPop > Issue Type: Improvement > Components: process > Affects Versions: 3.6.7, 3.7.2 > Reporter: Kelvin Lawrence > Priority: Major > > While the `sample` step can be useful with smallish sized amounts of data for > random walks and similar, its current implementation makes it unusable with > large graphs if you are looking to sample, say, one node, from a graph with a > millions or billions of nodes in it. > {code:java} > // This generally works assuming the out() step yields limited numbers of > nodes > g.V(1).out().sample(1).out().sample(1) //etc > // This fails for a large graph, usually with an OOM error > g.V().sample(1){code} > The current implementation of sample() is quite naive and assumes it can > fetch everything into memory before computing a result. I have seen many > users wanting to start a walk from a random place, and they always try to do > {color:#0747a6}_g.V().sample(1)_{color} or > _{color:#0747a6}g.E().sample(1){color}_ types of queries. -- This message was sent by Atlassian Jira (v8.20.10#820010)