[
https://issues.apache.org/jira/browse/TINKERPOP-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yang Xia updated TINKERPOP-3074:
--------------------------------
Affects Version/s: 3.7.2
3.6.7
> The sample() step is largely unusable with large graphs
> -------------------------------------------------------
>
> Key: TINKERPOP-3074
> URL: https://issues.apache.org/jira/browse/TINKERPOP-3074
> Project: TinkerPop
> Issue Type: Improvement
> Components: process
> Affects Versions: 3.6.7, 3.7.2
> Reporter: Kelvin Lawrence
> Priority: Major
>
> While the `sample` step can be useful with smallish sized amounts of data for
> random walks and similar, its current implementation makes it unusable with
> large graphs if you are looking to sample, say, one node, from a graph with a
> millions or billions of nodes in it.
> {code:java}
> // This generally works assuming the out() step yields limited numbers of
> nodes
> g.V(1).out().sample(1).out().sample(1) //etc
> // This fails for a large graph, usually with an OOM error
> g.V().sample(1){code}
> The current implementation of sample() is quite naive and assumes it can
> fetch everything into memory before computing a result. I have seen many
> users wanting to start a walk from a random place, and they always try to do
> {color:#0747a6}_g.V().sample(1)_{color} or
> _{color:#0747a6}g.E().sample(1){color}_ types of queries.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)