[ https://issues.apache.org/jira/browse/TINKERPOP-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stephen mallette reassigned TINKERPOP-2081: ------------------------------------------- Assignee: stephen mallette > PersistedOutputRDD materialises rdd lazily with Spark 2.x > --------------------------------------------------------- > > Key: TINKERPOP-2081 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2081 > Project: TinkerPop > Issue Type: Bug > Components: hadoop > Affects Versions: 3.3.4 > Reporter: Artem Aliev > Assignee: stephen mallette > Priority: Major > > PersistedOutputRDD is not actually persist RDD in spark memory but mark it > for lazy caching in the future. It looks like caching was eager in Spark 1.6, > but in spark 2.0 it lazy. > The lazy caching looks wrong for this case, the source graph could be changed > after snapshot is created and snapshot should not be affected by that changes. > The fix itself is simple: PersistedOutputRDD should call any spark action to > trigger eager caching. For example count() -- This message was sent by Atlassian JIRA (v7.6.3#76005)