[
https://issues.apache.org/jira/browse/TINKERPOP-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15606957#comment-15606957
]
ASF GitHub Bot commented on TINKERPOP-1525:
-------------------------------------------
GitHub user dalaro opened a pull request:
https://github.com/apache/tinkerpop/pull/466
TINKERPOP-1525 Avoid starting VP worker iterations that never end (Spark
2.0 version)
This is exactly like #462, except that it tracks a change except it tracks
a switch between Spark 1.6 and 2.0 away from functions that manipulate
iterables to those that manipulate iterators.
Assuming #462 eventually gets into master, and assuming that TINKERPOP-1389
eventually merges with master, the second merge will conflict. It still seems
marginally safer to make this change in parallel on TINKERPOP-1389 and
master/tp32 than just the latter, since the conflict will look more like "oh i
better keep one of these two almost-identical edge-case checks" than "oh the
Spark 1.x branch had some silly edge case check that I can just delete for 2.0".
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/dalaro/incubator-tinkerpop
TINKERPOP-1525-for-TINKERPOP-1389
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/tinkerpop/pull/466.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #466
----
commit 2a8f741190beebd7b7e6a9ff7922afb9b6807fa5
Author: Dan LaRocque <[email protected]>
Date: 2016-10-26T00:37:17Z
Avoid starting VP worker iterations that never end
SparkExecutor.executeVertexProgramIteration was written in such a way
that an empty RDD partition would cause it to invoke
VertexProgram.workerIterationStart without ever invoking
VertexProgram.workerIterationEnd. This seems like a contract
violation. I have at least one VP that relies on
workerIterationStart|End to allocate and release resources. Failing
to invoke End like this causes a leak in that VP, as it would for any
VP that uses that resource management pattern.
(cherry picked from commit 36e1159a80f539b8bd4a884e5c1cf304ec52c4f9;
this is the same change, except it tracks a switch between Spark 1.6
and 2.0 away from functions that manipulate iterables to those that
manipulate iterators)
----
> Plug VertexProgram iteration leak on empty Spark RDD partitions
> ---------------------------------------------------------------
>
> Key: TINKERPOP-1525
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1525
> Project: TinkerPop
> Issue Type: Bug
> Components: hadoop
> Affects Versions: 3.2.3
> Reporter: Dan LaRocque
>
> If SparkExecutor gets an RDD with empty partitions, it can invoke
> {{VertexProgram.workerIterationStart}} without ever invoking
> {{VertexProgram.workerIterationEnd}}.
> For vertex programs that allocate and release meaningful resources in the
> start/end methods, this can lead to resource leaks.
> I already tested a fix that I made against the 3.2 series. I will submit PRs
> momentarily.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)