[
https://issues.apache.org/jira/browse/TINKERPOP-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marko A. Rodriguez closed TINKERPOP-1100.
-----------------------------------------
Resolution: Fixed
When doing manual testing of {{SparkGraphComputer}} on my local cluster, I
noticed GC issues in the MapReduce execution engine. This was for two reasons:
* map() and reduce() were not lazy and thus, a large partition could cause OME.
* reduce() uses Spark's groupByKey(). To limit the cost of this nasty
aggregating/shuffling operation, I created a lazy combine().
> Look deeply into adding combine()-support in Spark MapReduce.
> -------------------------------------------------------------
>
> Key: TINKERPOP-1100
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1100
> Project: TinkerPop
> Issue Type: Improvement
> Components: process
> Affects Versions: 3.1.0-incubating
> Reporter: Marko A. Rodriguez
> Assignee: Marko A. Rodriguez
> Fix For: 3.1.1-incubating
>
>
> I have comments in the code that tell me that {{combine()}} support is not
> necessary. However, I don't really think this is true. Map processing seems
> to be taking longer than it should.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)