[jira] [Closed] (TINKERPOP-1100) Look deeply into adding combine()-support in Spark MapReduce.

Marko A. Rodriguez (JIRA) Tue, 26 Jan 2016 05:17:07 -0800

     [ 
https://issues.apache.org/jira/browse/TINKERPOP-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Marko A. Rodriguez closed TINKERPOP-1100.
-----------------------------------------
    Resolution: Fixed

When doing manual testing of {{SparkGraphComputer}} on my local cluster, I 
noticed GC issues in the MapReduce execution engine. This was for two reasons:

* map() and reduce() were not lazy and thus, a large partition could cause OME.
* reduce() uses Spark's groupByKey(). To limit the cost of this nasty 
aggregating/shuffling operation, I created a lazy combine(). 

> Look deeply into adding combine()-support in Spark MapReduce.
> -------------------------------------------------------------
>
>                 Key: TINKERPOP-1100
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1100
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: process
>    Affects Versions: 3.1.0-incubating
>            Reporter: Marko A. Rodriguez
>            Assignee: Marko A. Rodriguez
>             Fix For: 3.1.1-incubating
>
>
> I have comments in the code that tell me that {{combine()}} support is not 
> necessary. However, I don't really think this is true. Map processing seems 
> to be taking longer than it should.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (TINKERPOP-1100) Look deeply into adding combine()-support in Spark MapReduce.

Reply via email to