[ https://issues.apache.org/jira/browse/GIRAPH-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16033823#comment-16033823 ]
ASF GitHub Bot commented on GIRAPH-1148: ---------------------------------------- Github user dlogothetis commented on a diff in the pull request: https://github.com/apache/giraph/pull/39#discussion_r119745185 --- Diff: giraph-block-app-8/src/main/java/org/apache/giraph/block_app/library/prepare_graph/UndirectedConnectedComponents.java --- @@ -352,10 +352,15 @@ Block calculateConnectedComponentSizes( Pair<LongWritable, LongWritable> componentToReducePair = Pair.of( new LongWritable(), new LongWritable(1)); LongWritable reusableLong = new LongWritable(); - return Pieces.reduceAndBroadcast( - "CalcConnectedComponentSizes", + // This reduce operation is stateless so we can use a single instance + BasicMapReduce<LongWritable, LongWritable, LongWritable> reduceOperation = new BasicMapReduce<>( - LongTypeOps.INSTANCE, LongTypeOps.INSTANCE, SumReduce.LONG), + LongTypeOps.INSTANCE, LongTypeOps.INSTANCE, SumReduce.LONG); + return Pieces.reduceAndBroadcastWithArrayOfHandles( + "CalcConnectedComponentSizes", + 3137, /* Just using some large prime number */ --- End diff -- Sounds good. Looks good then. > Connected components - make calculate sizes work with large number of > components > -------------------------------------------------------------------------------- > > Key: GIRAPH-1148 > URL: https://issues.apache.org/jira/browse/GIRAPH-1148 > Project: Giraph > Issue Type: Improvement > Reporter: Maja Kabiljo > Assignee: Maja Kabiljo > > Currently if we have a graph with large number of connected components, > calculating connected components sizes fails because reducer becomes too > large. Use array of handles instead. -- This message was sent by Atlassian JIRA (v6.3.15#6346)