[ 
https://issues.apache.org/jira/browse/GIRAPH-314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449716#comment-13449716
 ] 

Maja Kabiljo commented on GIRAPH-314:
-------------------------------------

Great change, javadoc is much easier to understand now.

Two options which I mentioned should prevent us from generating new messages 
while enough of the current messages are not processed. So if we use 
out-of-core messages they shouldn't be able to pile up. With those options I 
was able to run RandomMessageBenchmark with really huge number of messages (it 
was slow, of course, but it worked). I'm surprised to hear it didn't work for 
you.

I'm not sure that we are thinking of the same combiner. Correct me if I'm 
wrong, but the reason why amortizing saves you is that you get to process part 
of messages before receiving new ones. And processing messages decrease memory 
used just by replacing several occurrences of one second degree neighbour with 
the single number of occurrences. That's what combiner should also do.

So you are planning to change the infrastructure, in order to support sending 
the same message to several vertices on the same worker in a better way? So 
that in practice we only send the message and the list of destination vertices, 
and on the destination worker we have only one copy of the message? That sounds 
like a really good improvement for this and similar applications, where 
messages are big objects. If messages are not combinable, and if we would have 
some good partitioning, this could really decrease the amount of traffic and 
memory usage here.
                
> Implement better message grouping to improve performance in 
> SimpleTriangleClosingVertex
> ---------------------------------------------------------------------------------------
>
>                 Key: GIRAPH-314
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-314
>             Project: Giraph
>          Issue Type: Improvement
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Eli Reisman
>            Assignee: Eli Reisman
>            Priority: Trivial
>             Fix For: 0.2.0
>
>         Attachments: GIRAPH-314-1.patch, GIRAPH-314-2.patch, 
> GIRAPH-314-3.patch, GIRAPH-314-4.patch
>
>
> After running SimpleTriangleClosingVertex at scale I'm thinking the 
> sendMessageToAllEdges() is pretty in the code, but its not a good idea in 
> practice since each vertex V sends degree(V)^2 messages right in the first 
> superset in this algorithm. Could do something with a combiner etc. but just 
> grouping messages by hand at the application level by using 
> IntArrayListWritable again does the trick fine.
> Probably should have just done it this way before, but 
> sendMessageToAllEdges() looked so nice. Sigh. Changed unit tests to reflect 
> this new approach, passes mvn verify and cluster, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to