[ https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107063#comment-13107063 ]
Hyunsik Choi commented on GIRAPH-12: ------------------------------------ (a note for sharing) In current implementation, outgoing messages are sent to other peers in only two triggers: 1) When the number of outgoing messages for a specific peer exceeds the a threshold (i.e., maxSize), the outgoing messages for the peer are transmitted to the peer. 2) When one super step is finished, the entire messages are flushed to other peers. In the case 1, however, the current implementation only consider the number of messages instead of the size of messages. The outgoing messages reside in main memory until they are sent to other peers. It is another important factor to consume main memory. It would be good to consider not only the number of messages but also the size of messages. > Investigate communication improvements > -------------------------------------- > > Key: GIRAPH-12 > URL: https://issues.apache.org/jira/browse/GIRAPH-12 > Project: Giraph > Issue Type: Improvement > Components: bsp > Reporter: Avery Ching > Assignee: Hyunsik Choi > Priority: Minor > Attachments: GIRAPH-12_1.patch > > > Currently every worker will start up a thread to communicate with every other > workers. Hadoop RPC is used for communication. For instance if there are > 400 workers, each worker will create 400 threads. This ends up using a lot > of memory, even with the option > -Dmapred.child.java.opts="-Xss64k". > It would be good to investigate using frameworks like Netty or custom roll > our own to improve this situation. By moving away from Hadoop RPC, we would > also make compatibility of different Hadoop versions easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira