[ https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13117011#comment-13117011 ]
Avery Ching commented on GIRAPH-12: ----------------------------------- If the default stack size is 1 MB, then for instance if you have 1024 workers, you are talking about 1 GB just wasted for thread stack space per node. The aggregate wasted memory would be 1 GB * 1024 = 1 TB, that's a lot of memory =). The issue is that many clusters (including Yahoo!'s) have are running only 32-bit JVMs. So if you are using 1 GB just for stack space, you only get so much left for heap (graph + messages). I think this should help quite a bit until GIRAPH-37 is taken on. Can you run the unittests against a real Hadoop instance as well? Then I'd say +1, unless someone disagrees. > Investigate communication improvements > -------------------------------------- > > Key: GIRAPH-12 > URL: https://issues.apache.org/jira/browse/GIRAPH-12 > Project: Giraph > Issue Type: Improvement > Components: bsp > Reporter: Avery Ching > Assignee: Hyunsik Choi > Priority: Minor > Attachments: GIRAPH-12_1.patch, GIRAPH-12_2.patch > > > Currently every worker will start up a thread to communicate with every other > workers. Hadoop RPC is used for communication. For instance if there are > 400 workers, each worker will create 400 threads. This ends up using a lot > of memory, even with the option > -Dmapred.child.java.opts="-Xss64k". > It would be good to investigate using frameworks like Netty or custom roll > our own to improve this situation. By moving away from Hadoop RPC, we would > also make compatibility of different Hadoop versions easier. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira