[ 
https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13116177#comment-13116177
 ] 

Hyunsik Choi commented on GIRAPH-12:
------------------------------------

FYI, I record how I collected the memory log messages of Runtime.

{noformat}
grep "totalMem" hadoop/logs/userlogs/job_201109281028_0007/ -r > orig_1.log

cat orig_1.log | awk '{print $6" "$7" "$8}' | sed 's/totalMem=//g' | sed 
's/maxMem=//' | 
sed 's/freeMem=//' | awk '{total=total+$1; max=max+$2; free=free+$3} END {print 
total/NR,max/NR,free/NR}'
{noformat}
                
> Investigate communication improvements
> --------------------------------------
>
>                 Key: GIRAPH-12
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-12
>             Project: Giraph
>          Issue Type: Improvement
>          Components: bsp
>            Reporter: Avery Ching
>            Assignee: Hyunsik Choi
>            Priority: Minor
>         Attachments: GIRAPH-12_1.patch, GIRAPH-12_2.patch
>
>
> Currently every worker will start up a thread to communicate with every other 
> workers.  Hadoop RPC is used for communication.  For instance if there are 
> 400 workers, each worker will create 400 threads.  This ends up using a lot 
> of memory, even with the option  
> -Dmapred.child.java.opts="-Xss64k".  
> It would be good to investigate using frameworks like Netty or custom roll 
> our own to improve this situation.  By moving away from Hadoop RPC, we would 
> also make compatibility of different Hadoop versions easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to