[ 
https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyunsik Choi updated GIRAPH-12:
-------------------------------

    Attachment: GIRAPH-12_2.patch

I attach the second patch.

I have benchmarked this patch via GIRAPH-32. The results are shown as the 
below. In the results, the improved version is slightly better than current 
implementation. As Avery mentioned, the improved one makes threads 
controllable, so it is an improve. 

Users can adjust the number of core threads and max threads by using 
GiraphJob's constants, such as MSG_FLUSHER_CORE_SIZE and MSG_FLUSHER_MAX_SIZE. 
This setting can affect the performance. So, we may need to guide users to find 
the best parameters.

But, this experiment may be not enough to evaluate this approach because this 
experiment is conducted in small cluster.

*the result of original version*
{noformat}
org.apache.giraph.benchmark.RandomMessageBenchmark -e 2 -s 3 -w 6 -b 4 -n 150 
-V 300000 -v

= 1st =
11/09/22 00:55:06 INFO mapred.JobClient:     Total (milliseconds)=63096
11/09/22 00:55:06 INFO mapred.JobClient:     Superstep 3 (milliseconds)=551
11/09/22 00:55:06 INFO mapred.JobClient:     Setup (milliseconds)=1331
11/09/22 00:55:06 INFO mapred.JobClient:     Shutdown (milliseconds)=1008
11/09/22 00:55:06 INFO mapred.JobClient:     Vertex input superstep 
(milliseconds)=516
11/09/22 00:55:06 INFO mapred.JobClient:     Superstep 0 (milliseconds)=16079
11/09/22 00:55:06 INFO mapred.JobClient:     Superstep 2 (milliseconds)=25657
11/09/22 00:55:06 INFO mapred.JobClient:     Superstep 1 (milliseconds)=17950

= 2rd =
11/09/22 00:58:13 INFO mapred.JobClient:     Total (milliseconds)=62771
11/09/22 00:58:13 INFO mapred.JobClient:     Superstep 3 (milliseconds)=600
11/09/22 00:58:13 INFO mapred.JobClient:     Setup (milliseconds)=1290
11/09/22 00:58:13 INFO mapred.JobClient:     Shutdown (milliseconds)=950
11/09/22 00:58:13 INFO mapred.JobClient:     Vertex input superstep 
(milliseconds)=614
11/09/22 00:58:13 INFO mapred.JobClient:     Superstep 0 (milliseconds)=15654
11/09/22 00:58:13 INFO mapred.JobClient:     Superstep 2 (milliseconds)=25157
11/09/22 00:58:13 INFO mapred.JobClient:     Superstep 1 (milliseconds)=18499
{noformat}

*the result of patched version*
{noformat}
= 1st =
11/09/22 00:59:41 INFO mapred.JobClient:     Total (milliseconds)=60068
11/09/22 00:59:41 INFO mapred.JobClient:     Superstep 3 (milliseconds)=542
11/09/22 00:59:41 INFO mapred.JobClient:     Setup (milliseconds)=1219
11/09/22 00:59:41 INFO mapred.JobClient:     Shutdown (milliseconds)=1025
11/09/22 00:59:41 INFO mapred.JobClient:     Vertex input superstep 
(milliseconds)=616
11/09/22 00:59:41 INFO mapred.JobClient:     Superstep 0 (milliseconds)=15887
11/09/22 00:59:41 INFO mapred.JobClient:     Superstep 2 (milliseconds)=23149
11/09/22 00:59:41 INFO mapred.JobClient:     Superstep 1 (milliseconds)=17626

= 2rd =
11/09/22 01:01:05 INFO mapred.JobClient:     Total (milliseconds)=60359
11/09/22 01:01:05 INFO mapred.JobClient:     Superstep 3 (milliseconds)=510
11/09/22 01:01:05 INFO mapred.JobClient:     Setup (milliseconds)=1399
11/09/22 01:01:05 INFO mapred.JobClient:     Shutdown (milliseconds)=956
11/09/22 01:01:05 INFO mapred.JobClient:     Vertex input superstep 
(milliseconds)=550
11/09/22 01:01:05 INFO mapred.JobClient:     Superstep 0 (milliseconds)=16054
11/09/22 01:01:05 INFO mapred.JobClient:     Superstep 2 (milliseconds)=23049
11/09/22 01:01:05 INFO mapred.JobClient:     Superstep 1 (milliseconds)=17835
{noformat}

> Investigate communication improvements
> --------------------------------------
>
>                 Key: GIRAPH-12
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-12
>             Project: Giraph
>          Issue Type: Improvement
>          Components: bsp
>            Reporter: Avery Ching
>            Assignee: Hyunsik Choi
>            Priority: Minor
>         Attachments: GIRAPH-12_1.patch, GIRAPH-12_2.patch
>
>
> Currently every worker will start up a thread to communicate with every other 
> workers.  Hadoop RPC is used for communication.  For instance if there are 
> 400 workers, each worker will create 400 threads.  This ends up using a lot 
> of memory, even with the option  
> -Dmapred.child.java.opts="-Xss64k".  
> It would be good to investigate using frameworks like Netty or custom roll 
> our own to improve this situation.  By moving away from Hadoop RPC, we would 
> also make compatibility of different Hadoop versions easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to