[ 
https://issues.apache.org/jira/browse/GIRAPH-128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195275#comment-13195275
 ] 

Jakob Homan commented on GIRAPH-128:
------------------------------------

* Should
{noformat}+    private static final int MAX_BIND_ATTEMPTS = 20;{noformat} be a 
configuration variable? It seems like something would be good to be able to 
tune.
* Choice of increment:
This is a review of GIRAPH-128.2.patch.  When uploading new versions of 
patches, it's best not delete the old ones so reviewers can see the progress. 
JIRA highlights the most current version (although it's always nice to number 
them as well, as was done here).

{noformat}
+        // Simple handling of port collisions on the same machine while
+        // preserving debugability from the port number alone.
+        // Round up the max number of workers to the next power of 10 and use
+        // it as a constant to increase the port number with.
+        int portIncrementConstant =
+            (int) Math.pow(10, Math.ceil(Math.log10(numWorkers)));
{noformat}
This seems like a bit of an odd choice; what motivated it?  Similar jobs (in a 
small cluster, for instance) may have the same number of workers and keep 
colliding.  Any reason not just to use some random number from a uniform 
distribution of say 100 and use that as the constant.  That may result in fewer 
collisions.
* {{MinimumIntCombiner.java}} and {{SimpleSumCombiner.java}} have unnecessary 
whitespace changes.
* RPCCommunicationsTest: Is there any reason not use a mock here rather than 
actually extending the Mapper class for test?  These one-off implementations 
for tests show up in "Find implementations" actions in IDE.

                
> RPC port from BasicRPCCommunications should be only a starting port, and 
> retried
> --------------------------------------------------------------------------------
>
>                 Key: GIRAPH-128
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-128
>             Project: Giraph
>          Issue Type: Improvement
>    Affects Versions: 0.1.0
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>         Attachments: GIRAPH-128.2.patch
>
>
> Currently Giraph uses a basic port + the task partition to get the RPC port.  
> This doesn't work well for when there are multiple Giraph jobs running 
> simultaneously in the same Hadoop cluster (port conflict).  At the same time, 
> it is nice to use this simple algorithm because it makes it very easy to 
> debug problems (you can find the troublesome mapper from the RPC port name).  
> I will be proposing a simple scheme to retry with another port.  I will round 
> the total number of mappers up to the nearest power of 10 (let's that that 
> number Z).  Then I will increment the port number by Z, retrying up to 20 
> tries.  If you have enough ports, this scheme would guarantee that up to 20 
> mappers / node would be supported.  It should be sufficient for most 
> clusters.  At the same time, we still maintain the easy debugging method 
> since you it's still easy to figure out the mapper partition from the port 
> (port % Z = map partition). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to