[jira] [Updated] (MAPREDUCE-5616) MR Client-AppMaster RPC max retries on socket timeout is too high.

2013-11-14 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-5616:
-

  Resolution: Fixed
   Fix Version/s: 2.3.0
  3.0.0
Target Version/s: 3.0.0, 2.3.0  (was: 3.0.0, 2.2.0)
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks for the review, Bikas.  I've committed this to trunk and branch-2.

> MR Client-AppMaster RPC max retries on socket timeout is too high.
> --
>
> Key: MAPREDUCE-5616
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5616
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 3.0.0, 2.3.0
>
> Attachments: MAPREDUCE-5616.1.patch
>
>
> MAPREDUCE-3811 introduced a separate config key for overriding the max 
> retries applied to RPC connections from the MapReduce Client to the MapReduce 
> Application Master.  This was done to make failover from the AM to the 
> MapReduce History Server faster in the event that the AM completes while the 
> client thinks it's still running.  However, the RPC client uses a separate 
> setting for socket timeouts, and this one is not overridden.  The default for 
> this is 45 retries with a 20-second timeout on each retry.  This means that 
> in environments subject to connection timeout instead of connection refused, 
> the client waits 15 minutes for failover.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5616) MR Client-AppMaster RPC max retries on socket timeout is too high.

2013-11-08 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-5616:
-

Target Version/s: 2.2.0, 3.0.0  (was: 3.0.0, 2.2.0)
  Status: Patch Available  (was: Open)

> MR Client-AppMaster RPC max retries on socket timeout is too high.
> --
>
> Key: MAPREDUCE-5616
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5616
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: MAPREDUCE-5616.1.patch
>
>
> MAPREDUCE-3811 introduced a separate config key for overriding the max 
> retries applied to RPC connections from the MapReduce Client to the MapReduce 
> Application Master.  This was done to make failover from the AM to the 
> MapReduce History Server faster in the event that the AM completes while the 
> client thinks it's still running.  However, the RPC client uses a separate 
> setting for socket timeouts, and this one is not overridden.  The default for 
> this is 45 retries with a 20-second timeout on each retry.  This means that 
> in environments subject to connection timeout instead of connection refused, 
> the client waits 15 minutes for failover.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5616) MR Client-AppMaster RPC max retries on socket timeout is too high.

2013-11-08 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-5616:
-

Attachment: MAPREDUCE-5616.1.patch

I'm attaching a patch for supporting override of max retries on socket 
connection timeouts.  I chose a default of 3 retries.

> MR Client-AppMaster RPC max retries on socket timeout is too high.
> --
>
> Key: MAPREDUCE-5616
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5616
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: MAPREDUCE-5616.1.patch
>
>
> MAPREDUCE-3811 introduced a separate config key for overriding the max 
> retries applied to RPC connections from the MapReduce Client to the MapReduce 
> Application Master.  This was done to make failover from the AM to the 
> MapReduce History Server faster in the event that the AM completes while the 
> client thinks it's still running.  However, the RPC client uses a separate 
> setting for socket timeouts, and this one is not overridden.  The default for 
> this is 45 retries with a 20-second timeout on each retry.  This means that 
> in environments subject to connection timeout instead of connection refused, 
> the client waits 15 minutes for failover.



--
This message was sent by Atlassian JIRA
(v6.1#6144)