[jira] [Updated] (MAPREDUCE-5332) Support token-preserving restart of history server

2013-09-27 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5332:
--

   Resolution: Fixed
Fix Version/s: 2.3.0
   3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I committed this to trunk and branch-2.

> Support token-preserving restart of history server
> --
>
> Key: MAPREDUCE-5332
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: jobhistoryserver
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Fix For: 3.0.0, 2.3.0
>
> Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332-3.patch, 
> MAPREDUCE-5332-4.patch, MAPREDUCE-5332-5.patch, MAPREDUCE-5332-5.patch, 
> MAPREDUCE-5332-6.patch, MAPREDUCE-5332-7.patch, MAPREDUCE-5332.patch
>
>
> To better support rolling upgrades through a cluster, the history server 
> needs the ability to restart without losing track of delegation tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5332) Support token-preserving restart of history server

2013-09-23 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5332:
--

Attachment: MAPREDUCE-5332-7.patch

Thanks for the thorough review, Daryn!  Updated the patch to address all but 
one of the concerns.  High-level changes include:

* Added an updateToken method to the state store interface, and filesystem 
store uses rename to try to make this atomic.
* Token buckets are created up front

bq. The DTSM has the stateStore so its recovery method could load the state - 
instead of the caller loading the state from the stateStore and passing it in. 
The code may become a bit easier to follow, but just a suggestion.

I kept this as-is.  It makes more sense if the history server were to persist 
more items in the future than just these tokens, as you'd want to load the 
state once then dole out the bits of state to the various entities that need to 
recover using that state.  Either that or the state stores should just be 
separate and per-service, then I agree that the recovery would be handled by 
each service.


> Support token-preserving restart of history server
> --
>
> Key: MAPREDUCE-5332
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: jobhistoryserver
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332-3.patch, 
> MAPREDUCE-5332-4.patch, MAPREDUCE-5332-5.patch, MAPREDUCE-5332-5.patch, 
> MAPREDUCE-5332-6.patch, MAPREDUCE-5332-7.patch, MAPREDUCE-5332.patch
>
>
> To better support rolling upgrades through a cluster, the history server 
> needs the ability to restart without losing track of delegation tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5332) Support token-preserving restart of history server

2013-09-13 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5332:
--

Attachment: MAPREDUCE-5332-6.patch

Minor tweak to patch to set the permissions on the file during the create which 
should reduce the number of RPC calls when using HDFS as the filesystem.

> Support token-preserving restart of history server
> --
>
> Key: MAPREDUCE-5332
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: jobhistoryserver
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332-3.patch, 
> MAPREDUCE-5332-4.patch, MAPREDUCE-5332-5.patch, MAPREDUCE-5332-5.patch, 
> MAPREDUCE-5332-6.patch, MAPREDUCE-5332.patch
>
>
> To better support rolling upgrades through a cluster, the history server 
> needs the ability to restart without losing track of delegation tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5332) Support token-preserving restart of history server

2013-09-12 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5332:
--

Attachment: MAPREDUCE-5332-5.patch

Wow, that's a lot of test breakage.

None of the test failures appear to be related to this change.  Many of them 
are failing with OOM errors due to too many threads, suspect this is caused by 
lingering AMs like what was reported in MAPREDUCE-5501 and YARN-1183.  Also, 
I'm able to reproduce many of the failures on trunk without this patch.

Uploading the same patch again to see if we can get a clean(er) run this time.

> Support token-preserving restart of history server
> --
>
> Key: MAPREDUCE-5332
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: jobhistoryserver
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332-3.patch, 
> MAPREDUCE-5332-4.patch, MAPREDUCE-5332-5.patch, MAPREDUCE-5332-5.patch, 
> MAPREDUCE-5332.patch
>
>
> To better support rolling upgrades through a cluster, the history server 
> needs the ability to restart without losing track of delegation tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5332) Support token-preserving restart of history server

2013-09-12 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5332:
--

Attachment: MAPREDUCE-5332-5.patch

Updating the patch to use temporary files when creating key and token files.  
This prevents the recovery from seeing a partially-written file if we crash in 
the middle of a write.

Also extended the unit tests to check for correct behavior on redundant key and 
token stores.

> Support token-preserving restart of history server
> --
>
> Key: MAPREDUCE-5332
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: jobhistoryserver
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332-3.patch, 
> MAPREDUCE-5332-4.patch, MAPREDUCE-5332-5.patch, MAPREDUCE-5332.patch
>
>
> To better support rolling upgrades through a cluster, the history server 
> needs the ability to restart without losing track of delegation tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5332) Support token-preserving restart of history server

2013-09-11 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5332:
--

Attachment: MAPREDUCE-5332-4.patch

Updated patch to address Daryn's comments.  Summary of changes:

* Prefixes prefixed
* getBucketPath div-to-mod fix
* state stores renamed to state store services

> Support token-preserving restart of history server
> --
>
> Key: MAPREDUCE-5332
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: jobhistoryserver
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332-3.patch, 
> MAPREDUCE-5332-4.patch, MAPREDUCE-5332.patch
>
>
> To better support rolling upgrades through a cluster, the history server 
> needs the ability to restart without losing track of delegation tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5332) Support token-preserving restart of history server

2013-08-21 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5332:
--

Attachment: MAPREDUCE-5332-3.patch

Updated patch based on similar changes in YARN-1082:

* HistoryServerStateStorage is now a service
* Moved filesystem startup to the startStorage method

> Support token-preserving restart of history server
> --
>
> Key: MAPREDUCE-5332
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: jobhistoryserver
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332-3.patch, 
> MAPREDUCE-5332.patch
>
>
> To better support rolling upgrades through a cluster, the history server 
> needs the ability to restart without losing track of delegation tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5332) Support token-preserving restart of history server

2013-08-15 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5332:
--

Attachment: MAPREDUCE-5332-2.patch

Fixing release audit warning.  The TestUberAM timeout has been occurring on 
trunk for quite a while and is unrelated to this change.

> Support token-preserving restart of history server
> --
>
> Key: MAPREDUCE-5332
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: jobhistoryserver
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332.patch
>
>
> To better support rolling upgrades through a cluster, the history server 
> needs the ability to restart without losing track of delegation tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5332) Support token-preserving restart of history server

2013-08-15 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5332:
--

Status: Patch Available  (was: Open)

> Support token-preserving restart of history server
> --
>
> Key: MAPREDUCE-5332
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: jobhistoryserver
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-5332.patch
>
>
> To better support rolling upgrades through a cluster, the history server 
> needs the ability to restart without losing track of delegation tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5332) Support token-preserving restart of history server

2013-08-15 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5332:
--

Attachment: MAPREDUCE-5332.patch

Patch that adds token persistence in a similar manner to how it is done for the 
RM.  One major difference is that an error in the token persistence layer is 
not fatal as it is for the RM.  My thinking is it would be better for the 
history server to stay up than just fall over for any filesystem hiccup.  It's 
easy to change this if people think that the history server should crash when 
this occurs.

> Support token-preserving restart of history server
> --
>
> Key: MAPREDUCE-5332
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: jobhistoryserver
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: MAPREDUCE-5332.patch
>
>
> To better support rolling upgrades through a cluster, the history server 
> needs the ability to restart without losing track of delegation tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira