[jira] [Commented] (YARN-5566) client-side NM graceful decom doesn't trigger when jobs finish

2016-08-31 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15452636#comment-15452636
 ] 

Karthik Kambatla commented on YARN-5566:


+1. Will commit this later today. 

[~djp] - could you take a quick look?

> client-side NM graceful decom doesn't trigger when jobs finish
> --
>
> Key: YARN-5566
> URL: https://issues.apache.org/jira/browse/YARN-5566
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.8.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-5566.001.patch, YARN-5566.002.patch, 
> YARN-5566.003.patch, YARN-5566.004.patch
>
>
> I was testing the client-side NM graceful decommission and noticed that it 
> was always waiting for the timeout, even if all jobs running on that node (or 
> even the cluster) had already finished.
> For example:
> # JobA is running with at least one container on NodeA
> # User runs client-side decom on NodeA at 5:00am with a timeout of 3 hours 
> --> NodeA enters DECOMMISSIONING state
> # JobA finishes at 6:00am and there are no other jobs running on NodeA
> # User's client reaches the timeout at 8:00am, and forcibly decommissions 
> NodeA
> NodeA should have decommissioned at 6:00am.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5585) [Atsv2] Add a new filter fromId in REST endpoints

2016-08-31 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15452583#comment-15452583
 ] 

Varun Saxena commented on YARN-5585:


If the use case is only for apps then the row keys in application table are 
stored in sorted manner (in descending order) within the scope of a flow / flow 
run.
And we can easily support fromId alongwith limit to achieve some sort of 
pagination here without any performance penalty.

However, the problem with this kind of an approach is that new apps keep on 
getting added so result may not be latest. For instance, if there are 100 apps 
app100-app1 in ATS and we show 10 apps on each page. Then, if we move to page 3 
we will show apps from app80-app71 but it is possible that say 5 more apps get 
added in the meantime i.e. we not have app105 to app1 in ATS.
Ideally page 3 should then show app85-app76.

Entities in entity table though are not sorted because entity could be anything.
If we have a similar use case for containers, we can consider separating it out 
to a different table and have special handling for it. But there should be a 
use case for it.

> [Atsv2] Add a new filter fromId in REST endpoints
> -
>
> Key: YARN-5585
> URL: https://issues.apache.org/jira/browse/YARN-5585
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>
> TimelineReader REST API's provides lot of filters to retrieve the 
> applications. Along with those, it would be good to add new filter i.e fromId 
> so that entities can be retrieved after the fromId. 
> Example : If applications are stored database, app-1 app-2 ... app-10.
> *getApps?limit=5* gives app-1 to app-10. But to retrieve next 5 apps, it is 
> difficult.
> So proposal is to have fromId in the filter like 
> *getApps?limit=5&&fromId=app-5* which gives list of apps from app-6 to 
> app-10. 
> This is very useful for pagination in web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5585) [Atsv2] Add a new filter fromId in REST endpoints

2016-08-31 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15452583#comment-15452583
 ] 

Varun Saxena edited comment on YARN-5585 at 8/31/16 3:51 PM:
-

If the use case is only for apps then the row keys in application table are 
stored in sorted manner (in descending order) within the scope of a flow / flow 
run.
And we can easily support fromId alongwith limit to achieve some sort of 
pagination here without any performance penalty.

However, the problem with this kind of an approach is that new apps keep on 
getting added so result may not be latest. For instance, if there are 100 apps 
app100-app1 in ATS and we show 10 apps on each page. Then, if we move to page 3 
we will show apps from app80-app71 but it is possible that say 5 more apps get 
added in the meantime i.e. we now have app105 to app1 in ATS.
Ideally page 3 should then show app85-app76. But I guess this would have 
already been considered.

Entities in entity table though are not sorted because entity could be anything.
If we have a similar use case for containers, we can consider separating it out 
to a different table and have special handling for it. But there should be a 
use case for it.


was (Author: varun_saxena):
If the use case is only for apps then the row keys in application table are 
stored in sorted manner (in descending order) within the scope of a flow / flow 
run.
And we can easily support fromId alongwith limit to achieve some sort of 
pagination here without any performance penalty.

However, the problem with this kind of an approach is that new apps keep on 
getting added so result may not be latest. For instance, if there are 100 apps 
app100-app1 in ATS and we show 10 apps on each page. Then, if we move to page 3 
we will show apps from app80-app71 but it is possible that say 5 more apps get 
added in the meantime i.e. we not have app105 to app1 in ATS.
Ideally page 3 should then show app85-app76.

Entities in entity table though are not sorted because entity could be anything.
If we have a similar use case for containers, we can consider separating it out 
to a different table and have special handling for it. But there should be a 
use case for it.

> [Atsv2] Add a new filter fromId in REST endpoints
> -
>
> Key: YARN-5585
> URL: https://issues.apache.org/jira/browse/YARN-5585
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>
> TimelineReader REST API's provides lot of filters to retrieve the 
> applications. Along with those, it would be good to add new filter i.e fromId 
> so that entities can be retrieved after the fromId. 
> Example : If applications are stored database, app-1 app-2 ... app-10.
> *getApps?limit=5* gives app-1 to app-10. But to retrieve next 5 apps, it is 
> difficult.
> So proposal is to have fromId in the filter like 
> *getApps?limit=5&&fromId=app-5* which gives list of apps from app-6 to 
> app-10. 
> This is very useful for pagination in web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5606) Support multi-label merge into one node label

2016-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15452592#comment-15452592
 ] 

Hadoop QA commented on YARN-5606:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
1s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 25s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 35s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
43s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
58s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 20s 
{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 1m 29s 
{color} | {color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 29s {color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 42s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: The patch generated 4 
new + 457 unchanged - 0 fixed = 461 total (was 457) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 21s 
{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
36s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 16s 
{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 23s {color} 
| {color:red} hadoop-yarn-api in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 18s 
{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 20s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 30m 1s {color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.conf.TestYarnConfigurationFields |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12826444/YARN-5606.001.patch |
| JIRA Issue | YARN-5606 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 2d4460228c17 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 20ae1fa |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/12970/artifact/patch

[jira] [Created] (YARN-5607) Document TestContainerResourceUsage#waitForContainerCompletion

2016-08-31 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-5607:
--

 Summary: Document 
TestContainerResourceUsage#waitForContainerCompletion
 Key: YARN-5607
 URL: https://issues.apache.org/jira/browse/YARN-5607
 Project: Hadoop YARN
  Issue Type: Test
  Components: resourcemanager, test
Affects Versions: 2.9.0
Reporter: Karthik Kambatla


The logic in TestContainerResourceUsage#waitForContainerCompletion (introduced 
in YARN-5024) is not immediately obvious. It could use some documentation. 
Also, this seems like a useful helper method. Should this be moved to one of 
the mock classes or to a util class? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5606) Support multi-label merge into one node label

2016-08-31 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15452534#comment-15452534
 ] 

Naganarasimha G R commented on YARN-5606:
-

Hi [~wjlei],
This seem to look like a temporary solution for the Constrain Label feature 
(YARN-3409).
But as i am working on a temporary patch and design for the same, i think this 
feature can be avoided. Will update this jira once i post temp patch and design 
doc in YARN-3409.

> Support multi-label merge into one node label
> -
>
> Key: YARN-5606
> URL: https://issues.apache.org/jira/browse/YARN-5606
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: jialei weng
> Attachments: YARN-5606.001.patch
>
>
> Support multi-label merge into one node label
> 1. we want to support multo-label like SSD,GPU,FPGA label merged into single 
> machine, joined by &. like SSD,GPU,FPGA-> SSD&GPU&FPGA
> 2. we add wildcard match to extend the job request. we define the wildcard 
> like *GPU*, it will math all the node labels with GPU as part of multi-label 
> merged label. For example, *GPU* will match  SSD&GPU&FPGA, GPU&FPGA. We 
> define  SSD&GPU&FPGA={SSD,GPU,FPGA}, and GPU is one of {SSD,GPU,FPGA}, so the 
> job can run on  SSD&GPU&FPGA node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4752) [Umbrella] FairScheduler: Improve preemption

2016-08-31 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-4752:
---
Attachment: (was: YARN-4752.FairSchedulerPreemptionOverhaul.pdf)

> [Umbrella] FairScheduler: Improve preemption
> 
>
> Key: YARN-4752
> URL: https://issues.apache.org/jira/browse/YARN-4752
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
> Attachments: YARN-4752.FairSchedulerPreemptionOverhaul.pdf
>
>
> A number of issues have been reported with respect to preemption in 
> FairScheduler along the lines of:
> # FairScheduler preempts resources from nodes even if the resultant free 
> resources cannot fit the incoming request.
> # Preemption doesn't preempt from sibling queues
> # Preemption doesn't preempt from sibling apps under the same queue that is 
> over its fairshare
> # ...
> Filing this umbrella JIRA to group all the issues together and think of a 
> comprehensive solution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4752) [Umbrella] FairScheduler: Improve preemption

2016-08-31 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-4752:
---
Attachment: YARN-4752.FairSchedulerPreemptionOverhaul.pdf

> [Umbrella] FairScheduler: Improve preemption
> 
>
> Key: YARN-4752
> URL: https://issues.apache.org/jira/browse/YARN-4752
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
> Attachments: YARN-4752.FairSchedulerPreemptionOverhaul.pdf, 
> YARN-4752.FairSchedulerPreemptionOverhaul.pdf
>
>
> A number of issues have been reported with respect to preemption in 
> FairScheduler along the lines of:
> # FairScheduler preempts resources from nodes even if the resultant free 
> resources cannot fit the incoming request.
> # Preemption doesn't preempt from sibling queues
> # Preemption doesn't preempt from sibling apps under the same queue that is 
> over its fairshare
> # ...
> Filing this umbrella JIRA to group all the issues together and think of a 
> comprehensive solution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5606) Support multi-label merge into one node label

2016-08-31 Thread jialei weng (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jialei weng updated YARN-5606:
--
Attachment: YARN-5606.001.patch

> Support multi-label merge into one node label
> -
>
> Key: YARN-5606
> URL: https://issues.apache.org/jira/browse/YARN-5606
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: jialei weng
> Attachments: YARN-5606.001.patch
>
>
> Support multi-label merge into one node label
> 1. we want to support multo-label like SSD,GPU,FPGA label merged into single 
> machine, joined by &. like SSD,GPU,FPGA-> SSD&GPU&FPGA
> 2. we add wildcard match to extend the job request. we define the wildcard 
> like *GPU*, it will math all the node labels with GPU as part of multi-label 
> merged label. For example, *GPU* will match  SSD&GPU&FPGA, GPU&FPGA. We 
> define  SSD&GPU&FPGA={SSD,GPU,FPGA}, and GPU is one of {SSD,GPU,FPGA}, so the 
> job can run on  SSD&GPU&FPGA node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5606) Support multi-label merge into one node label

2016-08-31 Thread jialei weng (JIRA)
jialei weng created YARN-5606:
-

 Summary: Support multi-label merge into one node label
 Key: YARN-5606
 URL: https://issues.apache.org/jira/browse/YARN-5606
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: jialei weng


Support multi-label merge into one node label
1. we want to support multo-label like SSD,GPU,FPGA label merged into single 
machine, joined by &. like SSD,GPU,FPGA-> SSD&GPU&FPGA
2. we add wildcard match to extend the job request. we define the wildcard like 
*GPU*, it will math all the node labels with GPU as part of multi-label merged 
label. For example, *GPU* will match  SSD&GPU&FPGA, GPU&FPGA. We define  
SSD&GPU&FPGA={SSD,GPU,FPGA}, and GPU is one of {SSD,GPU,FPGA}, so the job can 
run on  SSD&GPU&FPGA node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5594) Handle old data format while recovering RM

2016-08-31 Thread Tatyana But (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15452052#comment-15452052
 ] 

Tatyana But commented on YARN-5594:
---

That was a quick fix because we had to restore our cluster as soon as possible.

Maybe we should provide some version to identify RMDelegationToken* files 
format. But I am not shure how we will handle with files that already exist.

> Handle old data format while recovering RM
> --
>
> Key: YARN-5594
> URL: https://issues.apache.org/jira/browse/YARN-5594
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: Tatyana But
> Attachments: YARN-5594.001.patch
>
>
> We've got that error after upgrade cluster from v.2.5.1 to 2.7.0.
> {noformat}
> 2016-08-25 17:20:33,293 ERROR
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Failed to
> load/recover state
> com.google.protobuf.InvalidProtocolBufferException: Protocol message contained
> an invalid tag (zero).
> at 
> com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:89)
> at com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:108)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto.(YarnServerResourceManagerRecoveryProtos.java:4680)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto.(YarnServerResourceManagerRecoveryProtos.java:4644)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$1.parsePartialFrom(YarnServerResourceManagerRecoveryProtos.java:4740)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$1.parsePartialFrom(YarnServerResourceManagerRecoveryProtos.java:4735)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$Builder.mergeFrom(YarnServerResourceManagerRecoveryProtos.java:5075)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$Builder.mergeFrom(YarnServerResourceManagerRecoveryProtos.java:4955)
> at 
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:337)
> at 
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267)
> at 
> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:210)
> at 
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:904)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.records.RMDelegationTokenIdentifierData.readFields(RMDelegationTokenIdentifierData.java:43)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadRMDTSecretManagerState(FileSystemRMStateStore.java:355)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadState(FileSystemRMStateStore.java:199)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:587)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1007)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1048)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1044
> {noformat}
> The reason of this problem is that we use different formats of files 
> /var/mapr/cluster/yarn/rm/system/FSRMStateRoot/RMDTSecretManagerRoot/RMDelegationToken*
>  in these hadoop versions.
> This fix handle old data format during RM recover if 
> InvalidProtocolBufferException occures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5577) [Atsv2] Document object passing in infofilters with an example

2016-08-31 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15452001#comment-15452001
 ] 

Rohith Sharma K S commented on YARN-5577:
-

[~varun_saxena] If patch is fine, do you mind committing it?

> [Atsv2] Document object passing in infofilters with an example
> --
>
> Key: YARN-5577
> URL: https://issues.apache.org/jira/browse/YARN-5577
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelinereader, timelineserver
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>  Labels: documentation
> Attachments: YARN-5577.patch
>
>
> In HierarchicalTimelineEntity, setparent/addChild allows to set parent/child 
> entities at INFO level. The key is an string and value as an object. 
> Like below, for YARN_CONTAINER entity parent entity set for application.
> {code}
> "SYSTEM_INFO_PARENT_ENTITY": {
>"type": "YARN_APPLICATION",
>"id": "application_1471931266232_0024"
>  }
> {code}
> But to use infofilter on entity type YARN_CONTAINER for an specific 
> applicationId, IIUC there is no way to pass object as value in infofilter. 
> To make easier retrieval either
> # publish parent/child entity id and type as string rather that object like 
> below
> {code}
> "SYSTEM_INFO_PARENT_ENTITY_TYPE": "YARN_APPLICATION"
> "SYSTEM_INFO_PARENT_ENTITY_ID":"application_1471931266232_0024"
> {code}
> OR
> # Add ability to provide object as filter with below format like 
> {{infofilters=SYSTEM_INFO_PARENT_ENTITY eq ((type eq YARN_APPLICATION) AND 
> (id eq application_1471931266232_0024))}}
> I believe 2nd approach will be well applicable for any entities. But I am not 
> sure does HBase supports such a custom filters while scanning a table. 
> 1st approaches will be much easier to change. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5594) Handle old data format while recovering RM

2016-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451968#comment-15451968
 ] 

Hadoop QA commented on YARN-5594:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
43s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 1s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 1s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
20s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 35s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 3s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 30s 
{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 20s 
{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 34m 19s 
{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
21s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 88m 15s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12826364/YARN-5594.001.patch |
| JIRA Issue | YARN-5594 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 5e3f9d1d3ad4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 20ae1fa |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12969/testReport/ |
| modules | C: hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/12969/console |
| Powered by | Apache Yet

[jira] [Commented] (YARN-5561) [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and entities via REST

2016-08-31 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451966#comment-15451966
 ] 

Rohith Sharma K S commented on YARN-5561:
-

bq. One concern I have is, shall we expose the semantics of "containers" and 
"appattempts" in timeline reader API level?
Sorry could not understand this fully. I assume that you are asking to create 
another web service with different path like AHSWebService which internally 
redirect the calls to getEntites in TimeLineReaderWebApp.   So there will be 
two @Path i.e  */ws/v2/timeline* and */ws/v2/applicationhistory* exposed in 
ATSv2. Is it ?

bq. Similar to the past AHS APIs, maybe we'd like to abstract a separate layer 
of APIs for YARN to read data from timeline service?
Sorry I could not get it. 

> [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and 
> entities via REST
> ---
>
> Key: YARN-5561
> URL: https://issues.apache.org/jira/browse/YARN-5561
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: YARN-5561.patch, YARN-5561.v0.patch
>
>
> ATSv2 model lacks retrieval of {{list-of-all-apps}}, 
> {{list-of-all-app-attempts}} and {{list-of-all-containers-per-attempt}} via 
> REST API's. And also it is required to know about all the entities in an 
> applications.
> It is pretty much highly required these URLs for Web  UI.
> New REST URL would be 
> # GET {{/ws/v2/timeline/apps}}
> # GET {{/ws/v2/timeline/apps/\{app-id\}/appattempts}}.
> # GET 
> {{/ws/v2/timeline/apps/\{app-id\}/appattempts/\{attempt-id\}/containers}}
> # GET {{/ws/v2/timeline/apps/\{app id\}/entities}} should display list of 
> entities that can be queried.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-5546) NodeManager crashes due to SIGSEGV

2016-08-31 Thread Daniel Haviv (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Haviv resolved YARN-5546.

Resolution: Not A Problem

> NodeManager crashes due to SIGSEGV
> --
>
> Key: YARN-5546
> URL: https://issues.apache.org/jira/browse/YARN-5546
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.6.0
>Reporter: Daniel Haviv
>
> NodeManager crash due to SIGSEGV.
> hs_err includes the following java stack:
> j  
> org.fusesource.leveldbjni.internal.NativeDB$DBJNI.Put(JLorg/fusesource/leveldbjni/internal/NativeWriteOptions;Lorg/fusesource/leveldbjni/internal/NativeSlice;Lorg/fu
> sesource/leveldbjni/internal/NativeSlice;)J+0
> j  
> org.fusesource.leveldbjni.internal.NativeDB.put(Lorg/fusesource/leveldbjni/internal/NativeWriteOptions;Lorg/fusesource/leveldbjni/internal/NativeSlice;Lorg/fusesourc
> e/leveldbjni/internal/NativeSlice;)V+11
> j  
> org.fusesource.leveldbjni.internal.NativeDB.put(Lorg/fusesource/leveldbjni/internal/NativeWriteOptions;Lorg/fusesource/leveldbjni/internal/NativeBuffer;Lorg/fusesour
> ce/leveldbjni/internal/NativeBuffer;)V+18
> j  
> org.fusesource.leveldbjni.internal.NativeDB.put(Lorg/fusesource/leveldbjni/internal/NativeWriteOptions;[B[B)V+36
> j  
> org.fusesource.leveldbjni.internal.JniDB.put([B[BLorg/iq80/leveldb/WriteOptions;)Lorg/iq80/leveldb/Snapshot;+28
> j  org.fusesource.leveldbjni.internal.JniDB.put([B[B)V+10
> j  
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.storeDeletionTask(ILorg/apache/hadoop/yarn/proto/YarnServerNodemanagerRecoveryProtos$De
> letionServiceDeleteTaskProto;)V+32
> j  
> org.apache.hadoop.yarn.server.nodemanager.DeletionService.recordDeletionTaskInStateStore(Lorg/apache/hadoop/yarn/server/nodemanager/DeletionService$FileDeletionTask;
> )V+245
> j  
> org.apache.hadoop.yarn.server.nodemanager.DeletionService.delete(Ljava/lang/String;Lorg/apache/hadoop/fs/Path;[Lorg/apache/hadoop/fs/Path;)V+44
> j  
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run()V+271
> v  ~StubRoutines::call_stub
> and the culprit seems to be :
> # Problematic frame:
> # C  [libleveldbjni-64-1-5625225739273738004.8+0x2aaac]  
> leveldb::log::Writer::EmitPhysicalRecord(leveldb::log::RecordType, char 
> const*, unsigned long)+0x7c



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5546) NodeManager crashes due to SIGSEGV

2016-08-31 Thread Daniel Haviv (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451935#comment-15451935
 ] 

Daniel Haviv commented on YARN-5546:


Apparently this was due to Snoopy (https://github.com/a2o/snoopy)

> NodeManager crashes due to SIGSEGV
> --
>
> Key: YARN-5546
> URL: https://issues.apache.org/jira/browse/YARN-5546
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.6.0
>Reporter: Daniel Haviv
>
> NodeManager crash due to SIGSEGV.
> hs_err includes the following java stack:
> j  
> org.fusesource.leveldbjni.internal.NativeDB$DBJNI.Put(JLorg/fusesource/leveldbjni/internal/NativeWriteOptions;Lorg/fusesource/leveldbjni/internal/NativeSlice;Lorg/fu
> sesource/leveldbjni/internal/NativeSlice;)J+0
> j  
> org.fusesource.leveldbjni.internal.NativeDB.put(Lorg/fusesource/leveldbjni/internal/NativeWriteOptions;Lorg/fusesource/leveldbjni/internal/NativeSlice;Lorg/fusesourc
> e/leveldbjni/internal/NativeSlice;)V+11
> j  
> org.fusesource.leveldbjni.internal.NativeDB.put(Lorg/fusesource/leveldbjni/internal/NativeWriteOptions;Lorg/fusesource/leveldbjni/internal/NativeBuffer;Lorg/fusesour
> ce/leveldbjni/internal/NativeBuffer;)V+18
> j  
> org.fusesource.leveldbjni.internal.NativeDB.put(Lorg/fusesource/leveldbjni/internal/NativeWriteOptions;[B[B)V+36
> j  
> org.fusesource.leveldbjni.internal.JniDB.put([B[BLorg/iq80/leveldb/WriteOptions;)Lorg/iq80/leveldb/Snapshot;+28
> j  org.fusesource.leveldbjni.internal.JniDB.put([B[B)V+10
> j  
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.storeDeletionTask(ILorg/apache/hadoop/yarn/proto/YarnServerNodemanagerRecoveryProtos$De
> letionServiceDeleteTaskProto;)V+32
> j  
> org.apache.hadoop.yarn.server.nodemanager.DeletionService.recordDeletionTaskInStateStore(Lorg/apache/hadoop/yarn/server/nodemanager/DeletionService$FileDeletionTask;
> )V+245
> j  
> org.apache.hadoop.yarn.server.nodemanager.DeletionService.delete(Ljava/lang/String;Lorg/apache/hadoop/fs/Path;[Lorg/apache/hadoop/fs/Path;)V+44
> j  
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run()V+271
> v  ~StubRoutines::call_stub
> and the culprit seems to be :
> # Problematic frame:
> # C  [libleveldbjni-64-1-5625225739273738004.8+0x2aaac]  
> leveldb::log::Writer::EmitPhysicalRecord(leveldb::log::RecordType, char 
> const*, unsigned long)+0x7c



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5585) [Atsv2] Add a new filter fromId in REST endpoints

2016-08-31 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451835#comment-15451835
 ] 

Rohith Sharma K S commented on YARN-5585:
-

bq. entities may not stored in their creation time order (although may quite 
possibly be so). Therefore this is slightly different to the fromId in ATS v1, 
where it will "retrieve entities earlier than and including the specified ID".
Right, entities are not stored in creation time order. But once rows are 
retrieved from HBase, it is sorted as TimelineEntity#compareTo provided. 
Say rows fetched are in format  like 
HBase stored entities in a row and fetched in the order
  
then after sorting  are returned. So I 
think behavior remain same. Here, fromId=e3 then e2,e1 can be returned. 

I think I see one potential issue because of *entities may not stored in their 
creation time order* is filters *createdtimestart* and *createdtimeend* might 
not work properly. Say if these two filter are set as c1 and c5 respectively 
then HBase will return empty result scanner I think, since c1 comes after c5 in 
a row. This behavior need to confirm from HBase end. To test this behavior 
YARN-5094 blocks:-(

> [Atsv2] Add a new filter fromId in REST endpoints
> -
>
> Key: YARN-5585
> URL: https://issues.apache.org/jira/browse/YARN-5585
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>
> TimelineReader REST API's provides lot of filters to retrieve the 
> applications. Along with those, it would be good to add new filter i.e fromId 
> so that entities can be retrieved after the fromId. 
> Example : If applications are stored database, app-1 app-2 ... app-10.
> *getApps?limit=5* gives app-1 to app-10. But to retrieve next 5 apps, it is 
> difficult.
> So proposal is to have fromId in the filter like 
> *getApps?limit=5&&fromId=app-5* which gives list of apps from app-6 to 
> app-10. 
> This is very useful for pagination in web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1963) Support priorities across applications within the same queue

2016-08-31 Thread stefanlee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451780#comment-15451780
 ] 

stefanlee commented on YARN-1963:
-

[~sunilg] Thanks your jira,  i have a doubt that "creating multiple queues and 
making
users submit applications to higher priority and lower priority queues 
separately" in your doc,it  means  we create multiple queues, e.g. queue A and 
queue B,then label A  higher priority queue ,label B lower priority queue, 
after that ,user can submit higher priority application to A ?but i look up 
your code ,it means user can submit different priority application to same 
queue and the queue has a default priority,the cluster has a max priority.

> Support priorities across applications within the same queue 
> -
>
> Key: YARN-1963
> URL: https://issues.apache.org/jira/browse/YARN-1963
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: api, resourcemanager
>Reporter: Arun C Murthy
>Assignee: Sunil G
> Attachments: 0001-YARN-1963-prototype.patch, YARN Application 
> Priorities Design _02.pdf, YARN Application Priorities Design.pdf, YARN 
> Application Priorities Design_01.pdf
>
>
> It will be very useful to support priorities among applications within the 
> same queue, particularly in production scenarios. It allows for finer-grained 
> controls without having to force admins to create a multitude of queues, plus 
> allows existing applications to continue using existing queues which are 
> usually part of institutional memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5594) Handle old data format while recovering RM

2016-08-31 Thread Tatyana But (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tatyana But updated YARN-5594:
--
Attachment: YARN-5594.001.patch

> Handle old data format while recovering RM
> --
>
> Key: YARN-5594
> URL: https://issues.apache.org/jira/browse/YARN-5594
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: Tatyana But
> Attachments: YARN-5594.001.patch
>
>
> We've got that error after upgrade cluster from v.2.5.1 to 2.7.0.
> {noformat}
> 2016-08-25 17:20:33,293 ERROR
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Failed to
> load/recover state
> com.google.protobuf.InvalidProtocolBufferException: Protocol message contained
> an invalid tag (zero).
> at 
> com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:89)
> at com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:108)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto.(YarnServerResourceManagerRecoveryProtos.java:4680)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto.(YarnServerResourceManagerRecoveryProtos.java:4644)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$1.parsePartialFrom(YarnServerResourceManagerRecoveryProtos.java:4740)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$1.parsePartialFrom(YarnServerResourceManagerRecoveryProtos.java:4735)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$Builder.mergeFrom(YarnServerResourceManagerRecoveryProtos.java:5075)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$Builder.mergeFrom(YarnServerResourceManagerRecoveryProtos.java:4955)
> at 
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:337)
> at 
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267)
> at 
> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:210)
> at 
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:904)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.records.RMDelegationTokenIdentifierData.readFields(RMDelegationTokenIdentifierData.java:43)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadRMDTSecretManagerState(FileSystemRMStateStore.java:355)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadState(FileSystemRMStateStore.java:199)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:587)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1007)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1048)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1044
> {noformat}
> The reason of this problem is that we use different formats of files 
> /var/mapr/cluster/yarn/rm/system/FSRMStateRoot/RMDTSecretManagerRoot/RMDelegationToken*
>  in these hadoop versions.
> This fix handle old data format during RM recover if 
> InvalidProtocolBufferException occures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5594) Handle old data format while recovering RM

2016-08-31 Thread Tatyana But (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tatyana But updated YARN-5594:
--
Attachment: (was: YARN-5594.001.patch)

> Handle old data format while recovering RM
> --
>
> Key: YARN-5594
> URL: https://issues.apache.org/jira/browse/YARN-5594
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: Tatyana But
>
> We've got that error after upgrade cluster from v.2.5.1 to 2.7.0.
> {noformat}
> 2016-08-25 17:20:33,293 ERROR
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Failed to
> load/recover state
> com.google.protobuf.InvalidProtocolBufferException: Protocol message contained
> an invalid tag (zero).
> at 
> com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:89)
> at com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:108)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto.(YarnServerResourceManagerRecoveryProtos.java:4680)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto.(YarnServerResourceManagerRecoveryProtos.java:4644)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$1.parsePartialFrom(YarnServerResourceManagerRecoveryProtos.java:4740)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$1.parsePartialFrom(YarnServerResourceManagerRecoveryProtos.java:4735)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$Builder.mergeFrom(YarnServerResourceManagerRecoveryProtos.java:5075)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$Builder.mergeFrom(YarnServerResourceManagerRecoveryProtos.java:4955)
> at 
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:337)
> at 
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267)
> at 
> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:210)
> at 
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:904)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.records.RMDelegationTokenIdentifierData.readFields(RMDelegationTokenIdentifierData.java:43)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadRMDTSecretManagerState(FileSystemRMStateStore.java:355)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadState(FileSystemRMStateStore.java:199)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:587)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1007)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1048)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1044
> {noformat}
> The reason of this problem is that we use different formats of files 
> /var/mapr/cluster/yarn/rm/system/FSRMStateRoot/RMDTSecretManagerRoot/RMDelegationToken*
>  in these hadoop versions.
> This fix handle old data format during RM recover if 
> InvalidProtocolBufferException occures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5594) Handle old data format while recovering RM

2016-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451709#comment-15451709
 ] 

Hadoop QA commented on YARN-5594:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 50s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
34s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 14s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
45s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 50s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 30s 
{color} | {color:red} root: The patch generated 2 new + 49 unchanged - 0 fixed 
= 51 total (was 49) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
47s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 51s 
{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 20s 
{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 34m 0s 
{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
22s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 93m 2s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12826344/YARN-5594.001.patch |
| JIRA Issue | YARN-5594 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux d8da0b26da33 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 20ae1fa |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/12968/artifact/patchprocess/diff-checkstyle-root.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12968/testReport/ |
| modules | C: hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-ya

[jira] [Commented] (YARN-5567) Fix script exit code checking in NodeHealthScriptRunner#reportHealthStatus

2016-08-31 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451647#comment-15451647
 ] 

Naganarasimha G R commented on YARN-5567:
-

Well i would have liked to port it to 2.9 atleast as in my view its better to 
flag scrit syntax error as an error rather than silently passing it as success, 
but i am go in taking less riskier approach too!
bq. I would be OK with the change in trunk. It does make the behaviour clearer.
Well the catch again here is 3.0.0-alpha1 is seperate branch from trunk, so you 
guys plan to push it to 3.0.0-alpha1 branch too right ?

> Fix script exit code checking in NodeHealthScriptRunner#reportHealthStatus
> --
>
> Key: YARN-5567
> URL: https://issues.apache.org/jira/browse/YARN-5567
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Fix For: 3.0.0-alpha1
>
> Attachments: YARN-5567.001.patch
>
>
> In case of FAILED_WITH_EXIT_CODE, health status should be false.
> {code}
>   case FAILED_WITH_EXIT_CODE:
> setHealthStatus(true, "", now);
> break;
> {code}
> should be 
> {code}
>   case FAILED_WITH_EXIT_CODE:
> setHealthStatus(false, "", now);
> break;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5552) Add Builder methods for common yarn API records

2016-08-31 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451609#comment-15451609
 ] 

Tao Jie commented on YARN-5552:
---

Initialized the patch, added builder for mentioned 4 records. 
constructors and newInstance methods are still retained to keep compatibility.
Also use {{builder}} instead of {{newInstance}} somewhere, but not in test yet.

> Add Builder methods for common yarn API records
> ---
>
> Key: YARN-5552
> URL: https://issues.apache.org/jira/browse/YARN-5552
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Tao Jie
> Attachments: YARN-5552.000.patch
>
>
> Currently yarn API records such as ResourceRequest, AllocateRequest/Respone 
> as well as AMRMClient.ContainerRequest have multiple constructors / 
> newInstance methods. This makes it very difficult to add new fields to these 
> records.
> It would probably be better if we had Builder classes for many of these 
> records, which would make evolution of these records a bit easier.
> (suggested by [~kasha])



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5581) [Atsv2] Reuse the API doc for a same funtionality methods in TimelineReaderWebServices class

2016-08-31 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451600#comment-15451600
 ] 

Varun Saxena commented on YARN-5581:


IIRC, the Javadoc addition became necessary as there were a bunch of javadoc 
errors coming when we ran javadoc for Java 8. Java8 javadoc asks for param 
definition of each parameter in a method. It was a concious effort to have 0 
javadoc errors and almost 0 checkstyle errors when we drop onto trunk.

Is there a way to define javadoc for a method by not defining each param and 
linking it to the other API doc and still not get these errors ? If there is, I 
agree this should be done.

> [Atsv2] Reuse the API doc for a same funtionality methods in 
> TimelineReaderWebServices class
> 
>
> Key: YARN-5581
> URL: https://issues.apache.org/jira/browse/YARN-5581
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>  Labels: maintenance
>
> The class TimelineReaderWebServices has more API doc than actual code in it. 
> For example : querying an app support paths.
>  In TimelineReaderWebServices, basically 2 methods need to expose. Both API 
> has full set of API doc. It can be optimized with additional param and give a 
> link to existing API doc.
> {noformat}
> GET /ws/v2/timeline/clusters/{cluster name}/users/{user name}/flows/{flow 
> name}/apps
> or
> GET /ws/v2/timeline/users/{user name}/flows/{flow name}/apps
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5123) SQL based RM state store

2016-08-31 Thread Lavkesh Lahngir (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451589#comment-15451589
 ] 

Lavkesh Lahngir commented on YARN-5123:
---

[~giovanni.fumarola] : It's an arbitrary choice. Because we can Implement HSQL 
as easy as any other database. We just have to make sure that the fencing 
assumptions works. For this, we can refactor the code and put an abstract class 
which can provide few common use cases. Other implementation should override 
the syntax which is specific to the DB implementation (i.e. Postgres or HSQL) 

> SQL based RM state store
> 
>
> Key: YARN-5123
> URL: https://issues.apache.org/jira/browse/YARN-5123
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Lavkesh Lahngir
>Assignee: Lavkesh Lahngir
> Attachments: 0001-SQL-Based-RM-state-store-trunk.patch, High 
> Availability In YARN Resource Manager using SQL Based StateStore.pdf, 
> sqlstatestore.patch
>
>
> In our setup,  zookeeper based RM state store didn't work. We ended up 
> implementing our own SQL based state store. Here is a patch, if anybody else 
> wants to use it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5552) Add Builder methods for common yarn API records

2016-08-31 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-5552:
--
Attachment: YARN-5552.000.patch

> Add Builder methods for common yarn API records
> ---
>
> Key: YARN-5552
> URL: https://issues.apache.org/jira/browse/YARN-5552
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Tao Jie
> Attachments: YARN-5552.000.patch
>
>
> Currently yarn API records such as ResourceRequest, AllocateRequest/Respone 
> as well as AMRMClient.ContainerRequest have multiple constructors / 
> newInstance methods. This makes it very difficult to add new fields to these 
> records.
> It would probably be better if we had Builder classes for many of these 
> records, which would make evolution of these records a bit easier.
> (suggested by [~kasha])



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5596) TestDockerContainerRuntime fails on OS X

2016-08-31 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451532#comment-15451532
 ] 

Varun Vasudev commented on YARN-5596:
-

+1. I'll commit this tomorrow if no one objects.

> TestDockerContainerRuntime fails on OS X
> 
>
> Key: YARN-5596
> URL: https://issues.apache.org/jira/browse/YARN-5596
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, yarn
>Reporter: Sidharta Seethana
>Assignee: Sidharta Seethana
>Priority: Minor
> Attachments: YARN-5596.001.patch, YARN-5596.002.patch
>
>
> /sys/fs/cgroup doesn't exist on Mac OS X. And the tests seem to fail because 
> of this. 
> {code}
> Failed tests:
>   TestDockerContainerRuntime.testContainerLaunchWithCustomNetworks:456 
> expected:<...ET_BIND_SERVICE -v /[sys/fs/cgroup:/sys/fs/cgroup:ro -v 
> /]test_container_local...> but was:<...ET_BIND_SERVICE -v 
> /[]test_container_local...>
>   TestDockerContainerRuntime.testContainerLaunchWithNetworkingDefaults:401 
> expected:<...ET_BIND_SERVICE -v /[sys/fs/cgroup:/sys/fs/cgroup:ro -v 
> /]test_container_local...> but was:<...ET_BIND_SERVICE -v 
> /[]test_container_local...>
>   TestDockerContainerRuntime.testDockerContainerLaunch:297 
> expected:<...ET_BIND_SERVICE -v /[sys/fs/cgroup:/sys/fs/cgroup:ro -v 
> /]test_container_local...> but was:<...ET_BIND_SERVICE -v 
> /[]test_container_local...>
> Tests run: 19, Failures: 3, Errors: 0, Skipped: 0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5594) Handle old data format while recovering RM

2016-08-31 Thread Tatyana But (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tatyana But updated YARN-5594:
--
Attachment: YARN-5594.001.patch

> Handle old data format while recovering RM
> --
>
> Key: YARN-5594
> URL: https://issues.apache.org/jira/browse/YARN-5594
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: Tatyana But
> Attachments: YARN-5594.001.patch
>
>
> We've got that error after upgrade cluster from v.2.5.1 to 2.7.0.
> {noformat}
> 2016-08-25 17:20:33,293 ERROR
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Failed to
> load/recover state
> com.google.protobuf.InvalidProtocolBufferException: Protocol message contained
> an invalid tag (zero).
> at 
> com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:89)
> at com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:108)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto.(YarnServerResourceManagerRecoveryProtos.java:4680)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto.(YarnServerResourceManagerRecoveryProtos.java:4644)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$1.parsePartialFrom(YarnServerResourceManagerRecoveryProtos.java:4740)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$1.parsePartialFrom(YarnServerResourceManagerRecoveryProtos.java:4735)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$Builder.mergeFrom(YarnServerResourceManagerRecoveryProtos.java:5075)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$Builder.mergeFrom(YarnServerResourceManagerRecoveryProtos.java:4955)
> at 
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:337)
> at 
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267)
> at 
> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:210)
> at 
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:904)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.records.RMDelegationTokenIdentifierData.readFields(RMDelegationTokenIdentifierData.java:43)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadRMDTSecretManagerState(FileSystemRMStateStore.java:355)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadState(FileSystemRMStateStore.java:199)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:587)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1007)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1048)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1044
> {noformat}
> The reason of this problem is that we use different formats of files 
> /var/mapr/cluster/yarn/rm/system/FSRMStateRoot/RMDTSecretManagerRoot/RMDelegationToken*
>  in these hadoop versions.
> This fix handle old data format during RM recover if 
> InvalidProtocolBufferException occures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5594) Handle old data format while recovering RM

2016-08-31 Thread Tatyana But (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tatyana But updated YARN-5594:
--
Attachment: (was: YARN-5594.001.patch)

> Handle old data format while recovering RM
> --
>
> Key: YARN-5594
> URL: https://issues.apache.org/jira/browse/YARN-5594
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: Tatyana But
>
> We've got that error after upgrade cluster from v.2.5.1 to 2.7.0.
> {noformat}
> 2016-08-25 17:20:33,293 ERROR
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Failed to
> load/recover state
> com.google.protobuf.InvalidProtocolBufferException: Protocol message contained
> an invalid tag (zero).
> at 
> com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:89)
> at com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:108)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto.(YarnServerResourceManagerRecoveryProtos.java:4680)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto.(YarnServerResourceManagerRecoveryProtos.java:4644)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$1.parsePartialFrom(YarnServerResourceManagerRecoveryProtos.java:4740)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$1.parsePartialFrom(YarnServerResourceManagerRecoveryProtos.java:4735)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$Builder.mergeFrom(YarnServerResourceManagerRecoveryProtos.java:5075)
> at 
> org.apache.hadoop.yarn.proto.YarnServerResourceManagerRecoveryProtos$RMDelegationTokenIdentifierDataProto$Builder.mergeFrom(YarnServerResourceManagerRecoveryProtos.java:4955)
> at 
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:337)
> at 
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267)
> at 
> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:210)
> at 
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:904)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.records.RMDelegationTokenIdentifierData.readFields(RMDelegationTokenIdentifierData.java:43)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadRMDTSecretManagerState(FileSystemRMStateStore.java:355)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.loadState(FileSystemRMStateStore.java:199)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:587)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1007)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1048)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1044
> {noformat}
> The reason of this problem is that we use different formats of files 
> /var/mapr/cluster/yarn/rm/system/FSRMStateRoot/RMDTSecretManagerRoot/RMDelegationToken*
>  in these hadoop versions.
> This fix handle old data format during RM recover if 
> InvalidProtocolBufferException occures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4876) [Phase 1] Decoupled Init / Destroy of Containers from Start / Stop

2016-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451404#comment-15451404
 ] 

Hadoop QA commented on YARN-4876:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s {color} 
| {color:red} YARN-4876 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12826340/YARN-4876.004.patch |
| JIRA Issue | YARN-4876 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/12967/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> [Phase 1] Decoupled Init / Destroy of Containers from Start / Stop
> --
>
> Key: YARN-4876
> URL: https://issues.apache.org/jira/browse/YARN-4876
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Marco Rabozzi
> Attachments: YARN-4876-design-doc.pdf, YARN-4876.002.patch, 
> YARN-4876.003.patch, YARN-4876.004.patch, YARN-4876.004.patch, 
> YARN-4876.01.patch
>
>
> Introduce *initialize* and *destroy* container API into the 
> *ContainerManagementProtocol* and decouple the actual start of a container 
> from the initialization. This will allow AMs to re-start a container without 
> having to lose the allocation.
> Additionally, if the localization of the container is associated to the 
> initialize (and the cleanup with the destroy), This can also be used by 
> applications to upgrade a Container by *re-initializing* with a new 
> *ContainerLaunchContext*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3981) support timeline clients not associated with an application

2016-08-31 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451391#comment-15451391
 ] 

Rohith Sharma K S commented on YARN-3981:
-

bq. We can launch collectors as separate process for this use case?
There are 2 ways. Firstly run as default service in NM? Pros is RM can make use 
of underlaying NM-RM communication protocol. RM need to just track of collector 
address. Secondly run as separate process? Pro's is is that admin can decide 
off-client-writer-cluster. Con's is RM need to track this daemon separately at 
high level

bq. For storing those entities posted from clients, can we put them in the 
entity table, but just leave some unknown fields empty? Will that be a concern 
for the storage API's semantics?
Yes, it is concern for storage API's semantics. Current schema structure for 
entity table is 
*userName!clusterId!flowName!flowRunId!AppId!entityType!entityId*. So, key 
would become *userName!clusterId!flowName!null!null!entityType!entityId* which 
I am not sure doe HBase support this as a key for read/write.

> support timeline clients not associated with an application
> ---
>
> Key: YARN-3981
> URL: https://issues.apache.org/jira/browse/YARN-3981
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Rohith Sharma K S
>  Labels: YARN-5355
>
> In the current v.2 design, all timeline writes must belong in a 
> flow/application context (cluster + user + flow + flow run + application).
> But there are use cases that require writing data outside the context of an 
> application. One such example is a higher level client (e.g. tez client or 
> hive/oozie/cascading client) writing flow-level data that spans multiple 
> applications. We need to find a way to support them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



<    1   2