[jira] [Commented] (YARN-10300) appMasterHost not set in RM ApplicationSummary when AM fails before first heartbeat

2020-06-01 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17121495#comment-17121495
 ] 

Hadoop QA commented on YARN-10300:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
1s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 29s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
48s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 13s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 91m 31s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}158m 48s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler |
|   | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/26095/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-10300 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13004526/YARN-10300.001.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux aa1ebd84c7c1 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 
10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / 9fe4c37c25b |
| Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/26095/a

[jira] [Commented] (YARN-10300) appMasterHost not set in RM ApplicationSummary when AM fails before first heartbeat

2020-06-02 Thread Eric Badger (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17124328#comment-17124328
 ] 

Eric Badger commented on YARN-10300:


The unit tests pass for me locally. [~epayne], [~Jim_Brennan], could you take a 
look at this patch?

> appMasterHost not set in RM ApplicationSummary when AM fails before first 
> heartbeat
> ---
>
> Key: YARN-10300
> URL: https://issues.apache.org/jira/browse/YARN-10300
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-10300.001.patch
>
>
> {noformat}
> 2020-05-23 14:09:10,086 INFO resourcemanager.RMAppManager$ApplicationSummary: 
> appId=application_1586003420099_12444961,name=job_name,user=username,queue=queuename,state=FAILED,trackingUrl=https
>  
> ://cluster:port/applicationhistory/app/application_1586003420099_12444961,appMasterHost=N/A,startTime=1590241207309,finishTime=1590242950085,finalStatus=FAILED,memorySeconds=13750,vcoreSeconds=67,preemptedMemorySeconds=0,preemptedVcoreSeconds=0,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=  vCores:0>,applicationType=MAPREDUCE
> {noformat}
> {{appMasterHost=N/A}} should have the AM hostname instead of N/A



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10300) appMasterHost not set in RM ApplicationSummary when AM fails before first heartbeat

2020-06-02 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17124421#comment-17124421
 ] 

Jim Brennan commented on YARN-10300:


[~ebadger] can you explain under what circumstances  
{{attempt.getMasterContainer().getNodeId().getHost()}} will succeed where 
{{attempt.getHost()}} fails?
Also, do we need to add some null checks for these?  getMasterContainer(), 
getNodeId(), getHost()?
Note that the attempt.getHost() defaults to "N/A" before it is set - what do we 
get if NodeID().getHost() isn't valid yet?  Is that even a possibility?


> appMasterHost not set in RM ApplicationSummary when AM fails before first 
> heartbeat
> ---
>
> Key: YARN-10300
> URL: https://issues.apache.org/jira/browse/YARN-10300
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-10300.001.patch
>
>
> {noformat}
> 2020-05-23 14:09:10,086 INFO resourcemanager.RMAppManager$ApplicationSummary: 
> appId=application_1586003420099_12444961,name=job_name,user=username,queue=queuename,state=FAILED,trackingUrl=https
>  
> ://cluster:port/applicationhistory/app/application_1586003420099_12444961,appMasterHost=N/A,startTime=1590241207309,finishTime=1590242950085,finalStatus=FAILED,memorySeconds=13750,vcoreSeconds=67,preemptedMemorySeconds=0,preemptedVcoreSeconds=0,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=  vCores:0>,applicationType=MAPREDUCE
> {noformat}
> {{appMasterHost=N/A}} should have the AM hostname instead of N/A



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10300) appMasterHost not set in RM ApplicationSummary when AM fails before first heartbeat

2020-06-02 Thread Eric Badger (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17124450#comment-17124450
 ] 

Eric Badger commented on YARN-10300:


bq. Eric Badger can you explain under what circumstances 
attempt.getMasterContainer().getNodeId().getHost() will succeed where 
attempt.getHost() fails?

{{attempt.getHost()}} grabs {{host}} within RMAppAttemptImpl.java. This is set 
during {{AMRegisteredTransition}}, so it will be "N/A" until the AM registers 
with the NM during the first heartbeat. 

{{attempt.getMasterContainer()}} grabs {{masterContainer}} within 
RMAppAttemptImpl.java. This is set during {{AMContainerAllocatedTransition}}. 

So from the time between when the container is allocated until the time of the 
first AM heartbeat, {{masterContainer}} will be set, but {{host}} will be "N/A"

bq. Also, do we need to add some null checks for these? getMasterContainer(), 
getNodeId(), getHost()?
Probably wouldn't hurt. 

bq. Note that the attempt.getHost() defaults to "N/A" before it is set - what 
do we get if NodeID().getHost() isn't valid yet? Is that even a possibility?
I don't know if it's possible to be invalid at the start or not. The 
{{Container}} is going to be created via {{newInstance}}, which requires a 
{{NodeId}} as a parameter. But that could potentially be sent in as null. But I 
think it will either be the correct nodeId or will be null, which I can 
interpret as "N/A". There are so many places that containers are instantiated 
in the scheduler that it'd be pretty tough to see if all of the cases have the 
nodeID set initially. 

I can add in the null checks and default the string to "N/A" if any of them 
don't exist. 

> appMasterHost not set in RM ApplicationSummary when AM fails before first 
> heartbeat
> ---
>
> Key: YARN-10300
> URL: https://issues.apache.org/jira/browse/YARN-10300
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-10300.001.patch
>
>
> {noformat}
> 2020-05-23 14:09:10,086 INFO resourcemanager.RMAppManager$ApplicationSummary: 
> appId=application_1586003420099_12444961,name=job_name,user=username,queue=queuename,state=FAILED,trackingUrl=https
>  
> ://cluster:port/applicationhistory/app/application_1586003420099_12444961,appMasterHost=N/A,startTime=1590241207309,finishTime=1590242950085,finalStatus=FAILED,memorySeconds=13750,vcoreSeconds=67,preemptedMemorySeconds=0,preemptedVcoreSeconds=0,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=  vCores:0>,applicationType=MAPREDUCE
> {noformat}
> {{appMasterHost=N/A}} should have the AM hostname instead of N/A



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10300) appMasterHost not set in RM ApplicationSummary when AM fails before first heartbeat

2020-06-02 Thread Eric Badger (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17124461#comment-17124461
 ] 

Eric Badger commented on YARN-10300:


Added some null checks to patch 002

> appMasterHost not set in RM ApplicationSummary when AM fails before first 
> heartbeat
> ---
>
> Key: YARN-10300
> URL: https://issues.apache.org/jira/browse/YARN-10300
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-10300.001.patch, YARN-10300.002.patch
>
>
> {noformat}
> 2020-05-23 14:09:10,086 INFO resourcemanager.RMAppManager$ApplicationSummary: 
> appId=application_1586003420099_12444961,name=job_name,user=username,queue=queuename,state=FAILED,trackingUrl=https
>  
> ://cluster:port/applicationhistory/app/application_1586003420099_12444961,appMasterHost=N/A,startTime=1590241207309,finishTime=1590242950085,finalStatus=FAILED,memorySeconds=13750,vcoreSeconds=67,preemptedMemorySeconds=0,preemptedVcoreSeconds=0,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=  vCores:0>,applicationType=MAPREDUCE
> {noformat}
> {{appMasterHost=N/A}} should have the AM hostname instead of N/A



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10300) appMasterHost not set in RM ApplicationSummary when AM fails before first heartbeat

2020-06-02 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17124511#comment-17124511
 ] 

Hadoop QA commented on YARN-10300:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
50s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 37s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  2m  
9s{color} | {color:blue} Used deprecated FindBugs config; considering switching 
to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
5s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 14s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
4s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 52m 43s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch passed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
39s{color} | {color:red} The patch generated 17 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}126m 31s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerOvercommit |
|   | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/26105/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-10300 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13004661/YARN-10300.002.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux 27ca2cb7f6cd 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 
10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / 6288e15118f |
| Default Java | Private Build-1.8.0_252-8u252-b09-1

[jira] [Commented] (YARN-10300) appMasterHost not set in RM ApplicationSummary when AM fails before first heartbeat

2020-06-03 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17124994#comment-17124994
 ] 

Jim Brennan commented on YARN-10300:


Thanks [~ebadger]!  Patch 002 looks good to me.  I verified it builds and the 
test passes.   +1 (non-binding).


> appMasterHost not set in RM ApplicationSummary when AM fails before first 
> heartbeat
> ---
>
> Key: YARN-10300
> URL: https://issues.apache.org/jira/browse/YARN-10300
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-10300.001.patch, YARN-10300.002.patch
>
>
> {noformat}
> 2020-05-23 14:09:10,086 INFO resourcemanager.RMAppManager$ApplicationSummary: 
> appId=application_1586003420099_12444961,name=job_name,user=username,queue=queuename,state=FAILED,trackingUrl=https
>  
> ://cluster:port/applicationhistory/app/application_1586003420099_12444961,appMasterHost=N/A,startTime=1590241207309,finishTime=1590242950085,finalStatus=FAILED,memorySeconds=13750,vcoreSeconds=67,preemptedMemorySeconds=0,preemptedVcoreSeconds=0,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=  vCores:0>,applicationType=MAPREDUCE
> {noformat}
> {{appMasterHost=N/A}} should have the AM hostname instead of N/A



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10300) appMasterHost not set in RM ApplicationSummary when AM fails before first heartbeat

2020-06-03 Thread Eric Badger (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17125139#comment-17125139
 ] 

Eric Badger commented on YARN-10300:


[~epayne], would you mind looking at this patch as well to give a binding 
review?

> appMasterHost not set in RM ApplicationSummary when AM fails before first 
> heartbeat
> ---
>
> Key: YARN-10300
> URL: https://issues.apache.org/jira/browse/YARN-10300
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-10300.001.patch, YARN-10300.002.patch
>
>
> {noformat}
> 2020-05-23 14:09:10,086 INFO resourcemanager.RMAppManager$ApplicationSummary: 
> appId=application_1586003420099_12444961,name=job_name,user=username,queue=queuename,state=FAILED,trackingUrl=https
>  
> ://cluster:port/applicationhistory/app/application_1586003420099_12444961,appMasterHost=N/A,startTime=1590241207309,finishTime=1590242950085,finalStatus=FAILED,memorySeconds=13750,vcoreSeconds=67,preemptedMemorySeconds=0,preemptedVcoreSeconds=0,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=  vCores:0>,applicationType=MAPREDUCE
> {noformat}
> {{appMasterHost=N/A}} should have the AM hostname instead of N/A



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10300) appMasterHost not set in RM ApplicationSummary when AM fails before first heartbeat

2020-06-04 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17126158#comment-17126158
 ] 

Eric Payne commented on YARN-10300:
---

Thanks for raising this issue and for providing the fix.

The code changes look good. However, I have a couple of comments on the unit 
test.

The unit test succeeds with and without the changes. The unit test tests 
whether 
{{app.getCurrentAppAttempt().getMasterContainer().getNodeId().getHost()}} 
returns the current host, not whether {{RMAppManager#createAppSummary}} fills 
in the AM host name prior to node heartbeat.

I don't know how hard it would be to call {{RMAppManager#createAppSummary}} 
directly, but if it's possible, I think the unit test should do that and then 
check that the SummaryBuilder has the host and port filled in. Do you think 
that's possible?

> appMasterHost not set in RM ApplicationSummary when AM fails before first 
> heartbeat
> ---
>
> Key: YARN-10300
> URL: https://issues.apache.org/jira/browse/YARN-10300
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-10300.001.patch, YARN-10300.002.patch
>
>
> {noformat}
> 2020-05-23 14:09:10,086 INFO resourcemanager.RMAppManager$ApplicationSummary: 
> appId=application_1586003420099_12444961,name=job_name,user=username,queue=queuename,state=FAILED,trackingUrl=https
>  
> ://cluster:port/applicationhistory/app/application_1586003420099_12444961,appMasterHost=N/A,startTime=1590241207309,finishTime=1590242950085,finalStatus=FAILED,memorySeconds=13750,vcoreSeconds=67,preemptedMemorySeconds=0,preemptedVcoreSeconds=0,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=  vCores:0>,applicationType=MAPREDUCE
> {noformat}
> {{appMasterHost=N/A}} should have the AM hostname instead of N/A



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10300) appMasterHost not set in RM ApplicationSummary when AM fails before first heartbeat

2020-06-08 Thread Eric Badger (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17128713#comment-17128713
 ] 

Eric Badger commented on YARN-10300:


[~epayne], thanks for the review! In patch 003 I've modified an additional 
existing test to better test the {{createAppSummary()}} code. The unit test 
fails without the code change and succeeds with it.

> appMasterHost not set in RM ApplicationSummary when AM fails before first 
> heartbeat
> ---
>
> Key: YARN-10300
> URL: https://issues.apache.org/jira/browse/YARN-10300
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-10300.001.patch, YARN-10300.002.patch, 
> YARN-10300.003.patch
>
>
> {noformat}
> 2020-05-23 14:09:10,086 INFO resourcemanager.RMAppManager$ApplicationSummary: 
> appId=application_1586003420099_12444961,name=job_name,user=username,queue=queuename,state=FAILED,trackingUrl=https
>  
> ://cluster:port/applicationhistory/app/application_1586003420099_12444961,appMasterHost=N/A,startTime=1590241207309,finishTime=1590242950085,finalStatus=FAILED,memorySeconds=13750,vcoreSeconds=67,preemptedMemorySeconds=0,preemptedVcoreSeconds=0,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=  vCores:0>,applicationType=MAPREDUCE
> {noformat}
> {{appMasterHost=N/A}} should have the AM hostname instead of N/A



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10300) appMasterHost not set in RM ApplicationSummary when AM fails before first heartbeat

2020-06-08 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17128784#comment-17128784
 ] 

Hadoop QA commented on YARN-10300:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
43s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 49s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
44s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 49s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 87m 
48s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}148m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/26133/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-10300 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13005169/YARN-10300.003.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux 6ac1c218b1c4 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / fbb87754306 |
| Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/26133/testReport/ |
| Max. process+thread count | 886 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-

[jira] [Commented] (YARN-10300) appMasterHost not set in RM ApplicationSummary when AM fails before first heartbeat

2020-06-09 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129700#comment-17129700
 ] 

Eric Payne commented on YARN-10300:
---

+1

Thanks [~ebadger]. I will commit this afternoon or tomorrow morning.

> appMasterHost not set in RM ApplicationSummary when AM fails before first 
> heartbeat
> ---
>
> Key: YARN-10300
> URL: https://issues.apache.org/jira/browse/YARN-10300
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-10300.001.patch, YARN-10300.002.patch, 
> YARN-10300.003.patch
>
>
> {noformat}
> 2020-05-23 14:09:10,086 INFO resourcemanager.RMAppManager$ApplicationSummary: 
> appId=application_1586003420099_12444961,name=job_name,user=username,queue=queuename,state=FAILED,trackingUrl=https
>  
> ://cluster:port/applicationhistory/app/application_1586003420099_12444961,appMasterHost=N/A,startTime=1590241207309,finishTime=1590242950085,finalStatus=FAILED,memorySeconds=13750,vcoreSeconds=67,preemptedMemorySeconds=0,preemptedVcoreSeconds=0,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=  vCores:0>,applicationType=MAPREDUCE
> {noformat}
> {{appMasterHost=N/A}} should have the AM hostname instead of N/A



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10300) appMasterHost not set in RM ApplicationSummary when AM fails before first heartbeat

2020-06-09 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129725#comment-17129725
 ] 

Hudson commented on YARN-10300:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18342 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18342/])
YARN-10300: appMasterHost not set in RM ApplicationSummary when AM fails 
(ericp: rev 56247db3022705635580c4d2f8b0abde109f954f)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterLauncher.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java


> appMasterHost not set in RM ApplicationSummary when AM fails before first 
> heartbeat
> ---
>
> Key: YARN-10300
> URL: https://issues.apache.org/jira/browse/YARN-10300
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-10300.001.patch, YARN-10300.002.patch, 
> YARN-10300.003.patch
>
>
> {noformat}
> 2020-05-23 14:09:10,086 INFO resourcemanager.RMAppManager$ApplicationSummary: 
> appId=application_1586003420099_12444961,name=job_name,user=username,queue=queuename,state=FAILED,trackingUrl=https
>  
> ://cluster:port/applicationhistory/app/application_1586003420099_12444961,appMasterHost=N/A,startTime=1590241207309,finishTime=1590242950085,finalStatus=FAILED,memorySeconds=13750,vcoreSeconds=67,preemptedMemorySeconds=0,preemptedVcoreSeconds=0,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=  vCores:0>,applicationType=MAPREDUCE
> {noformat}
> {{appMasterHost=N/A}} should have the AM hostname instead of N/A



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10300) appMasterHost not set in RM ApplicationSummary when AM fails before first heartbeat

2020-06-11 Thread Eric Badger (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133540#comment-17133540
 ] 

Eric Badger commented on YARN-10300:


Thanks, [~epayne] for the review/commit and [~Jim_Brennan] for the review!

> appMasterHost not set in RM ApplicationSummary when AM fails before first 
> heartbeat
> ---
>
> Key: YARN-10300
> URL: https://issues.apache.org/jira/browse/YARN-10300
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Fix For: 3.2.2, 2.10.1, 3.4.0, 3.3.1, 3.1.5
>
> Attachments: YARN-10300.001.patch, YARN-10300.002.patch, 
> YARN-10300.003.patch
>
>
> {noformat}
> 2020-05-23 14:09:10,086 INFO resourcemanager.RMAppManager$ApplicationSummary: 
> appId=application_1586003420099_12444961,name=job_name,user=username,queue=queuename,state=FAILED,trackingUrl=https
>  
> ://cluster:port/applicationhistory/app/application_1586003420099_12444961,appMasterHost=N/A,startTime=1590241207309,finishTime=1590242950085,finalStatus=FAILED,memorySeconds=13750,vcoreSeconds=67,preemptedMemorySeconds=0,preemptedVcoreSeconds=0,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=  vCores:0>,applicationType=MAPREDUCE
> {noformat}
> {{appMasterHost=N/A}} should have the AM hostname instead of N/A



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org