[jira] [Commented] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state

2015-06-22 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14597265#comment-14597265
 ] 

Varun Saxena commented on YARN-2902:


Looking at the public localization code, I do not think public resources can be 
orphaned because we do not stop localization for them midway on container 
cleanup.
Its difficult to ascertain though from logs as to why localization was failing 
in the scenario mentioned above for public resources. Whatever little I could 
look into the code, I could not find anything concrete which can explain the 
failures. 

Anyways, the scope of this JIRA, i.e. orphaning of resources would not happen 
for PUBLIC resources IMHO. And I guess there is no point further delaying this 
JIRA hoping to find out what went wrong with public resources in scenario above.

bq.  What's not clear to me is whether the trigger was the public localization 
timing out or the stopContainer request 
Reference can become 0 if container is killed while downloading.

Coming to the patch, there are two approaches to handle this.
# Cleanup for downloading resources can be done by Localization Service while 
doing container cleanup.
# On Heartbeat from container localizer, if localizer runner is already 
stopped, we can indicate the localizer runner to do the cleanup for downloading 
resources.

The patch attached adopts approach 1.
Herein, we wait for container localizer to die before running deletion tasks. 
Also, downloading resources can either be in local directory or in local 
directory suffixed by {{_tmp}}. So we try for both.
Moreover, localization failed event is sent to all the containers which are 
referring to the resource which is in downloading state. 


> Killing a container that is localizing can orphan resources in the 
> DOWNLOADING state
> 
>
> Key: YARN-2902
> URL: https://issues.apache.org/jira/browse/YARN-2902
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Attachments: YARN-2902.002.patch, YARN-2902.03.patch, YARN-2902.patch
>
>
> If a container is in the process of localizing when it is stopped/killed then 
> resources are left in the DOWNLOADING state.  If no other container comes 
> along and requests these resources they linger around with no reference 
> counts but aren't cleaned up during normal cache cleanup scans since it will 
> never delete resources in the DOWNLOADING state even if their reference count 
> is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3705) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state

2015-06-22 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14597255#comment-14597255
 ] 

Masatake Iwasaki commented on YARN-3705:


The failure of TestRMRestart seems to be same issue with YARN-2871.

> forcemanual transitionToStandby in RM-HA automatic-failover mode should 
> change elector state
> 
>
> Key: YARN-3705
> URL: https://issues.apache.org/jira/browse/YARN-3705
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
> Attachments: YARN-3705.001.patch, YARN-3705.002.patch, 
> YARN-3705.003.patch
>
>
> Executing {{rmadmin -transitionToStandby --forcemanual}} in 
> automatic-failover.enabled mode makes ResouceManager standby while keeping 
> the state of ActiveStandbyElector. It should make elector to quit and rejoin 
> in order to enable other candidates to promote, otherwise forcemanual 
> transition should not be allowed in automatic-failover mode in order to avoid 
> confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3705) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state

2015-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14597229#comment-14597229
 ] 

Hadoop QA commented on YARN-3705:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 35s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 42s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 13s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 17s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   7m  0s | Tests passed in 
hadoop-yarn-client. |
| {color:red}-1{color} | yarn tests |  50m 41s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  97m 29s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart |
|   | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12741212/YARN-3705.003.patch |
| Optional Tests | javac unit findbugs checkstyle javadoc |
| git revision | trunk / 99271b7 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8323/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8323/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8323/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8323/console |


This message was automatically generated.

> forcemanual transitionToStandby in RM-HA automatic-failover mode should 
> change elector state
> 
>
> Key: YARN-3705
> URL: https://issues.apache.org/jira/browse/YARN-3705
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
> Attachments: YARN-3705.001.patch, YARN-3705.002.patch, 
> YARN-3705.003.patch
>
>
> Executing {{rmadmin -transitionToStandby --forcemanual}} in 
> automatic-failover.enabled mode makes ResouceManager standby while keeping 
> the state of ActiveStandbyElector. It should make elector to quit and rejoin 
> in order to enable other candidates to promote, otherwise forcemanual 
> transition should not be allowed in automatic-failover mode in order to avoid 
> confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3800) Simplify inmemory state for ReservationAllocation

2015-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14597189#comment-14597189
 ] 

Hadoop QA commented on YARN-3800:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  20m  5s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 7 new or modified test files. |
| {color:green}+1{color} | javac |  10m 39s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  12m  6s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 28s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m  0s | The applied patch generated  1 
new checkstyle issues (total was 54, now 49). |
| {color:green}+1{color} | whitespace |   0m  5s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 58s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 37s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 35s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  45m 33s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  94m 11s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | 
org.apache.hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer
 |
|   | org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart 
|
|   | 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
|   | 
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12741211/YARN-3800.003.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 99271b7 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8321/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8321/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8321/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8321/console |


This message was automatically generated.

> Simplify inmemory state for ReservationAllocation
> -
>
> Key: YARN-3800
> URL: https://issues.apache.org/jira/browse/YARN-3800
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Attachments: YARN-3800.001.patch, YARN-3800.002.patch, 
> YARN-3800.002.patch, YARN-3800.003.patch
>
>
> Instead of storing the ReservationRequest we store the Resource for 
> allocations, as thats the only thing we need. Ultimately we convert 
> everything to resources anyway



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3838) Rest API failing when ip configured in RM address in secure https mode

2015-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14597187#comment-14597187
 ] 

Hadoop QA commented on YARN-3838:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 52s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 33s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 41s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 34s | The applied patch generated  1 
new checkstyle issues (total was 39, now 40). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 2  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 24s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  22m  2s | Tests passed in 
hadoop-common. |
| {color:green}+1{color} | yarn tests |   1m 56s | Tests passed in 
hadoop-yarn-common. |
| | |  66m 50s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12740917/0001-YARN-3838.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 99271b7 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8322/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/8322/artifact/patchprocess/whitespace.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8322/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8322/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8322/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8322/console |


This message was automatically generated.

> Rest API failing when ip configured in RM address in secure https mode
> --
>
> Key: YARN-3838
> URL: https://issues.apache.org/jira/browse/YARN-3838
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Critical
> Attachments: 0001-HADOOP-12096.patch, 0001-YARN-3810.patch, 
> 0001-YARN-3838.patch, 0002-YARN-3810.patch
>
>
> Steps to reproduce
> ===
> 1.Configure hadoop.http.authentication.kerberos.principal as below
> {code:xml}
>   
> hadoop.http.authentication.kerberos.principal
> HTTP/_h...@hadoop.com
>   
> {code}
> 2. In RM web address also configure IP 
> 3. Startup RM 
> Call Rest API for RM  {{ curl -i -k  --insecure --negotiate -u : https IP 
> /ws/v1/cluster/info"}}
> *Actual*
> Rest API  failing
> {code}
> 2015-06-16 19:03:49,845 DEBUG 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter: 
> Authentication exception: GSSException: No valid credentials provided 
> (Mechanism level: Failed to find any Kerberos credentails)
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos credentails)
>   at 
> org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:399)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:348)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:519)
>   at 
> org.apache.hadoop.yarn.server.security.http.RMAuthenticationFilter.doFilter(RMAuthenticationFilter.java:82)
> {code}



--
This message was 

[jira] [Commented] (YARN-3705) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state

2015-06-22 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14597176#comment-14597176
 ] 

Masatake Iwasaki commented on YARN-3705:


bq. ResourceManager#handleTransitionToStandBy is expected to be used only when 
automatic failover enabled.

This was not true. It checks not {{isAutomaticFailoverEnabled}} but 
{{isHAEnabled}}. {{ResourceManager#handleTransitionToStandBy}} is no-op if 
{{RMContext#isHAEnabled}} is false.

{code}
  public void handleTransitionToStandBy() {
if (rmContext.isHAEnabled()) {
  try {
// Transition to standby and reinit active services
LOG.info("Transitioning RM to Standby mode");
transitionToStandby(true);
adminService.resetLeaderElection();
return;
  } catch (Exception e) {
LOG.fatal("Failed to transition RM to Standby mode.");
ExitUtil.terminate(1, e);
  }
}
  }
{code}

It seems strange that doing nothing in transitionToStandby if {{isHAEnable}} is 
false affects tests for HA...


> forcemanual transitionToStandby in RM-HA automatic-failover mode should 
> change elector state
> 
>
> Key: YARN-3705
> URL: https://issues.apache.org/jira/browse/YARN-3705
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
> Attachments: YARN-3705.001.patch, YARN-3705.002.patch, 
> YARN-3705.003.patch
>
>
> Executing {{rmadmin -transitionToStandby --forcemanual}} in 
> automatic-failover.enabled mode makes ResouceManager standby while keeping 
> the state of ActiveStandbyElector. It should make elector to quit and rejoin 
> in order to enable other candidates to promote, otherwise forcemanual 
> transition should not be allowed in automatic-failover mode in order to avoid 
> confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3841) [Storage implementation] Create HDFS backing storage implementation for ATS writes

2015-06-22 Thread Tsuyoshi Ozawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi Ozawa updated YARN-3841:
-
Summary: [Storage implementation] Create HDFS backing storage 
implementation for ATS writes  (was: [Storage abstraction] Create HDFS backing 
storage implementation for ATS writes)

> [Storage implementation] Create HDFS backing storage implementation for ATS 
> writes
> --
>
> Key: YARN-3841
> URL: https://issues.apache.org/jira/browse/YARN-3841
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Tsuyoshi Ozawa
>Assignee: Tsuyoshi Ozawa
>
> HDFS backing storage is useful for following scenarios.
> 1. For Hadoop clusters which don't run HBase.
> 2. For fallback from HBase when HBase cluster is temporary unavailable. 
> Quoting ATS design document of YARN-2928:
> {quote}
> In the case the HBase
> storage is not available, the plugin should buffer the writes temporarily 
> (e.g. HDFS), and flush
> them once the storage comes back online. Reading and writing to hdfs as the 
> the backup storage
> could potentially use the HDFS writer plugin unless the complexity of 
> generalizing the HDFS
> writer plugin for this purpose exceeds the benefits of reusing it here.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3705) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state

2015-06-22 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated YARN-3705:
---
Attachment: YARN-3705.003.patch

The test failure is relevant. ResourceManager#handleTransitionToStandBy is 
expected to be used only when automatic failover enabled. I am attaching 003 
addressing non automatic failover case too. 

> forcemanual transitionToStandby in RM-HA automatic-failover mode should 
> change elector state
> 
>
> Key: YARN-3705
> URL: https://issues.apache.org/jira/browse/YARN-3705
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
> Attachments: YARN-3705.001.patch, YARN-3705.002.patch, 
> YARN-3705.003.patch
>
>
> Executing {{rmadmin -transitionToStandby --forcemanual}} in 
> automatic-failover.enabled mode makes ResouceManager standby while keeping 
> the state of ActiveStandbyElector. It should make elector to quit and rejoin 
> in order to enable other candidates to promote, otherwise forcemanual 
> transition should not be allowed in automatic-failover mode in order to avoid 
> confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3800) Simplify inmemory state for ReservationAllocation

2015-06-22 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-3800:

Attachment: YARN-3800.003.patch

fixed checkstyle

> Simplify inmemory state for ReservationAllocation
> -
>
> Key: YARN-3800
> URL: https://issues.apache.org/jira/browse/YARN-3800
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Attachments: YARN-3800.001.patch, YARN-3800.002.patch, 
> YARN-3800.002.patch, YARN-3800.003.patch
>
>
> Instead of storing the ReservationRequest we store the Resource for 
> allocations, as thats the only thing we need. Ultimately we convert 
> everything to resources anyway



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3798) ZKRMStateStore shouldn't create new session without occurrance of SESSIONEXPIED

2015-06-22 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596997#comment-14596997
 ] 

Tsuyoshi Ozawa commented on YARN-3798:
--

After zk server closes the client, zk client in ZKRMStateStore will accept 
CONNECTIONLOSS and handle it without creating new session.

> ZKRMStateStore shouldn't create new session without occurrance of 
> SESSIONEXPIED
> ---
>
> Key: YARN-3798
> URL: https://issues.apache.org/jira/browse/YARN-3798
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
> Environment: Suse 11 Sp3
>Reporter: Bibin A Chundatt
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: RM.log, YARN-3798-2.7.002.patch, 
> YARN-3798-branch-2.7.002.patch, YARN-3798-branch-2.7.patch
>
>
> RM going down with NoNode exception during create of znode for appattempt
> *Please find the exception logs*
> {code}
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session connected
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session restored
> 2015-06-09 10:09:44,886 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> Exception while executing a ZK operation.
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1405)
>   at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:1310)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:926)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1101)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1122)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:937)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:970)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.updateApplicationAttemptStateInternal(ZKRMStateStore.java:671)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:275)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:260)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:837)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:900)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:895)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108)
>   at java.lang.Thread.run(Thread.java:745)
> 2015-06-09 10:09:44,887 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Maxed 
> out ZK retries. Giving up!
> 2015-06-09 10:09:44,887 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Error 
> updating appAttempt: appattempt_1433764310492_7152_01
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1405)
>   at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:1310)
>   at 
> org.apache.hadoop.yarn.server.resourcemana

[jira] [Commented] (YARN-3792) Test case failures in TestDistributedShell and some issue fixes related to ATSV2

2015-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596994#comment-14596994
 ] 

Hadoop QA commented on YARN-3792:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m 29s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 49s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  4s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 26s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 38s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 43s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 40s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   5m 59s | The patch appears to introduce 7 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   8m 10s | Tests passed in 
hadoop-yarn-applications-distributedshell. |
| {color:green}+1{color} | yarn tests |   1m 58s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   6m 11s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| {color:red}-1{color} | yarn tests |  51m 49s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| {color:green}+1{color} | yarn tests |   1m 17s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | | 115m 20s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12741171/YARN-3792-YARN-2928.004.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 8c036a1 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/8319/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-applications-distributedshell test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8319/artifact/patchprocess/testrun_hadoop-yarn-applications-distributedshell.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8319/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8319/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8319/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8319/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8319/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8319/console |


This message was automatically generated.

> Test case failures in TestDistributedShell and some issue fixes related to 
> ATSV2
> 
>
> Key: YARN-3792
> URL: https://issues.apache.org/jira/browse/YARN-3792
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
> Attachments: YARN-3792-YARN-2928.001.patch, 
> YARN-3792-YARN-2928.002.patch, YARN-3792-YARN-2928.003.patch, 
> YARN-3792-YARN-2928.004.patch
>
>
> # encountered [testcase 
> failures|https://builds.apache.org/job/PreCommit-YARN-Build/8233/testReport/] 
> which was happening even without the patch modifications in YARN-3044
> TestDistributedShell.testDSShellWithoutDomainV2CustomizedFlow
> TestDistributedShell.testDSShellWithoutDomainV2DefaultFlow
> TestDistributedShellWithNodeLabels.testDSShellWithNodeLabelExpression
> # Remove unused {{enableATSV1}} in testDisstributedShell
> # container metrics needs to be published only for v2 test cases of 
> testDisstributedShell
> # Nullpointer was 

[jira] [Commented] (YARN-3798) ZKRMStateStore shouldn't create new session without occurrance of SESSIONEXPIED

2015-06-22 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596990#comment-14596990
 ] 

Tsuyoshi Ozawa commented on YARN-3798:
--

[~vinodkv] the patch is only applied to branch-2.7 because ZKRMStateStrore of 
2.8 or later uses Apache Curator. I'm running test locally under 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager,
 so I'll report the result manually. Double checking is welcome.

> ZKRMStateStore shouldn't create new session without occurrance of 
> SESSIONEXPIED
> ---
>
> Key: YARN-3798
> URL: https://issues.apache.org/jira/browse/YARN-3798
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
> Environment: Suse 11 Sp3
>Reporter: Bibin A Chundatt
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: RM.log, YARN-3798-2.7.002.patch, 
> YARN-3798-branch-2.7.002.patch, YARN-3798-branch-2.7.patch
>
>
> RM going down with NoNode exception during create of znode for appattempt
> *Please find the exception logs*
> {code}
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session connected
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session restored
> 2015-06-09 10:09:44,886 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> Exception while executing a ZK operation.
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1405)
>   at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:1310)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:926)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1101)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1122)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:937)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:970)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.updateApplicationAttemptStateInternal(ZKRMStateStore.java:671)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:275)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:260)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:837)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:900)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:895)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108)
>   at java.lang.Thread.run(Thread.java:745)
> 2015-06-09 10:09:44,887 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Maxed 
> out ZK retries. Giving up!
> 2015-06-09 10:09:44,887 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Error 
> updating appAttempt: appattempt_1433764310492_7152_01
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.ZooKeepe

[jira] [Commented] (YARN-3705) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state

2015-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596989#comment-14596989
 ] 

Hadoop QA commented on YARN-3705:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 55s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 41s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 41s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 15s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 18s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |   5m 41s | Tests failed in 
hadoop-yarn-client. |
| {color:green}+1{color} | yarn tests |  50m 55s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  96m 54s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | 
org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12741178/YARN-3705.002.patch |
| Optional Tests | javac unit findbugs checkstyle javadoc |
| git revision | trunk / fac4e04 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8320/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8320/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8320/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8320/console |


This message was automatically generated.

> forcemanual transitionToStandby in RM-HA automatic-failover mode should 
> change elector state
> 
>
> Key: YARN-3705
> URL: https://issues.apache.org/jira/browse/YARN-3705
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
> Attachments: YARN-3705.001.patch, YARN-3705.002.patch
>
>
> Executing {{rmadmin -transitionToStandby --forcemanual}} in 
> automatic-failover.enabled mode makes ResouceManager standby while keeping 
> the state of ActiveStandbyElector. It should make elector to quit and rejoin 
> in order to enable other candidates to promote, otherwise forcemanual 
> transition should not be allowed in automatic-failover mode in order to avoid 
> confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3798) ZKRMStateStore shouldn't create new session without occurrance of SESSIONEXPIED

2015-06-22 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596987#comment-14596987
 ] 

Tsuyoshi Ozawa commented on YARN-3798:
--

[~zxu] In the case of SessionMovedException, I think zk client should retry to 
connect to another zk server with same session id automatically without 
creating new session. If we create new session for SessionMovedException, we'll 
face the same issue as Bibin and Varun reported. With new patch, 
SessionMovedException is handled in same session. After we get 
SessionMovedException, the zk client in ZKRMStateStore waits for passing 
specified period and retrying operations. At that time, zk server should detect 
the session has moved and close the client 
as a document for ZooKeeper mentions: 
http://zookeeper.apache.org/doc/r3.4.0/zookeeperProgrammers.html#ch_zkSessions
{quote}
When the delayed packet arrives at the first server, the old server detects 
that the session has moved, and closes the client connection.
{quote}

If this behaviour is not same as described, we should fix ZooKeeper.

> ZKRMStateStore shouldn't create new session without occurrance of 
> SESSIONEXPIED
> ---
>
> Key: YARN-3798
> URL: https://issues.apache.org/jira/browse/YARN-3798
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
> Environment: Suse 11 Sp3
>Reporter: Bibin A Chundatt
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: RM.log, YARN-3798-2.7.002.patch, 
> YARN-3798-branch-2.7.002.patch, YARN-3798-branch-2.7.patch
>
>
> RM going down with NoNode exception during create of znode for appattempt
> *Please find the exception logs*
> {code}
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session connected
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session restored
> 2015-06-09 10:09:44,886 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> Exception while executing a ZK operation.
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1405)
>   at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:1310)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:926)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1101)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1122)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:937)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:970)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.updateApplicationAttemptStateInternal(ZKRMStateStore.java:671)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:275)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:260)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:837)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:900)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:895)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175)
>   at 
> org.apache.hadoop.yarn.event.As

[jira] [Commented] (YARN-3835) hadoop-yarn-server-resourcemanager test package bundles core-site.xml, yarn-site.xml

2015-06-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596963#comment-14596963
 ] 

Hudson commented on YARN-3835:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8051 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8051/])
YARN-3835. hadoop-yarn-server-resourcemanager test package bundles 
core-site.xml, yarn-site.xml (vamsee via rkanter) (rkanter: rev 
99271b762129d78c86f3c9733a24c77962b0b3f7)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/pom.xml


> hadoop-yarn-server-resourcemanager test package bundles core-site.xml, 
> yarn-site.xml
> 
>
> Key: YARN-3835
> URL: https://issues.apache.org/jira/browse/YARN-3835
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Vamsee Yarlagadda
>Assignee: Vamsee Yarlagadda
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: YARN-3835.patch
>
>
> It looks like by default yarn is bundling core-site.xml, yarn-site.xml in 
> test artifact of hadoop-yarn-server-resourcemanager which means that any 
> downstream project which uses this a dependency can have a problem in picking 
> up the user supplied/environment supplied core-site.xml, yarn-site.xml
> So we should ideally exclude these .xml files from being bundled into the 
> test-jar. (Similar to YARN-1748)
> I also proactively looked at other YARN modules where this might be 
> happening. 
> {code}
> vamsee-MBP:hadoop-yarn-project vamsee$ find . -name "*-site.xml"
> ./hadoop-yarn/conf/yarn-site.xml
> ./hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/resources/yarn-site.xml
> ./hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/src/test/resources/yarn-site.xml
> ./hadoop-yarn/hadoop-yarn-client/src/test/resources/core-site.xml
> ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/core-site.xml
> ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/core-site.xml
> ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site.xml
> ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/target/test-classes/core-site.xml
> ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/target/test-classes/yarn-site.xml
> ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/resources/core-site.xml
> {code}
> And out of these only two modules (hadoop-yarn-server-resourcemanager, 
> hadoop-yarn-server-tests) are building test-jars. In future, if we start 
> building test-jar of other modules, we should exclude these xml files from 
> being bundled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3800) Simplify inmemory state for ReservationAllocation

2015-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596957#comment-14596957
 ] 

Hadoop QA commented on YARN-3800:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m  6s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 7 new or modified test files. |
| {color:green}+1{color} | javac |   7m 37s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 47s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 50s | The applied patch generated  7 
new checkstyle issues (total was 55, now 56). |
| {color:green}+1{color} | whitespace |   0m  4s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 27s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  51m  0s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  89m 29s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService
 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12741165/YARN-3800.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / fac4e04 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8318/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8318/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8318/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8318/console |


This message was automatically generated.

> Simplify inmemory state for ReservationAllocation
> -
>
> Key: YARN-3800
> URL: https://issues.apache.org/jira/browse/YARN-3800
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Attachments: YARN-3800.001.patch, YARN-3800.002.patch, 
> YARN-3800.002.patch
>
>
> Instead of storing the ReservationRequest we store the Resource for 
> allocations, as thats the only thing we need. Ultimately we convert 
> everything to resources anyway



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3842) NMProxy should retry on NMNotYetReadyException

2015-06-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596946#comment-14596946
 ] 

Hudson commented on YARN-3842:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8050 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8050/])
YARN-3842. NMProxy should retry on NMNotYetReadyException. (Robert Kanter via 
kasha) (kasha: rev 5ebf2817e58e1be8214dc1916a694a912075aa0a)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestNMProxy.java
* hadoop-yarn-project/CHANGES.txt


> NMProxy should retry on NMNotYetReadyException
> --
>
> Key: YARN-3842
> URL: https://issues.apache.org/jira/browse/YARN-3842
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Karthik Kambatla
>Assignee: Robert Kanter
>Priority: Critical
> Fix For: 2.7.1
>
> Attachments: MAPREDUCE-6409.001.patch, MAPREDUCE-6409.002.patch, 
> YARN-3842.001.patch, YARN-3842.002.patch
>
>
> Consider the following scenario:
> 1. RM assigns a container on node N to an app A.
> 2. Node N is restarted
> 3. A tries to launch container on node N.
> 3 could lead to an NMNotYetReadyException depending on whether NM N has 
> registered with the RM. In MR, this is considered a task attempt failure. A 
> few of these could lead to a task/job failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3748) Cleanup Findbugs volatile warnings

2015-06-22 Thread Gabor Liptak (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596936#comment-14596936
 ] 

Gabor Liptak commented on YARN-3748:


Any other changes needed before this can be considered for commit? Thanks

> Cleanup Findbugs volatile warnings
> --
>
> Key: YARN-3748
> URL: https://issues.apache.org/jira/browse/YARN-3748
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Gabor Liptak
>Priority: Minor
> Attachments: YARN-3748.1.patch, YARN-3748.2.patch, YARN-3748.3.patch, 
> YARN-3748.5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3842) NMProxy should retry on NMNotYetReadyException

2015-06-22 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-3842:
---
Summary: NMProxy should retry on NMNotYetReadyException  (was: NM restarts 
could lead to app failures)

> NMProxy should retry on NMNotYetReadyException
> --
>
> Key: YARN-3842
> URL: https://issues.apache.org/jira/browse/YARN-3842
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Karthik Kambatla
>Assignee: Robert Kanter
>Priority: Critical
> Attachments: MAPREDUCE-6409.001.patch, MAPREDUCE-6409.002.patch, 
> YARN-3842.001.patch, YARN-3842.002.patch
>
>
> Consider the following scenario:
> 1. RM assigns a container on node N to an app A.
> 2. Node N is restarted
> 3. A tries to launch container on node N.
> 3 could lead to an NMNotYetReadyException depending on whether NM N has 
> registered with the RM. In MR, this is considered a task attempt failure. A 
> few of these could lead to a task/job failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3792) Test case failures in TestDistributedShell and some issue fixes related to ATSV2

2015-06-22 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596925#comment-14596925
 ] 

Sangjin Lee commented on YARN-3792:
---

The latest patch LGTM. Once the jenkins comes back, I'll go ahead and merge it.

Folks, do let me know soon if you have any other feedback. Thanks!

> Test case failures in TestDistributedShell and some issue fixes related to 
> ATSV2
> 
>
> Key: YARN-3792
> URL: https://issues.apache.org/jira/browse/YARN-3792
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
> Attachments: YARN-3792-YARN-2928.001.patch, 
> YARN-3792-YARN-2928.002.patch, YARN-3792-YARN-2928.003.patch, 
> YARN-3792-YARN-2928.004.patch
>
>
> # encountered [testcase 
> failures|https://builds.apache.org/job/PreCommit-YARN-Build/8233/testReport/] 
> which was happening even without the patch modifications in YARN-3044
> TestDistributedShell.testDSShellWithoutDomainV2CustomizedFlow
> TestDistributedShell.testDSShellWithoutDomainV2DefaultFlow
> TestDistributedShellWithNodeLabels.testDSShellWithNodeLabelExpression
> # Remove unused {{enableATSV1}} in testDisstributedShell
> # container metrics needs to be published only for v2 test cases of 
> testDisstributedShell
> # Nullpointer was thrown in TimelineClientImpl.constructResURI when Aux 
> service was not configured and {{TimelineClient.putObjects}} was getting 
> invoked.
> # Race condition for the Application events to published and test case 
> verification for RM's ApplicationFinished Timeline Events
> # Application Tags for converted to lowercase in 
> ApplicationSubmissionContextPBimpl, hence RMTimelinecollector was not able to 
> detect to custom flow details of the app



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2801) Documentation development for Node labels requirment

2015-06-22 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596910#comment-14596910
 ] 

Naganarasimha G R commented on YARN-2801:
-

Hi [~leftnoteasy],
After escaping the links, seems like its getting applied. Few nits :
* ??User need configure how many resources?? => {{User need configure how much 
resource  of each partition}}
* points in note after configuration section needs to come as list
* ??application can use following Java APIs?? =>  ??Application can use 
following Java APIs??

Apart from it others seems to be fine !

> Documentation development for Node labels requirment
> 
>
> Key: YARN-2801
> URL: https://issues.apache.org/jira/browse/YARN-2801
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Gururaj Shetty
>Assignee: Wangda Tan
> Attachments: YARN-2801.1.patch, YARN-2801.2.patch
>
>
> Documentation needs to be developed for the node label requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3635) Get-queue-mapping should be a common interface of YarnScheduler

2015-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596894#comment-14596894
 ] 

Hadoop QA commented on YARN-3635:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 25s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   7m 55s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  0s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 46s | The applied patch generated  
18 new checkstyle issues (total was 204, now 215). |
| {color:red}-1{color} | whitespace |   0m  3s | The patch has 15  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 40s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   1m 30s | The patch appears to introduce 3 
new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  50m 14s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  89m 34s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12741145/YARN-3635.5.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 077250d |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8316/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/8316/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/8316/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8316/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8316/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8316/console |


This message was automatically generated.

> Get-queue-mapping should be a common interface of YarnScheduler
> ---
>
> Key: YARN-3635
> URL: https://issues.apache.org/jira/browse/YARN-3635
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3635.1.patch, YARN-3635.2.patch, YARN-3635.3.patch, 
> YARN-3635.4.patch, YARN-3635.5.patch
>
>
> Currently, both of fair/capacity scheduler support queue mapping, which makes 
> scheduler can change queue of an application after submitted to scheduler.
> One issue of doing this in specific scheduler is: If the queue after mapping 
> has different maximum_allocation/default-node-label-expression of the 
> original queue, {{validateAndCreateResourceRequest}} in RMAppManager checks 
> the wrong queue.
> I propose to make the queue mapping as a common interface of scheduler, and 
> RMAppManager set the queue after mapping before doing validations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3705) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state

2015-06-22 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated YARN-3705:
---
Attachment: YARN-3705.002.patch

I'm attached 002 addressing whitespace warnings. TestWorkPreservingRMRestart is 
not related to the code path the patch fixes.

> forcemanual transitionToStandby in RM-HA automatic-failover mode should 
> change elector state
> 
>
> Key: YARN-3705
> URL: https://issues.apache.org/jira/browse/YARN-3705
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
> Attachments: YARN-3705.001.patch, YARN-3705.002.patch
>
>
> Executing {{rmadmin -transitionToStandby --forcemanual}} in 
> automatic-failover.enabled mode makes ResouceManager standby while keeping 
> the state of ActiveStandbyElector. It should make elector to quit and rejoin 
> in order to enable other candidates to promote, otherwise forcemanual 
> transition should not be allowed in automatic-failover mode in order to avoid 
> confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2801) Documentation development for Node labels requirment

2015-06-22 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596886#comment-14596886
 ] 

Naganarasimha G R commented on YARN-2801:
-

hi [~leftnoteasy], seems like after applying the patch mvn site is failing 

> Documentation development for Node labels requirment
> 
>
> Key: YARN-2801
> URL: https://issues.apache.org/jira/browse/YARN-2801
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Gururaj Shetty
>Assignee: Wangda Tan
> Attachments: YARN-2801.1.patch, YARN-2801.2.patch
>
>
> Documentation needs to be developed for the node label requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2801) Documentation development for Node labels requirment

2015-06-22 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-2801:

Assignee: Wangda Tan  (was: Naganarasimha G R)

> Documentation development for Node labels requirment
> 
>
> Key: YARN-2801
> URL: https://issues.apache.org/jira/browse/YARN-2801
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Gururaj Shetty
>Assignee: Wangda Tan
> Attachments: YARN-2801.1.patch, YARN-2801.2.patch
>
>
> Documentation needs to be developed for the node label requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-2801) Documentation development for Node labels requirment

2015-06-22 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R reassigned YARN-2801:
---

Assignee: Naganarasimha G R  (was: Wangda Tan)

> Documentation development for Node labels requirment
> 
>
> Key: YARN-2801
> URL: https://issues.apache.org/jira/browse/YARN-2801
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Gururaj Shetty
>Assignee: Naganarasimha G R
> Attachments: YARN-2801.1.patch, YARN-2801.2.patch
>
>
> Documentation needs to be developed for the node label requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3792) Test case failures in TestDistributedShell and some issue fixes related to ATSV2

2015-06-22 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-3792:

Attachment: YARN-3792-YARN-2928.004.patch

Hi [~sjlee0], 
Corrected whitespace and  findbugs issue in Client.java attaching a patch for 
it, and the remaining seems to be not a problemif not unnecessary checks needs 
to be done.

> Test case failures in TestDistributedShell and some issue fixes related to 
> ATSV2
> 
>
> Key: YARN-3792
> URL: https://issues.apache.org/jira/browse/YARN-3792
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
> Attachments: YARN-3792-YARN-2928.001.patch, 
> YARN-3792-YARN-2928.002.patch, YARN-3792-YARN-2928.003.patch, 
> YARN-3792-YARN-2928.004.patch
>
>
> # encountered [testcase 
> failures|https://builds.apache.org/job/PreCommit-YARN-Build/8233/testReport/] 
> which was happening even without the patch modifications in YARN-3044
> TestDistributedShell.testDSShellWithoutDomainV2CustomizedFlow
> TestDistributedShell.testDSShellWithoutDomainV2DefaultFlow
> TestDistributedShellWithNodeLabels.testDSShellWithNodeLabelExpression
> # Remove unused {{enableATSV1}} in testDisstributedShell
> # container metrics needs to be published only for v2 test cases of 
> testDisstributedShell
> # Nullpointer was thrown in TimelineClientImpl.constructResURI when Aux 
> service was not configured and {{TimelineClient.putObjects}} was getting 
> invoked.
> # Race condition for the Application events to published and test case 
> verification for RM's ApplicationFinished Timeline Events
> # Application Tags for converted to lowercase in 
> ApplicationSubmissionContextPBimpl, hence RMTimelinecollector was not able to 
> detect to custom flow details of the app



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3842) NM restarts could lead to app failures

2015-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596867#comment-14596867
 ] 

Hadoop QA commented on YARN-3842:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 16s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 38s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 28s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 47s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   1m 56s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   6m 27s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  49m 43s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12741154/YARN-3842.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 077250d |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8317/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8317/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8317/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8317/console |


This message was automatically generated.

> NM restarts could lead to app failures
> --
>
> Key: YARN-3842
> URL: https://issues.apache.org/jira/browse/YARN-3842
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Karthik Kambatla
>Assignee: Robert Kanter
>Priority: Critical
> Attachments: MAPREDUCE-6409.001.patch, MAPREDUCE-6409.002.patch, 
> YARN-3842.001.patch, YARN-3842.002.patch
>
>
> Consider the following scenario:
> 1. RM assigns a container on node N to an app A.
> 2. Node N is restarted
> 3. A tries to launch container on node N.
> 3 could lead to an NMNotYetReadyException depending on whether NM N has 
> registered with the RM. In MR, this is considered a task attempt failure. A 
> few of these could lead to a task/job failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-3843) Fair Scheduler should not accept apps with space keys as queue name

2015-06-22 Thread Dongwook Kwon (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongwook Kwon resolved YARN-3843.
-
  Resolution: Duplicate
   Fix Version/s: 2.8.0
Target Version/s: 2.8.0

> Fair Scheduler should not accept apps with space keys as queue name
> ---
>
> Key: YARN-3843
> URL: https://issues.apache.org/jira/browse/YARN-3843
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.4.0, 2.5.0
>Reporter: Dongwook Kwon
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: YARN-3843.01.patch
>
>
> As YARN-461, since empty string queue name is not valid, queue name with 
> space keys such as " " ,"   " should not be accepted either, also not as 
> prefix nor postfix. 
> e.g) "root.test.queuename  ", or "root.test. queuename"
> I have 2 specific cases kill RM with these space keys as part of queue name.
> 1) Without placement policy (hadoop 2.4.0 and above), 
> When a job is submitted with " "(space key) as queue name
> e.g) mapreduce.job.queuename=" "
> 2) With placement policy (hadoop 2.5.0 and above)
>  Once a job is submitted without space key as queue name, and submit another 
> job with space key.
> e.g) 1st time: mapreduce.job.queuename="root.test.user1" 
> 2nd time: mapreduce.job.queuename="root.test.user1 "
> {code}
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.974 sec <<< 
> FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
> testQueueNameWithSpace(org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler)
>   Time elapsed: 0.724 sec  <<< ERROR!
> org.apache.hadoop.metrics2.MetricsException: Metrics source 
> QueueMetrics,q0=root,q1=adhoc,q2=birvine already exists!
>   at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:135)
>   at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:112)
>   at 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:218)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueueMetrics.forQueue(FSQueueMetrics.java:96)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueue.(FSQueue.java:56)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.(FSLeafQueue.java:66)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.createQueue(QueueManager.java:169)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getQueue(QueueManager.java:120)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getLeafQueue(QueueManager.java:88)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.assignToQueue(FairScheduler.java:660)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplication(FairScheduler.java:569)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1127)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testQueueNameWithSpace(TestFairScheduler.java:627)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3843) Fair Scheduler should not accept apps with space keys as queue name

2015-06-22 Thread Dongwook Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596858#comment-14596858
 ] 

Dongwook Kwon commented on YARN-3843:
-

Thanks, you're right, it's duplicated. I didn't find the other jira case, I 
will close it.

> Fair Scheduler should not accept apps with space keys as queue name
> ---
>
> Key: YARN-3843
> URL: https://issues.apache.org/jira/browse/YARN-3843
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.4.0, 2.5.0
>Reporter: Dongwook Kwon
>Priority: Minor
> Attachments: YARN-3843.01.patch
>
>
> As YARN-461, since empty string queue name is not valid, queue name with 
> space keys such as " " ,"   " should not be accepted either, also not as 
> prefix nor postfix. 
> e.g) "root.test.queuename  ", or "root.test. queuename"
> I have 2 specific cases kill RM with these space keys as part of queue name.
> 1) Without placement policy (hadoop 2.4.0 and above), 
> When a job is submitted with " "(space key) as queue name
> e.g) mapreduce.job.queuename=" "
> 2) With placement policy (hadoop 2.5.0 and above)
>  Once a job is submitted without space key as queue name, and submit another 
> job with space key.
> e.g) 1st time: mapreduce.job.queuename="root.test.user1" 
> 2nd time: mapreduce.job.queuename="root.test.user1 "
> {code}
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.974 sec <<< 
> FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
> testQueueNameWithSpace(org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler)
>   Time elapsed: 0.724 sec  <<< ERROR!
> org.apache.hadoop.metrics2.MetricsException: Metrics source 
> QueueMetrics,q0=root,q1=adhoc,q2=birvine already exists!
>   at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:135)
>   at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:112)
>   at 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:218)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueueMetrics.forQueue(FSQueueMetrics.java:96)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueue.(FSQueue.java:56)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.(FSLeafQueue.java:66)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.createQueue(QueueManager.java:169)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getQueue(QueueManager.java:120)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getLeafQueue(QueueManager.java:88)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.assignToQueue(FairScheduler.java:660)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplication(FairScheduler.java:569)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1127)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testQueueNameWithSpace(TestFairScheduler.java:627)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3843) Fair Scheduler should not accept apps with space keys as queue name

2015-06-22 Thread Dongwook Kwon (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongwook Kwon updated YARN-3843:

Attachment: YARN-3843.01.patch

> Fair Scheduler should not accept apps with space keys as queue name
> ---
>
> Key: YARN-3843
> URL: https://issues.apache.org/jira/browse/YARN-3843
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.4.0, 2.5.0
>Reporter: Dongwook Kwon
>Priority: Minor
> Attachments: YARN-3843.01.patch
>
>
> As YARN-461, since empty string queue name is not valid, queue name with 
> space keys such as " " ,"   " should not be accepted either, also not as 
> prefix nor postfix. 
> e.g) "root.test.queuename  ", or "root.test. queuename"
> I have 2 specific cases kill RM with these space keys as part of queue name.
> 1) Without placement policy (hadoop 2.4.0 and above), 
> When a job is submitted with " "(space key) as queue name
> e.g) mapreduce.job.queuename=" "
> 2) With placement policy (hadoop 2.5.0 and above)
>  Once a job is submitted without space key as queue name, and submit another 
> job with space key.
> e.g) 1st time: mapreduce.job.queuename="root.test.user1" 
> 2nd time: mapreduce.job.queuename="root.test.user1 "
> {code}
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.974 sec <<< 
> FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
> testQueueNameWithSpace(org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler)
>   Time elapsed: 0.724 sec  <<< ERROR!
> org.apache.hadoop.metrics2.MetricsException: Metrics source 
> QueueMetrics,q0=root,q1=adhoc,q2=birvine already exists!
>   at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:135)
>   at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:112)
>   at 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:218)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueueMetrics.forQueue(FSQueueMetrics.java:96)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueue.(FSQueue.java:56)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.(FSLeafQueue.java:66)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.createQueue(QueueManager.java:169)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getQueue(QueueManager.java:120)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getLeafQueue(QueueManager.java:88)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.assignToQueue(FairScheduler.java:660)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplication(FairScheduler.java:569)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1127)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testQueueNameWithSpace(TestFairScheduler.java:627)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3843) Fair Scheduler should not accept apps with space keys as queue name

2015-06-22 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596848#comment-14596848
 ] 

zhihai xu commented on YARN-3843:
-

Hi [~dongwook], thanks for reporting this issue. I think this issue was fixed 
at YARN-3241.

> Fair Scheduler should not accept apps with space keys as queue name
> ---
>
> Key: YARN-3843
> URL: https://issues.apache.org/jira/browse/YARN-3843
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.4.0, 2.5.0
>Reporter: Dongwook Kwon
>Priority: Minor
>
> As YARN-461, since empty string queue name is not valid, queue name with 
> space keys such as " " ,"   " should not be accepted either, also not as 
> prefix nor postfix. 
> e.g) "root.test.queuename  ", or "root.test. queuename"
> I have 2 specific cases kill RM with these space keys as part of queue name.
> 1) Without placement policy (hadoop 2.4.0 and above), 
> When a job is submitted with " "(space key) as queue name
> e.g) mapreduce.job.queuename=" "
> 2) With placement policy (hadoop 2.5.0 and above)
>  Once a job is submitted without space key as queue name, and submit another 
> job with space key.
> e.g) 1st time: mapreduce.job.queuename="root.test.user1" 
> 2nd time: mapreduce.job.queuename="root.test.user1 "
> {code}
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.974 sec <<< 
> FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
> testQueueNameWithSpace(org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler)
>   Time elapsed: 0.724 sec  <<< ERROR!
> org.apache.hadoop.metrics2.MetricsException: Metrics source 
> QueueMetrics,q0=root,q1=adhoc,q2=birvine already exists!
>   at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:135)
>   at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:112)
>   at 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:218)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueueMetrics.forQueue(FSQueueMetrics.java:96)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueue.(FSQueue.java:56)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.(FSLeafQueue.java:66)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.createQueue(QueueManager.java:169)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getQueue(QueueManager.java:120)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getLeafQueue(QueueManager.java:88)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.assignToQueue(FairScheduler.java:660)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplication(FairScheduler.java:569)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1127)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testQueueNameWithSpace(TestFairScheduler.java:627)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3843) Fair Scheduler should not accept apps with space keys as queue name

2015-06-22 Thread Dongwook Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596842#comment-14596842
 ] 

Dongwook Kwon commented on YARN-3843:
-

>From my investigation, QueueMetrics doesn't allow space key string as start or 
>end of names, it just trims empty strings.

static final Splitter Q_SPLITTER = 
Splitter.on('.').omitEmptyStrings().trimResults();

https://github.com/apache/hadoop/blob/branch-2.5.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/QueueMetrics.java#L112
https://github.com/apache/hadoop/blob/branch-2.5.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/QueueMetrics.java#L85

So, from FairScheduler, "root.adhoc.birvine ", this queue name with the space 
at the end of name, it is treated as different from "root.adhoc.birvine" 
because it has one more character, and from QueueMetrics, because names are 
trimmed, all of sudden, 2 different queue names become the same that causes the 
error as "Metrics source QueueMetrics,q0=root,q1=adhoc,q2=birvine already 
exists!"


> Fair Scheduler should not accept apps with space keys as queue name
> ---
>
> Key: YARN-3843
> URL: https://issues.apache.org/jira/browse/YARN-3843
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.4.0, 2.5.0
>Reporter: Dongwook Kwon
>Priority: Minor
>
> As YARN-461, since empty string queue name is not valid, queue name with 
> space keys such as " " ,"   " should not be accepted either, also not as 
> prefix nor postfix. 
> e.g) "root.test.queuename  ", or "root.test. queuename"
> I have 2 specific cases kill RM with these space keys as part of queue name.
> 1) Without placement policy (hadoop 2.4.0 and above), 
> When a job is submitted with " "(space key) as queue name
> e.g) mapreduce.job.queuename=" "
> 2) With placement policy (hadoop 2.5.0 and above)
>  Once a job is submitted without space key as queue name, and submit another 
> job with space key.
> e.g) 1st time: mapreduce.job.queuename="root.test.user1" 
> 2nd time: mapreduce.job.queuename="root.test.user1 "
> {code}
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.974 sec <<< 
> FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
> testQueueNameWithSpace(org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler)
>   Time elapsed: 0.724 sec  <<< ERROR!
> org.apache.hadoop.metrics2.MetricsException: Metrics source 
> QueueMetrics,q0=root,q1=adhoc,q2=birvine already exists!
>   at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:135)
>   at 
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:112)
>   at 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:218)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueueMetrics.forQueue(FSQueueMetrics.java:96)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueue.(FSQueue.java:56)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.(FSLeafQueue.java:66)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.createQueue(QueueManager.java:169)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getQueue(QueueManager.java:120)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getLeafQueue(QueueManager.java:88)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.assignToQueue(FairScheduler.java:660)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplication(FairScheduler.java:569)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1127)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testQueueNameWithSpace(TestFairScheduler.java:627)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3800) Simplify inmemory state for ReservationAllocation

2015-06-22 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-3800:

Attachment: YARN-3800.002.patch

Addressed feedback

> Simplify inmemory state for ReservationAllocation
> -
>
> Key: YARN-3800
> URL: https://issues.apache.org/jira/browse/YARN-3800
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Attachments: YARN-3800.001.patch, YARN-3800.002.patch, 
> YARN-3800.002.patch
>
>
> Instead of storing the ReservationRequest we store the Resource for 
> allocations, as thats the only thing we need. Ultimately we convert 
> everything to resources anyway



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3843) Fair Scheduler should not accept apps with space keys as queue name

2015-06-22 Thread Dongwook Kwon (JIRA)
Dongwook Kwon created YARN-3843:
---

 Summary: Fair Scheduler should not accept apps with space keys as 
queue name
 Key: YARN-3843
 URL: https://issues.apache.org/jira/browse/YARN-3843
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0, 2.4.0
Reporter: Dongwook Kwon
Priority: Minor


As YARN-461, since empty string queue name is not valid, queue name with space 
keys such as " " ,"   " should not be accepted either, also not as prefix nor 
postfix. 
e.g) "root.test.queuename  ", or "root.test. queuename"

I have 2 specific cases kill RM with these space keys as part of queue name.
1) Without placement policy (hadoop 2.4.0 and above), 
When a job is submitted with " "(space key) as queue name
e.g) mapreduce.job.queuename=" "

2) With placement policy (hadoop 2.5.0 and above)
 Once a job is submitted without space key as queue name, and submit another 
job with space key.
e.g) 1st time: mapreduce.job.queuename="root.test.user1" 
2nd time: mapreduce.job.queuename="root.test.user1 "

{code}
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.974 sec <<< 
FAILURE! - in 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
testQueueNameWithSpace(org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler)
  Time elapsed: 0.724 sec  <<< ERROR!
org.apache.hadoop.metrics2.MetricsException: Metrics source 
QueueMetrics,q0=root,q1=adhoc,q2=birvine already exists!
at 
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:135)
at 
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:112)
at 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:218)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueueMetrics.forQueue(FSQueueMetrics.java:96)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueue.(FSQueue.java:56)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.(FSLeafQueue.java:66)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.createQueue(QueueManager.java:169)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getQueue(QueueManager.java:120)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getLeafQueue(QueueManager.java:88)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.assignToQueue(FairScheduler.java:660)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplication(FairScheduler.java:569)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1127)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testQueueNameWithSpace(TestFairScheduler.java:627)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2862) RM might not start if the machine was hard shutdown and FileSystemRMStateStore was used

2015-06-22 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596837#comment-14596837
 ] 

Ming Ma commented on YARN-2862:
---

Thanks, [~rohithsharma] and [~leftnoteasy]. Yes, YARN-3410 will be useful. So 
admins still need to look through RM logs to identify those apps. Will it be 
useful to provide a new RM startup option to delete or skip such apps 
automatically?

> RM might not start if the machine was hard shutdown and 
> FileSystemRMStateStore was used
> ---
>
> Key: YARN-2862
> URL: https://issues.apache.org/jira/browse/YARN-2862
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ming Ma
>
> This might be a known issue. Given FileSystemRMStateStore isn't used for HA 
> scenario, it might not be that important, unless there is something we need 
> to fix at RM layer to make it more tolerant to RMStore issue.
> When RM was hard shutdown, OS might not get a chance to persist blocks. Some 
> of the stored application data end up with size zero after reboot. And RM 
> didn't like that.
> {noformat}
> ls -al 
> /var/log/hadoop/rmstore/FSRMStateRoot/RMAppRoot/application_1412702189634_324351
> total 156
> drwxr-xr-x.2 x y   4096 Nov 13 16:45 .
> drwxr-xr-x. 1524 x y 151552 Nov 13 16:45 ..
> -rw-r--r--.1 x y  0 Nov 13 16:45 
> appattempt_1412702189634_324351_01
> -rw-r--r--.1 x y  0 Nov 13 16:45 
> .appattempt_1412702189634_324351_01.crc
> -rw-r--r--.1 x y  0 Nov 13 16:45 application_1412702189634_324351
> -rw-r--r--.1 x y  0 Nov 13 16:45 .application_1412702189634_324351.crc
> {noformat}
> When RM starts up
> {noformat}
> 2014-11-13 16:55:25,844 WARN org.apache.hadoop.fs.FSInputChecker: Problem 
> opening checksum file: 
> file:/var/log/hadoop/rmstore/FSRMStateRoot/RMAppRoot/application_1412702189634_324351/application_1412702189634_324351.
>   Ignoring exception:
> java.io.EOFException
> at java.io.DataInputStream.readFully(DataInputStream.java:197)
> at java.io.DataInputStream.readFully(DataInputStream.java:169)
> at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:146)
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:339)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:792)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.readFile(FileSystemRMStateStore.java:501)
> ...
> 2014-11-13 17:40:48,876 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Failed to 
> load/recover state
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ApplicationState.getAppId(RMStateStore.java:184)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:306)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:425)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1027)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:484)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:834)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3842) NM restarts could lead to app failures

2015-06-22 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596783#comment-14596783
 ] 

Karthik Kambatla commented on YARN-3842:


+1, pending Jenkins. 

Thanks for your review, [~jianhe]. I ll go ahead commit this if Jenkins is fine 
with it. 

> NM restarts could lead to app failures
> --
>
> Key: YARN-3842
> URL: https://issues.apache.org/jira/browse/YARN-3842
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Karthik Kambatla
>Assignee: Robert Kanter
>Priority: Critical
> Attachments: MAPREDUCE-6409.001.patch, MAPREDUCE-6409.002.patch, 
> YARN-3842.001.patch, YARN-3842.002.patch
>
>
> Consider the following scenario:
> 1. RM assigns a container on node N to an app A.
> 2. Node N is restarted
> 3. A tries to launch container on node N.
> 3 could lead to an NMNotYetReadyException depending on whether NM N has 
> registered with the RM. In MR, this is considered a task attempt failure. A 
> few of these could lead to a task/job failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3842) NM restarts could lead to app failures

2015-06-22 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596776#comment-14596776
 ] 

Jian He commented on YARN-3842:
---

I think the latest patch is safe for 2.7.1,  +1

> NM restarts could lead to app failures
> --
>
> Key: YARN-3842
> URL: https://issues.apache.org/jira/browse/YARN-3842
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Karthik Kambatla
>Assignee: Robert Kanter
>Priority: Critical
> Attachments: MAPREDUCE-6409.001.patch, MAPREDUCE-6409.002.patch, 
> YARN-3842.001.patch, YARN-3842.002.patch
>
>
> Consider the following scenario:
> 1. RM assigns a container on node N to an app A.
> 2. Node N is restarted
> 3. A tries to launch container on node N.
> 3 could lead to an NMNotYetReadyException depending on whether NM N has 
> registered with the RM. In MR, this is considered a task attempt failure. A 
> few of these could lead to a task/job failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3842) NM restarts could lead to app failures

2015-06-22 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-3842:

Attachment: YARN-3842.002.patch

The new patch make the changes Karthik suggested.  I also added a few comments 
and renamed {{isExpectingNMNotYetReadyException}} to 
{{shouldThrowNMNotYetReadyException}} for clarity.

> NM restarts could lead to app failures
> --
>
> Key: YARN-3842
> URL: https://issues.apache.org/jira/browse/YARN-3842
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Karthik Kambatla
>Assignee: Robert Kanter
>Priority: Critical
> Attachments: MAPREDUCE-6409.001.patch, MAPREDUCE-6409.002.patch, 
> YARN-3842.001.patch, YARN-3842.002.patch
>
>
> Consider the following scenario:
> 1. RM assigns a container on node N to an app A.
> 2. Node N is restarted
> 3. A tries to launch container on node N.
> 3 could lead to an NMNotYetReadyException depending on whether NM N has 
> registered with the RM. In MR, this is considered a task attempt failure. A 
> few of these could lead to a task/job failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3842) NM restarts could lead to app failures

2015-06-22 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596765#comment-14596765
 ] 

Robert Kanter commented on YARN-3842:
-

I had sort of just split {{startContainers}} into two sections (one for each 
part of the test), but this is a lot more concise.  I'll do that.

> NM restarts could lead to app failures
> --
>
> Key: YARN-3842
> URL: https://issues.apache.org/jira/browse/YARN-3842
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Karthik Kambatla
>Assignee: Robert Kanter
>Priority: Critical
> Attachments: MAPREDUCE-6409.001.patch, MAPREDUCE-6409.002.patch, 
> YARN-3842.001.patch
>
>
> Consider the following scenario:
> 1. RM assigns a container on node N to an app A.
> 2. Node N is restarted
> 3. A tries to launch container on node N.
> 3 could lead to an NMNotYetReadyException depending on whether NM N has 
> registered with the RM. In MR, this is considered a task attempt failure. A 
> few of these could lead to a task/job failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3842) NM restarts could lead to app failures

2015-06-22 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596756#comment-14596756
 ] 

Karthik Kambatla commented on YARN-3842:


Thanks for the quick turnaround on this, Robert. 

One nit-pick on the test: would the following be more concise? 

{code}
if (retryCount < 5) {
  retryCount++;
  if (isExpectingNMNotYetReadyException) {
containerManager.setBlockNewContainerRequests(true);
  } else {
throw new java.net.ConnectException("start container exception");
  }
} else {
  containerManager.setBlockNewContainerRequests(false);
}
return super.startContainers(requests);
{code}

> NM restarts could lead to app failures
> --
>
> Key: YARN-3842
> URL: https://issues.apache.org/jira/browse/YARN-3842
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Karthik Kambatla
>Assignee: Robert Kanter
>Priority: Critical
> Attachments: MAPREDUCE-6409.001.patch, MAPREDUCE-6409.002.patch, 
> YARN-3842.001.patch
>
>
> Consider the following scenario:
> 1. RM assigns a container on node N to an app A.
> 2. Node N is restarted
> 3. A tries to launch container on node N.
> 3 could lead to an NMNotYetReadyException depending on whether NM N has 
> registered with the RM. In MR, this is considered a task attempt failure. A 
> few of these could lead to a task/job failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3635) Get-queue-mapping should be a common interface of YarnScheduler

2015-06-22 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3635:
-
Attachment: YARN-3635.5.patch

Attached ver.5, fixed bunch of warnings.

> Get-queue-mapping should be a common interface of YarnScheduler
> ---
>
> Key: YARN-3635
> URL: https://issues.apache.org/jira/browse/YARN-3635
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3635.1.patch, YARN-3635.2.patch, YARN-3635.3.patch, 
> YARN-3635.4.patch, YARN-3635.5.patch
>
>
> Currently, both of fair/capacity scheduler support queue mapping, which makes 
> scheduler can change queue of an application after submitted to scheduler.
> One issue of doing this in specific scheduler is: If the queue after mapping 
> has different maximum_allocation/default-node-label-expression of the 
> original queue, {{validateAndCreateResourceRequest}} in RMAppManager checks 
> the wrong queue.
> I propose to make the queue mapping as a common interface of scheduler, and 
> RMAppManager set the queue after mapping before doing validations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3842) NM restarts could lead to app failures

2015-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596733#comment-14596733
 ] 

Hadoop QA commented on YARN-3842:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 20s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 46s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 42s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 30s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 47s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   1m 58s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   6m  5s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  49m 40s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12741131/YARN-3842.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 445b132 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8314/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8314/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8314/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8314/console |


This message was automatically generated.

> NM restarts could lead to app failures
> --
>
> Key: YARN-3842
> URL: https://issues.apache.org/jira/browse/YARN-3842
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Karthik Kambatla
>Assignee: Robert Kanter
>Priority: Critical
> Attachments: MAPREDUCE-6409.001.patch, MAPREDUCE-6409.002.patch, 
> YARN-3842.001.patch
>
>
> Consider the following scenario:
> 1. RM assigns a container on node N to an app A.
> 2. Node N is restarted
> 3. A tries to launch container on node N.
> 3 could lead to an NMNotYetReadyException depending on whether NM N has 
> registered with the RM. In MR, this is considered a task attempt failure. A 
> few of these could lead to a task/job failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3792) Test case failures in TestDistributedShell and some issue fixes related to ATSV2

2015-06-22 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596730#comment-14596730
 ] 

Sangjin Lee commented on YARN-3792:
---

Thanks [~Naganarasimha] for the update!

+1 on the test failure. It appears to be an issue unrelated to the timeline 
service.

It does seem like the whitespace is related to the patch (or in the vicinity of 
the patch). Could you kindly do a quick change to remove those extra spaces?

Also, for findbugs, I ran findbugs against those two projects (distributed 
shell and resource manager). I do see several findbugs warnings, and they are 
not introduced by this patch but do appear to be related to the YARN-2928 work.

distributed shell:
{code}

{code}

resource manager:
{code}
{code}

It would be nice to address them (at least the one on Client.java) here, but if 
you're not inclined, we could do it later... Let me know how you want to 
proceed.

> Test case failures in TestDistributedShell and some issue fixes related to 
> ATSV2
> 
>
> Key: YARN-3792
> URL: https://issues.apache.org/jira/browse/YARN-3792
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
> Attachments: YARN-3792-YARN-2928.001.patch, 
> YARN-3792-YARN-2928.002.patch, YARN-3792-YARN-2928.003.patch
>
>
> # encountered [testcase 
> failures|https://builds.apache.org/job/PreCommit-YARN-Build/8233/testReport/] 
> which was happening even without the patch modifications in YARN-3044
> TestDistributedShell.testDSShellWithoutDomainV2CustomizedFlow
> TestDistributedShell.testDSShellWithoutDomainV2DefaultFlow
> TestDistributedShellWithNodeLabels.testDSShellWithNodeLabelExpression
> # Remove unused {{enableATSV1}} in testDisstributedShell
> # container metrics needs to be published only for v2 test cases of 
> testDisstributedShell
> # Nullpointer was thrown in TimelineClientImpl.constructResURI when Aux 
> service was not configured and {{TimelineClient.putObjects}} was getting 
> invoked.
> # Race condition for the Application events to published and test case 
> verification for RM's ApplicationFinished Timeline Events
> # Application Tags for converted to lowercase in 
> ApplicationSubmissionContextPBimpl, hence RMTimelinecollector was not able to 
> detect to custom flow details of the app



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3360) Add JMX metrics to TimelineDataManager

2015-06-22 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596717#comment-14596717
 ] 

Jason Lowe commented on YARN-3360:
--

The checkstyle comments are complaining about existing method argument lengths 
or the visibility of the Metrics fields.  I was replicating the same style used 
by all other metric fields, so this is consistent with the code base.

> Add JMX metrics to TimelineDataManager
> --
>
> Key: YARN-3360
> URL: https://issues.apache.org/jira/browse/YARN-3360
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: timelineserver
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>  Labels: BB2015-05-TBR
> Attachments: YARN-3360.001.patch, YARN-3360.002.patch, 
> YARN-3360.003.patch
>
>
> The TimelineDataManager currently has no metrics, outside of the standard JVM 
> metrics.  It would be very useful to at least log basic counts of method 
> calls, time spent in those calls, and number of entities/events involved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2801) Documentation development for Node labels requirment

2015-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596706#comment-14596706
 ] 

Hadoop QA commented on YARN-2801:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   3m  2s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | release audit |   0m 21s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | site |   1m 58s | Site compilation is broken. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| | |   5m 26s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12741138/YARN-2801.2.patch |
| Optional Tests | site |
| git revision | trunk / 11ac848 |
| site | 
https://builds.apache.org/job/PreCommit-YARN-Build/8315/artifact/patchprocess/patchSiteWarnings.txt
 |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8315/console |


This message was automatically generated.

> Documentation development for Node labels requirment
> 
>
> Key: YARN-2801
> URL: https://issues.apache.org/jira/browse/YARN-2801
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Gururaj Shetty
>Assignee: Wangda Tan
> Attachments: YARN-2801.1.patch, YARN-2801.2.patch
>
>
> Documentation needs to be developed for the node label requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2801) Documentation development for Node labels requirment

2015-06-22 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2801:
-
Attachment: YARN-2801.2.patch

Hi [~Naganarasimha],
Thanks for your thoughtful review, comments for your suggestions:
2) There's no preemption related documentation in Apache Hadoop yet, I suggest 
to add this part after we have a preemption page.
10) They're what admin should specify. I prefer to not add default value here 
because default is always changing, which will be tracked by 
{{yarn-default.xml}}
12) Changed it, it should be percentage of resources on nodes with DEFAULT 
partition.
13) That's different, {{}} and not specifed means "inherit from 
parent"
18) REST API is under development, I think we still need some time to finalize 
it for 2.8. I suggest to add that part later.
19) Added CS link from node labels page, I think it's a relative independent 
feature. I suggest to not reference from CS.

I addressed other items in attached patch.

Please let me know your ideas.

Thanks,

> Documentation development for Node labels requirment
> 
>
> Key: YARN-2801
> URL: https://issues.apache.org/jira/browse/YARN-2801
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Gururaj Shetty
>Assignee: Wangda Tan
> Attachments: YARN-2801.1.patch, YARN-2801.2.patch
>
>
> Documentation needs to be developed for the node label requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3800) Simplify inmemory state for ReservationAllocation

2015-06-22 Thread Subru Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596663#comment-14596663
 ] 

Subru Krishnan commented on YARN-3800:
--

Thanks [~adhoot] for the patch. I looked at it & just had a couple of comments:
   1. Can we have _toResource(ReservationRequest request)_ in a Reservation 
utility class rather than in _InMemoryReservationAllocation_
   2. I feel we can update the constructor of _InMemoryReservationAllocation_ 
to take in _Map_ instead of 
_Map_ so that we do the translation 
only once. This should simplify the state in GreedyReservationAgent also.

> Simplify inmemory state for ReservationAllocation
> -
>
> Key: YARN-3800
> URL: https://issues.apache.org/jira/browse/YARN-3800
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Attachments: YARN-3800.001.patch, YARN-3800.002.patch
>
>
> Instead of storing the ReservationRequest we store the Resource for 
> allocations, as thats the only thing we need. Ultimately we convert 
> everything to resources anyway



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3842) NM restarts could lead to app failures

2015-06-22 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-3842:

Attachment: YARN-3842.001.patch

That makes sense.  The patch is also a lot simpler; it just adds a retry policy 
for {{NMNotYetReadyException}}, and a test.

> NM restarts could lead to app failures
> --
>
> Key: YARN-3842
> URL: https://issues.apache.org/jira/browse/YARN-3842
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Karthik Kambatla
>Assignee: Robert Kanter
>Priority: Critical
> Attachments: MAPREDUCE-6409.001.patch, MAPREDUCE-6409.002.patch, 
> YARN-3842.001.patch
>
>
> Consider the following scenario:
> 1. RM assigns a container on node N to an app A.
> 2. Node N is restarted
> 3. A tries to launch container on node N.
> 3 could lead to an NMNotYetReadyException depending on whether NM N has 
> registered with the RM. In MR, this is considered a task attempt failure. A 
> few of these could lead to a task/job failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (YARN-3842) NM restarts could lead to app failures

2015-06-22 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla moved MAPREDUCE-6409 to YARN-3842:
---

 Target Version/s: 2.7.1  (was: 2.7.1)
Affects Version/s: (was: 2.7.0)
   2.7.0
  Key: YARN-3842  (was: MAPREDUCE-6409)
  Project: Hadoop YARN  (was: Hadoop Map/Reduce)

> NM restarts could lead to app failures
> --
>
> Key: YARN-3842
> URL: https://issues.apache.org/jira/browse/YARN-3842
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Karthik Kambatla
>Assignee: Robert Kanter
>Priority: Critical
> Attachments: MAPREDUCE-6409.001.patch, MAPREDUCE-6409.002.patch
>
>
> Consider the following scenario:
> 1. RM assigns a container on node N to an app A.
> 2. Node N is restarted
> 3. A tries to launch container on node N.
> 3 could lead to an NMNotYetReadyException depending on whether NM N has 
> registered with the RM. In MR, this is considered a task attempt failure. A 
> few of these could lead to a task/job failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3360) Add JMX metrics to TimelineDataManager

2015-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596630#comment-14596630
 ] 

Hadoop QA commented on YARN-3360:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 25s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 4 new or modified test files. |
| {color:green}+1{color} | javac |   7m 31s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 32s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 29s | The applied patch generated  
19 new checkstyle issues (total was 7, now 26). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 59s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   3m 10s | Tests passed in 
hadoop-yarn-server-applicationhistoryservice. |
| | |  39m 36s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12741115/YARN-3360.003.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 445b132 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8313/artifact/patchprocess/diffcheckstylehadoop-yarn-server-applicationhistoryservice.txt
 |
| hadoop-yarn-server-applicationhistoryservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8313/artifact/patchprocess/testrun_hadoop-yarn-server-applicationhistoryservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8313/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8313/console |


This message was automatically generated.

> Add JMX metrics to TimelineDataManager
> --
>
> Key: YARN-3360
> URL: https://issues.apache.org/jira/browse/YARN-3360
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: timelineserver
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>  Labels: BB2015-05-TBR
> Attachments: YARN-3360.001.patch, YARN-3360.002.patch, 
> YARN-3360.003.patch
>
>
> The TimelineDataManager currently has no metrics, outside of the standard JVM 
> metrics.  It would be very useful to at least log basic counts of method 
> calls, time spent in those calls, and number of entities/events involved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-06-22 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596616#comment-14596616
 ] 

Ted Yu commented on YARN-3815:
--

bq. in the spirit of readless increments as used in Tephra

Readless increment feature is implemented in cdap, called delta write.
Please take a look at:
cdap-hbase-compat-0.98/src/main/java/co/cask/cdap/data2/increment/hbase98/IncrementHandler.java
cdap-hbase-compat-0.98//src/main/java/co/cask/cdap/data2/increment/hbase98/IncrementSummingScanner.java

The implementation uses hbase coprocessor, BTW

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3635) Get-queue-mapping should be a common interface of YarnScheduler

2015-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596620#comment-14596620
 ] 

Hadoop QA commented on YARN-3635:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 14s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   7m 43s | There were no new javac warning 
messages. |
| {color:red}-1{color} | javadoc |   9m 48s | The applied patch generated  2  
additional warning messages. |
| {color:red}-1{color} | release audit |   0m 18s | The applied patch generated 
4 release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 47s | The applied patch generated  
18 new checkstyle issues (total was 204, now 215). |
| {color:red}-1{color} | whitespace |   0m  3s | The patch has 15  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   1m 27s | The patch appears to introduce 3 
new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  61m  8s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  99m 39s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
| Timed out tests | 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12741096/YARN-3635.4.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 445b132 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/8311/artifact/patchprocess/diffJavadocWarnings.txt
 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-YARN-Build/8311/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8311/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/8311/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/8311/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8311/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8311/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8311/console |


This message was automatically generated.

> Get-queue-mapping should be a common interface of YarnScheduler
> ---
>
> Key: YARN-3635
> URL: https://issues.apache.org/jira/browse/YARN-3635
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3635.1.patch, YARN-3635.2.patch, YARN-3635.3.patch, 
> YARN-3635.4.patch
>
>
> Currently, both of fair/capacity scheduler support queue mapping, which makes 
> scheduler can change queue of an application after submitted to scheduler.
> One issue of doing this in specific scheduler is: If the queue after mapping 
> has different maximum_allocation/default-node-label-expression of the 
> original queue, {{validateAndCreateResourceRequest}} in RMAppManager checks 
> the wrong queue.
> I propose to make the queue mapping as a common interface of scheduler, and 
> RMAppManager set the queue after mapping before doing validations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-110) AM releases too many containers due to the protocol

2015-06-22 Thread Giovanni Matteo Fumarola (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596614#comment-14596614
 ] 

Giovanni Matteo Fumarola commented on YARN-110:
---

[~acmurthy], [~vinodkv] any updates on this? 
If you don't mind, can I work on this?

> AM releases too many containers due to the protocol
> ---
>
> Key: YARN-110
> URL: https://issues.apache.org/jira/browse/YARN-110
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: YARN-110.patch
>
>
> - AM sends request asking 4 containers on host H1.
> - Asynchronously, host H1 reaches RM and gets assigned 4 containers. RM at 
> this point, sets the value against H1 to
> zero in its aggregate request-table for all apps.
> - In the mean-while AM gets to need 3 more containers, so a total of 7 
> including the 4 from previous request.
> - Today, AM sends the absolute number of 7 against H1 to RM as part of its 
> request table.
> - RM seems to be overriding its earlier value of zero against H1 to 7 against 
> H1. And thus allocating 7 more
> containers.
> - AM already gets 4 in this scheduling iteration, but gets 7 more, a total of 
> 11 instead of the required 7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3798) ZKRMStateStore shouldn't create new session without occurrance of SESSIONEXPIED

2015-06-22 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596606#comment-14596606
 ] 

zhihai xu commented on YARN-3798:
-

I think we should also create a new session for SessionMovedException.
We hit the SessionMovedException before, the following is the reason for the 
SessionMovedException we find:
# ZK client tried to connect to Leader L. Network was very slow, so before 
leader processed the request, client disconnected.
# Client then re-connected to Follower F reusing the same session ID. It was 
successful.
# The request in step 1 went into leader. Leader processed it and invalidated 
the connection created in step 2. But client didn't know the connection it used 
is invalidated.
# Client got SessionMovedException when it used the connection invalidated by 
leader for any ZooKeeper operation.

IMHO, the only way to recover from this error at RM side is to take 
SessionMovedException as SessionExpiredException, close current ZK client and 
create a new one.

> ZKRMStateStore shouldn't create new session without occurrance of 
> SESSIONEXPIED
> ---
>
> Key: YARN-3798
> URL: https://issues.apache.org/jira/browse/YARN-3798
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
> Environment: Suse 11 Sp3
>Reporter: Bibin A Chundatt
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: RM.log, YARN-3798-2.7.002.patch, 
> YARN-3798-branch-2.7.002.patch, YARN-3798-branch-2.7.patch
>
>
> RM going down with NoNode exception during create of znode for appattempt
> *Please find the exception logs*
> {code}
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session connected
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session restored
> 2015-06-09 10:09:44,886 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> Exception while executing a ZK operation.
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1405)
>   at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:1310)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:926)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1101)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1122)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:937)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:970)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.updateApplicationAttemptStateInternal(ZKRMStateStore.java:671)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:275)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:260)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:837)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:900)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:895)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108)
>   at java

[jira] [Updated] (YARN-3360) Add JMX metrics to TimelineDataManager

2015-06-22 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-3360:
-
Attachment: YARN-3360.003.patch

Rebased patch on trunk

> Add JMX metrics to TimelineDataManager
> --
>
> Key: YARN-3360
> URL: https://issues.apache.org/jira/browse/YARN-3360
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: timelineserver
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>  Labels: BB2015-05-TBR
> Attachments: YARN-3360.001.patch, YARN-3360.002.patch, 
> YARN-3360.003.patch
>
>
> The TimelineDataManager currently has no metrics, outside of the standard JVM 
> metrics.  It would be very useful to at least log basic counts of method 
> calls, time spent in those calls, and number of entities/events involved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3798) ZKRMStateStore shouldn't create new session without occurrance of SESSIONEXPIED

2015-06-22 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596522#comment-14596522
 ] 

Varun Saxena commented on YARN-3798:


Thanks [~ozawa]. Explanation given by you and subsequent discussions with 
[~rakeshr] helped a lot in clarifying behavior of zookeeper.

> ZKRMStateStore shouldn't create new session without occurrance of 
> SESSIONEXPIED
> ---
>
> Key: YARN-3798
> URL: https://issues.apache.org/jira/browse/YARN-3798
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
> Environment: Suse 11 Sp3
>Reporter: Bibin A Chundatt
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: RM.log, YARN-3798-2.7.002.patch, 
> YARN-3798-branch-2.7.002.patch, YARN-3798-branch-2.7.patch
>
>
> RM going down with NoNode exception during create of znode for appattempt
> *Please find the exception logs*
> {code}
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session connected
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session restored
> 2015-06-09 10:09:44,886 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> Exception while executing a ZK operation.
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1405)
>   at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:1310)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:926)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1101)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1122)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:937)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:970)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.updateApplicationAttemptStateInternal(ZKRMStateStore.java:671)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:275)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:260)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:837)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:900)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:895)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108)
>   at java.lang.Thread.run(Thread.java:745)
> 2015-06-09 10:09:44,887 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Maxed 
> out ZK retries. Giving up!
> 2015-06-09 10:09:44,887 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Error 
> updating appAttempt: appattempt_1433764310492_7152_01
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1405)
>   at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:1310)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager

[jira] [Commented] (YARN-1963) Support priorities across applications within the same queue

2015-06-22 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596481#comment-14596481
 ] 

Jian He commented on YARN-1963:
---

I think we need to move this forward..

Overall, I prefer using numeric priority to label-based priority because the 
former is simpler and more flexible if user wants to define a wide range of 
priorities. no extra configs. User does not need to be educated about the new 
mapping any time the mapping changes.

Also, one problem is that if we refresh the priority mapping while some 
existing long-running jobs are already running on certain priority, how do we 
map the previous priority mapping range to the new priority mapping range?

In addition, if everyone runs the application at “VERY_HIGH” priority, the 
“HIGH” priority, though named as “HIGH”, is not really the “HIGH” priority any 
more. It actually becomes the “LOWEST” priority. My point is that the 
importance of priority will make sense only when compared with its peers. In 
that sense, I think adding a utility to surface how applications are 
distributed across each priority so that user can reason about how to place the 
application on certain priority may be more useful than adding a static naming 
mapping to let people reason about the relative importance of priority by 
naming. 

> Support priorities across applications within the same queue 
> -
>
> Key: YARN-1963
> URL: https://issues.apache.org/jira/browse/YARN-1963
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: api, resourcemanager
>Reporter: Arun C Murthy
>Assignee: Sunil G
> Attachments: 0001-YARN-1963-prototype.patch, YARN Application 
> Priorities Design.pdf, YARN Application Priorities Design_01.pdf
>
>
> It will be very useful to support priorities among applications within the 
> same queue, particularly in production scenarios. It allows for finer-grained 
> controls without having to force admins to create a multitude of queues, plus 
> allows existing applications to continue using existing queues which are 
> usually part of institutional memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3798) ZKRMStateStore shouldn't create new session without occurrance of SESSIONEXPIED

2015-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596470#comment-14596470
 ] 

Hadoop QA commented on YARN-3798:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12741098/YARN-3798-2.7.002.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 445b132 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8312/console |


This message was automatically generated.

> ZKRMStateStore shouldn't create new session without occurrance of 
> SESSIONEXPIED
> ---
>
> Key: YARN-3798
> URL: https://issues.apache.org/jira/browse/YARN-3798
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
> Environment: Suse 11 Sp3
>Reporter: Bibin A Chundatt
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: RM.log, YARN-3798-2.7.002.patch, 
> YARN-3798-branch-2.7.002.patch, YARN-3798-branch-2.7.patch
>
>
> RM going down with NoNode exception during create of znode for appattempt
> *Please find the exception logs*
> {code}
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session connected
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session restored
> 2015-06-09 10:09:44,886 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> Exception while executing a ZK operation.
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1405)
>   at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:1310)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:926)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1101)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1122)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:937)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:970)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.updateApplicationAttemptStateInternal(ZKRMStateStore.java:671)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:275)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:260)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:837)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:900)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:895)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108)
>   at java.lang.Thread.run(Thread.java:745)
> 2015-06-09 10:09:44,887 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Maxed 
> out ZK retries. Giving up!
> 2015-06-09 10:09:44,887 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Error 
> up

[jira] [Updated] (YARN-3798) ZKRMStateStore shouldn't create new session without occurrance of SESSIONEXPIED

2015-06-22 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-3798:
--
Attachment: YARN-3798-2.7.002.patch

> ZKRMStateStore shouldn't create new session without occurrance of 
> SESSIONEXPIED
> ---
>
> Key: YARN-3798
> URL: https://issues.apache.org/jira/browse/YARN-3798
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
> Environment: Suse 11 Sp3
>Reporter: Bibin A Chundatt
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: RM.log, YARN-3798-2.7.002.patch, 
> YARN-3798-branch-2.7.002.patch, YARN-3798-branch-2.7.patch
>
>
> RM going down with NoNode exception during create of znode for appattempt
> *Please find the exception logs*
> {code}
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session connected
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session restored
> 2015-06-09 10:09:44,886 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> Exception while executing a ZK operation.
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1405)
>   at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:1310)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:926)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1101)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1122)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:937)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:970)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.updateApplicationAttemptStateInternal(ZKRMStateStore.java:671)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:275)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:260)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:837)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:900)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:895)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108)
>   at java.lang.Thread.run(Thread.java:745)
> 2015-06-09 10:09:44,887 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Maxed 
> out ZK retries. Giving up!
> 2015-06-09 10:09:44,887 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Error 
> updating appAttempt: appattempt_1433764310492_7152_01
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1405)
>   at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:1310)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:926)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMSta

[jira] [Commented] (YARN-3820) Collect disks usages on the node

2015-06-22 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596451#comment-14596451
 ] 

Inigo Goiri commented on YARN-3820:
---

You may want to exclude the change in CommonNodeLabelsManager.java as it's not 
related to this patch.

> Collect disks usages on the node
> 
>
> Key: YARN-3820
> URL: https://issues.apache.org/jira/browse/YARN-3820
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Robert Grandl
>Assignee: Robert Grandl
>  Labels: yarn-common, yarn-util
> Attachments: YARN-3820-1.patch, YARN-3820-2.patch, YARN-3820-3.patch, 
> YARN-3820-4.patch
>
>
> In this JIRA we propose to collect disks usages on a node. This JIRA is part 
> of a larger effort of monitoring resource usages on the nodes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3820) Collect disks usages on the node

2015-06-22 Thread Robert Grandl (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596463#comment-14596463
 ] 

Robert Grandl commented on YARN-3820:
-

[~elgoiri], I fixed the warning because HadoopQA javadoc was -1. I will revert 
the change if HadoopQA will return +1.

> Collect disks usages on the node
> 
>
> Key: YARN-3820
> URL: https://issues.apache.org/jira/browse/YARN-3820
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Robert Grandl
>Assignee: Robert Grandl
>  Labels: yarn-common, yarn-util
> Attachments: YARN-3820-1.patch, YARN-3820-2.patch, YARN-3820-3.patch, 
> YARN-3820-4.patch
>
>
> In this JIRA we propose to collect disks usages on a node. This JIRA is part 
> of a larger effort of monitoring resource usages on the nodes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3798) ZKRMStateStore shouldn't create new session without occurrance of SESSIONEXPIED

2015-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596461#comment-14596461
 ] 

Hadoop QA commented on YARN-3798:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12740206/YARN-3798-branch-2.7.002.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 445b132 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8310/console |


This message was automatically generated.

> ZKRMStateStore shouldn't create new session without occurrance of 
> SESSIONEXPIED
> ---
>
> Key: YARN-3798
> URL: https://issues.apache.org/jira/browse/YARN-3798
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
> Environment: Suse 11 Sp3
>Reporter: Bibin A Chundatt
>Assignee: Varun Saxena
>Priority: Blocker
> Attachments: RM.log, YARN-3798-branch-2.7.002.patch, 
> YARN-3798-branch-2.7.patch
>
>
> RM going down with NoNode exception during create of znode for appattempt
> *Please find the exception logs*
> {code}
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session connected
> 2015-06-09 10:09:44,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> ZKRMStateStore Session restored
> 2015-06-09 10:09:44,886 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: 
> Exception while executing a ZK operation.
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
>   at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1405)
>   at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:1310)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:926)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1101)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1122)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:923)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:937)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:970)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.updateApplicationAttemptStateInternal(ZKRMStateStore.java:671)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:275)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppAttemptTransition.transition(RMStateStore.java:260)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:837)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:900)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:895)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108)
>   at java.lang.Thread.run(Thread.java:745)
> 2015-06-09 10:09:44,887 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Maxed 
> out ZK retries. Giving up!
> 2015-06-09 10:09:44,887 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Error 
> updating appAttempt:

[jira] [Updated] (YARN-3635) Get-queue-mapping should be a common interface of YarnScheduler

2015-06-22 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3635:
-
Attachment: YARN-3635.4.patch

Sorry for my late response, [~vinodkv]. Just have some bandwidth to do the 
update.

Attached ver.4 addressed most of your comments, now queue-placement-rules is a 
separated module in RM, and scheduler initializes it. RMAppManager uses it to 
do queue placing.

Defined interfaces are not exactly as same as you suggested, I put minimal set 
of interfaces needed in my mind. You can take a look at: 
{{org.apache.hadoop.yarn.server.resourcemanager.placement}} for details.

And the ver.4 patch makes original CapacityScheduler.QueueMapping becomes a 
rule: UserGroupPlacementRule.

Thoughts?

> Get-queue-mapping should be a common interface of YarnScheduler
> ---
>
> Key: YARN-3635
> URL: https://issues.apache.org/jira/browse/YARN-3635
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3635.1.patch, YARN-3635.2.patch, YARN-3635.3.patch, 
> YARN-3635.4.patch
>
>
> Currently, both of fair/capacity scheduler support queue mapping, which makes 
> scheduler can change queue of an application after submitted to scheduler.
> One issue of doing this in specific scheduler is: If the queue after mapping 
> has different maximum_allocation/default-node-label-expression of the 
> original queue, {{validateAndCreateResourceRequest}} in RMAppManager checks 
> the wrong queue.
> I propose to make the queue mapping as a common interface of scheduler, and 
> RMAppManager set the queue after mapping before doing validations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3790) TestWorkPreservingRMRestart#testSchedulerRecovery fails intermittently in trunk for FS scheduler

2015-06-22 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596441#comment-14596441
 ] 

Jian He commented on YARN-3790:
---

lgtm, thanks [~zxu] and [~rohithsharma]

> TestWorkPreservingRMRestart#testSchedulerRecovery fails intermittently in 
> trunk for FS scheduler
> 
>
> Key: YARN-3790
> URL: https://issues.apache.org/jira/browse/YARN-3790
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, test
>Reporter: Rohith Sharma K S
>Assignee: zhihai xu
> Attachments: YARN-3790.000.patch
>
>
> Failure trace is as follows
> {noformat}
> Tests run: 28, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 284.078 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart
> testSchedulerRecovery[1](org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart)
>   Time elapsed: 6.502 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<6144> but was:<8192>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.assertMetrics(TestWorkPreservingRMRestart.java:853)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.checkFSQueue(TestWorkPreservingRMRestart.java:342)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.testSchedulerRecovery(TestWorkPreservingRMRestart.java:241)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3768) Index out of range exception with environment variables without values

2015-06-22 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596443#comment-14596443
 ] 

Gera Shegalov commented on YARN-3768:
-

Instead of executing two regexes:

first directly via Pattern p = 
Pattern.compile(Shell.getEnvironmentVariableRegex()) and then via split

can we simply match via a single regex? we can use a capture group to get the 
value.

> Index out of range exception with environment variables without values
> --
>
> Key: YARN-3768
> URL: https://issues.apache.org/jira/browse/YARN-3768
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.5.0
>Reporter: Joe Ferner
>Assignee: zhihai xu
> Attachments: YARN-3768.000.patch, YARN-3768.001.patch
>
>
> Looking at line 80 of org.apache.hadoop.yarn.util.Apps an index out of range 
> exception occurs if an environment variable is encountered without a value.
> I believe this occurs because java will not return empty strings from the 
> split method. Similar to this 
> http://stackoverflow.com/questions/14602062/java-string-split-removed-empty-values



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3116) [Collector wireup] We need an assured way to determine if a container is an AM container on NM

2015-06-22 Thread Giovanni Matteo Fumarola (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596437#comment-14596437
 ] 

Giovanni Matteo Fumarola commented on YARN-3116:


Thanks [~zjshen] for quickly reviewing the patch & your comments.

   1. I agree that ContainerTokenIdentifier would be a better place to do it so 
that we keep the flag internal but the ContainerTokenIdentifier is created 
before the state transition in RMAppAttempt that sets the AM flag in 
RMContainer. I can try to recreate ContainerTokenIdentifier at the AM launch 
but that looks unwieldy. Do you have any suggestions on how to do it cleaner?

   2. Again a good observation, I'll add this in the next iteration of the 
patch based on your suggestion for (1) above.

> [Collector wireup] We need an assured way to determine if a container is an 
> AM container on NM
> --
>
> Key: YARN-3116
> URL: https://issues.apache.org/jira/browse/YARN-3116
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, timelineserver
>Reporter: Zhijie Shen
>Assignee: Giovanni Matteo Fumarola
> Attachments: YARN-3116.patch
>
>
> In YARN-3030, to start the per-app aggregator only for a started AM 
> container,  we need to determine if the container is an AM container or not 
> from the context in NM (we can do it on RM). This information is missing, 
> such that we worked around to considered the container with ID "_01" as 
> the AM container. Unfortunately, this is neither necessary or sufficient 
> condition. We need to have a way to determine if a container is an AM 
> container on NM. We can add flag to the container object or create an API to 
> do the judgement. Perhaps the distributed AM information may also be useful 
> to YARN-2877.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state

2015-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596337#comment-14596337
 ] 

Hadoop QA commented on YARN-2902:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 40s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 34s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 33s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 36s | The applied patch generated  
25 new checkstyle issues (total was 168, now 187). |
| {color:green}+1{color} | whitespace |   0m  4s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 14s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m 24s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  43m 37s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12741076/YARN-2902.03.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 445b132 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8309/artifact/patchprocess/diffcheckstylehadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8309/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8309/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8309/console |


This message was automatically generated.

> Killing a container that is localizing can orphan resources in the 
> DOWNLOADING state
> 
>
> Key: YARN-2902
> URL: https://issues.apache.org/jira/browse/YARN-2902
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Attachments: YARN-2902.002.patch, YARN-2902.03.patch, YARN-2902.patch
>
>
> If a container is in the process of localizing when it is stopped/killed then 
> resources are left in the DOWNLOADING state.  If no other container comes 
> along and requests these resources they linger around with no reference 
> counts but aren't cleaned up during normal cache cleanup scans since it will 
> never delete resources in the DOWNLOADING state even if their reference count 
> is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3835) hadoop-yarn-server-resourcemanager test package bundles core-site.xml, yarn-site.xml

2015-06-22 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596327#comment-14596327
 ] 

Robert Kanter commented on YARN-3835:
-

+1

> hadoop-yarn-server-resourcemanager test package bundles core-site.xml, 
> yarn-site.xml
> 
>
> Key: YARN-3835
> URL: https://issues.apache.org/jira/browse/YARN-3835
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Vamsee Yarlagadda
>Assignee: Vamsee Yarlagadda
>Priority: Minor
> Attachments: YARN-3835.patch
>
>
> It looks like by default yarn is bundling core-site.xml, yarn-site.xml in 
> test artifact of hadoop-yarn-server-resourcemanager which means that any 
> downstream project which uses this a dependency can have a problem in picking 
> up the user supplied/environment supplied core-site.xml, yarn-site.xml
> So we should ideally exclude these .xml files from being bundled into the 
> test-jar. (Similar to YARN-1748)
> I also proactively looked at other YARN modules where this might be 
> happening. 
> {code}
> vamsee-MBP:hadoop-yarn-project vamsee$ find . -name "*-site.xml"
> ./hadoop-yarn/conf/yarn-site.xml
> ./hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/resources/yarn-site.xml
> ./hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/src/test/resources/yarn-site.xml
> ./hadoop-yarn/hadoop-yarn-client/src/test/resources/core-site.xml
> ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/core-site.xml
> ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/core-site.xml
> ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site.xml
> ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/target/test-classes/core-site.xml
> ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/target/test-classes/yarn-site.xml
> ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/resources/core-site.xml
> {code}
> And out of these only two modules (hadoop-yarn-server-resourcemanager, 
> hadoop-yarn-server-tests) are building test-jars. In future, if we start 
> building test-jar of other modules, we should exclude these xml files from 
> being bundled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3835) hadoop-yarn-server-resourcemanager test package bundles core-site.xml, yarn-site.xml

2015-06-22 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-3835:

Target Version/s: 2.8.0

> hadoop-yarn-server-resourcemanager test package bundles core-site.xml, 
> yarn-site.xml
> 
>
> Key: YARN-3835
> URL: https://issues.apache.org/jira/browse/YARN-3835
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Vamsee Yarlagadda
>Assignee: Vamsee Yarlagadda
>Priority: Minor
> Attachments: YARN-3835.patch
>
>
> It looks like by default yarn is bundling core-site.xml, yarn-site.xml in 
> test artifact of hadoop-yarn-server-resourcemanager which means that any 
> downstream project which uses this a dependency can have a problem in picking 
> up the user supplied/environment supplied core-site.xml, yarn-site.xml
> So we should ideally exclude these .xml files from being bundled into the 
> test-jar. (Similar to YARN-1748)
> I also proactively looked at other YARN modules where this might be 
> happening. 
> {code}
> vamsee-MBP:hadoop-yarn-project vamsee$ find . -name "*-site.xml"
> ./hadoop-yarn/conf/yarn-site.xml
> ./hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/resources/yarn-site.xml
> ./hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/src/test/resources/yarn-site.xml
> ./hadoop-yarn/hadoop-yarn-client/src/test/resources/core-site.xml
> ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/core-site.xml
> ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/core-site.xml
> ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site.xml
> ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/target/test-classes/core-site.xml
> ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/target/test-classes/yarn-site.xml
> ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/resources/core-site.xml
> {code}
> And out of these only two modules (hadoop-yarn-server-resourcemanager, 
> hadoop-yarn-server-tests) are building test-jars. In future, if we start 
> building test-jar of other modules, we should exclude these xml files from 
> being bundled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3176) In Fair Scheduler, child queue should inherit maxApp from its parent

2015-06-22 Thread Siqi Li (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596257#comment-14596257
 ] 

Siqi Li commented on YARN-3176:
---

Hi [~djp], can you take a look at patch v2. The checkstyle issues and test 
errors do not seems to apply to this patch

> In Fair Scheduler, child queue should inherit maxApp from its parent
> 
>
> Key: YARN-3176
> URL: https://issues.apache.org/jira/browse/YARN-3176
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Siqi Li
>Assignee: Siqi Li
> Attachments: YARN-3176.v1.patch, YARN-3176.v2.patch
>
>
> if the child queue does not have a maxRunningApp limit, it will use the 
> queueMaxAppsDefault. This behavior is not quite right, since 
> queueMaxAppsDefault is normally a small number, whereas some parent queues do 
> have maxRunningApp set to be more than the default



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state

2015-06-22 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2902:
---
Attachment: YARN-2902.03.patch

> Killing a container that is localizing can orphan resources in the 
> DOWNLOADING state
> 
>
> Key: YARN-2902
> URL: https://issues.apache.org/jira/browse/YARN-2902
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Attachments: YARN-2902.002.patch, YARN-2902.03.patch, YARN-2902.patch
>
>
> If a container is in the process of localizing when it is stopped/killed then 
> resources are left in the DOWNLOADING state.  If no other container comes 
> along and requests these resources they linger around with no reference 
> counts but aren't cleaned up during normal cache cleanup scans since it will 
> never delete resources in the DOWNLOADING state even if their reference count 
> is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3840) Resource Manager web ui bug on main view after application number 9999

2015-06-22 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596230#comment-14596230
 ] 

Xuan Gong commented on YARN-3840:
-

[~Alexandre LINTE] Hey, could you provide which version of hadoop you are using 
? 2.7 ?

> Resource Manager web ui bug on main view after application number 
> --
>
> Key: YARN-3840
> URL: https://issues.apache.org/jira/browse/YARN-3840
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
> Environment: Centos 6.6
> Java 1.7
>Reporter: LINTE
>
> On the WEBUI, the global main view page : 
> http://resourcemanager:8088/cluster/apps doesn't display applications over 
> .
> With command line it works (# yarn application -list).
> Regards,
> Alexandre



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-06-22 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596173#comment-14596173
 ] 

Ted Yu commented on YARN-3815:
--

My comment is related to usage of hbase.
bq. under framework_specific_metrics column family
Since column family name appears in every KeyValue, it would be better to use 
very short column family name. e.g. f_m for framework metrics.

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-06-22 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596129#comment-14596129
 ] 

Junping Du commented on YARN-3815:
--

Thanks [~sjlee0] and [~jrottinghuis] for review and good comments in detail. 
[~jrottinghuis]'s comments are pretty long and I could only reply part of it 
and will finish the left parts tomorrow. :)

bq. For framework-specific metrics, I would say this falls on the individual 
frameworks. The framework AM usually already aggregates them in memory 
(consider MR job counters for example). So for them it is straightforward to 
write them out directly onto the YARN app entities. Furthermore, it is 
problematic to add them to the sub-app YARN entities and ask YARN to aggregate 
them to the application. Framework’s sub-app entities may not even align with 
YARN’s sub-app entities. For example, in case of MR, there is a reasonable 
one-to-one mapping between a mapper/reducer task attempt and a container, but 
for other applications that may not be true. Forcing all frameworks to hang 
values at containers may not be practical. I think it’s far easier for 
frameworks to write aggregated values to the YARN app entities.
AM currently leverage YARN's AppTimelineCollector to forward entities to 
backend storage, so making AM talk directly to backend storage is not 
considered to be safe. It is also not necessary too because the real difficulty 
here is to aggregate framework specific metrics in other levels (flow, user and 
queue), because that beyond the life cycle of framework so YARN have to take 
care of it. Instead of asking frameworks to handle specific metrics themselves, 
I would like to propose to treat these metrics as "anonymous", it would pass 
both metrics name and value to YARN's collector and YARN's collector could 
aggregate it and store as dynamic column (under framework_specific_metrics 
column family) into app states table. So other (flow, user, etc.) level 
aggregation on freamework metrics could happen based on this.

bq. app-to-flow online aggregation. This is more or less live aggregated 
metrics at the flow level. This will still be based on the native HBase schema.
About flow online aggregation, I am not quite sure on requirement yet. Do we 
really want real time for flow aggregated data or some fine-grained time 
interval (like 15 secs) should be good enough - if we want to show some nice 
metrics chart for flow, this should be fine. Even for real time, we don't have 
to aggregate everything from raw entity table, we don't have to duplicated 
count metrics again for finished apps. Isn't it?

bq. (3) time-based flow aggregation: This is different than the online 
aggregation in the sense that it is aggregated along the time boundary (e.g. 
“daily”, “weekly”, etc.). This can be based on the Phoenix schema. This can be 
populated in an offline fashion (e.g. running a mapreduce job).
Any special reason not to handle it in the same way above - as HBase 
coprocessor? It just sound like gross-grained time interval. Isn't it?

bq. This is another “offline” aggregation type. Also, I believe we’re talking 
about only time-based aggregation. In other words, we would aggregate values 
for users only with a well-defined time window. There won’t be a “real-time” 
aggregation of values, similar to the flow aggregation.
I would also call for a fine-grained time interval (closed to real-time) 
because the aggregated resource metrics on user could be used in billing hadoop 
usage in a shared environment (no matter private or public cloud), so user need 
to know more details on resource consumption especially in some random peak 
time.

bq. Very much agree with separation into 2 categories "online" versus 
"periodic". I think this will be natural split between the native HBase tables 
for the former and the Phoenix approach for the latter to each emphasize their 
relative strengths.
I would question the necessary for "online" again if this mean "real time" 
instead of fine-grained time interval. Actually, as a building block, every 
container metrics (cpu, memory, etc.) are generated in a time interval instead 
of real time. As a result, we never know the exactly snapshot of whole system 
in a precisely time but only can try to getting closer.


> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the qu

[jira] [Commented] (YARN-3834) Scrub debug logging of tokens during resource localization.

2015-06-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596089#comment-14596089
 ] 

Hudson commented on YARN-3834:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2182 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2182/])
YARN-3834. Scrub debug logging of tokens during resource localization. 
Contributed by Chris Nauroth (xgong: rev 
6c7a9d502a633b5aca75c9798f19ce4a5729014e)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java


> Scrub debug logging of tokens during resource localization.
> ---
>
> Key: YARN-3834
> URL: https://issues.apache.org/jira/browse/YARN-3834
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.7.1
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 2.8.0
>
> Attachments: YARN-3834.001.patch
>
>
> During resource localization, the NodeManager logs tokens at debug level to 
> aid troubleshooting.  This includes the full token representation.  Best 
> practice is to avoid logging anything secret, even at debug level.  We can 
> improve on this by changing the logging to use a scrubbed representation of 
> the token.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3834) Scrub debug logging of tokens during resource localization.

2015-06-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596059#comment-14596059
 ] 

Hudson commented on YARN-3834:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #234 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/234/])
YARN-3834. Scrub debug logging of tokens during resource localization. 
Contributed by Chris Nauroth (xgong: rev 
6c7a9d502a633b5aca75c9798f19ce4a5729014e)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java
* hadoop-yarn-project/CHANGES.txt


> Scrub debug logging of tokens during resource localization.
> ---
>
> Key: YARN-3834
> URL: https://issues.apache.org/jira/browse/YARN-3834
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.7.1
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 2.8.0
>
> Attachments: YARN-3834.001.patch
>
>
> During resource localization, the NodeManager logs tokens at debug level to 
> aid troubleshooting.  This includes the full token representation.  Best 
> practice is to avoid logging anything secret, even at debug level.  We can 
> improve on this by changing the logging to use a scrubbed representation of 
> the token.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3834) Scrub debug logging of tokens during resource localization.

2015-06-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595984#comment-14595984
 ] 

Hudson commented on YARN-3834:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #225 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/225/])
YARN-3834. Scrub debug logging of tokens during resource localization. 
Contributed by Chris Nauroth (xgong: rev 
6c7a9d502a633b5aca75c9798f19ce4a5729014e)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java


> Scrub debug logging of tokens during resource localization.
> ---
>
> Key: YARN-3834
> URL: https://issues.apache.org/jira/browse/YARN-3834
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.7.1
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 2.8.0
>
> Attachments: YARN-3834.001.patch
>
>
> During resource localization, the NodeManager logs tokens at debug level to 
> aid troubleshooting.  This includes the full token representation.  Best 
> practice is to avoid logging anything secret, even at debug level.  We can 
> improve on this by changing the logging to use a scrubbed representation of 
> the token.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3834) Scrub debug logging of tokens during resource localization.

2015-06-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595971#comment-14595971
 ] 

Hudson commented on YARN-3834:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2164 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2164/])
YARN-3834. Scrub debug logging of tokens during resource localization. 
Contributed by Chris Nauroth (xgong: rev 
6c7a9d502a633b5aca75c9798f19ce4a5729014e)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java


> Scrub debug logging of tokens during resource localization.
> ---
>
> Key: YARN-3834
> URL: https://issues.apache.org/jira/browse/YARN-3834
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.7.1
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 2.8.0
>
> Attachments: YARN-3834.001.patch
>
>
> During resource localization, the NodeManager logs tokens at debug level to 
> aid troubleshooting.  This includes the full token representation.  Best 
> practice is to avoid logging anything secret, even at debug level.  We can 
> improve on this by changing the logging to use a scrubbed representation of 
> the token.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3840) Resource Manager web ui bug on main view after application number 9999

2015-06-22 Thread LINTE (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595896#comment-14595896
 ] 

LINTE commented on YARN-3840:
-

Hi,

There is no Java stackstrace for this bug.
I think that the property yarn.resourcemanager.max-completed-applications is in 
cause (default value is 1), but it doesn't work properly.

Maybe yarn.resourcemanager.max-completed-applications is only effective on 
ResourceManager GUI.

Regards,

> Resource Manager web ui bug on main view after application number 
> --
>
> Key: YARN-3840
> URL: https://issues.apache.org/jira/browse/YARN-3840
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
> Environment: Centos 6.6
> Java 1.7
>Reporter: LINTE
>
> On the WEBUI, the global main view page : 
> http://resourcemanager:8088/cluster/apps doesn't display applications over 
> .
> With command line it works (# yarn application -list).
> Regards,
> Alexandre



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3841) [Storage abstraction] Create HDFS backing storage implementation for ATS writes

2015-06-22 Thread Tsuyoshi Ozawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi Ozawa updated YARN-3841:
-
Description: 
HDFS backing storage is useful for following scenarios.
1. For Hadoop clusters which don't run HBase.
2. For fallback from HBase when HBase cluster is temporary unavailable. Quoting 
ATS design document of YARN-2928:
{quote}
In the case the HBase
storage is not available, the plugin should buffer the writes temporarily (e.g. 
HDFS), and flush
them once the storage comes back online. Reading and writing to hdfs as the the 
backup storage
could potentially use the HDFS writer plugin unless the complexity of 
generalizing the HDFS
writer plugin for this purpose exceeds the benefits of reusing it here.
{quote}


  was:
HDFS backing storage is useful for following scenarios.
1. For Hadoop clusters which don't run HBase.
2. For fallback from HBase when HBase cluster is temporary unavailable. 
{quote}
In the case the HBase
storage is not available, the plugin should buffer the writes temporarily (e.g. 
HDFS), and flush
them once the storage comes back online. Reading and writing to hdfs as the the 
backup storage
could potentially use the HDFS writer plugin unless the complexity of 
generalizing the HDFS
writer plugin for this purpose exceeds the benefits of reusing it here.
{quote}



> [Storage abstraction] Create HDFS backing storage implementation for ATS 
> writes
> ---
>
> Key: YARN-3841
> URL: https://issues.apache.org/jira/browse/YARN-3841
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Tsuyoshi Ozawa
>Assignee: Tsuyoshi Ozawa
>
> HDFS backing storage is useful for following scenarios.
> 1. For Hadoop clusters which don't run HBase.
> 2. For fallback from HBase when HBase cluster is temporary unavailable. 
> Quoting ATS design document of YARN-2928:
> {quote}
> In the case the HBase
> storage is not available, the plugin should buffer the writes temporarily 
> (e.g. HDFS), and flush
> them once the storage comes back online. Reading and writing to hdfs as the 
> the backup storage
> could potentially use the HDFS writer plugin unless the complexity of 
> generalizing the HDFS
> writer plugin for this purpose exceeds the benefits of reusing it here.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3841) [Storage abstraction] Create HDFS backing storage implementation for ATS writes

2015-06-22 Thread Tsuyoshi Ozawa (JIRA)
Tsuyoshi Ozawa created YARN-3841:


 Summary: [Storage abstraction] Create HDFS backing storage 
implementation for ATS writes
 Key: YARN-3841
 URL: https://issues.apache.org/jira/browse/YARN-3841
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Tsuyoshi Ozawa
Assignee: Tsuyoshi Ozawa


HDFS backing storage is useful for following scenarios.
1. For Hadoop clusters which don't run HBase.
2. For fallback from HBase when HBase cluster is temporary unavailable. 
{quote}
In the case the HBase
storage is not available, the plugin should buffer the writes temporarily (e.g. 
HDFS), and flush
them once the storage comes back online. Reading and writing to hdfs as the the 
backup storage
could potentially use the HDFS writer plugin unless the complexity of 
generalizing the HDFS
writer plugin for this purpose exceeds the benefits of reusing it here.
{quote}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3840) Resource Manager web ui bug on main view after application number 9999

2015-06-22 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595826#comment-14595826
 ] 

Devaraj K commented on YARN-3840:
-

Thanks [~Alexandre LINTE] for reporting the issue. 

Can you paste the exception if you see anything in the RM UI or in the RM logs?

> Resource Manager web ui bug on main view after application number 
> --
>
> Key: YARN-3840
> URL: https://issues.apache.org/jira/browse/YARN-3840
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
> Environment: Centos 6.6
> Java 1.7
>Reporter: LINTE
>
> On the WEBUI, the global main view page : 
> http://resourcemanager:8088/cluster/apps doesn't display applications over 
> .
> With command line it works (# yarn application -list).
> Regards,
> Alexandre



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3840) Resource Manager web ui bug on main view after application number 9999

2015-06-22 Thread LINTE (JIRA)
LINTE created YARN-3840:
---

 Summary: Resource Manager web ui bug on main view after 
application number 
 Key: YARN-3840
 URL: https://issues.apache.org/jira/browse/YARN-3840
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
 Environment: Centos 6.6
Java 1.7
Reporter: LINTE


On the WEBUI, the global main view page : 
http://resourcemanager:8088/cluster/apps doesn't display applications over .

With command line it works (# yarn application -list).

Regards,

Alexandre





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3834) Scrub debug logging of tokens during resource localization.

2015-06-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595805#comment-14595805
 ] 

Hudson commented on YARN-3834:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #966 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/966/])
YARN-3834. Scrub debug logging of tokens during resource localization. 
Contributed by Chris Nauroth (xgong: rev 
6c7a9d502a633b5aca75c9798f19ce4a5729014e)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java
* hadoop-yarn-project/CHANGES.txt


> Scrub debug logging of tokens during resource localization.
> ---
>
> Key: YARN-3834
> URL: https://issues.apache.org/jira/browse/YARN-3834
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.7.1
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 2.8.0
>
> Attachments: YARN-3834.001.patch
>
>
> During resource localization, the NodeManager logs tokens at debug level to 
> aid troubleshooting.  This includes the full token representation.  Best 
> practice is to avoid logging anything secret, even at debug level.  We can 
> improve on this by changing the logging to use a scrubbed representation of 
> the token.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3834) Scrub debug logging of tokens during resource localization.

2015-06-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595749#comment-14595749
 ] 

Hudson commented on YARN-3834:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #236 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/236/])
YARN-3834. Scrub debug logging of tokens during resource localization. 
Contributed by Chris Nauroth (xgong: rev 
6c7a9d502a633b5aca75c9798f19ce4a5729014e)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* hadoop-yarn-project/CHANGES.txt


> Scrub debug logging of tokens during resource localization.
> ---
>
> Key: YARN-3834
> URL: https://issues.apache.org/jira/browse/YARN-3834
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.7.1
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 2.8.0
>
> Attachments: YARN-3834.001.patch
>
>
> During resource localization, the NodeManager logs tokens at debug level to 
> aid troubleshooting.  This includes the full token representation.  Best 
> practice is to avoid logging anything secret, even at debug level.  We can 
> improve on this by changing the logging to use a scrubbed representation of 
> the token.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3826) Race condition in ResourceTrackerService: potential wrong diagnostics messages

2015-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595717#comment-14595717
 ] 

Hadoop QA commented on YARN-3826:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 32s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 30s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 48s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 24s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  50m 43s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  89m  7s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12740355/YARN-3826.01.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 445b132 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8308/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8308/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8308/console |


This message was automatically generated.

> Race condition in ResourceTrackerService: potential wrong diagnostics messages
> --
>
> Key: YARN-3826
> URL: https://issues.apache.org/jira/browse/YARN-3826
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: YARN-3826.01.patch
>
>
> Since we are calling {{setDiagnosticsMessage}} in {{nodeHeartbeat}}, which 
> can be called concurrently, the static {{resync}} and {{shutdown}} may have 
> wrong diagnostics messages in some cases.
> On the other side, these static members can hardly save any memory, since the 
> normal heartbeat responses are created for each heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3768) Index out of range exception with environment variables without values

2015-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595552#comment-14595552
 ] 

Hadoop QA commented on YARN-3768:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m  5s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 53s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 32s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   1m 57s | Tests passed in 
hadoop-yarn-common. |
| | |  40m  4s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12740968/YARN-3768.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 445b132 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8307/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8307/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8307/console |


This message was automatically generated.

> Index out of range exception with environment variables without values
> --
>
> Key: YARN-3768
> URL: https://issues.apache.org/jira/browse/YARN-3768
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.5.0
>Reporter: Joe Ferner
>Assignee: zhihai xu
> Attachments: YARN-3768.000.patch, YARN-3768.001.patch
>
>
> Looking at line 80 of org.apache.hadoop.yarn.util.Apps an index out of range 
> exception occurs if an environment variable is encountered without a value.
> I believe this occurs because java will not return empty strings from the 
> split method. Similar to this 
> http://stackoverflow.com/questions/14602062/java-string-split-removed-empty-values



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3768) Index out of range exception with environment variables without values

2015-06-22 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595495#comment-14595495
 ] 

zhihai xu commented on YARN-3768:
-

Hi [~xgong], thanks for the review. I uploaded a new patch YARN-3768.001.patch, 
in which I add a test case to verify bad environment variables are skipped.
About keeping trailing empty strings, it will depend on whether an Environment 
Variable with empty value is a valid use case.
MAPREDUCE-5965 adds option to configure Environment Variable with empty value 
if stream.jobconf.truncate.limit is 0.
It looks like an Environment Variable with empty value may be a valid use case.

> Index out of range exception with environment variables without values
> --
>
> Key: YARN-3768
> URL: https://issues.apache.org/jira/browse/YARN-3768
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.5.0
>Reporter: Joe Ferner
>Assignee: zhihai xu
> Attachments: YARN-3768.000.patch, YARN-3768.001.patch
>
>
> Looking at line 80 of org.apache.hadoop.yarn.util.Apps an index out of range 
> exception occurs if an environment variable is encountered without a value.
> I believe this occurs because java will not return empty strings from the 
> split method. Similar to this 
> http://stackoverflow.com/questions/14602062/java-string-split-removed-empty-values



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3768) Index out of range exception with environment variables without values

2015-06-22 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3768:

Attachment: YARN-3768.001.patch

> Index out of range exception with environment variables without values
> --
>
> Key: YARN-3768
> URL: https://issues.apache.org/jira/browse/YARN-3768
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.5.0
>Reporter: Joe Ferner
>Assignee: zhihai xu
> Attachments: YARN-3768.000.patch, YARN-3768.001.patch
>
>
> Looking at line 80 of org.apache.hadoop.yarn.util.Apps an index out of range 
> exception occurs if an environment variable is encountered without a value.
> I believe this occurs because java will not return empty strings from the 
> split method. Similar to this 
> http://stackoverflow.com/questions/14602062/java-string-split-removed-empty-values



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3826) Race condition in ResourceTrackerService: potential wrong diagnostics messages

2015-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595457#comment-14595457
 ] 

Hadoop QA commented on YARN-3826:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  19m 42s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   9m 30s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 42s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 28s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 40s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 44s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 28s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  61m 36s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | | 106m 18s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart |
| Timed out tests | 
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart 
|
|   | 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12740355/YARN-3826.01.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 6c7a9d5 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8306/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8306/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8306/console |


This message was automatically generated.

> Race condition in ResourceTrackerService: potential wrong diagnostics messages
> --
>
> Key: YARN-3826
> URL: https://issues.apache.org/jira/browse/YARN-3826
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: YARN-3826.01.patch
>
>
> Since we are calling {{setDiagnosticsMessage}} in {{nodeHeartbeat}}, which 
> can be called concurrently, the static {{resync}} and {{shutdown}} may have 
> wrong diagnostics messages in some cases.
> On the other side, these static members can hardly save any memory, since the 
> normal heartbeat responses are created for each heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)