[jira] [Updated] (YARN-5101) YARN_APPLICATION_UPDATED event is parsed in ApplicationHistoryManagerOnTimelineStore#convertToApplicationReport with reversed order

2016-07-13 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-5101:
--
Attachment: YARN-5101.0002.patch

Thanks [~rohithsharma]. It makes sense.
Updated a patch with this approach, however {{SystemClock}} was used by 
{{RMAppImpl}} which is internally using {{System.currentTimeMillis}} and it was 
available only in RMAppImpl, so CS#updateApplicationPriority had to use old 
approach. I will raise a separate jira to use MonotonicClock for RMAppImpl.

> YARN_APPLICATION_UPDATED event is parsed in 
> ApplicationHistoryManagerOnTimelineStore#convertToApplicationReport with 
> reversed order
> ---
>
> Key: YARN-5101
> URL: https://issues.apache.org/jira/browse/YARN-5101
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Xuan Gong
>Assignee: Sunil G
> Attachments: YARN-5101.0001.patch, YARN-5101.0002.patch
>
>
> Right now, the application events are parsed in in 
> ApplicationHistoryManagerOnTimelineStore#convertToApplicationReport with 
> timestamp descending order, which means the later events would be parsed 
> first, and the previous same type of events would override the information. In
> https://issues.apache.org/jira/browse/YARN-4044, we have introduced 
> YARN_APPLICATION_UPDATED events which might be submitted by RM multiple times 
> in one application life cycle. This could cause problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5156) YARN_CONTAINER_FINISHED of YARN_CONTAINERs will always have running state

2016-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376442#comment-15376442
 ] 

Hadoop QA commented on YARN-5156:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
40s {color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
20s {color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s 
{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
19s {color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
57s {color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
32s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 13m 31s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 35m 4s {color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817887/YARN-5156-YARN-5355.02.patch
 |
| JIRA Issue | YARN-5156 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux aea954f2bd8f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | YARN-5355 / 0fd3980 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/12321/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-YARN-Build/12321/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12321/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https:

[jira] [Commented] (YARN-5299) Log Docker run command when container fails

2016-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376423#comment-15376423
 ] 

Hudson commented on YARN-5299:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #10096 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10096/])
YARN-5299. Log Docker run command when container fails. Contributed by 
(rohithsharmaks: rev dbe97aa768e2987209811c407969fea47641418c)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DockerLinuxContainerRuntime.java


> Log Docker run command when container fails
> ---
>
> Key: YARN-5299
> URL: https://issues.apache.org/jira/browse/YARN-5299
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: 2.9.0
>
> Attachments: YARN-5299.001.patch
>
>
> It's useful to have the docker run command logged when containers fail to 
> help debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5363) For AM containers, or for containers of running-apps, "yarn logs" incorrectly only (tries to) shows syslog file-type by default

2016-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376409#comment-15376409
 ] 

Hadoop QA commented on YARN-5363:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 48s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 2s 
{color} | {color:blue} The patch file was not named according to hadoop's 
naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute 
for instructions. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client: The 
patch generated 7 new + 80 unchanged - 8 fixed = 87 total (was 88) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 22s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
36s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 23s {color} 
| {color:red} hadoop-yarn-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 0s {color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.client.api.impl.TestYarnClient |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817883/YARN-5363-2016-07-13.1.txt
 |
| JIRA Issue | YARN-5363 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 85ba631ffd0b 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 2bbc3ea |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/12320/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/12320/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-YARN-Build/12320/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-Y

[jira] [Commented] (YARN-5299) Log Docker run command when container fails

2016-07-13 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376401#comment-15376401
 ] 

Rohith Sharma K S commented on YARN-5299:
-

sorry that I did not see your comment since it was going parallelly :-(

> Log Docker run command when container fails
> ---
>
> Key: YARN-5299
> URL: https://issues.apache.org/jira/browse/YARN-5299
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: 2.9.0
>
> Attachments: YARN-5299.001.patch
>
>
> It's useful to have the docker run command logged when containers fail to 
> help debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5287) LinuxContainerExecutor fails to set proper permission

2016-07-13 Thread Ying Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376395#comment-15376395
 ] 

Ying Zhang commented on YARN-5287:
--

Hi Naganarasimha, I've uploaded a new patch with a new test case included. I've 
run the "test-container-executor" with regular user and root. It passes.
As I said earlier, there is issue with the current test-container-executor when 
running under root. I have made some minor change to make it pass(to be 
specific, in test-container-executor.c, in function main(), when running as 
root, test_recursive_unlink_children() needs to be run before 
set_user(username). Not sure if it is correct, just a work-around). I don't 
think I should include it in this patch. Let me know your idea. Thank you.

> LinuxContainerExecutor fails to set proper permission
> -
>
> Key: YARN-5287
> URL: https://issues.apache.org/jira/browse/YARN-5287
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.2
>Reporter: Ying Zhang
>Assignee: Ying Zhang
>Priority: Minor
> Attachments: YARN-5287-naga.patch, YARN-5287.001.patch, 
> YARN-5287.002.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> LinuxContainerExecutor fails to set the proper permissions on the local 
> directories(i.e., /hadoop/yarn/local/usercache/... by default) if the cluster 
> has been configured with a restrictive umask, e.g.: umask 077. Job failed due 
> to the following reason:
> Path /hadoop/yarn/local/usercache/ambari-qa/appcache/application_ has 
> permission 700 but needs permission 750



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5377) TestQueuingContainerManager.testKillMultipleOpportunisticContainers fails in trunk

2016-07-13 Thread Rohith Sharma K S (JIRA)
Rohith Sharma K S created YARN-5377:
---

 Summary: 
TestQueuingContainerManager.testKillMultipleOpportunisticContainers fails in 
trunk
 Key: YARN-5377
 URL: https://issues.apache.org/jira/browse/YARN-5377
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Rohith Sharma K S


Test case fails jenkin build 
[link|https://builds.apache.org/job/PreCommit-YARN-Build/12228/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt]
{noformat}
Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 134.586 sec <<< 
FAILURE! - in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager
testKillMultipleOpportunisticContainers(org.apache.hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager)
  Time elapsed: 32.134 sec  <<< FAILURE!
java.lang.AssertionError: ContainerState is not correct (timedout) 
expected: but was:
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.BaseContainerManagerTest.waitForNMContainerState(BaseContainerManagerTest.java:363)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager.testKillMultipleOpportunisticContainers(TestQueuingContainerManager.java:470)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5156) YARN_CONTAINER_FINISHED of YARN_CONTAINERs will always have running state

2016-07-13 Thread Vrushali C (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C updated YARN-5156:
-
Attachment: YARN-5156-YARN-5355.02.patch

Thanks [~varun_saxena]
Uploading another patch which now removes storing the ContainerStatus in the 
ContainerFinishedEvent in the NMTimelinePublisher. 


> YARN_CONTAINER_FINISHED of YARN_CONTAINERs will always have running state
> -
>
> Key: YARN-5156
> URL: https://issues.apache.org/jira/browse/YARN-5156
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Vrushali C
>  Labels: YARN-5355
> Attachments: YARN-5156-YARN-2928.01.patch, 
> YARN-5156-YARN-5355.01.patch, YARN-5156-YARN-5355.02.patch
>
>
> On container finished, we're reporting "YARN_CONTAINER_STATE: "RUNNING"". Do 
> we design this deliberately or it's a bug? 
> {code}
> {
> metrics: [ ],
> events: [
> {
> id: "YARN_CONTAINER_FINISHED",
> timestamp: 1464213765890,
> info: {
> YARN_CONTAINER_EXIT_STATUS: 0,
> YARN_CONTAINER_STATE: "RUNNING",
> YARN_CONTAINER_DIAGNOSTICS_INFO: ""
> }
> },
> {
> id: "YARN_NM_CONTAINER_LOCALIZATION_FINISHED",
> timestamp: 1464213761133,
> info: { }
> },
> {
> id: "YARN_CONTAINER_CREATED",
> timestamp: 1464213761132,
> info: { }
> },
> {
> id: "YARN_NM_CONTAINER_LOCALIZATION_STARTED",
> timestamp: 1464213761132,
> info: { }
> }
> ],
> id: "container_e15_1464213707405_0001_01_18",
> type: "YARN_CONTAINER",
> createdtime: 1464213761132,
> info: {
> YARN_CONTAINER_ALLOCATED_PRIORITY: "20",
> YARN_CONTAINER_ALLOCATED_VCORE: 1,
> YARN_CONTAINER_ALLOCATED_HOST_HTTP_ADDRESS: "10.22.16.164:0",
> UID: 
> "yarn_cluster!application_1464213707405_0001!YARN_CONTAINER!container_e15_1464213707405_0001_01_18",
> YARN_CONTAINER_ALLOCATED_HOST: "10.22.16.164",
> YARN_CONTAINER_ALLOCATED_MEMORY: 1024,
> SYSTEM_INFO_PARENT_ENTITY: {
> type: "YARN_APPLICATION_ATTEMPT",
> id: "appattempt_1464213707405_0001_01"
> },
> YARN_CONTAINER_ALLOCATED_PORT: 64694
> },
> configs: { },
> isrelatedto: { },
> relatesto: { }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5299) Log Docker run command when container fails

2016-07-13 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376387#comment-15376387
 ] 

Vinod Kumar Vavilapalli commented on YARN-5299:
---

Can we print similar logging for other commands like signalContainer?

For launch failure, does it make sense to expose this full-command further up 
into the container-diagnostics?

Not related to this patch, but another thing that is a little overwhelming is 
the following in DelegatingLinuxContainerRuntime
{code}
if (LOG.isInfoEnabled()) {
  LOG.info("Using container runtime: " + runtime.getClass()
  .getSimpleName());
}
{code}
Make it debug only?

> Log Docker run command when container fails
> ---
>
> Key: YARN-5299
> URL: https://issues.apache.org/jira/browse/YARN-5299
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: YARN-5299.001.patch
>
>
> It's useful to have the docker run command logged when containers fail to 
> help debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-4888) Changes in RM AppSchedulingInfo for identifying resource-requests explicitly

2016-07-13 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376380#comment-15376380
 ] 

Arun Suresh edited comment on YARN-4888 at 7/14/16 6:12 AM:


Thanks for the patch [~subru]

Couple of initial comments:
# In {{AppSchedulingInfo::checkForDeactivation()}}, given that you are 
introducing an inner for loop, the deactivate = false should be set only if ALL 
requests in the {{mappedRequest.values()}} has {{request.getNumContainers() > 
0}} right ?
# In {{AppSchedulingInfo::allocateNodeLocal()}}, That {{rackLocalRequest}} and 
{{offRackRequest}} selected from the {{resoureRequestMap}} should correspond to 
the same allocationRequestId of the {{nodeLocalRequest}}. Only when you have a 
single allocationRequestId, can you guarantee that {{.firstEntry().getValue()}} 
will correspond to the same nodeLocalRequest in both cases.

With regard to the getMergeResource() method. I have a feeling we should not be 
merging all requests of a {{Priority}}. Consider this alternate approach, which 
I feel might allow us to more easily verify if we are breaking any invariants 
in the Scheduler:
My intuition is based on the fact that If we agree that, in the absence of an 
*allocateRequestId*, we can simulate the same functionality by using a unique 
Priority to tie all requests we need to be of the same allocateRequestId  
Then, why not replace the 'Priority' in the 
{{AppSchedulingInfo::resourceRequestMap}} with a new type (called 
*SchedulerPriority*) which is essentially a composite of *Priority + 
allocateRequestId*. If no allocateRequestId is provided, the requestId part of 
it will default to 0. If the *SchedulerPriority* is a subclass of *Priority*, 
then we wont even need to change any of the APIs.

Thoughts ?
I can help provide a quick prototype patch to verify if this works..


was (Author: asuresh):
Thanks for the patch [~subru]

Couple of initial comments:
# In {{AppSchedulingInfo::checkForDeactivation()}}, given that you are 
introducing an inner for loop, the deactivate = false should be set only if ALL 
requests in the {{mappedRequest.values()}} has {{request.getNumContainers() > 
0}} right ?
# In {{AppSchedulingInfo::allocateNodeLocal()}}, That {{rackLocalRequest}} and 
{{offRackRequest}} selected from the {{resoureRequestMap}} should correspond to 
the same allocationRequestId of the {{nodeLocalRequest}}. Only when you have a 
single allocationRequestId, can you guarantee that {{.firstEntry().getValue()}} 
will correspond to the same nodeLocalRequest in both cases.

With regard to the getMergeResource() method. I have a feeling we should not be 
merging all requests of a {{Priority}}. Consider this alternate approach, which 
I feel might allow us to more easily verify if we are breaking any invariants 
in the Scheduler:
My intuition is based on the fact that If we agree that, in the absence of an 
*allocateRequestId*, we can simulate the same functionality by using a unique 
Priority to tie all requests we need to be of the same allocateRequestId  
Then, why not replace the 'Priority' in the 
{{AppSchedulingInfo::resourceRequestMap}} with a new type (called 
*SchedulerPriority*) which is essentially a composite of *Priority + 
allocateRequestId*. If not allocateRequestId is provided, it will default to 0. 
If the *SchedulerPriority* is a subclass of *Priority*, then we wont even need 
to change any of the APIs.

Thoughts ?
I can help provide a quick prototype patch to verify if this works..

> Changes in RM AppSchedulingInfo for identifying resource-requests explicitly
> 
>
> Key: YARN-4888
> URL: https://issues.apache.org/jira/browse/YARN-4888
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: YARN-4888-v0.patch
>
>
> YARN-4879 puts forward the notion of identifying allocate requests 
> explicitly. This JIRA is to track the changes in RM app scheduling data 
> structures to accomplish it. Please refer to the design doc in the parent 
> JIRA for details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4888) Changes in RM AppSchedulingInfo for identifying resource-requests explicitly

2016-07-13 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376380#comment-15376380
 ] 

Arun Suresh commented on YARN-4888:
---

Thanks for the patch [~subru]

Couple of initial comments:
# In {{AppSchedulingInfo::checkForDeactivation()}}, given that you are 
introducing an inner for loop, the deactivate = false should be set only if ALL 
requests in the {{mappedRequest.values()}} has {{request.getNumContainers() > 
0}} right ?
# In {{AppSchedulingInfo::allocateNodeLocal()}}, That {{rackLocalRequest}} and 
{{offRackRequest}} selected from the {{resoureRequestMap}} should correspond to 
the same allocationRequestId of the {{nodeLocalRequest}}. Only when you have a 
single allocationRequestId, can you guarantee that {{.firstEntry().getValue()}} 
will correspond to the same nodeLocalRequest in both cases.

With regard to the getMergeResource() method. I have a feeling we should not be 
merging all requests of a {{Priority}}. Consider this alternate approach, which 
I feel might allow us to more easily verify if we are breaking any invariants 
in the Scheduler:
My intuition is based on the fact that If we agree that, in the absence of an 
*allocateRequestId*, we can simulate the same functionality by using a unique 
Priority to tie all requests we need to be of the same allocateRequestId  
Then, why not replace the 'Priority' in the 
{{AppSchedulingInfo::resourceRequestMap}} with a new type (called 
*SchedulerPriority*) which is essentially a composite of *Priority + 
allocateRequestId*. If not allocateRequestId is provided, it will default to 0. 
If the *SchedulerPriority* is a subclass of *Priority*, then we wont even need 
to change any of the APIs.

Thoughts ?
I can help provide a quick prototype patch to verify if this works..

> Changes in RM AppSchedulingInfo for identifying resource-requests explicitly
> 
>
> Key: YARN-4888
> URL: https://issues.apache.org/jira/browse/YARN-4888
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: YARN-4888-v0.patch
>
>
> YARN-4879 puts forward the notion of identifying allocate requests 
> explicitly. This JIRA is to track the changes in RM app scheduling data 
> structures to accomplish it. Please refer to the design doc in the parent 
> JIRA for details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5363) For AM containers, or for containers of running-apps, "yarn logs" incorrectly only (tries to) shows syslog file-type by default

2016-07-13 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-5363:
--
Attachment: YARN-5363-2016-07-13.1.txt

Tx for the review [~xgong]!
bq.  think that we could add the logic here. So, we do not need to do it 
separately inside several different functions.
This makes perfect sense - code reuse, yay!

Updating patch with the comments addressed.


> For AM containers, or for containers of running-apps, "yarn logs" incorrectly 
> only (tries to) shows syslog file-type by default
> ---
>
> Key: YARN-5363
> URL: https://issues.apache.org/jira/browse/YARN-5363
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Attachments: YARN-5363-2016-07-12.txt, YARN-5363-2016-07-13.1.txt, 
> YARN-5363-2016-07-13.txt
>
>
> For e.g, for a running application, the following happens:
> {code}
> # yarn logs -applicationId application_1467838922593_0001
> 16/07/06 22:07:05 INFO impl.TimelineClientImpl: Timeline service address: 
> http://:8188/ws/v1/timeline/
> 16/07/06 22:07:06 INFO client.RMProxy: Connecting to ResourceManager at 
> /:8050
> 16/07/06 22:07:07 INFO impl.TimelineClientImpl: Timeline service address: 
> http://l:8188/ws/v1/timeline/
> 16/07/06 22:07:07 INFO client.RMProxy: Connecting to ResourceManager at 
> /:8050
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_01 within the application: 
> application_1467838922593_0001
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_02 within the application: 
> application_1467838922593_0001
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_03 within the application: 
> application_1467838922593_0001
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_04 within the application: 
> application_1467838922593_0001
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_05 within the application: 
> application_1467838922593_0001
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_06 within the application: 
> application_1467838922593_0001
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_07 within the application: 
> application_1467838922593_0001
> Can not find the logs for the application: application_1467838922593_0001 
> with the appOwner: 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5299) Log Docker run command when container fails

2016-07-13 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376370#comment-15376370
 ] 

Rohith Sharma K S commented on YARN-5299:
-

+1 lgtm

> Log Docker run command when container fails
> ---
>
> Key: YARN-5299
> URL: https://issues.apache.org/jira/browse/YARN-5299
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: YARN-5299.001.patch
>
>
> It's useful to have the docker run command logged when containers fail to 
> help debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5355) YARN Timeline Service v.2: alpha 2

2016-07-13 Thread Vrushali C (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C updated YARN-5355:
-
Attachment: YARN-5355-branch-2.01.patch


Uploading the patch for back-porting to branch-2.  (YARN-5355-branch-2.01.patch)

> YARN Timeline Service v.2: alpha 2
> --
>
> Key: YARN-5355
> URL: https://issues.apache.org/jira/browse/YARN-5355
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Vrushali C
>Priority: Critical
> Attachments: Timeline Service v2_ Ideas for Next Steps.pdf, 
> YARN-5355-branch-2.01.patch
>
>
> This is an umbrella JIRA for the alpha 2 milestone for YARN Timeline Service 
> v.2.
> This is developed on feature branches: {{YARN-5355}} for the trunk-based 
> development and {{YARN-5355-branch-2}} to maintain backports to branch-2. Any 
> subtask work on this JIRA will be committed to those 2 branches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5355) YARN Timeline Service v.2: alpha 2

2016-07-13 Thread Vrushali C (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C updated YARN-5355:
-
Assignee: Sangjin Lee  (was: Vrushali C)

> YARN Timeline Service v.2: alpha 2
> --
>
> Key: YARN-5355
> URL: https://issues.apache.org/jira/browse/YARN-5355
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: Timeline Service v2_ Ideas for Next Steps.pdf, 
> YARN-5355-branch-2.01.patch
>
>
> This is an umbrella JIRA for the alpha 2 milestone for YARN Timeline Service 
> v.2.
> This is developed on feature branches: {{YARN-5355}} for the trunk-based 
> development and {{YARN-5355-branch-2}} to maintain backports to branch-2. Any 
> subtask work on this JIRA will be committed to those 2 branches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5355) YARN Timeline Service v.2: alpha 2

2016-07-13 Thread Vrushali C (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C reassigned YARN-5355:


Assignee: Vrushali C  (was: Sangjin Lee)

> YARN Timeline Service v.2: alpha 2
> --
>
> Key: YARN-5355
> URL: https://issues.apache.org/jira/browse/YARN-5355
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Vrushali C
>Priority: Critical
> Attachments: Timeline Service v2_ Ideas for Next Steps.pdf
>
>
> This is an umbrella JIRA for the alpha 2 milestone for YARN Timeline Service 
> v.2.
> This is developed on feature branches: {{YARN-5355}} for the trunk-based 
> development and {{YARN-5355-branch-2}} to maintain backports to branch-2. Any 
> subtask work on this JIRA will be committed to those 2 branches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5287) LinuxContainerExecutor fails to set proper permission

2016-07-13 Thread Ying Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ying Zhang updated YARN-5287:
-
Attachment: YARN-5287.002.patch

> LinuxContainerExecutor fails to set proper permission
> -
>
> Key: YARN-5287
> URL: https://issues.apache.org/jira/browse/YARN-5287
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.2
>Reporter: Ying Zhang
>Assignee: Ying Zhang
>Priority: Minor
> Attachments: YARN-5287-naga.patch, YARN-5287.001.patch, 
> YARN-5287.002.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> LinuxContainerExecutor fails to set the proper permissions on the local 
> directories(i.e., /hadoop/yarn/local/usercache/... by default) if the cluster 
> has been configured with a restrictive umask, e.g.: umask 077. Job failed due 
> to the following reason:
> Path /hadoop/yarn/local/usercache/ambari-qa/appcache/application_ has 
> permission 700 but needs permission 750



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4212) FairScheduler: Parent queues is not allowed to be 'Fair' policy if its children have the "drf" policy

2016-07-13 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376314#comment-15376314
 ] 

Yufei Gu commented on YARN-4212:


Thanks [~kasha]. 

> FairScheduler: Parent queues is not allowed to be 'Fair' policy if its 
> children have the "drf" policy
> -
>
> Key: YARN-4212
> URL: https://issues.apache.org/jira/browse/YARN-4212
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Yufei Gu
>  Labels: fairscheduler
> Attachments: YARN-4212.002.patch, YARN-4212.003.patch, 
> YARN-4212.004.patch, YARN-4212.1.patch
>
>
> The Fair Scheduler, while performing a {{recomputeShares()}} during an 
> {{update()}} call, uses the parent queues policy to distribute shares to its 
> children.
> If the parent queues policy is 'fair', it only computes weight for memory and 
> sets the vcores fair share of its children to 0.
> Assuming a situation where we have 1 parent queue with policy 'fair' and 
> multiple leaf queues with policy 'drf', Any app submitted to the child queues 
> with vcore requirement > 1 will always be above fairshare, since during the 
> recomputeShare process, the child queues were all assigned 0 for fairshare 
> vcores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4464) default value of yarn.resourcemanager.state-store.max-completed-applications should lower.

2016-07-13 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376295#comment-15376295
 ] 

Naganarasimha G R commented on YARN-4464:
-

Thanks [~vinodkv], it looks ideal to have default value as zero but not sure 
all production cluster will adopt ATS immediately, in that per se i thought of 
having around last 500 ~ 1000 completed apps in RM. 
If all are ok with no completed apps in RM Memory as default then i am fine 
with it, its like -0 from my side. And i am ok with no change in Hadoop 2.x.


> default value of yarn.resourcemanager.state-store.max-completed-applications 
> should lower.
> --
>
> Key: YARN-4464
> URL: https://issues.apache.org/jira/browse/YARN-4464
> Project: Hadoop YARN
>  Issue Type: Wish
>  Components: resourcemanager
>Reporter: KWON BYUNGCHANG
>Assignee: Daniel Templeton
>Priority: Blocker
> Attachments: YARN-4464.001.patch, YARN-4464.002.patch, 
> YARN-4464.003.patch, YARN-4464.004.patch
>
>
> my cluster has 120 nodes.
> I configured RM Restart feature.
> {code}
> yarn.resourcemanager.recovery.enabled=true
> yarn.resourcemanager.store.class=org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
> yarn.resourcemanager.fs.state-store.uri=/system/yarn/rmstore
> {code}
> unfortunately I did not configure 
> {{yarn.resourcemanager.state-store.max-completed-applications}}.
> so that property configured default value 10,000.
> I have restarted RM due to changing another configuartion.
> I expected that RM restart immediately.
> recovery process was very slow.  I have waited about 20min.  
> realize missing 
> {{yarn.resourcemanager.state-store.max-completed-applications}}.
> its default value is very huge.  
> need to change lower value or document notice on [RM Restart 
> page|http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/ResourceManagerRestart.html].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5272) Handle queue names consistently in FairScheduler

2016-07-13 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376276#comment-15376276
 ] 

Ray Chiang commented on YARN-5272:
--

[~wilfreds], let me know if you'd prefer to abstract out the whitespace 
trimming in a follow up JIRA or if you plan to do it for this patch.

> Handle queue names consistently in FairScheduler
> 
>
> Key: YARN-5272
> URL: https://issues.apache.org/jira/browse/YARN-5272
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Attachments: YARN-5272.1.patch, YARN-5272.3.patch, YARN-5272.4.patch
>
>
> The fix used in YARN-3214 uses a the JDK trim() method to remove leading and 
> trailing spaces. The QueueMetrics uses a guava based trim when it splits the 
> queues.
> The guava based trim uses the unicode definition of a white space which is 
> different than the java trim as can be seen 
> [here|https://docs.google.com/a/cloudera.com/spreadsheets/d/1kq4ECwPjHX9B8QUCTPclgsDCXYaj7T-FlT4tB5q3ahk/pub]
> A queue name with a non-breaking white space will thus still cause the same 
> "Metrics source XXX already exists!" MetricsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking

2016-07-13 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376265#comment-15376265
 ] 

Robert Kanter commented on YARN-4676:
-

Thanks for pointing that out [~mingma].  Using the same file format makes sense 
to me.  Would it make sense to move some of that code (i.e. parsing, etc) to 
Common so that we can use the same implementation in HDFS and YARN?

[~danzhi], [~djp], what do you think?

> Automatic and Asynchronous Decommissioning Nodes Status Tracking
> 
>
> Key: YARN-4676
> URL: https://issues.apache.org/jira/browse/YARN-4676
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.8.0
>Reporter: Daniel Zhi
>Assignee: Daniel Zhi
>  Labels: features
> Attachments: GracefulDecommissionYarnNode.pdf, 
> GracefulDecommissionYarnNode.pdf, YARN-4676.004.patch, YARN-4676.005.patch, 
> YARN-4676.006.patch, YARN-4676.007.patch, YARN-4676.008.patch, 
> YARN-4676.009.patch, YARN-4676.010.patch, YARN-4676.011.patch, 
> YARN-4676.012.patch, YARN-4676.013.patch, YARN-4676.014.patch, 
> YARN-4676.015.patch, YARN-4676.016.patch
>
>
> YARN-4676 implements an automatic, asynchronous and flexible mechanism to 
> graceful decommission
> YARN nodes. After user issues the refreshNodes request, ResourceManager 
> automatically evaluates
> status of all affected nodes to kicks out decommission or recommission 
> actions. RM asynchronously
> tracks container and application status related to DECOMMISSIONING nodes to 
> decommission the
> nodes immediately after there are ready to be decommissioned. Decommissioning 
> timeout at individual
> nodes granularity is supported and could be dynamically updated. The 
> mechanism naturally supports multiple
> independent graceful decommissioning “sessions” where each one involves 
> different sets of nodes with
> different timeout settings. Such support is ideal and necessary for graceful 
> decommission request issued
> by external cluster management software instead of human.
> DecommissioningNodeWatcher inside ResourceTrackingService tracks 
> DECOMMISSIONING nodes status automatically and asynchronously after 
> client/admin made the graceful decommission request. It tracks 
> DECOMMISSIONING nodes status to decide when, after all running containers on 
> the node have completed, will be transitioned into DECOMMISSIONED state. 
> NodesListManager detect and handle include and exclude list changes to kick 
> out decommission or recommission as necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5376) capacity scheduler crashed while processing APP_ATTEMPT_REMOVED

2016-07-13 Thread sandflee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sandflee updated YARN-5376:
---
Issue Type: Bug  (was: Improvement)

> capacity scheduler crashed while processing APP_ATTEMPT_REMOVED
> ---
>
> Key: YARN-5376
> URL: https://issues.apache.org/jira/browse/YARN-5376
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: sandflee
> Attachments: capacity-crash.log
>
>
> we are testing capacity schedule with a sls like client, see following error, 
> seems shedulerNode is removed.
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.completedContainer(LeafQueue.java:1606)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.completedContainer(CapacityScheduler.java:1416)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.doneApplicationAttempt(CapacityScheduler.java:903)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1265)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:121)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:677)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5156) YARN_CONTAINER_FINISHED of YARN_CONTAINERs will always have running state

2016-07-13 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376260#comment-15376260
 ] 

Varun Saxena commented on YARN-5156:


I am fine with removing it. We can anyways interpret what the container state 
will be from the event. It can either be RUNNING or COMPLETE.  And its COMPLETE 
only on container finished event.

> YARN_CONTAINER_FINISHED of YARN_CONTAINERs will always have running state
> -
>
> Key: YARN-5156
> URL: https://issues.apache.org/jira/browse/YARN-5156
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Vrushali C
>  Labels: YARN-5355
> Attachments: YARN-5156-YARN-2928.01.patch, 
> YARN-5156-YARN-5355.01.patch
>
>
> On container finished, we're reporting "YARN_CONTAINER_STATE: "RUNNING"". Do 
> we design this deliberately or it's a bug? 
> {code}
> {
> metrics: [ ],
> events: [
> {
> id: "YARN_CONTAINER_FINISHED",
> timestamp: 1464213765890,
> info: {
> YARN_CONTAINER_EXIT_STATUS: 0,
> YARN_CONTAINER_STATE: "RUNNING",
> YARN_CONTAINER_DIAGNOSTICS_INFO: ""
> }
> },
> {
> id: "YARN_NM_CONTAINER_LOCALIZATION_FINISHED",
> timestamp: 1464213761133,
> info: { }
> },
> {
> id: "YARN_CONTAINER_CREATED",
> timestamp: 1464213761132,
> info: { }
> },
> {
> id: "YARN_NM_CONTAINER_LOCALIZATION_STARTED",
> timestamp: 1464213761132,
> info: { }
> }
> ],
> id: "container_e15_1464213707405_0001_01_18",
> type: "YARN_CONTAINER",
> createdtime: 1464213761132,
> info: {
> YARN_CONTAINER_ALLOCATED_PRIORITY: "20",
> YARN_CONTAINER_ALLOCATED_VCORE: 1,
> YARN_CONTAINER_ALLOCATED_HOST_HTTP_ADDRESS: "10.22.16.164:0",
> UID: 
> "yarn_cluster!application_1464213707405_0001!YARN_CONTAINER!container_e15_1464213707405_0001_01_18",
> YARN_CONTAINER_ALLOCATED_HOST: "10.22.16.164",
> YARN_CONTAINER_ALLOCATED_MEMORY: 1024,
> SYSTEM_INFO_PARENT_ENTITY: {
> type: "YARN_APPLICATION_ATTEMPT",
> id: "appattempt_1464213707405_0001_01"
> },
> YARN_CONTAINER_ALLOCATED_PORT: 64694
> },
> configs: { },
> isrelatedto: { },
> relatesto: { }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5156) YARN_CONTAINER_FINISHED of YARN_CONTAINERs will always have running state

2016-07-13 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376240#comment-15376240
 ] 

Vrushali C commented on YARN-5156:
--

Thanks [~varun_saxena]! 

bq. We are not. Container state is only published only in Finished event. Maybe 
we can either include it everywhere or not have it anywhere.
I see, then I think we should just remove it (as part of this jira fix). What 
do you think ?

> YARN_CONTAINER_FINISHED of YARN_CONTAINERs will always have running state
> -
>
> Key: YARN-5156
> URL: https://issues.apache.org/jira/browse/YARN-5156
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Vrushali C
>  Labels: YARN-5355
> Attachments: YARN-5156-YARN-2928.01.patch, 
> YARN-5156-YARN-5355.01.patch
>
>
> On container finished, we're reporting "YARN_CONTAINER_STATE: "RUNNING"". Do 
> we design this deliberately or it's a bug? 
> {code}
> {
> metrics: [ ],
> events: [
> {
> id: "YARN_CONTAINER_FINISHED",
> timestamp: 1464213765890,
> info: {
> YARN_CONTAINER_EXIT_STATUS: 0,
> YARN_CONTAINER_STATE: "RUNNING",
> YARN_CONTAINER_DIAGNOSTICS_INFO: ""
> }
> },
> {
> id: "YARN_NM_CONTAINER_LOCALIZATION_FINISHED",
> timestamp: 1464213761133,
> info: { }
> },
> {
> id: "YARN_CONTAINER_CREATED",
> timestamp: 1464213761132,
> info: { }
> },
> {
> id: "YARN_NM_CONTAINER_LOCALIZATION_STARTED",
> timestamp: 1464213761132,
> info: { }
> }
> ],
> id: "container_e15_1464213707405_0001_01_18",
> type: "YARN_CONTAINER",
> createdtime: 1464213761132,
> info: {
> YARN_CONTAINER_ALLOCATED_PRIORITY: "20",
> YARN_CONTAINER_ALLOCATED_VCORE: 1,
> YARN_CONTAINER_ALLOCATED_HOST_HTTP_ADDRESS: "10.22.16.164:0",
> UID: 
> "yarn_cluster!application_1464213707405_0001!YARN_CONTAINER!container_e15_1464213707405_0001_01_18",
> YARN_CONTAINER_ALLOCATED_HOST: "10.22.16.164",
> YARN_CONTAINER_ALLOCATED_MEMORY: 1024,
> SYSTEM_INFO_PARENT_ENTITY: {
> type: "YARN_APPLICATION_ATTEMPT",
> id: "appattempt_1464213707405_0001_01"
> },
> YARN_CONTAINER_ALLOCATED_PORT: 64694
> },
> configs: { },
> isrelatedto: { },
> relatesto: { }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5342) Improve non-exclusive node partition resource allocation in Capacity Scheduler

2016-07-13 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375771#comment-15375771
 ] 

Naganarasimha G R edited comment on YARN-5342 at 7/14/16 3:11 AM:
--

Thanks for the patch [~wangda], 
Given that discussed approach in YARN-4425 (Fallback Policy based ) is going to 
take some time as it would require significant modifications, i would agree to 
go for intermittent modification to optimize the non exclusive mode scheduling.
Only concern i have is if the size of default partition is greater than the non 
exclusive partition then on one allocation in default we are resetting the 
counter, would it be productive ?


was (Author: naganarasimha):
Thanks for the patch [~wangda], 
Given that discussed approach in YARN-4225 (Fallback Policy based ) is going to 
take some time as it would require significant modifications, i would agree to 
go for intermittent modification to optimize the non exclusive mode scheduling.
Only concern i have is if the size of default partition is greater than the non 
exclusive partition then on one allocation in default we are resetting the 
counter, would it be productive ?

> Improve non-exclusive node partition resource allocation in Capacity Scheduler
> --
>
> Key: YARN-5342
> URL: https://issues.apache.org/jira/browse/YARN-5342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-5342.1.patch
>
>
> In the previous implementation, one non-exclusive container allocation is 
> possible when the missed-opportunity >= #cluster-nodes. And 
> missed-opportunity will be reset when container allocated to any node.
> This will slow down the frequency of container allocation on non-exclusive 
> node partition: *When a non-exclusive partition=x has idle resource, we can 
> only allocate one container for this app in every 
> X=nodemanagers.heartbeat-interval secs for the whole cluster.*
> In this JIRA, I propose a fix to reset missed-opporunity only if we have >0 
> pending resource for the non-exclusive partition OR we get allocation from 
> the default partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5156) YARN_CONTAINER_FINISHED of YARN_CONTAINERs will always have running state

2016-07-13 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376239#comment-15376239
 ] 

Varun Saxena commented on YARN-5156:


[~vrushalic], I think the warning log is not required because it will be 
printed everytime. Its because in ContainerImpl the state will not be COMPLETE 
when the event to NMTimelinePublisher is posted.

bq.  I think we should include the container state in the finished event, if we 
are including other container states at other times in other events. 
We are not. Container state is only published only in Finished event. Maybe we 
can either include it everywhere or not have it anywhere.

> YARN_CONTAINER_FINISHED of YARN_CONTAINERs will always have running state
> -
>
> Key: YARN-5156
> URL: https://issues.apache.org/jira/browse/YARN-5156
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Vrushali C
>  Labels: YARN-5355
> Attachments: YARN-5156-YARN-2928.01.patch, 
> YARN-5156-YARN-5355.01.patch
>
>
> On container finished, we're reporting "YARN_CONTAINER_STATE: "RUNNING"". Do 
> we design this deliberately or it's a bug? 
> {code}
> {
> metrics: [ ],
> events: [
> {
> id: "YARN_CONTAINER_FINISHED",
> timestamp: 1464213765890,
> info: {
> YARN_CONTAINER_EXIT_STATUS: 0,
> YARN_CONTAINER_STATE: "RUNNING",
> YARN_CONTAINER_DIAGNOSTICS_INFO: ""
> }
> },
> {
> id: "YARN_NM_CONTAINER_LOCALIZATION_FINISHED",
> timestamp: 1464213761133,
> info: { }
> },
> {
> id: "YARN_CONTAINER_CREATED",
> timestamp: 1464213761132,
> info: { }
> },
> {
> id: "YARN_NM_CONTAINER_LOCALIZATION_STARTED",
> timestamp: 1464213761132,
> info: { }
> }
> ],
> id: "container_e15_1464213707405_0001_01_18",
> type: "YARN_CONTAINER",
> createdtime: 1464213761132,
> info: {
> YARN_CONTAINER_ALLOCATED_PRIORITY: "20",
> YARN_CONTAINER_ALLOCATED_VCORE: 1,
> YARN_CONTAINER_ALLOCATED_HOST_HTTP_ADDRESS: "10.22.16.164:0",
> UID: 
> "yarn_cluster!application_1464213707405_0001!YARN_CONTAINER!container_e15_1464213707405_0001_01_18",
> YARN_CONTAINER_ALLOCATED_HOST: "10.22.16.164",
> YARN_CONTAINER_ALLOCATED_MEMORY: 1024,
> SYSTEM_INFO_PARENT_ENTITY: {
> type: "YARN_APPLICATION_ATTEMPT",
> id: "appattempt_1464213707405_0001_01"
> },
> YARN_CONTAINER_ALLOCATED_PORT: 64694
> },
> configs: { },
> isrelatedto: { },
> relatesto: { }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5376) capacity scheduler crashed while processing APP_ATTEMPT_REMOVED

2016-07-13 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376233#comment-15376233
 ] 

sandflee commented on YARN-5376:


2.7.2, did not change capacity scheduler code.

> capacity scheduler crashed while processing APP_ATTEMPT_REMOVED
> ---
>
> Key: YARN-5376
> URL: https://issues.apache.org/jira/browse/YARN-5376
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: sandflee
> Attachments: capacity-crash.log
>
>
> we are testing capacity schedule with a sls like client, see following error, 
> seems shedulerNode is removed.
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.completedContainer(LeafQueue.java:1606)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.completedContainer(CapacityScheduler.java:1416)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.doneApplicationAttempt(CapacityScheduler.java:903)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1265)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:121)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:677)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5309) SSLFactory truststore reloader thread leak in TimelineClientImpl

2016-07-13 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-5309:
--
Priority: Blocker  (was: Major)

> SSLFactory truststore reloader thread leak in TimelineClientImpl
> 
>
> Key: YARN-5309
> URL: https://issues.apache.org/jira/browse/YARN-5309
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver, yarn
>Affects Versions: 2.7.1
>Reporter: Thomas Friedrich
>Assignee: Weiwei Yang
>Priority: Blocker
> Attachments: YARN-5309.001.patch, YARN-5309.002.patch, 
> YARN-5309.003.patch, YARN-5309.004.patch
>
>
> We found a similar issue as HADOOP-11368 in TimelineClientImpl. The class 
> creates an instance of SSLFactory in newSslConnConfigurator and subsequently 
> creates the ReloadingX509TrustManager instance which in turn starts a trust 
> store reloader thread. 
> However, the SSLFactory is never destroyed and hence the trust store reloader 
> threads are not killed.
> This problem was observed by a customer who had SSL enabled in Hadoop and 
> submitted many queries against the HiveServer2. After a few days, the HS2 
> instance crashed and from the Java dump we could see many (over 13000) 
> threads like this:
> "Truststore reloader thread" #126 daemon prio=5 os_prio=0 
> tid=0x7f680d2e3000 nid=0x98fd waiting on 
> condition [0x7f67e482c000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run
> (ReloadingX509TrustManager.java:225)
> at java.lang.Thread.run(Thread.java:745)
> HiveServer2 uses the JobClient to submit a job:
> Thread [HiveServer2-Background-Pool: Thread-188] (Suspended (breakpoint at 
> line 89 in 
> ReloadingX509TrustManager))   
>   owns: Object  (id=464)  
>   owns: Object  (id=465)  
>   owns: Object  (id=466)  
>   owns: ServiceLoader  (id=210)
>   ReloadingX509TrustManager.(String, String, String, long) line: 89 
>   FileBasedKeyStoresFactory.init(SSLFactory$Mode) line: 209   
>   SSLFactory.init() line: 131 
>   TimelineClientImpl.newSslConnConfigurator(int, Configuration) line: 532 
>   TimelineClientImpl.newConnConfigurator(Configuration) line: 507 
>   TimelineClientImpl.serviceInit(Configuration) line: 269 
>   TimelineClientImpl(AbstractService).init(Configuration) line: 163   
>   YarnClientImpl.serviceInit(Configuration) line: 169 
>   YarnClientImpl(AbstractService).init(Configuration) line: 163   
>   ResourceMgrDelegate.serviceInit(Configuration) line: 102
>   ResourceMgrDelegate(AbstractService).init(Configuration) line: 163  
>   ResourceMgrDelegate.(YarnConfiguration) line: 96  
>   YARNRunner.(Configuration) line: 112  
>   YarnClientProtocolProvider.create(Configuration) line: 34   
>   Cluster.initialize(InetSocketAddress, Configuration) line: 95   
>   Cluster.(InetSocketAddress, Configuration) line: 82   
>   Cluster.(Configuration) line: 75  
>   JobClient.init(JobConf) line: 475   
>   JobClient.(JobConf) line: 454 
>   MapRedTask(ExecDriver).execute(DriverContext) line: 401 
>   MapRedTask.execute(DriverContext) line: 137 
>   MapRedTask(Task).executeTask() line: 160 
>   TaskRunner.runSequential() line: 88 
>   Driver.launchTask(Task, String, boolean, String, int, 
> DriverContext) line: 1653   
>   Driver.execute() line: 1412 
> For every job, a new instance of JobClient/YarnClientImpl/TimelineClientImpl 
> is created. But because the HS2 process stays up for days, the previous trust 
> store reloader threads are still hanging around in the HS2 process and 
> eventually use all the resources available. 
> It seems like a similar fix as HADOOP-11368 is needed in TimelineClientImpl 
> but it doesn't have a destroy method to begin with. 
> One option to avoid this problem is to disable the yarn timeline service 
> (yarn.timeline-service.enabled=false).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5376) capacity scheduler crashed while processing APP_ATTEMPT_REMOVED

2016-07-13 Thread sandflee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sandflee updated YARN-5376:
---
Attachment: capacity-crash.log

> capacity scheduler crashed while processing APP_ATTEMPT_REMOVED
> ---
>
> Key: YARN-5376
> URL: https://issues.apache.org/jira/browse/YARN-5376
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: sandflee
> Attachments: capacity-crash.log
>
>
> we are testing capacity schedule with a sls like client, see following error, 
> seems shedulerNode is removed.
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.completedContainer(LeafQueue.java:1606)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.completedContainer(CapacityScheduler.java:1416)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.doneApplicationAttempt(CapacityScheduler.java:903)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1265)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:121)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:677)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5376) capacity scheduler crashed while processing APP_ATTEMPT_REMOVED

2016-07-13 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376222#comment-15376222
 ] 

Sunil G commented on YARN-5376:
---

Hi [~sandflee],  which version of hadoop are you using. 

> capacity scheduler crashed while processing APP_ATTEMPT_REMOVED
> ---
>
> Key: YARN-5376
> URL: https://issues.apache.org/jira/browse/YARN-5376
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: sandflee
>
> we are testing capacity schedule with a sls like client, see following error, 
> seems shedulerNode is removed.
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.completedContainer(LeafQueue.java:1606)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.completedContainer(CapacityScheduler.java:1416)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.doneApplicationAttempt(CapacityScheduler.java:903)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1265)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:121)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:677)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5333) Some recovered apps are put into default queue when RM HA

2016-07-13 Thread Jun Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376219#comment-15376219
 ] 

Jun Gong commented on YARN-5333:


The reason for test case errors in TestRMWebServicesAppsModification(e.g. 
testAppMove) is that they reinitialize CapacityScheduler with a new 
CapacitySchedulerConfiguration before {{rm.start()}} and it will cause problems 
to reinitialize it two times. However from another point of view, I think 
CapacityScheduler also needs this patch.  [~vinodkv], [~vvasudev] could you 
please help confirm it? Thanks!

> Some recovered apps are put into default queue when RM HA
> -
>
> Key: YARN-5333
> URL: https://issues.apache.org/jira/browse/YARN-5333
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jun Gong
>Assignee: Jun Gong
> Attachments: YARN-5333.01.patch, YARN-5333.02.patch
>
>
> Enable RM HA and use FairScheduler, 
> {{yarn.scheduler.fair.allow-undeclared-pools}} is set to false, 
> {{yarn.scheduler.fair.user-as-default-queue}} is set to false.
> Reproduce steps:
> 1. Start two RMs.
> 2. After RMs are running, change both RM's file 
> {{etc/hadoop/fair-scheduler.xml}}, then add some queues.
> 3. Submit some apps to the new added queues.
> 4. Stop the active RM, then the standby RM will transit to active and recover 
> apps.
> However the new active RM will put recovered apps into default queue because 
> it might have not loaded the new {{fair-scheduler.xml}}. We need call 
> {{initScheduler}} before start active services or bring {{refreshAll()}} in 
> front of {{rm.transitionToActive()}}. *It seems it is also important for 
> other scheduler*.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5376) capacity scheduler crashed while processing APP_ATTEMPT_REMOVED

2016-07-13 Thread sandflee (JIRA)
sandflee created YARN-5376:
--

 Summary: capacity scheduler crashed while processing 
APP_ATTEMPT_REMOVED
 Key: YARN-5376
 URL: https://issues.apache.org/jira/browse/YARN-5376
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: sandflee


we are testing capacity schedule with a sls like client, see following error, 
seems shedulerNode is removed.
{noformat}
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.completedContainer(LeafQueue.java:1606)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.completedContainer(CapacityScheduler.java:1416)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.doneApplicationAttempt(CapacityScheduler.java:903)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1265)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:121)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:677)
at java.lang.Thread.run(Thread.java:745)
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5211) Supporting "priorities" in the ReservationSystem

2016-07-13 Thread Sean Po (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Po updated YARN-5211:
--
Issue Type: Improvement  (was: Sub-task)
Parent: (was: YARN-2572)

> Supporting "priorities" in the ReservationSystem
> 
>
> Key: YARN-5211
> URL: https://issues.apache.org/jira/browse/YARN-5211
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Carlo Curino
>Assignee: Sean Po
>
> The ReservationSystem currently has an implicit FIFO priority. This JIRA 
> tracks effort to generalize this to arbitrary priority. This is non-trivial 
> as the greedy nature of our ReservationAgents might need to be revisited if 
> not enough space if found for late-arriving but higher priority reservations. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5362) TestRMRestart#testFinishedAppRemovalAfterRMRestart can fail

2016-07-13 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376161#comment-15376161
 ] 

sandflee commented on YARN-5362:


thanks [~rohithsharma] for review and commit, open YARN-5375 to track 
implicitly invokes drainEvents in mockRM.

> TestRMRestart#testFinishedAppRemovalAfterRMRestart can fail
> ---
>
> Key: YARN-5362
> URL: https://issues.apache.org/jira/browse/YARN-5362
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jason Lowe
>Assignee: sandflee
> Fix For: 2.9.0
>
> Attachments: YARN-5362.01.patch
>
>
> Saw the following in a precommit build that only changed an unrelated unit 
> test:
> {noformat}
> Tests run: 29, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 101.265 sec 
> <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
> testFinishedAppRemovalAfterRMRestart(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart)
>   Time elapsed: 0.411 sec  <<< FAILURE!
> java.lang.AssertionError: expected null, but 
> was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotNull(Assert.java:664)
>   at org.junit.Assert.assertNull(Assert.java:646)
>   at org.junit.Assert.assertNull(Assert.java:656)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testFinishedAppRemovalAfterRMRestart(TestRMRestart.java:1653)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5375) invoke MockRM#drainEvents implicitly in MockRM methods to reduce test failures

2016-07-13 Thread sandflee (JIRA)
sandflee created YARN-5375:
--

 Summary: invoke MockRM#drainEvents implicitly in MockRM methods to 
reduce test failures
 Key: YARN-5375
 URL: https://issues.apache.org/jira/browse/YARN-5375
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: sandflee
Assignee: sandflee


seen many test failures related to RMApp/RMAppattempt comes to some state but 
some event are not processed in rm event queue or scheduler event queue, cause 
test failure, seems we could implicitly invokes drainEvents(should also drain 
sheduler event) in some mockRM method like waitForState



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5361) Obtaining logs for completed container says 'file belongs to a running container ' at the end

2016-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376109#comment-15376109
 ] 

Hadoop QA commented on YARN-5361:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 50s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 26s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 14s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 35s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: The patch generated 1 
new + 4 unchanged - 1 fixed = 5 total (was 5) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
35s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 16s 
{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 14s {color} 
| {color:red} hadoop-yarn-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 43s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.client.cli.TestLogsCLI |
|   | hadoop.yarn.client.api.impl.TestYarnClient |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817838/YARN-5361.2.patch |
| JIRA Issue | YARN-5361 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 2fbc753c99e0 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 728bf7f |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/12318/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/12318/artifact/patc

[jira] [Commented] (YARN-5156) YARN_CONTAINER_FINISHED of YARN_CONTAINERs will always have running state

2016-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376107#comment-15376107
 ] 

Hadoop QA commented on YARN-5156:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 
26s {color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s 
{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s 
{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
46s {color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s 
{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 8 new + 0 unchanged - 0 fixed = 8 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch 5 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 54s 
{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 30m 44s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817847/YARN-5156-YARN-5355.01.patch
 |
| JIRA Issue | YARN-5156 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 63e565a26a7b 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | YARN-5355 / 0fd3980 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/12319/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/12319/artifact/patchprocess/whitespace-tabs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12319/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/12319/console |
| Powered by | Apache Y

[jira] [Commented] (YARN-4759) Revisit signalContainer() for docker containers

2016-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376098#comment-15376098
 ] 

Hadoop QA commented on YARN-4759:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
1s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
40s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 12s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 3 new + 18 unchanged - 0 fixed = 21 total (was 18) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 55s 
{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 27m 35s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817837/YARN-4759.003.patch |
| JIRA Issue | YARN-4759 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux 105334caf068 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 728bf7f |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/12317/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12317/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/12317/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Revisit signalContainer() for docker containers
> ---

[jira] [Commented] (YARN-5342) Improve non-exclusive node partition resource allocation in Capacity Scheduler

2016-07-13 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376095#comment-15376095
 ] 

Wangda Tan commented on YARN-5342:
--

[~Naganarasimha], that's a good point, actually I have thought about this while 
working on the patch.
The only purpose of doing this is for simple, we could have some better logic 
like gradually decrease the counter depends on ratio of #nodes in default 
partition and #nodes in specific partitions, but they could be complex and 
potentially can be a regression since we don't know what happened. Please share 
your thoughts. 

Thanks,

> Improve non-exclusive node partition resource allocation in Capacity Scheduler
> --
>
> Key: YARN-5342
> URL: https://issues.apache.org/jira/browse/YARN-5342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-5342.1.patch
>
>
> In the previous implementation, one non-exclusive container allocation is 
> possible when the missed-opportunity >= #cluster-nodes. And 
> missed-opportunity will be reset when container allocated to any node.
> This will slow down the frequency of container allocation on non-exclusive 
> node partition: *When a non-exclusive partition=x has idle resource, we can 
> only allocate one container for this app in every 
> X=nodemanagers.heartbeat-interval secs for the whole cluster.*
> In this JIRA, I propose a fix to reset missed-opporunity only if we have >0 
> pending resource for the non-exclusive partition OR we get allocation from 
> the default partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5159) Wrong Javadoc tag in MiniYarnCluster

2016-07-13 Thread Akira Ajisaka (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376089#comment-15376089
 ] 

Akira Ajisaka commented on YARN-5159:
-

bq. I tried that locally before but if I remove the package name the javadoc 
engine will skip it.
Really? o.a.h.yarn.conf.YarnConfiguration is imported in MiniYarnCluster.java, 
so I'm thinking that works. I tried that and succeed the following commands.
{noformat}
$ cd hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests
$ mvn javadoc:test-javadoc
{noformat}

> Wrong Javadoc tag in MiniYarnCluster
> 
>
> Key: YARN-5159
> URL: https://issues.apache.org/jira/browse/YARN-5159
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: documentation
>Affects Versions: 2.6.0
>Reporter: Andras Bokor
>Assignee: Andras Bokor
> Fix For: 2.8.0
>
> Attachments: YARN-5159.01.patch, YARN-5159.02.patch, 
> YARN-5159.03.patch
>
>
> {@YarnConfiguration.RM_SCHEDULER_INCLUDE_PORT_IN_NODE_NAME} is wrong. Should 
> be changed to 
>  {@value YarnConfiguration#RM_SCHEDULER_INCLUDE_PORT_IN_NODE_NAME}
> Edit:
> I noted that due to java 8 javadoc restrictions the javadoc:test-javadoc goal 
> fails on hadoop-yarn-server-tests project.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5156) YARN_CONTAINER_FINISHED of YARN_CONTAINERs will always have running state

2016-07-13 Thread Vrushali C (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C updated YARN-5156:
-
Attachment: YARN-5156-YARN-5355.01.patch

Uploading patch rebased to new branch YARN-5355 and modifying the code as per 
Varun's points above.

Like I mentioned in an earlier comment, I think we should include the container 
state in the finished event, if we are including other container states at 
other times in other events. This has two purposes:
- ensuring consistency in information within an event
- allowing for easier scanning/filtering in the data when state information is 
present. 

I am still wondering what unit test to write. The patch is simple enough. 


> YARN_CONTAINER_FINISHED of YARN_CONTAINERs will always have running state
> -
>
> Key: YARN-5156
> URL: https://issues.apache.org/jira/browse/YARN-5156
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Vrushali C
>  Labels: YARN-5355
> Attachments: YARN-5156-YARN-2928.01.patch, 
> YARN-5156-YARN-5355.01.patch
>
>
> On container finished, we're reporting "YARN_CONTAINER_STATE: "RUNNING"". Do 
> we design this deliberately or it's a bug? 
> {code}
> {
> metrics: [ ],
> events: [
> {
> id: "YARN_CONTAINER_FINISHED",
> timestamp: 1464213765890,
> info: {
> YARN_CONTAINER_EXIT_STATUS: 0,
> YARN_CONTAINER_STATE: "RUNNING",
> YARN_CONTAINER_DIAGNOSTICS_INFO: ""
> }
> },
> {
> id: "YARN_NM_CONTAINER_LOCALIZATION_FINISHED",
> timestamp: 1464213761133,
> info: { }
> },
> {
> id: "YARN_CONTAINER_CREATED",
> timestamp: 1464213761132,
> info: { }
> },
> {
> id: "YARN_NM_CONTAINER_LOCALIZATION_STARTED",
> timestamp: 1464213761132,
> info: { }
> }
> ],
> id: "container_e15_1464213707405_0001_01_18",
> type: "YARN_CONTAINER",
> createdtime: 1464213761132,
> info: {
> YARN_CONTAINER_ALLOCATED_PRIORITY: "20",
> YARN_CONTAINER_ALLOCATED_VCORE: 1,
> YARN_CONTAINER_ALLOCATED_HOST_HTTP_ADDRESS: "10.22.16.164:0",
> UID: 
> "yarn_cluster!application_1464213707405_0001!YARN_CONTAINER!container_e15_1464213707405_0001_01_18",
> YARN_CONTAINER_ALLOCATED_HOST: "10.22.16.164",
> YARN_CONTAINER_ALLOCATED_MEMORY: 1024,
> SYSTEM_INFO_PARENT_ENTITY: {
> type: "YARN_APPLICATION_ATTEMPT",
> id: "appattempt_1464213707405_0001_01"
> },
> YARN_CONTAINER_ALLOCATED_PORT: 64694
> },
> configs: { },
> isrelatedto: { },
> relatesto: { }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5156) YARN_CONTAINER_FINISHED of YARN_CONTAINERs will always have running state

2016-07-13 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376057#comment-15376057
 ] 

Vrushali C edited comment on YARN-5156 at 7/14/16 12:25 AM:


Thanks [~varun_saxena] for the review discussion. 

Uploading patch rebased to new branch YARN-5355 and modifying the code as per 
Varun's points above.

Like I mentioned in an earlier comment, I think we should include the container 
state in the finished event, if we are including other container states at 
other times in other events. This has two purposes:
- ensuring consistency in information within an event
- allowing for easier scanning/filtering in the data when state information is 
present. 

I am still wondering what unit test to write. The patch is simple enough. 



was (Author: vrushalic):
Uploading patch rebased to new branch YARN-5355 and modifying the code as per 
Varun's points above.

Like I mentioned in an earlier comment, I think we should include the container 
state in the finished event, if we are including other container states at 
other times in other events. This has two purposes:
- ensuring consistency in information within an event
- allowing for easier scanning/filtering in the data when state information is 
present. 

I am still wondering what unit test to write. The patch is simple enough. 


> YARN_CONTAINER_FINISHED of YARN_CONTAINERs will always have running state
> -
>
> Key: YARN-5156
> URL: https://issues.apache.org/jira/browse/YARN-5156
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Vrushali C
>  Labels: YARN-5355
> Attachments: YARN-5156-YARN-2928.01.patch, 
> YARN-5156-YARN-5355.01.patch
>
>
> On container finished, we're reporting "YARN_CONTAINER_STATE: "RUNNING"". Do 
> we design this deliberately or it's a bug? 
> {code}
> {
> metrics: [ ],
> events: [
> {
> id: "YARN_CONTAINER_FINISHED",
> timestamp: 1464213765890,
> info: {
> YARN_CONTAINER_EXIT_STATUS: 0,
> YARN_CONTAINER_STATE: "RUNNING",
> YARN_CONTAINER_DIAGNOSTICS_INFO: ""
> }
> },
> {
> id: "YARN_NM_CONTAINER_LOCALIZATION_FINISHED",
> timestamp: 1464213761133,
> info: { }
> },
> {
> id: "YARN_CONTAINER_CREATED",
> timestamp: 1464213761132,
> info: { }
> },
> {
> id: "YARN_NM_CONTAINER_LOCALIZATION_STARTED",
> timestamp: 1464213761132,
> info: { }
> }
> ],
> id: "container_e15_1464213707405_0001_01_18",
> type: "YARN_CONTAINER",
> createdtime: 1464213761132,
> info: {
> YARN_CONTAINER_ALLOCATED_PRIORITY: "20",
> YARN_CONTAINER_ALLOCATED_VCORE: 1,
> YARN_CONTAINER_ALLOCATED_HOST_HTTP_ADDRESS: "10.22.16.164:0",
> UID: 
> "yarn_cluster!application_1464213707405_0001!YARN_CONTAINER!container_e15_1464213707405_0001_01_18",
> YARN_CONTAINER_ALLOCATED_HOST: "10.22.16.164",
> YARN_CONTAINER_ALLOCATED_MEMORY: 1024,
> SYSTEM_INFO_PARENT_ENTITY: {
> type: "YARN_APPLICATION_ATTEMPT",
> id: "appattempt_1464213707405_0001_01"
> },
> YARN_CONTAINER_ALLOCATED_PORT: 64694
> },
> configs: { },
> isrelatedto: { },
> relatesto: { }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5361) Obtaining logs for completed container says 'file belongs to a running container ' at the end

2016-07-13 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-5361:

Attachment: YARN-5361.2.patch

> Obtaining logs for completed container says 'file belongs to a running 
> container ' at the end
> -
>
> Key: YARN-5361
> URL: https://issues.apache.org/jira/browse/YARN-5361
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Sumana Sathish
>Assignee: Xuan Gong
>Priority: Critical
> Attachments: YARN-5361.1.patch, YARN-5361.2.patch
>
>
> Obtaining logs via yarn CLI for completed container but running application 
> says "This log file belongs to a running container 
> (container_e32_1468319707096_0001_01_04) and so may not be complete" 
> which is not correct.
> {code}
> LogType:stdout
> Log Upload Time:Tue Jul 12 10:38:14 + 2016
> Log Contents:
> End of LogType:stdout. This log file belongs to a running container 
> (container_e32_1468319707096_0001_01_04) and so may not be complete.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4759) Revisit signalContainer() for docker containers

2016-07-13 Thread Shane Kumpf (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shane Kumpf updated YARN-4759:
--
Attachment: YARN-4759.003.patch

> Revisit signalContainer() for docker containers
> ---
>
> Key: YARN-4759
> URL: https://issues.apache.org/jira/browse/YARN-4759
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Shane Kumpf
> Attachments: YARN-4759.001.patch, YARN-4759.002.patch, 
> YARN-4759.003.patch
>
>
> The current signal handling (in the DockerContainerRuntime) needs to be 
> revisited for docker containers. For example, container reacquisition on NM 
> restart might not work, depending on which user the process in the container 
> runs as. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4759) Revisit signalContainer() for docker containers

2016-07-13 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376031#comment-15376031
 ] 

Shane Kumpf commented on YARN-4759:
---

Thanks for the review [~vvasudev]! I will upload a new patch shortly.

> Revisit signalContainer() for docker containers
> ---
>
> Key: YARN-4759
> URL: https://issues.apache.org/jira/browse/YARN-4759
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Shane Kumpf
> Attachments: YARN-4759.001.patch, YARN-4759.002.patch
>
>
> The current signal handling (in the DockerContainerRuntime) needs to be 
> revisited for docker containers. For example, container reacquisition on NM 
> restart might not work, depending on which user the process in the container 
> runs as. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5361) Obtaining logs for completed container says 'file belongs to a running container ' at the end

2016-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376027#comment-15376027
 ] 

Hadoop QA commented on YARN-5361:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
58s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 56s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
36s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
48s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 45s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 40s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: The patch generated 1 
new + 4 unchanged - 1 fixed = 5 total (was 5) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 23s 
{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 34s {color} 
| {color:red} hadoop-yarn-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 36m 13s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.client.cli.TestLogsCLI |
|   | hadoop.yarn.client.api.impl.TestYarnClient |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817826/YARN-5361.1.patch |
| JIRA Issue | YARN-5361 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 8e0895a44569 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / d180505 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/12316/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/12316/artifact/patchp

[jira] [Comment Edited] (YARN-4743) ResourceManager crash because TimSort

2016-07-13 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375989#comment-15375989
 ] 

sandflee edited comment on YARN-4743 at 7/13/16 11:37 PM:
--

I don't think snapshot could resolve this, as in YARN-5371, node is only sorted 
with unused resource. this seems caused by a > b, and b > c, but while sorting 
a and c, a < c. we should snapshot all sorting element and then sort to avoid 
this, or could add -Djava.util.Arrays.useLegacyMergeSort=true to YARN_OPS to 
use mergeSort not TimSort for Collection#sort.


was (Author: sandflee):
I don't think snapshot could resolve this, as in YARN-5371, node is only sorted 
with unused resource. this seems caused by a > b, and b > c, but while sorting 
a and c, a < c. we should snapshot all sorting element and then sort to avoid 
this, or could add -Djava.util.Arrays.useLegacyMergeSort=true to YARN_OPS to 
use mergeSort not TimSort for Collection#sort, I think capacity scheduler have 
similar problem.

> ResourceManager crash because TimSort
> -
>
> Key: YARN-4743
> URL: https://issues.apache.org/jira/browse/YARN-4743
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.4
>Reporter: Zephyr Guo
>Assignee: Yufei Gu
> Attachments: YARN-4743-cdh5.4.7.patch
>
>
> {code}
> 2016-02-26 14:08:50,821 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type NODE_UPDATE to the scheduler
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>  at java.util.TimSort.mergeHi(TimSort.java:868)
>  at java.util.TimSort.mergeAt(TimSort.java:485)
>  at java.util.TimSort.mergeCollapse(TimSort.java:410)
>  at java.util.TimSort.sort(TimSort.java:214)
>  at java.util.TimSort.sort(TimSort.java:173)
>  at java.util.Arrays.sort(Arrays.java:659)
>  at java.util.Collections.sort(Collections.java:217)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:316)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:240)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1091)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:989)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1185)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:112)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684)
>  at java.lang.Thread.run(Thread.java:745)
> 2016-02-26 14:08:50,822 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..
> {code}
> Actually, this issue found in 2.6.0-cdh5.4.7.
> I think the cause is that we modify {{Resouce}} while we are sorting 
> {{runnableApps}}.
> {code:title=FSLeafQueue.java}
> Comparator comparator = policy.getComparator();
> writeLock.lock();
> try {
>   Collections.sort(runnableApps, comparator);
> } finally {
>   writeLock.unlock();
> }
> readLock.lock();
> {code}
> {code:title=FairShareComparator}
> public int compare(Schedulable s1, Schedulable s2) {
> ..
>   s1.getResourceUsage(), minShare1);
>   boolean s2Needy = Resources.lessThan(RESOURCE_CALCULATOR, null,
>   s2.getResourceUsage(), minShare2);
>   minShareRatio1 = (double) s1.getResourceUsage().getMemory()
>   / Resources.max(RESOURCE_CALCULATOR, null, minShare1, 
> ONE).getMemory();
>   minShareRatio2 = (double) s2.getResourceUsage().getMemory()
>   / Resources.max(RESOURCE_CALCULATOR, null, minShare2, 
> ONE).getMemory();
> ..
> {code}
> {{getResourceUsage}} will return current Resource. The current Resource is 
> unstable. 
> {code:title=FSAppAttempt.java}
> @Override
>   public Resource getResourceUsage() {
> // Here the getPreemptedResources() always return zero, except in
> // a preemption round
> return Resources.subtract(getCurrentConsumption(), 
> getPreemptedResources());
>   }
> {code}
> {code:title=SchedulerApplicationAttempt}
>  public Resource getCurrentConsumption() {
> return currentConsumption;
>   }
> // This method may modify current Resource.
> public synchronized void recoverContainer(RMContainer rmContainer) {
> ..
> Resources.addTo(currentConsumption, rmContainer.getContainer()
>   .getResourc

[jira] [Commented] (YARN-5363) For AM containers, or for containers of running-apps, "yarn logs" incorrectly only (tries to) shows syslog file-type by default

2016-07-13 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375999#comment-15375999
 ] 

Xuan Gong commented on YARN-5363:
-

[~vinodkv] Thanks for the patch.  Overall looks good.

I have one comment:
we did a check for the input log files:
{code}
List logs = new ArrayList();
if (fetchAllLogFiles(logFiles)) {
  logs.add(".*");
} else if (logFiles != null && logFiles.length > 0) {
  logs = Arrays.asList(logFiles);
}
{code}
before we actually ran any commands. I think that we could add the logic here. 
So, we do not need to do it separately inside several different functions.

> For AM containers, or for containers of running-apps, "yarn logs" incorrectly 
> only (tries to) shows syslog file-type by default
> ---
>
> Key: YARN-5363
> URL: https://issues.apache.org/jira/browse/YARN-5363
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Attachments: YARN-5363-2016-07-12.txt, YARN-5363-2016-07-13.txt
>
>
> For e.g, for a running application, the following happens:
> {code}
> # yarn logs -applicationId application_1467838922593_0001
> 16/07/06 22:07:05 INFO impl.TimelineClientImpl: Timeline service address: 
> http://:8188/ws/v1/timeline/
> 16/07/06 22:07:06 INFO client.RMProxy: Connecting to ResourceManager at 
> /:8050
> 16/07/06 22:07:07 INFO impl.TimelineClientImpl: Timeline service address: 
> http://l:8188/ws/v1/timeline/
> 16/07/06 22:07:07 INFO client.RMProxy: Connecting to ResourceManager at 
> /:8050
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_01 within the application: 
> application_1467838922593_0001
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_02 within the application: 
> application_1467838922593_0001
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_03 within the application: 
> application_1467838922593_0001
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_04 within the application: 
> application_1467838922593_0001
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_05 within the application: 
> application_1467838922593_0001
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_06 within the application: 
> application_1467838922593_0001
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_07 within the application: 
> application_1467838922593_0001
> Can not find the logs for the application: application_1467838922593_0001 
> with the appOwner: 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5373) NPE listing wildcard directory in containerLaunch

2016-07-13 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-5373:
-
Summary: NPE listing wildcard directory in containerLaunch  (was: NPE 
introduced by YARN-4958 (The file localization process should allow...))

> NPE listing wildcard directory in containerLaunch
> -
>
> Key: YARN-5373
> URL: https://issues.apache.org/jira/browse/YARN-5373
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Critical
>
> YARN-4958 added support for wildcards in file localization. It introduces a 
> NPE 
> at 
> {code:java}
> for (File wildLink : directory.listFiles()) {
> sb.symlink(new Path(wildLink.toString()), new Path(wildLink.getName()));
> }
> {code}
> When directory.listFiles returns null (only happens in a secure cluster), NPE 
> will cause the container fail to launch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5373) NPE introduced by YARN-4958 (The file localization process should allow...)

2016-07-13 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-5373:
-
Priority: Critical  (was: Major)

> NPE introduced by YARN-4958 (The file localization process should allow...)
> ---
>
> Key: YARN-5373
> URL: https://issues.apache.org/jira/browse/YARN-5373
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Critical
>
> YARN-4958 added support for wildcards in file localization. It introduces a 
> NPE 
> at 
> {code:java}
> for (File wildLink : directory.listFiles()) {
> sb.symlink(new Path(wildLink.toString()), new Path(wildLink.getName()));
> }
> {code}
> When directory.listFiles returns null (only happens in a secure cluster), NPE 
> will cause the container fail to launch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4743) ResourceManager crash because TimSort

2016-07-13 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375989#comment-15375989
 ] 

sandflee commented on YARN-4743:


I don't think snapshot could resolve this, as in YARN-5371, node is only sorted 
with unused resource. this seems caused by a > b, and b > c, but while sorting 
a and c, a < c. we should snapshot all sorting element and then sort to avoid 
this, or could add -Djava.util.Arrays.useLegacyMergeSort=true to YARN_OPS to 
use mergeSort not TimSort for Collection#sort, I think capacity scheduler have 
similar problem.

> ResourceManager crash because TimSort
> -
>
> Key: YARN-4743
> URL: https://issues.apache.org/jira/browse/YARN-4743
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.4
>Reporter: Zephyr Guo
>Assignee: Yufei Gu
> Attachments: YARN-4743-cdh5.4.7.patch
>
>
> {code}
> 2016-02-26 14:08:50,821 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type NODE_UPDATE to the scheduler
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>  at java.util.TimSort.mergeHi(TimSort.java:868)
>  at java.util.TimSort.mergeAt(TimSort.java:485)
>  at java.util.TimSort.mergeCollapse(TimSort.java:410)
>  at java.util.TimSort.sort(TimSort.java:214)
>  at java.util.TimSort.sort(TimSort.java:173)
>  at java.util.Arrays.sort(Arrays.java:659)
>  at java.util.Collections.sort(Collections.java:217)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:316)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:240)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1091)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:989)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1185)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:112)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684)
>  at java.lang.Thread.run(Thread.java:745)
> 2016-02-26 14:08:50,822 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..
> {code}
> Actually, this issue found in 2.6.0-cdh5.4.7.
> I think the cause is that we modify {{Resouce}} while we are sorting 
> {{runnableApps}}.
> {code:title=FSLeafQueue.java}
> Comparator comparator = policy.getComparator();
> writeLock.lock();
> try {
>   Collections.sort(runnableApps, comparator);
> } finally {
>   writeLock.unlock();
> }
> readLock.lock();
> {code}
> {code:title=FairShareComparator}
> public int compare(Schedulable s1, Schedulable s2) {
> ..
>   s1.getResourceUsage(), minShare1);
>   boolean s2Needy = Resources.lessThan(RESOURCE_CALCULATOR, null,
>   s2.getResourceUsage(), minShare2);
>   minShareRatio1 = (double) s1.getResourceUsage().getMemory()
>   / Resources.max(RESOURCE_CALCULATOR, null, minShare1, 
> ONE).getMemory();
>   minShareRatio2 = (double) s2.getResourceUsage().getMemory()
>   / Resources.max(RESOURCE_CALCULATOR, null, minShare2, 
> ONE).getMemory();
> ..
> {code}
> {{getResourceUsage}} will return current Resource. The current Resource is 
> unstable. 
> {code:title=FSAppAttempt.java}
> @Override
>   public Resource getResourceUsage() {
> // Here the getPreemptedResources() always return zero, except in
> // a preemption round
> return Resources.subtract(getCurrentConsumption(), 
> getPreemptedResources());
>   }
> {code}
> {code:title=SchedulerApplicationAttempt}
>  public Resource getCurrentConsumption() {
> return currentConsumption;
>   }
> // This method may modify current Resource.
> public synchronized void recoverContainer(RMContainer rmContainer) {
> ..
> Resources.addTo(currentConsumption, rmContainer.getContainer()
>   .getResource());
> ..
>   }
> {code}
> I suggest that use stable Resource in comparator.
> Is there something i think wrong?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5373) NPE introduced by YARN-4958 (The file localization process should allow...)

2016-07-13 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375988#comment-15375988
 ] 

Haibo Chen commented on YARN-5373:
--

As per offline discussion with Daniel, the cause is that in a secure cluster, 
the node manager that executes container launch code runs as a user that has no 
permission to read/execute the local wildcard directory that is downloaded as a 
resource by the remote user. Thus, directory.listFiles() return null.

> NPE introduced by YARN-4958 (The file localization process should allow...)
> ---
>
> Key: YARN-5373
> URL: https://issues.apache.org/jira/browse/YARN-5373
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> YARN-4958 added support for wildcards in file localization. It introduces a 
> NPE 
> at 
> {code:java}
> for (File wildLink : directory.listFiles()) {
> sb.symlink(new Path(wildLink.toString()), new Path(wildLink.getName()));
> }
> {code}
> When directory.listFiles returns null (only happens in a secure cluster), NPE 
> will cause the container fail to launch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3649) Allow configurable prefix for hbase table names (like prod, exp, test etc)

2016-07-13 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375987#comment-15375987
 ] 

Vrushali C commented on YARN-3649:
--

Thanks Joep, yes I need to rebase. Good point about the documentation, will 
include updates to doc as well.


> Allow configurable prefix for hbase table names (like prod, exp, test etc)
> --
>
> Key: YARN-3649
> URL: https://issues.apache.org/jira/browse/YARN-3649
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Vrushali C
>Assignee: Vrushali C
>  Labels: YARN-5355
> Attachments: YARN-3649-YARN-2928.01.patch
>
>
> As per [~jrottinghuis]'s suggestion in YARN-3411, it will be a good idea to 
> have a configurable prefix for hbase table names.  
> This way we can easily run a staging, a test, a production and whatever setup 
> in the same HBase instance / without having to override every single table in 
> the config.
> One could simply overwrite the default prefix and you're off and running.
> For prefix, potential candidates are "tst" "prod" "exp" etc. Once can then 
> still override one tablename if needed, but managing one whole setup will be 
> easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5164) CapacityOvertimePolicy does not take advantaged of plan RLE

2016-07-13 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375971#comment-15375971
 ] 

Chris Douglas commented on YARN-5164:
-

Only minor nits, otherwise +1:
{{CapacityOverTimePolicy}}
- Avoid importing java.util.\*
- Where the intermediate points are added, the code would be more readable if 
the key were assigned to a named variable (instead of multiple calls to 
{{e.getKey()}}). Same with the point-wise integral computation
- checkstyle (spacing): {{+  if(e.getValue()!=null) {}}
- A comment briefly sketching the algorithm would help future maintainers

{{NoOverCommitPolicy}}
- The exception message should be reformatted (some redundant string concats) 
and omit references to the time it no longer reports
- Should the {{PlanningException}} be added as a cause, rather than 
concatenated with the ReservationID?

> CapacityOvertimePolicy does not take advantaged of plan RLE
> ---
>
> Key: YARN-5164
> URL: https://issues.apache.org/jira/browse/YARN-5164
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Carlo Curino
>Assignee: Carlo Curino
> Attachments: YARN-5164-example.pdf, YARN-5164-inclusive.4.patch, 
> YARN-5164-inclusive.5.patch, YARN-5164.1.patch, YARN-5164.2.patch, 
> YARN-5164.5.patch, YARN-5164.6.patch
>
>
> As a consequence small time granularities (e.g., 1 sec) and long time horizon 
> for a reservation (e.g., months) run rather slow (10 sec). 
> Proposed resolution is to switch to interval math in checking, similar to how 
> YARN-4359 does for agents.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5361) Obtaining logs for completed container says 'file belongs to a running container ' at the end

2016-07-13 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375956#comment-15375956
 ] 

Xuan Gong commented on YARN-5361:
-

It's not straightforward to add a unit test. I have tested locally.

> Obtaining logs for completed container says 'file belongs to a running 
> container ' at the end
> -
>
> Key: YARN-5361
> URL: https://issues.apache.org/jira/browse/YARN-5361
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Sumana Sathish
>Assignee: Xuan Gong
>Priority: Critical
> Attachments: YARN-5361.1.patch
>
>
> Obtaining logs via yarn CLI for completed container but running application 
> says "This log file belongs to a running container 
> (container_e32_1468319707096_0001_01_04) and so may not be complete" 
> which is not correct.
> {code}
> LogType:stdout
> Log Upload Time:Tue Jul 12 10:38:14 + 2016
> Log Contents:
> End of LogType:stdout. This log file belongs to a running container 
> (container_e32_1468319707096_0001_01_04) and so may not be complete.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5361) Obtaining logs for completed container says 'file belongs to a running container ' at the end

2016-07-13 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-5361:

Attachment: YARN-5361.1.patch

> Obtaining logs for completed container says 'file belongs to a running 
> container ' at the end
> -
>
> Key: YARN-5361
> URL: https://issues.apache.org/jira/browse/YARN-5361
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Sumana Sathish
>Assignee: Xuan Gong
>Priority: Critical
> Attachments: YARN-5361.1.patch
>
>
> Obtaining logs via yarn CLI for completed container but running application 
> says "This log file belongs to a running container 
> (container_e32_1468319707096_0001_01_04) and so may not be complete" 
> which is not correct.
> {code}
> LogType:stdout
> Log Upload Time:Tue Jul 12 10:38:14 + 2016
> Log Contents:
> End of LogType:stdout. This log file belongs to a running container 
> (container_e32_1468319707096_0001_01_04) and so may not be complete.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3662) Federation Membership State APIs

2016-07-13 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375947#comment-15375947
 ] 

Wangda Tan commented on YARN-3662:
--

Hi [~subru],

I took a very quick look at this patch and also YARN-3664/YARN-5367, I put all 
questions and comments here:

Questions:
- Could not quite sure about what is FederationPolicy and how to use the class. 
Is it a state or a configuration? And why compressing parameters into a byte 
array instead more meaningful fields?
- It could be better to add RPC service interface definitions of 
FederationPolicy storage API for easier review, now I cannot understand how 
these protocol definitions will be used.

(Highlevel) Comments:
- FederationMembershipState looks like a "state manager" since it supports 
operations to modify existing members. At the first glance, it's a 
sub-cluster-resource-tracker which is similar to existing RM resource tracker.
- Similiarly, FederationApplicationState looks like a 
"federation-application-manager" instead of a "state".
- FederationMembershipState has same parameter FederationSubClusterInfo for 
register/heartbeat -- is it possible that we require different parameter for 
registration and heartbeat? (Just like NM registration request and NM update 
request).
- FederationSubClusterInfo: fields like amRMAddress is actually a service 
endpoint, names of these fields are little confusing to me.

Styles:
- redundunt "public" in all interface definitions (considering switching to 
Intellij instead of Eclipse? :-p)

Thanks,

> Federation Membership State APIs
> 
>
> Key: YARN-3662
> URL: https://issues.apache.org/jira/browse/YARN-3662
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: YARN-3662-YARN-2915-v1.1.patch, 
> YARN-3662-YARN-2915-v1.patch, YARN-3662-YARN-2915-v2.patch
>
>
> The Federation Application State encapsulates the information about the 
> active RM of each sub-cluster that is participating in Federation. The 
> information includes addresses for ClientRM, ApplicationMaster and Admin 
> services along with the sub_cluster _capability_ which is currently defined 
> by *ClusterMetricsInfo*. Please refer to the design doc in parent JIRA for 
> further details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5298) Mount usercache and NM filecache directories into Docker container

2016-07-13 Thread Sidharta Seethana (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375910#comment-15375910
 ] 

Sidharta Seethana commented on YARN-5298:
-

Thanks, [~vvasudev] and [~templedf] !

[~vvasudev] , about the container specific directories : The docker container 
runtime itself makes no assumptions about the location of the container 
specific directories/non container-specific directories. It does not know of or 
assume a parent/sub-dir structure and explicitly mounts all required 
directories. I hope that answers your question. 

> Mount usercache and NM filecache directories into Docker container
> --
>
> Key: YARN-5298
> URL: https://issues.apache.org/jira/browse/YARN-5298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Varun Vasudev
>Assignee: Sidharta Seethana
> Attachments: YARN-5298.001.patch, YARN-5298.002.patch
>
>
> Currently, we don't mount the usercache and the NM filecache directories into 
> the Docker container. This can lead to issues with containers that rely on 
> public and application scope resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5181) ClusterNodeTracker: add method to get list of nodes matching a specific resourceName

2016-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375904#comment-15375904
 ] 

Hadoop QA commented on YARN-5181:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
52s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
20s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
57s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
30s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 30s {color} 
| {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 1 new + 2 unchanged - 1 fixed = 3 total (was 3) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 3 new + 1 unchanged - 0 fixed = 4 total (was 1) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
59s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 18s 
{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 2 new + 989 unchanged - 0 fixed = 991 total (was 989) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 33m 9s 
{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 47m 41s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12807049/yarn-5181-1.patch |
| JIRA Issue | YARN-5181 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 46276eaa1b34 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / af8f480 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| javac | 
https://builds.apache.org/job/PreCommit-YARN-Build/12315/artifact/patchprocess/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/12315/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/12315/artifact/patchprocess/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |

[jira] [Commented] (YARN-5339) passing file to -out for YARN log CLI doesnt give warning or error code

2016-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375896#comment-15375896
 ] 

Hudson commented on YARN-5339:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #10093 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10093/])
YARN-5339. Fixed "yarn logs" to fail when a file is passed to -out (vinodkv: 
rev d18050522c5c6bd9e32eb9a1be4ffe2288624c40)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/LogsCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestLogsCLI.java


> passing file to -out for YARN log CLI doesnt give warning or error code
> ---
>
> Key: YARN-5339
> URL: https://issues.apache.org/jira/browse/YARN-5339
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Sumana Sathish
>Assignee: Xuan Gong
> Fix For: 2.9.0
>
> Attachments: YARN-5339.1.patch, YARN-5339.2.patch
>
>
> passing file to -out for YARN log CLI doesnt give warning or error code
> {code}
> yarn  logs -applicationId application_1467117709224_0003 -out 
> /grid/0/hadoopqe/artifacts/file.txt
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4464) default value of yarn.resourcemanager.state-store.max-completed-applications should lower.

2016-07-13 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375867#comment-15375867
 ] 

Vinod Kumar Vavilapalli commented on YARN-4464:
---

We need ATS in production - aka ATS V2. With that in the picture, I agree that 
we don't need to keep any completed applications in RM memory at all.

> default value of yarn.resourcemanager.state-store.max-completed-applications 
> should lower.
> --
>
> Key: YARN-4464
> URL: https://issues.apache.org/jira/browse/YARN-4464
> Project: Hadoop YARN
>  Issue Type: Wish
>  Components: resourcemanager
>Reporter: KWON BYUNGCHANG
>Assignee: Daniel Templeton
>Priority: Blocker
> Attachments: YARN-4464.001.patch, YARN-4464.002.patch, 
> YARN-4464.003.patch, YARN-4464.004.patch
>
>
> my cluster has 120 nodes.
> I configured RM Restart feature.
> {code}
> yarn.resourcemanager.recovery.enabled=true
> yarn.resourcemanager.store.class=org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
> yarn.resourcemanager.fs.state-store.uri=/system/yarn/rmstore
> {code}
> unfortunately I did not configure 
> {{yarn.resourcemanager.state-store.max-completed-applications}}.
> so that property configured default value 10,000.
> I have restarted RM due to changing another configuartion.
> I expected that RM restart immediately.
> recovery process was very slow.  I have waited about 20min.  
> realize missing 
> {{yarn.resourcemanager.state-store.max-completed-applications}}.
> its default value is very huge.  
> need to change lower value or document notice on [RM Restart 
> page|http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/ResourceManagerRestart.html].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5181) ClusterNodeTracker: add method to get list of nodes matching a specific resourceName

2016-07-13 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375866#comment-15375866
 ] 

Arun Suresh commented on YARN-5181:
---

Thanks for the patch [~kasha]

Some minor comments:
# Remove the unused import
# Maybe rename the getNodes(String) to getNodesWithName(String) so that we 
don't need the cast null to (NodeFilter) in getAllNodes() ?




> ClusterNodeTracker: add method to get list of nodes matching a specific 
> resourceName
> 
>
> Key: YARN-5181
> URL: https://issues.apache.org/jira/browse/YARN-5181
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-5181-1.patch
>
>
> ClusterNodeTracker should have a method to return the list of nodes matching 
> a particular resourceName. This is so we could identify what all nodes a 
> particular ResourceRequest is interested in, which in turn is useful in 
> YARN-5139 (global scheduler) and YARN-4752 (FairScheduler preemption 
> overhaul). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4464) default value of yarn.resourcemanager.state-store.max-completed-applications should lower.

2016-07-13 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375862#comment-15375862
 ] 

Daniel Templeton commented on YARN-4464:


With ATS, I don't see a lot of need to keep 10k completed apps lying about. Not 
only is it a startup burden, but it also is a ZK burden.  We regularly tell 
customers to set it lower because of ZK cache load.  Improving the recovery 
logic is something we should also do, but the best doesn't need to be the enemy 
of the good.  [~vinodkv], [~Naganarasimha], [~kasha], can we come to a 
conclusion?

> default value of yarn.resourcemanager.state-store.max-completed-applications 
> should lower.
> --
>
> Key: YARN-4464
> URL: https://issues.apache.org/jira/browse/YARN-4464
> Project: Hadoop YARN
>  Issue Type: Wish
>  Components: resourcemanager
>Reporter: KWON BYUNGCHANG
>Assignee: Daniel Templeton
>Priority: Blocker
> Attachments: YARN-4464.001.patch, YARN-4464.002.patch, 
> YARN-4464.003.patch, YARN-4464.004.patch
>
>
> my cluster has 120 nodes.
> I configured RM Restart feature.
> {code}
> yarn.resourcemanager.recovery.enabled=true
> yarn.resourcemanager.store.class=org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
> yarn.resourcemanager.fs.state-store.uri=/system/yarn/rmstore
> {code}
> unfortunately I did not configure 
> {{yarn.resourcemanager.state-store.max-completed-applications}}.
> so that property configured default value 10,000.
> I have restarted RM due to changing another configuartion.
> I expected that RM restart immediately.
> recovery process was very slow.  I have waited about 20min.  
> realize missing 
> {{yarn.resourcemanager.state-store.max-completed-applications}}.
> its default value is very huge.  
> need to change lower value or document notice on [RM Restart 
> page|http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/ResourceManagerRestart.html].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5339) passing file to -out for YARN log CLI doesnt give warning or error code

2016-07-13 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375833#comment-15375833
 ] 

Vinod Kumar Vavilapalli commented on YARN-5339:
---

Looks good, +1. Checking this in.

> passing file to -out for YARN log CLI doesnt give warning or error code
> ---
>
> Key: YARN-5339
> URL: https://issues.apache.org/jira/browse/YARN-5339
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Sumana Sathish
>Assignee: Xuan Gong
> Attachments: YARN-5339.1.patch, YARN-5339.2.patch
>
>
> passing file to -out for YARN log CLI doesnt give warning or error code
> {code}
> yarn  logs -applicationId application_1467117709224_0003 -out 
> /grid/0/hadoopqe/artifacts/file.txt
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-5371) FairScheduer ContinuousScheduling thread throws Exception

2016-07-13 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla resolved YARN-5371.

Resolution: Duplicate

> FairScheduer ContinuousScheduling thread throws Exception
> -
>
> Key: YARN-5371
> URL: https://issues.apache.org/jira/browse/YARN-5371
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: sandflee
>Assignee: sandflee
>Priority: Critical
>
> {noformat}
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
> at java.util.TimSort.mergeLo(TimSort.java:777)
> at java.util.TimSort.mergeAt(TimSort.java:514)
> at java.util.TimSort.mergeCollapse(TimSort.java:441)
> at java.util.TimSort.sort(TimSort.java:245)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1002)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:285)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5371) FairScheduer ContinuousScheduling thread throws Exception

2016-07-13 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-5371:
---
Priority: Critical  (was: Major)

> FairScheduer ContinuousScheduling thread throws Exception
> -
>
> Key: YARN-5371
> URL: https://issues.apache.org/jira/browse/YARN-5371
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: sandflee
>Assignee: sandflee
>Priority: Critical
>
> {noformat}
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
> at java.util.TimSort.mergeLo(TimSort.java:777)
> at java.util.TimSort.mergeAt(TimSort.java:514)
> at java.util.TimSort.mergeCollapse(TimSort.java:441)
> at java.util.TimSort.sort(TimSort.java:245)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1002)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:285)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4767) Network issues can cause persistent RM UI outage

2016-07-13 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375812#comment-15375812
 ] 

Daniel Templeton commented on YARN-4767:


Ping [~xgong], [~vinodkv].  Would love feedback on the approach in this patch.  
Thanks!

> Network issues can cause persistent RM UI outage
> 
>
> Key: YARN-4767
> URL: https://issues.apache.org/jira/browse/YARN-4767
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.7.2
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Critical
> Attachments: YARN-4767.001.patch, YARN-4767.002.patch, 
> YARN-4767.003.patch, YARN-4767.004.patch, YARN-4767.005.patch, 
> YARN-4767.006.patch, YARN-4767.007.patch
>
>
> If a network issue causes an AM web app to resolve the RM proxy's address to 
> something other than what's listed in the allowed proxies list, the 
> AmIpFilter will 302 redirect the RM proxy's request back to the RM proxy.  
> The RM proxy will then consume all available handler threads connecting to 
> itself over and over, resulting in an outage of the web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4212) FairScheduler: Parent queues is not allowed to be 'Fair' policy if its children have the "drf" policy

2016-07-13 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375809#comment-15375809
 ] 

Karthik Kambatla commented on YARN-4212:


I am perfectly open to working on YARN-5264 first. Happy to review. 

> FairScheduler: Parent queues is not allowed to be 'Fair' policy if its 
> children have the "drf" policy
> -
>
> Key: YARN-4212
> URL: https://issues.apache.org/jira/browse/YARN-4212
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Yufei Gu
>  Labels: fairscheduler
> Attachments: YARN-4212.002.patch, YARN-4212.003.patch, 
> YARN-4212.004.patch, YARN-4212.1.patch
>
>
> The Fair Scheduler, while performing a {{recomputeShares()}} during an 
> {{update()}} call, uses the parent queues policy to distribute shares to its 
> children.
> If the parent queues policy is 'fair', it only computes weight for memory and 
> sets the vcores fair share of its children to 0.
> Assuming a situation where we have 1 parent queue with policy 'fair' and 
> multiple leaf queues with policy 'drf', Any app submitted to the child queues 
> with vcore requirement > 1 will always be above fairshare, since during the 
> recomputeShare process, the child queues were all assigned 0 for fairshare 
> vcores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5373) NPE introduced by YARN-4958 (The file localization process should allow...)

2016-07-13 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-5373:
-
Description: 
YARN-4958 added support for wildcards in file localization. It introduces a NPE 
at 
{code:java}
for (File wildLink : directory.listFiles()) {
sb.symlink(new Path(wildLink.toString()), new Path(wildLink.getName()));
}
{code}
When directory.listFiles returns null (only happens in a secure cluster), NPE 
will cause the container fail to launch.

  was:
YARN-4958 added support for wildcards in file localization. It introduces a NPE 
at 
{code:java}
for (File wildLink : directory.listFiles()) {
sb.symlink(new Path(wildLink.toString()), new Path(wildLink.getName()));
}
{code}
When directory.listFiles returns null, NPE will cause the container fail to 
launch.


> NPE introduced by YARN-4958 (The file localization process should allow...)
> ---
>
> Key: YARN-5373
> URL: https://issues.apache.org/jira/browse/YARN-5373
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> YARN-4958 added support for wildcards in file localization. It introduces a 
> NPE 
> at 
> {code:java}
> for (File wildLink : directory.listFiles()) {
> sb.symlink(new Path(wildLink.toString()), new Path(wildLink.getName()));
> }
> {code}
> When directory.listFiles returns null (only happens in a secure cluster), NPE 
> will cause the container fail to launch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5304) Ship single node HBase config option with single startup command

2016-07-13 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375807#comment-15375807
 ] 

Karthik Kambatla commented on YARN-5304:


I spoke to [~esteban] about this. In his opinion, the minicluster approach 
(master, RS etc. in a single process) is discouraged. I am assuming the goal is 
to do a pseudo-distributed setup of HBase - Master and RegionServer in 
different processes. 

> Ship single node HBase config option with single startup command
> 
>
> Key: YARN-5304
> URL: https://issues.apache.org/jira/browse/YARN-5304
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Joep Rottinghuis
>Assignee: Joep Rottinghuis
>  Labels: YARN-5355
>
> For small to medium Hadoop deployments we should make it dead-simple to use 
> the timeline service v2. We should have a single command to launch and stop 
> the timelineservice back-end for the default HBase implementation.
> A default config with all the values should be packaged that launches all the 
> needed daemons (on the RM node) with a single command with all the 
> recommended settings.
> Having a timeline admin command, perhaps an init command might be needed, or 
> perhaps the timeline service can even auto-detect that and create tables, 
> deploy needed coprocessors etc.
> The overall purpose is to ensure nobody needs to be an HBase expert to get 
> this going. For those cluster operators with HBase experience, they can 
> choose their own more sophisticated deployment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5343) TestContinuousScheduling#testSortedNodes fail intermittently

2016-07-13 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375800#comment-15375800
 ] 

Karthik Kambatla commented on YARN-5343:


I remember [~yufeigu] was looking into this. [~yufeigu] - does [~sandflee]'s 
analysis help? 

> TestContinuousScheduling#testSortedNodes fail intermittently
> 
>
> Key: YARN-5343
> URL: https://issues.apache.org/jira/browse/YARN-5343
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: sandflee
>Priority: Minor
>
> {noformat}
> java.lang.AssertionError: expected:<2> but was:<1>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling.testSortedNodes(TestContinuousScheduling.java:167)
> {noformat}
> https://builds.apache.org/job/PreCommit-YARN-Build/12250/testReport/org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair/TestContinuousScheduling/testSortedNodes/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5342) Improve non-exclusive node partition resource allocation in Capacity Scheduler

2016-07-13 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375771#comment-15375771
 ] 

Naganarasimha G R commented on YARN-5342:
-

Thanks for the patch [~wangda], 
Given that discussed approach in YARN-4225 (Fallback Policy based ) is going to 
take some time as it would require significant modifications, i would agree to 
go for intermittent modification to optimize the non exclusive mode scheduling.
Only concern i have is if the size of default partition is greater than the non 
exclusive partition then on one allocation in default we are resetting the 
counter, would it be productive ?

> Improve non-exclusive node partition resource allocation in Capacity Scheduler
> --
>
> Key: YARN-5342
> URL: https://issues.apache.org/jira/browse/YARN-5342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-5342.1.patch
>
>
> In the previous implementation, one non-exclusive container allocation is 
> possible when the missed-opportunity >= #cluster-nodes. And 
> missed-opportunity will be reset when container allocated to any node.
> This will slow down the frequency of container allocation on non-exclusive 
> node partition: *When a non-exclusive partition=x has idle resource, we can 
> only allocate one container for this app in every 
> X=nodemanagers.heartbeat-interval secs for the whole cluster.*
> In this JIRA, I propose a fix to reset missed-opporunity only if we have >0 
> pending resource for the non-exclusive partition OR we get allocation from 
> the default partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-5374) Preemption causing communication loop

2016-07-13 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan resolved YARN-5374.
--
Resolution: Invalid

Closing as invalid.

> Preemption causing communication loop
> -
>
> Key: YARN-5374
> URL: https://issues.apache.org/jira/browse/YARN-5374
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, nodemanager, resourcemanager, yarn
>Affects Versions: 2.7.1
> Environment: Yarn version: Hadoop 2.7.1-amzn-0
> AWS EMR Cluster running:
> 1 x r3.8xlarge (Master)
> 52 x r3.8xlarge (Core)
> Spark version : 1.6.0
> Scala version: 2.10.5
> Java version: 1.8.0_51
> Input size: ~10 tb
> Input coming from S3
> Queue Configuration:
> Dynamic allocation: enabled
> Preemption: enabled
> Q1: 70% capacity with max of 100%
> Q2: 30% capacity with max of 100%
> Job Configuration:
> Driver memory = 10g
> Executor cores = 6
> Executor memory = 10g
> Deploy mode = cluster
> Master = yarn
> maxResultSize = 4g
> Shuffle manager = hash
>Reporter: Lucas Winkelmann
>Priority: Blocker
>
> Here is the scenario:
> I launch job 1 into Q1 and allow it to grow to 100% cluster utilization.
> I wait between 15-30 mins ( for this job to complete with 100% of the cluster 
> available takes about 1hr so job 1 is between 25-50% complete). Note that if 
> I wait less time then the issue sometimes does not occur, it appears to be 
> only after the job 1 is at least 25% complete.
> I launch job 2 into Q2 and preemption occurs on the Q1 shrinking the job to 
> allow 70% of cluster utilization.
> At this point job 1 basically halts progress while job 2 continues to execute 
> as normal and finishes. Job 2 either:
> - Fails its attempt and restarts. By the time this attempt fails the other 
> job is already complete meaning the second attempt has full cluster 
> availability and finishes.
> - The job remains at its current progress and simply does not finish ( I have 
> waited ~6 hrs until finally killing the application ).
>  
> Looking into the error log there is this constant error message:
> WARN NettyRpcEndpointRef: Error sending message [message = 
> RemoveExecutor(454,Container container_1468422920649_0001_01_000594 on host: 
> ip-NUMBERS.ec2.internal was preempted.)] in X attempts
>  
> My observations have led me to believe that the application master does not 
> know about this container being killed and continuously asks the container to 
> remove the executor until eventually failing the attempt or continue trying 
> to remove the executor.
>  
> I have done much digging online for anyone else experiencing this issue but 
> have come up with nothing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5374) Preemption causing communication loop

2016-07-13 Thread Lucas Winkelmann (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375761#comment-15375761
 ] 

Lucas Winkelmann commented on YARN-5374:


I will go ahead and file a Spark JIRA ticket now.

> Preemption causing communication loop
> -
>
> Key: YARN-5374
> URL: https://issues.apache.org/jira/browse/YARN-5374
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, nodemanager, resourcemanager, yarn
>Affects Versions: 2.7.1
> Environment: Yarn version: Hadoop 2.7.1-amzn-0
> AWS EMR Cluster running:
> 1 x r3.8xlarge (Master)
> 52 x r3.8xlarge (Core)
> Spark version : 1.6.0
> Scala version: 2.10.5
> Java version: 1.8.0_51
> Input size: ~10 tb
> Input coming from S3
> Queue Configuration:
> Dynamic allocation: enabled
> Preemption: enabled
> Q1: 70% capacity with max of 100%
> Q2: 30% capacity with max of 100%
> Job Configuration:
> Driver memory = 10g
> Executor cores = 6
> Executor memory = 10g
> Deploy mode = cluster
> Master = yarn
> maxResultSize = 4g
> Shuffle manager = hash
>Reporter: Lucas Winkelmann
>Priority: Blocker
>
> Here is the scenario:
> I launch job 1 into Q1 and allow it to grow to 100% cluster utilization.
> I wait between 15-30 mins ( for this job to complete with 100% of the cluster 
> available takes about 1hr so job 1 is between 25-50% complete). Note that if 
> I wait less time then the issue sometimes does not occur, it appears to be 
> only after the job 1 is at least 25% complete.
> I launch job 2 into Q2 and preemption occurs on the Q1 shrinking the job to 
> allow 70% of cluster utilization.
> At this point job 1 basically halts progress while job 2 continues to execute 
> as normal and finishes. Job 2 either:
> - Fails its attempt and restarts. By the time this attempt fails the other 
> job is already complete meaning the second attempt has full cluster 
> availability and finishes.
> - The job remains at its current progress and simply does not finish ( I have 
> waited ~6 hrs until finally killing the application ).
>  
> Looking into the error log there is this constant error message:
> WARN NettyRpcEndpointRef: Error sending message [message = 
> RemoveExecutor(454,Container container_1468422920649_0001_01_000594 on host: 
> ip-NUMBERS.ec2.internal was preempted.)] in X attempts
>  
> My observations have led me to believe that the application master does not 
> know about this container being killed and continuously asks the container to 
> remove the executor until eventually failing the attempt or continue trying 
> to remove the executor.
>  
> I have done much digging online for anyone else experiencing this issue but 
> have come up with nothing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5374) Preemption causing communication loop

2016-07-13 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375759#comment-15375759
 ] 

Wangda Tan commented on YARN-5374:
--

[~LucasW], it seems to me that the issue is caused by Spark application doesn't 
well handle container preemption message. If so, I suggest you can drop a mail 
to Spark maillist or file a Spark JIRA instead.

> Preemption causing communication loop
> -
>
> Key: YARN-5374
> URL: https://issues.apache.org/jira/browse/YARN-5374
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, nodemanager, resourcemanager, yarn
>Affects Versions: 2.7.1
> Environment: Yarn version: Hadoop 2.7.1-amzn-0
> AWS EMR Cluster running:
> 1 x r3.8xlarge (Master)
> 52 x r3.8xlarge (Core)
> Spark version : 1.6.0
> Scala version: 2.10.5
> Java version: 1.8.0_51
> Input size: ~10 tb
> Input coming from S3
> Queue Configuration:
> Dynamic allocation: enabled
> Preemption: enabled
> Q1: 70% capacity with max of 100%
> Q2: 30% capacity with max of 100%
> Job Configuration:
> Driver memory = 10g
> Executor cores = 6
> Executor memory = 10g
> Deploy mode = cluster
> Master = yarn
> maxResultSize = 4g
> Shuffle manager = hash
>Reporter: Lucas Winkelmann
>Priority: Blocker
>
> Here is the scenario:
> I launch job 1 into Q1 and allow it to grow to 100% cluster utilization.
> I wait between 15-30 mins ( for this job to complete with 100% of the cluster 
> available takes about 1hr so job 1 is between 25-50% complete). Note that if 
> I wait less time then the issue sometimes does not occur, it appears to be 
> only after the job 1 is at least 25% complete.
> I launch job 2 into Q2 and preemption occurs on the Q1 shrinking the job to 
> allow 70% of cluster utilization.
> At this point job 1 basically halts progress while job 2 continues to execute 
> as normal and finishes. Job 2 either:
> - Fails its attempt and restarts. By the time this attempt fails the other 
> job is already complete meaning the second attempt has full cluster 
> availability and finishes.
> - The job remains at its current progress and simply does not finish ( I have 
> waited ~6 hrs until finally killing the application ).
>  
> Looking into the error log there is this constant error message:
> WARN NettyRpcEndpointRef: Error sending message [message = 
> RemoveExecutor(454,Container container_1468422920649_0001_01_000594 on host: 
> ip-NUMBERS.ec2.internal was preempted.)] in X attempts
>  
> My observations have led me to believe that the application master does not 
> know about this container being killed and continuously asks the container to 
> remove the executor until eventually failing the attempt or continue trying 
> to remove the executor.
>  
> I have done much digging online for anyone else experiencing this issue but 
> have come up with nothing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5364) timelineservice modules have indirect dependencies on mapreduce artifacts

2016-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375711#comment-15375711
 ] 

Hudson commented on YARN-5364:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #10092 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10092/])
YARN-5364. timelineservice modules have indirect dependencies on 
(naganarasimha_gr: rev af8f480c2482b40e9f5a2d29fb5bc7069979fa2e)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/pom.xml


> timelineservice modules have indirect dependencies on mapreduce artifacts
> -
>
> Key: YARN-5364
> URL: https://issues.apache.org/jira/browse/YARN-5364
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.0.0-alpha1
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Minor
> Fix For: 3.0.0-alpha1
>
> Attachments: YARN-5364.01.patch
>
>
> The new timelineservice and timelineservice-hbase-tests modules have indirect 
> dependencies to mapreduce artifacts through HBase and phoenix. Although it's 
> not causing builds to fail, it's not good hygiene.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5374) Preemption causing communication loop

2016-07-13 Thread Lucas Winkelmann (JIRA)
Lucas Winkelmann created YARN-5374:
--

 Summary: Preemption causing communication loop
 Key: YARN-5374
 URL: https://issues.apache.org/jira/browse/YARN-5374
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, nodemanager, resourcemanager, yarn
Affects Versions: 2.7.1
 Environment: Yarn version: Hadoop 2.7.1-amzn-0

AWS EMR Cluster running:
1 x r3.8xlarge (Master)
52 x r3.8xlarge (Core)

Spark version : 1.6.0
Scala version: 2.10.5
Java version: 1.8.0_51

Input size: ~10 tb
Input coming from S3

Queue Configuration:
Dynamic allocation: enabled
Preemption: enabled
Q1: 70% capacity with max of 100%
Q2: 30% capacity with max of 100%

Job Configuration:
Driver memory = 10g
Executor cores = 6
Executor memory = 10g
Deploy mode = cluster
Master = yarn
maxResultSize = 4g
Shuffle manager = hash
Reporter: Lucas Winkelmann
Priority: Blocker


Here is the scenario:
I launch job 1 into Q1 and allow it to grow to 100% cluster utilization.
I wait between 15-30 mins ( for this job to complete with 100% of the cluster 
available takes about 1hr so job 1 is between 25-50% complete). Note that if I 
wait less time then the issue sometimes does not occur, it appears to be only 
after the job 1 is at least 25% complete.
I launch job 2 into Q2 and preemption occurs on the Q1 shrinking the job to 
allow 70% of cluster utilization.
At this point job 1 basically halts progress while job 2 continues to execute 
as normal and finishes. Job 2 either:
- Fails its attempt and restarts. By the time this attempt fails the other job 
is already complete meaning the second attempt has full cluster availability 
and finishes.
- The job remains at its current progress and simply does not finish ( I have 
waited ~6 hrs until finally killing the application ).
 
Looking into the error log there is this constant error message:
WARN NettyRpcEndpointRef: Error sending message [message = 
RemoveExecutor(454,Container container_1468422920649_0001_01_000594 on host: 
ip-NUMBERS.ec2.internal was preempted.)] in X attempts
 
My observations have led me to believe that the application master does not 
know about this container being killed and continuously asks the container to 
remove the executor until eventually failing the attempt or continue trying to 
remove the executor.
 
I have done much digging online for anyone else experiencing this issue but 
have come up with nothing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5373) NPE introduced by YARN-4958 (The file localization process should allow...)

2016-07-13 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375683#comment-15375683
 ] 

Daniel Templeton commented on YARN-5373:


It looks like the issue only appears when running with a secure cluster.

> NPE introduced by YARN-4958 (The file localization process should allow...)
> ---
>
> Key: YARN-5373
> URL: https://issues.apache.org/jira/browse/YARN-5373
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> YARN-4958 added support for wildcards in file localization. It introduces a 
> NPE 
> at 
> {code:java}
> for (File wildLink : directory.listFiles()) {
> sb.symlink(new Path(wildLink.toString()), new Path(wildLink.getName()));
> }
> {code}
> When directory.listFiles returns null, NPE will cause the container fail to 
> launch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5364) timelineservice modules have indirect dependencies on mapreduce artifacts

2016-07-13 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375648#comment-15375648
 ] 

Naganarasimha G R commented on YARN-5364:
-

strangely dependency tree was also not showing it as required jar earlier

> timelineservice modules have indirect dependencies on mapreduce artifacts
> -
>
> Key: YARN-5364
> URL: https://issues.apache.org/jira/browse/YARN-5364
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.0.0-alpha1
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Minor
> Attachments: YARN-5364.01.patch
>
>
> The new timelineservice and timelineservice-hbase-tests modules have indirect 
> dependencies to mapreduce artifacts through HBase and phoenix. Although it's 
> not causing builds to fail, it's not good hygiene.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5364) timelineservice modules have indirect dependencies on mapreduce artifacts

2016-07-13 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375647#comment-15375647
 ] 

Naganarasimha G R commented on YARN-5364:
-

Not sure why it was failing earlier (with /without the patch), once i changed 
the repo location then was able to start running the test cases, Will go ahead 
and commit the patch.

> timelineservice modules have indirect dependencies on mapreduce artifacts
> -
>
> Key: YARN-5364
> URL: https://issues.apache.org/jira/browse/YARN-5364
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.0.0-alpha1
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Minor
> Attachments: YARN-5364.01.patch
>
>
> The new timelineservice and timelineservice-hbase-tests modules have indirect 
> dependencies to mapreduce artifacts through HBase and phoenix. Although it's 
> not causing builds to fail, it's not good hygiene.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5364) timelineservice modules have indirect dependencies on mapreduce artifacts

2016-07-13 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-5364:

Attachment: (was: screenshot-1.png)

> timelineservice modules have indirect dependencies on mapreduce artifacts
> -
>
> Key: YARN-5364
> URL: https://issues.apache.org/jira/browse/YARN-5364
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.0.0-alpha1
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Minor
> Attachments: YARN-5364.01.patch
>
>
> The new timelineservice and timelineservice-hbase-tests modules have indirect 
> dependencies to mapreduce artifacts through HBase and phoenix. Although it's 
> not causing builds to fail, it's not good hygiene.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5373) NPE introduced by YARN-4958 (The file localization process should allow...)

2016-07-13 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-5373:
-
Description: 
YARN-4958 added support for wildcards in file localization. It introduces a NPE 
at 
{code:java}
for (File wildLink : directory.listFiles()) {
sb.symlink(new Path(wildLink.toString()), new Path(wildLink.getName()));
}
{code}
When directory.listFiles returns null, NPE will cause the container fail to 
launch.

  was:
YARN-4958 added support for wildcards in file localization. It introduces a NPE 
at 
{code:java}
  for (File wildLink : directory.listFiles()) {
  sb.symlink(new Path(wildLink.toString()), new 
Path(wildLink.getName()));
  }
{code}
When directory.listFiles returns null, NPE will cause the container fail to 
launch.


> NPE introduced by YARN-4958 (The file localization process should allow...)
> ---
>
> Key: YARN-5373
> URL: https://issues.apache.org/jira/browse/YARN-5373
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> YARN-4958 added support for wildcards in file localization. It introduces a 
> NPE 
> at 
> {code:java}
> for (File wildLink : directory.listFiles()) {
> sb.symlink(new Path(wildLink.toString()), new Path(wildLink.getName()));
> }
> {code}
> When directory.listFiles returns null, NPE will cause the container fail to 
> launch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5373) NPE introduced by YARN-4958 (The file localization process should allow...)

2016-07-13 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-5373:
-
Description: 
YARN-4958 added support for wildcards in file localization. It introduces a NPE 
at 
{code:java}
  for (File wildLink : directory.listFiles()) {
  sb.symlink(new Path(wildLink.toString()),
  new Path(wildLink.getName()));
}
{code}
When directory.listFiles returns null, NPE will cause the container fail to 
launch.

  was:
YARN-4958 added support for wildcards in file localization. It introduces a NPE 
at 
{{code}}
  for (File wildLink : directory.listFiles()) {
  sb.symlink(new Path(wildLink.toString()),
  new Path(wildLink.getName()));
}
{{code}}
When directory.listFiles returns null, NPE will cause the container fail to 
launch.


> NPE introduced by YARN-4958 (The file localization process should allow...)
> ---
>
> Key: YARN-5373
> URL: https://issues.apache.org/jira/browse/YARN-5373
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> YARN-4958 added support for wildcards in file localization. It introduces a 
> NPE 
> at 
> {code:java}
>   for (File wildLink : directory.listFiles()) {
>   sb.symlink(new Path(wildLink.toString()),
>   new Path(wildLink.getName()));
> }
> {code}
> When directory.listFiles returns null, NPE will cause the container fail to 
> launch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5373) NPE introduced by YARN-4958 (The file localization process should allow...)

2016-07-13 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-5373:
-
Description: 
YARN-4958 added support for wildcards in file localization. It introduces a NPE 
at 
{code:java}
  for (File wildLink : directory.listFiles()) {
  sb.symlink(new Path(wildLink.toString()),
  new Path(wildLink.getName()));
  }
{code}
When directory.listFiles returns null, NPE will cause the container fail to 
launch.

  was:
YARN-4958 added support for wildcards in file localization. It introduces a NPE 
at 
{code:java}
  for (File wildLink : directory.listFiles()) {
  sb.symlink(new Path(wildLink.toString()),
  new Path(wildLink.getName()));
}
{code}
When directory.listFiles returns null, NPE will cause the container fail to 
launch.


> NPE introduced by YARN-4958 (The file localization process should allow...)
> ---
>
> Key: YARN-5373
> URL: https://issues.apache.org/jira/browse/YARN-5373
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> YARN-4958 added support for wildcards in file localization. It introduces a 
> NPE 
> at 
> {code:java}
>   for (File wildLink : directory.listFiles()) {
>   sb.symlink(new Path(wildLink.toString()),
>   new Path(wildLink.getName()));
>   }
> {code}
> When directory.listFiles returns null, NPE will cause the container fail to 
> launch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5373) NPE introduced by YARN-4958 (The file localization process should allow...)

2016-07-13 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-5373:
-
Description: 
YARN-4958 added support for wildcards in file localization. It introduces a NPE 
at 
{code:java}
  for (File wildLink : directory.listFiles()) {
  sb.symlink(new Path(wildLink.toString()), new 
Path(wildLink.getName()));
  }
{code}
When directory.listFiles returns null, NPE will cause the container fail to 
launch.

  was:
YARN-4958 added support for wildcards in file localization. It introduces a NPE 
at 
{code:java}
  for (File wildLink : directory.listFiles()) {
  sb.symlink(new Path(wildLink.toString()),
  new Path(wildLink.getName()));
  }
{code}
When directory.listFiles returns null, NPE will cause the container fail to 
launch.


> NPE introduced by YARN-4958 (The file localization process should allow...)
> ---
>
> Key: YARN-5373
> URL: https://issues.apache.org/jira/browse/YARN-5373
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>
> YARN-4958 added support for wildcards in file localization. It introduces a 
> NPE 
> at 
> {code:java}
>   for (File wildLink : directory.listFiles()) {
>   sb.symlink(new Path(wildLink.toString()), new 
> Path(wildLink.getName()));
>   }
> {code}
> When directory.listFiles returns null, NPE will cause the container fail to 
> launch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5373) NPE introduced by YARN-4958 (The file localization process should allow...)

2016-07-13 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-5373:


 Summary: NPE introduced by YARN-4958 (The file localization 
process should allow...)
 Key: YARN-5373
 URL: https://issues.apache.org/jira/browse/YARN-5373
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.9.0
Reporter: Haibo Chen
Assignee: Haibo Chen


YARN-4958 added support for wildcards in file localization. It introduces a NPE 
at 
{{code}}
  for (File wildLink : directory.listFiles()) {
  sb.symlink(new Path(wildLink.toString()),
  new Path(wildLink.getName()));
}
{{code}}
When directory.listFiles returns null, NPE will cause the container fail to 
launch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5303) Clean up ContainerExecutor JavaDoc

2016-07-13 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375544#comment-15375544
 ] 

Varun Vasudev commented on YARN-5303:
-

Thanks for the patch [~templedf]! +1. I'll commit this tomorrow if no one 
objects.

> Clean up ContainerExecutor JavaDoc
> --
>
> Key: YARN-5303
> URL: https://issues.apache.org/jira/browse/YARN-5303
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Minor
> Attachments: YARN-5303.001.patch
>
>
> The {{ContainerExecutor}} class needs a lot of JavaDoc cleanup and could use 
> some other TLC as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5007) MiniYarnCluster contains deprecated constructor which is called by the other constructors

2016-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375528#comment-15375528
 ] 

Hadoop QA commented on YARN-5007:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 32s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
51s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 54s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 20s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
52s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
58s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 52s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 52s 
{color} | {color:green} root generated 0 new + 706 unchanged - 4 fixed = 706 
total (was 710) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 26s 
{color} | {color:red} root: The patch generated 1 new + 61 unchanged - 2 fixed 
= 62 total (was 63) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
51s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 4m 27s {color} 
| {color:red} hadoop-yarn-server-tests in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 22s {color} 
| {color:red} hadoop-yarn-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 113m 53s 
{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
31s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 162m 26s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.TestContainerManagerSecurity |
|   | hadoop.yarn.server.TestMiniYarnClusterNodeUtilization |
|   | hadoop.yarn.client.api.impl.TestYarnClient |
|   | hadoop.yarn.client.cli.TestLogsCLI |
|   | hadoop.mapred.TestMRCJCFileOutputCommitter |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817705/YARN-5007.02.patch |
| JIRA Issue | YARN-5007 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux e5fa8444e734 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 5614217 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0

[jira] [Commented] (YARN-5339) passing file to -out for YARN log CLI doesnt give warning or error code

2016-07-13 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375522#comment-15375522
 ] 

Xuan Gong commented on YARN-5339:
-

The testcase failures and checkstyle issue are not related

> passing file to -out for YARN log CLI doesnt give warning or error code
> ---
>
> Key: YARN-5339
> URL: https://issues.apache.org/jira/browse/YARN-5339
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Sumana Sathish
>Assignee: Xuan Gong
> Attachments: YARN-5339.1.patch, YARN-5339.2.patch
>
>
> passing file to -out for YARN log CLI doesnt give warning or error code
> {code}
> yarn  logs -applicationId application_1467117709224_0003 -out 
> /grid/0/hadoopqe/artifacts/file.txt
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5339) passing file to -out for YARN log CLI doesnt give warning or error code

2016-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375503#comment-15375503
 ] 

Hadoop QA commented on YARN-5339:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
8s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 16s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client: The 
patch generated 1 new + 87 unchanged - 1 fixed = 88 total (was 88) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 2s {color} | 
{color:red} hadoop-yarn-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 45s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.client.cli.TestLogsCLI |
|   | hadoop.yarn.client.api.impl.TestAMRMProxy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817574/YARN-5339.2.patch |
| JIRA Issue | YARN-5339 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux ab084b82c3e8 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / eb47163 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/12314/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/12314/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-YARN-Build/12314/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12314/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/1

[jira] [Commented] (YARN-5363) For AM containers, or for containers of running-apps, "yarn logs" incorrectly only (tries to) shows syslog file-type by default

2016-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375496#comment-15375496
 ] 

Hadoop QA commented on YARN-5363:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 35s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 0s 
{color} | {color:blue} The patch file was not named according to hadoop's 
naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute 
for instructions. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
20s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client: The 
patch generated 7 new + 80 unchanged - 8 fixed = 87 total (was 88) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
35s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 31s {color} 
| {color:red} hadoop-yarn-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 37s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.client.api.impl.TestYarnClient |
|   | hadoop.yarn.client.cli.TestLogsCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817767/YARN-5363-2016-07-13.txt
 |
| JIRA Issue | YARN-5363 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 2c97dfcde450 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / eb47163 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/12313/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/12313/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-YARN-Build/12313/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
|  Test Results | 

[jira] [Commented] (YARN-5298) Mount usercache and NM filecache directories into Docker container

2016-07-13 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375488#comment-15375488
 ] 

Daniel Templeton commented on YARN-5298:


Looks good to me as well.

> Mount usercache and NM filecache directories into Docker container
> --
>
> Key: YARN-5298
> URL: https://issues.apache.org/jira/browse/YARN-5298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Varun Vasudev
>Assignee: Sidharta Seethana
> Attachments: YARN-5298.001.patch, YARN-5298.002.patch
>
>
> Currently, we don't mount the usercache and the NM filecache directories into 
> the Docker container. This can lead to issues with containers that rely on 
> public and application scope resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5200) Improve yarn logs to get Container List

2016-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375487#comment-15375487
 ] 

Hudson commented on YARN-5200:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #10091 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10091/])
YARN-5200. Enhanced "yarn logs" to be able to get a list of containers 
(vinodkv: rev eb471632349deac4b62f8dec853c8ceb64c9617a)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/LogsCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/LogCLIHelpers.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestLogsCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogFormat.java


> Improve yarn logs to get Container List
> ---
>
> Key: YARN-5200
> URL: https://issues.apache.org/jira/browse/YARN-5200
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Fix For: 2.9.0
>
> Attachments: YARN-5200.1.patch, YARN-5200.10.patch, 
> YARN-5200.11.patch, YARN-5200.12.patch, YARN-5200.2.patch, YARN-5200.3.patch, 
> YARN-5200.4.patch, YARN-5200.5.patch, YARN-5200.6.patch, YARN-5200.7.patch, 
> YARN-5200.8.patch, YARN-5200.9.patch, YARN-5200.9.rebase.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5298) Mount usercache and NM filecache directories into Docker container

2016-07-13 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375481#comment-15375481
 ] 

Varun Vasudev commented on YARN-5298:
-

Forgot to mention - the question should not hold up the patch. +1 for the 
patch. I'll commit it tomorrow if no one objects.

> Mount usercache and NM filecache directories into Docker container
> --
>
> Key: YARN-5298
> URL: https://issues.apache.org/jira/browse/YARN-5298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Varun Vasudev
>Assignee: Sidharta Seethana
> Attachments: YARN-5298.001.patch, YARN-5298.002.patch
>
>
> Currently, we don't mount the usercache and the NM filecache directories into 
> the Docker container. This can lead to issues with containers that rely on 
> public and application scope resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5363) For AM containers, or for containers of running-apps, "yarn logs" incorrectly only (tries to) shows syslog file-type by default

2016-07-13 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-5363:
--
Attachment: YARN-5363-2016-07-13.txt

Updated patch against the latest trunk.

> For AM containers, or for containers of running-apps, "yarn logs" incorrectly 
> only (tries to) shows syslog file-type by default
> ---
>
> Key: YARN-5363
> URL: https://issues.apache.org/jira/browse/YARN-5363
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Attachments: YARN-5363-2016-07-12.txt, YARN-5363-2016-07-13.txt
>
>
> For e.g, for a running application, the following happens:
> {code}
> # yarn logs -applicationId application_1467838922593_0001
> 16/07/06 22:07:05 INFO impl.TimelineClientImpl: Timeline service address: 
> http://:8188/ws/v1/timeline/
> 16/07/06 22:07:06 INFO client.RMProxy: Connecting to ResourceManager at 
> /:8050
> 16/07/06 22:07:07 INFO impl.TimelineClientImpl: Timeline service address: 
> http://l:8188/ws/v1/timeline/
> 16/07/06 22:07:07 INFO client.RMProxy: Connecting to ResourceManager at 
> /:8050
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_01 within the application: 
> application_1467838922593_0001
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_02 within the application: 
> application_1467838922593_0001
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_03 within the application: 
> application_1467838922593_0001
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_04 within the application: 
> application_1467838922593_0001
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_05 within the application: 
> application_1467838922593_0001
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_06 within the application: 
> application_1467838922593_0001
> Can not find any log file matching the pattern: [syslog] for the container: 
> container_e03_1467838922593_0001_01_07 within the application: 
> application_1467838922593_0001
> Can not find the logs for the application: application_1467838922593_0001 
> with the appOwner: 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4759) Revisit signalContainer() for docker containers

2016-07-13 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375435#comment-15375435
 ] 

Varun Vasudev commented on YARN-4759:
-

Thanks for the patch [~shaneku...@gmail.com]. Patch looks mostly good. One 
minor change -
{code}
+  // always change back
+  if (change_effective_user(user, group) != 0) {
+return -1;
+  }
{code}
Can you please log an error message?

> Revisit signalContainer() for docker containers
> ---
>
> Key: YARN-4759
> URL: https://issues.apache.org/jira/browse/YARN-4759
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Shane Kumpf
> Attachments: YARN-4759.001.patch, YARN-4759.002.patch
>
>
> The current signal handling (in the DockerContainerRuntime) needs to be 
> revisited for docker containers. For example, container reacquisition on NM 
> restart might not work, depending on which user the process in the container 
> runs as. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5200) Improve yarn logs to get Container List

2016-07-13 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375411#comment-15375411
 ] 

Vinod Kumar Vavilapalli commented on YARN-5200:
---

I'll dig up the test-case tickets.

The latest patch looks good to me. +1, checking this in.

> Improve yarn logs to get Container List
> ---
>
> Key: YARN-5200
> URL: https://issues.apache.org/jira/browse/YARN-5200
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-5200.1.patch, YARN-5200.10.patch, 
> YARN-5200.11.patch, YARN-5200.12.patch, YARN-5200.2.patch, YARN-5200.3.patch, 
> YARN-5200.4.patch, YARN-5200.5.patch, YARN-5200.6.patch, YARN-5200.7.patch, 
> YARN-5200.8.patch, YARN-5200.9.patch, YARN-5200.9.rebase.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5370) Setting yarn.nodemanager.delete.debug-delay-sec to high number crashes NM because of OOM

2016-07-13 Thread Manikandan R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375339#comment-15375339
 ] 

Manikandan R commented on YARN-5370:


To solve this issue, we tried by setting 
yarn.nodemanager.delete.debug-delay-sec to very low value (zero second) 
assuming that it may clear off the existing scheduled deletion tasks. It didn't 
happen - basically it is not applied for the existing tasks which has been 
already scheduled. Then, we come to know that canRecover() method is getting 
called in service start, which is trying to pull the info from NM recovery 
directory (from local filesystem) and building this entire info in memory, 
which in turn, causing the problems in starting the services and consuming so 
much amount of memory. Then, we tried by moving the contents of NM recovery 
directory to some other place. From this points onwards, it was able to start 
smoothly and works as expected. I think showing some warnings about this high 
value (for ex, 100+ days) somewhere (for ex, in logs) indicating that it can 
cause potential crash can saving significant amount of time to troubleshoot 
this issue.

> Setting yarn.nodemanager.delete.debug-delay-sec to high number crashes NM 
> because of OOM
> 
>
> Key: YARN-5370
> URL: https://issues.apache.org/jira/browse/YARN-5370
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Manikandan R
>
> I set yarn.nodemanager.delete.debug-delay-sec to 100 + days in my dev  
> cluster for some reasons. It has been done before 3-4 weeks. After setting 
> this up, at times, NM crashes because of OOM. So, I kept on increasing from 
> 512MB to 6 GB over the past few weeks gradually as and when this crash occurs 
> as temp fix. Sometimes, It won't start smoothly and after multiple tries, it 
> starts functioning. While analyzing heap dump of corresponding JVM, come to 
> know that DeletionService.Java is occupying almost 99% of total allocated 
> memory (-xmx) something like this
> org.apache.hadoop.yarn.server.nodemanager.DeletionService$DelServiceSchedThreadPoolExecutor
>  @ 0x6c1d09068| 80 | 3,544,094,696 | 99.13%
> Basically, there are huge no. of above mentioned tasks scheduled for 
> deletion. Usually, I see NM memory requirements as 2-4GB for large clusters. 
> In my case, cluster is very small and OOM occurs.
> Is it expected behaviour? (or) Is there any limit we can expose on 
> yarn.nodemanager.delete.debug-delay-sec to avoid these kind of issues?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5364) timelineservice modules have indirect dependencies on mapreduce artifacts

2016-07-13 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375332#comment-15375332
 ] 

Varun Saxena commented on YARN-5364:


Passes for me.
I tried changing the repository path (so that all jars are downloaded) and even 
then it works.

Probably [~naganarasimha...@apache.org] at that time the repository from where 
the jar was to be downloaded from, may have been down.

> timelineservice modules have indirect dependencies on mapreduce artifacts
> -
>
> Key: YARN-5364
> URL: https://issues.apache.org/jira/browse/YARN-5364
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.0.0-alpha1
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Minor
> Attachments: YARN-5364.01.patch, screenshot-1.png
>
>
> The new timelineservice and timelineservice-hbase-tests modules have indirect 
> dependencies to mapreduce artifacts through HBase and phoenix. Although it's 
> not causing builds to fail, it's not good hygiene.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5364) timelineservice modules have indirect dependencies on mapreduce artifacts

2016-07-13 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375332#comment-15375332
 ] 

Varun Saxena edited comment on YARN-5364 at 7/13/16 4:47 PM:
-

Passes for me.
I tried changing the repository path (so that all jars are downloaded again) 
and even then it works.

Probably [~naganarasimha...@apache.org] at that time the repository from where 
the jar was to be downloaded from, may have been down.


was (Author: varun_saxena):
Passes for me.
I tried changing the repository path (so that all jars are downloaded) and even 
then it works.

Probably [~naganarasimha...@apache.org] at that time the repository from where 
the jar was to be downloaded from, may have been down.

> timelineservice modules have indirect dependencies on mapreduce artifacts
> -
>
> Key: YARN-5364
> URL: https://issues.apache.org/jira/browse/YARN-5364
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.0.0-alpha1
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Minor
> Attachments: YARN-5364.01.patch, screenshot-1.png
>
>
> The new timelineservice and timelineservice-hbase-tests modules have indirect 
> dependencies to mapreduce artifacts through HBase and phoenix. Although it's 
> not causing builds to fail, it's not good hygiene.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   >