[jira] [Commented] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat

2015-08-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701028#comment-14701028
 ] 

Hadoop QA commented on YARN-4024:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m 23s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m  0s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 55s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 38s | The applied patch generated  1 
new checkstyle issues (total was 211, now 211). |
| {color:red}-1{color} | whitespace |   0m  1s | The patch has 2  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 26s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m  9s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |   0m 21s | Tests failed in 
hadoop-yarn-api. |
| {color:red}-1{color} | yarn tests |  53m 34s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  97m 42s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
| Failed unit tests | hadoop.yarn.conf.TestYarnConfigurationFields |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation |
|   | hadoop.yarn.server.resourcemanager.rmapp.TestNodesListManager |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750953/YARN-4024-draft-v2.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 71566e2 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8872/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/8872/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/8872/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8872/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8872/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8872/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8872/console |


This message was automatically generated.

 YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
 --

 Key: YARN-4024
 URL: https://issues.apache.org/jira/browse/YARN-4024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wangda Tan
Assignee: Hong Zhiguo
 Attachments: YARN-4024-draft-v2.patch, YARN-4024-draft.patch


 Currently, YARN RM NodesListManager will resolve IP address every time when 
 node doing heartbeat. When DNS server becomes slow, NM heartbeat will be 
 blocked and cannot make progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2923) Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup

2015-08-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700781#comment-14700781
 ] 

Hadoop QA commented on YARN-2923:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  18m  8s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   7m 42s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 39s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 39s | The applied patch generated  1 
new checkstyle issues (total was 211, now 211). |
| {color:green}+1{color} | whitespace |   0m  9s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 22s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 22s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 23s | Tests passed in 
hadoop-yarn-api. |
| {color:red}-1{color} | yarn tests |   1m 56s | Tests failed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   6m 24s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  53m 23s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.yarn.util.TestRackResolver |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750943/YARN-2923.20150818-1.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 71566e2 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8871/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8871/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8871/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8871/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8871/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8871/console |


This message was automatically generated.

 Support configuration based NodeLabelsProvider Service in Distributed Node 
 Label Configuration Setup 
 -

 Key: YARN-2923
 URL: https://issues.apache.org/jira/browse/YARN-2923
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R
 Fix For: 2.8.0

 Attachments: YARN-2923.20141204-1.patch, YARN-2923.20141210-1.patch, 
 YARN-2923.20150328-1.patch, YARN-2923.20150404-1.patch, 
 YARN-2923.20150517-1.patch, YARN-2923.20150817-1.patch, 
 YARN-2923.20150818-1.patch


 As part of Distributed Node Labels configuration we need to support Node 
 labels to be configured in Yarn-site.xml. And on modification of Node Labels 
 configuration in yarn-site.xml, NM should be able to get modified Node labels 
 from this NodeLabelsprovider service without NM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat

2015-08-18 Thread Hong Zhiguo (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Zhiguo updated YARN-4024:
--
Attachment: YARN-4024-draft-v3.patch

YARN-4024-draft-v3.patch: fix the checkstyle warning and testcase failure

 YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
 --

 Key: YARN-4024
 URL: https://issues.apache.org/jira/browse/YARN-4024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wangda Tan
Assignee: Hong Zhiguo
 Attachments: YARN-4024-draft-v2.patch, YARN-4024-draft-v3.patch, 
 YARN-4024-draft.patch


 Currently, YARN RM NodesListManager will resolve IP address every time when 
 node doing heartbeat. When DNS server becomes slow, NM heartbeat will be 
 blocked and cannot make progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2262) Few fields displaying wrong values in Timeline server after RM restart

2015-08-18 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700845#comment-14700845
 ] 

Jeff Zhang commented on YARN-2262:
--

Please ignore my last comment, finally find how to use ATS for Storing history 
data.

{code}
private ApplicationHistoryManager createApplicationHistoryManager(
  Configuration conf) {
// Backward compatibility:
// APPLICATION_HISTORY_STORE is neither null nor empty, it means that the
// user has enabled it explicitly.
if (conf.get(YarnConfiguration.APPLICATION_HISTORY_STORE) == null ||
conf.get(YarnConfiguration.APPLICATION_HISTORY_STORE).length() == 0 ||
conf.get(YarnConfiguration.APPLICATION_HISTORY_STORE).equals(
NullApplicationHistoryStore.class.getName())) {
  return new ApplicationHistoryManagerOnTimelineStore(
  timelineDataManager, aclsManager);
} else {
  LOG.warn(The filesystem based application history store is deprecated.);
  return new ApplicationHistoryManagerImpl();
}
  }
{code}

 Few fields displaying wrong values in Timeline server after RM restart
 --

 Key: YARN-2262
 URL: https://issues.apache.org/jira/browse/YARN-2262
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: 2.4.0
Reporter: Nishan Shetty
Assignee: Naganarasimha G R
 Attachments: Capture.PNG, Capture1.PNG, 
 yarn-testos-historyserver-HOST-10-18-40-95.log, 
 yarn-testos-resourcemanager-HOST-10-18-40-84.log, 
 yarn-testos-resourcemanager-HOST-10-18-40-95.log


 Few fields displaying wrong values in Timeline server after RM restart
 State:null
 FinalStatus:  UNDEFINED
 Started:  8-Jul-2014 14:58:08
 Elapsed:  2562047397789hrs, 44mins, 47sec 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2262) Few fields displaying wrong values in Timeline server after RM restart

2015-08-18 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700830#comment-14700830
 ] 

Jeff Zhang commented on YARN-2262:
--

bq. No longer maintain FS based generic history store.
I can reproduce this issue easily by restarting RM when app is running. Check 
YARN-2033, I do see that app history data can now be stored in Timeline 
service. But it looks like there's no ATS implementation of 
ApplicationHistoryStore.  FileSystemApplicationHistoryStore is still the only 
feasible one for RM recovery. so does it make sense to make it no longer 
maintain. Or do I miss something ? [~zjshen]  [~djp]


 Few fields displaying wrong values in Timeline server after RM restart
 --

 Key: YARN-2262
 URL: https://issues.apache.org/jira/browse/YARN-2262
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: 2.4.0
Reporter: Nishan Shetty
Assignee: Naganarasimha G R
 Attachments: Capture.PNG, Capture1.PNG, 
 yarn-testos-historyserver-HOST-10-18-40-95.log, 
 yarn-testos-resourcemanager-HOST-10-18-40-84.log, 
 yarn-testos-resourcemanager-HOST-10-18-40-95.log


 Few fields displaying wrong values in Timeline server after RM restart
 State:null
 FinalStatus:  UNDEFINED
 Started:  8-Jul-2014 14:58:08
 Elapsed:  2562047397789hrs, 44mins, 47sec 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2262) Few fields displaying wrong values in Timeline server after RM restart

2015-08-18 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700867#comment-14700867
 ] 

Jeff Zhang commented on YARN-2262:
--

And the document needs to be updated for the deprecation of 
FileSystemApplicationHistoryStore.

http://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/hadoop-yarn-site/TimelineServer.html



 Few fields displaying wrong values in Timeline server after RM restart
 --

 Key: YARN-2262
 URL: https://issues.apache.org/jira/browse/YARN-2262
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: 2.4.0
Reporter: Nishan Shetty
Assignee: Naganarasimha G R
 Attachments: Capture.PNG, Capture1.PNG, 
 yarn-testos-historyserver-HOST-10-18-40-95.log, 
 yarn-testos-resourcemanager-HOST-10-18-40-84.log, 
 yarn-testos-resourcemanager-HOST-10-18-40-95.log


 Few fields displaying wrong values in Timeline server after RM restart
 State:null
 FinalStatus:  UNDEFINED
 Started:  8-Jul-2014 14:58:08
 Elapsed:  2562047397789hrs, 44mins, 47sec 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3979) Am in ResourceLocalizationService hang 10 min cause RM kill AM

2015-08-18 Thread zhangyubiao (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700886#comment-14700886
 ] 

zhangyubiao commented on YARN-3979:
---

Thanks for Rohith Sharma K S's  patch , We stop the copy of Logs that the 
program gone , and we will test patch for our test enviroment and if it's OK .  
we will patch for our production envirments . Thank you for your help.

 Am in ResourceLocalizationService hang 10 min cause RM kill  AM
 ---

 Key: YARN-3979
 URL: https://issues.apache.org/jira/browse/YARN-3979
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.2.0
 Environment: CentOS 6.5  Hadoop-2.2.0
Reporter: zhangyubiao
 Attachments: ERROR103.log


 2015-07-27 02:46:17,348 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Created localizer for container_1437735375558
 _104282_01_01
 2015-07-27 02:56:18,510 INFO SecurityLogger.org.apache.hadoop.ipc.Server: 
 Auth successful for appattempt_1437735375558_104282_01 (auth:SIMPLE)
 2015-07-27 02:56:18,510 INFO 
 SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
  Authorization successful for appattempt_1437735375558_104282_0
 1 (auth:TOKEN) for protocol=interface 
 org.apache.hadoop.yarn.api.ContainerManagementProtocolPB



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2262) Few fields displaying wrong values in Timeline server after RM restart

2015-08-18 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700861#comment-14700861
 ] 

Jeff Zhang commented on YARN-2262:
--

But my app still can not be recovered. Does it mean yarn can not recover 
running app ?
{code}
2015-08-18 15:18:35,270 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
Recovering attempt: appattempt_1439882258172_0001_01 with final state: null
2015-08-18 15:18:35,270 INFO 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: 
Create AMRMToken for ApplicationAttempt: appattempt_1439882258172_0001_01
2015-08-18 15:18:35,273 INFO 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: 
Creating password for appattempt_1439882258172_0001_01
2015-08-18 15:18:35,277 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
Application added - appId: application_1439882258172_0001 user: jzhang 
leaf-queue of parent: root #applications: 1
2015-08-18 15:18:35,278 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
 Accepted application application_1439882258172_0001 from user: jzhang, in 
queue: default
2015-08-18 15:18:35,278 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
appattempt_1439882258172_0001_01 State change from NEW to LAUNCHED
2015-08-18 15:18:35,278 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
application_1439882258172_0001 State change from NEW to ACCEPTED
{code}

{code}
2015-08-18 15:18:36,305 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: 
Application attempt appattempt_1439882258172_0001_01 doesn't exist in 
ApplicationMasterService cache.
2015-08-18 15:18:36,306 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 
on 8030, call org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.allocate 
from 192.168.3.3:56241 Call#56 Retry#0
org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException: 
Application attempt appattempt_1439882258172_0001_01 doesn't exist in 
ApplicationMasterService cache.
at 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:436)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
at 
org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
2015-08-18 15:18:37,298 INFO org.apache.hadoop.yarn.util.RackResolver: Resolved 
192.168.3.3 to /default-rack
{code}

 Few fields displaying wrong values in Timeline server after RM restart
 --

 Key: YARN-2262
 URL: https://issues.apache.org/jira/browse/YARN-2262
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: 2.4.0
Reporter: Nishan Shetty
Assignee: Naganarasimha G R
 Attachments: Capture.PNG, Capture1.PNG, 
 yarn-testos-historyserver-HOST-10-18-40-95.log, 
 yarn-testos-resourcemanager-HOST-10-18-40-84.log, 
 yarn-testos-resourcemanager-HOST-10-18-40-95.log


 Few fields displaying wrong values in Timeline server after RM restart
 State:null
 FinalStatus:  UNDEFINED
 Started:  8-Jul-2014 14:58:08
 Elapsed:  2562047397789hrs, 44mins, 47sec 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2005) Blacklisting support for scheduling AMs

2015-08-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701163#comment-14701163
 ] 

Hadoop QA commented on YARN-2005:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  19m 21s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 6 new or modified test files. |
| {color:green}+1{color} | javac |   7m 48s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 39s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 47s | The applied patch generated  1 
new checkstyle issues (total was 211, now 211). |
| {color:red}-1{color} | whitespace |   0m 12s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 27s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 51s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | tools/hadoop tests |   0m 52s | Tests passed in 
hadoop-sls. |
| {color:red}-1{color} | yarn tests |   0m 22s | Tests failed in 
hadoop-yarn-api. |
| {color:red}-1{color} | yarn tests |  57m 41s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | | 104m 16s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.yarn.conf.TestYarnConfigurationFields |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750966/YARN-2005.005.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 71566e2 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8874/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/8874/artifact/patchprocess/whitespace.txt
 |
| hadoop-sls test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8874/artifact/patchprocess/testrun_hadoop-sls.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8874/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8874/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8874/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8874/console |


This message was automatically generated.

 Blacklisting support for scheduling AMs
 ---

 Key: YARN-2005
 URL: https://issues.apache.org/jira/browse/YARN-2005
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 0.23.10, 2.4.0
Reporter: Jason Lowe
Assignee: Anubhav Dhoot
 Attachments: YARN-2005.001.patch, YARN-2005.002.patch, 
 YARN-2005.003.patch, YARN-2005.004.patch, YARN-2005.005.patch


 It would be nice if the RM supported blacklisting a node for an AM launch 
 after the same node fails a configurable number of AM attempts.  This would 
 be similar to the blacklisting support for scheduling task attempts in the 
 MapReduce AM but for scheduling AM attempts on the RM side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3045) [Event producers] Implement NM writing container lifecycle events to ATS

2015-08-18 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701109#comment-14701109
 ] 

Junping Du commented on YARN-3045:
--

I have commit latest (011) patch to YARN-2928 branch. Thanks [~Naganarasimha] 
for contributing the patch and [~sjlee0] for review!
bq. So shall i handle YARN-3367 jira and then revisit the missing NM container 
and application events?
Sure. I make it unassigned so feel free to pick up it.

 [Event producers] Implement NM writing container lifecycle events to ATS
 

 Key: YARN-3045
 URL: https://issues.apache.org/jira/browse/YARN-3045
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Naganarasimha G R
 Attachments: YARN-3045-YARN-2928.002.patch, 
 YARN-3045-YARN-2928.003.patch, YARN-3045-YARN-2928.004.patch, 
 YARN-3045-YARN-2928.005.patch, YARN-3045-YARN-2928.006.patch, 
 YARN-3045-YARN-2928.007.patch, YARN-3045-YARN-2928.008.patch, 
 YARN-3045-YARN-2928.009.patch, YARN-3045-YARN-2928.010.patch, 
 YARN-3045-YARN-2928.011.patch, YARN-3045.20150420-1.patch


 Per design in YARN-2928, implement NM writing container lifecycle events and 
 container system metrics to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3367) Replace starting a separate thread for post entity with event loop in TimelineClient

2015-08-18 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3367:
-
Assignee: (was: Junping Du)

 Replace starting a separate thread for post entity with event loop in 
 TimelineClient
 

 Key: YARN-3367
 URL: https://issues.apache.org/jira/browse/YARN-3367
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Junping Du

 Since YARN-3039, we add loop in TimelineClient to wait for 
 collectorServiceAddress ready before posting any entity. In consumer of  
 TimelineClient (like AM), we are starting a new thread for each call to get 
 rid of potential deadlock in main thread. This way has at least 3 major 
 defects:
 1. The consumer need some additional code to wrap a thread before calling 
 putEntities() in TimelineClient.
 2. It cost many thread resources which is unnecessary.
 3. The sequence of events could be out of order because each posting 
 operation thread get out of waiting loop randomly.
 We should have something like event loop in TimelineClient side, 
 putEntities() only put related entities into a queue of entities and a 
 separated thread handle to deliver entities in queue to collector via REST 
 call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4014) Support user cli interface in for Application Priority

2015-08-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701144#comment-14701144
 ] 

Hadoop QA commented on YARN-4014:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  20m 19s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   7m 49s | There were no new javac warning 
messages. |
| {color:red}-1{color} | javadoc |   9m 47s | The applied patch generated  3  
additional warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 30s | The applied patch generated  3 
new checkstyle issues (total was 31, now 34). |
| {color:green}+1{color} | whitespace |   0m 12s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 25s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   6m 14s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | mapreduce tests | 104m 56s | Tests failed in 
hadoop-mapreduce-client-jobclient. |
| {color:green}+1{color} | yarn tests |   0m 27s | Tests passed in 
hadoop-yarn-api. |
| {color:red}-1{color} | yarn tests |   5m  9s | Tests failed in 
hadoop-yarn-client. |
| {color:red}-1{color} | yarn tests |   0m 23s | Tests failed in 
hadoop-yarn-common. |
| {color:red}-1{color} | yarn tests |   0m 22s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | | 161m  9s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.mapreduce.lib.output.TestJobOutputCommitter |
|   | hadoop.yarn.client.api.impl.TestNMClient |
|   | hadoop.yarn.client.api.impl.TestYarnClient |
| Timed out tests | org.apache.hadoop.mapreduce.TestLargeSort |
|   | org.apache.hadoop.yarn.client.api.impl.TestAHSClient |
|   | org.apache.hadoop.yarn.client.api.impl.TestAMRMClient |
| Failed build | hadoop-yarn-common |
|   | hadoop-yarn-server-resourcemanager |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750963/0004-YARN-4014.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 71566e2 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/8873/artifact/patchprocess/diffJavadocWarnings.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8873/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| hadoop-mapreduce-client-jobclient test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8873/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8873/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8873/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8873/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8873/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8873/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8873/console |


This message was automatically generated.

 Support user cli interface in for Application Priority
 --

 Key: YARN-4014
 URL: https://issues.apache.org/jira/browse/YARN-4014
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Rohith Sharma K S
Assignee: Rohith Sharma K S
 Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 
 0002-YARN-4014.patch, 0003-YARN-4014.patch, 0004-YARN-4014.patch


 Track the changes for user-RM client protocol i.e ApplicationClientProtocol 
 changes and discussions in this jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3367) Replace starting a separate thread for post entity with event loop in TimelineClient

2015-08-18 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R reassigned YARN-3367:
---

Assignee: Naganarasimha G R

 Replace starting a separate thread for post entity with event loop in 
 TimelineClient
 

 Key: YARN-3367
 URL: https://issues.apache.org/jira/browse/YARN-3367
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Junping Du
Assignee: Naganarasimha G R

 Since YARN-3039, we add loop in TimelineClient to wait for 
 collectorServiceAddress ready before posting any entity. In consumer of  
 TimelineClient (like AM), we are starting a new thread for each call to get 
 rid of potential deadlock in main thread. This way has at least 3 major 
 defects:
 1. The consumer need some additional code to wrap a thread before calling 
 putEntities() in TimelineClient.
 2. It cost many thread resources which is unnecessary.
 3. The sequence of events could be out of order because each posting 
 operation thread get out of waiting loop randomly.
 We should have something like event loop in TimelineClient side, 
 putEntities() only put related entities into a queue of entities and a 
 separated thread handle to deliver entities in queue to collector via REST 
 call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4025) Deal with byte representations of Longs in writer code

2015-08-18 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701656#comment-14701656
 ] 

Junping Du commented on YARN-4025:
--

bq.  The EntityTable.java file is already fixed in the v.3 patch.
I mean the example. id3?id4?id5 should be id3=id4=id5? or I miss something 
here? :)

 Deal with byte representations of Longs in writer code
 --

 Key: YARN-4025
 URL: https://issues.apache.org/jira/browse/YARN-4025
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Vrushali C
Assignee: Sangjin Lee
 Attachments: YARN-4025-YARN-2928.001.patch, 
 YARN-4025-YARN-2928.002.patch, YARN-4025-YARN-2928.003.patch


 Timestamps are being stored as Longs in hbase by the HBaseTimelineWriterImpl 
 code. There seem to be some places in the code where there are conversions 
 between Long to byte[] to String for easier argument passing between function 
 calls. Then these values end up being converted back to byte[] while storing 
 in hbase. 
 It would be better to pass around byte[] or the Longs themselves  as 
 applicable. 
 This may result in some api changes (store function) as well in adding a few 
 more function calls like getColumnQualifier which accepts a pre-encoded byte 
 array. It will be in addition to the existing api which accepts a String and 
 the ColumnHelper to return a byte[] column name instead of a String one. 
 Filing jira to track these changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3857) Memory leak in ResourceManager with SIMPLE mode

2015-08-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701755#comment-14701755
 ] 

Hudson commented on YARN-3857:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8317 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8317/])
YARN-3857: Memory leak in ResourceManager with SIMPLE mode. Contributed by 
mujunchao. (zxu: rev 3a76a010b85176f2bcb85ed6f74c25dcb8acfe4d)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/ClientToAMTokenSecretManagerInRM.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java


 Memory leak in ResourceManager with SIMPLE mode
 ---

 Key: YARN-3857
 URL: https://issues.apache.org/jira/browse/YARN-3857
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: mujunchao
Assignee: mujunchao
Priority: Critical
  Labels: patch
 Attachments: YARN-3857-1.patch, YARN-3857-2.patch, YARN-3857-3.patch, 
 YARN-3857-4.patch, hadoop-yarn-server-resourcemanager.patch


  We register the ClientTokenMasterKey to avoid client may hold an invalid 
 ClientToken after RM restarts. In SIMPLE mode, we register 
 PairApplicationAttemptId, null ,  But we never remove it from HashMap, as 
 unregister only runing while in Security mode, so memory leak coming. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4025) Deal with byte representations of Longs in writer code

2015-08-18 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701672#comment-14701672
 ] 

Sangjin Lee commented on YARN-4025:
---

Oh OK. Got it. I thought you meant the line you referred to.

 Deal with byte representations of Longs in writer code
 --

 Key: YARN-4025
 URL: https://issues.apache.org/jira/browse/YARN-4025
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Vrushali C
Assignee: Sangjin Lee
 Attachments: YARN-4025-YARN-2928.001.patch, 
 YARN-4025-YARN-2928.002.patch, YARN-4025-YARN-2928.003.patch


 Timestamps are being stored as Longs in hbase by the HBaseTimelineWriterImpl 
 code. There seem to be some places in the code where there are conversions 
 between Long to byte[] to String for easier argument passing between function 
 calls. Then these values end up being converted back to byte[] while storing 
 in hbase. 
 It would be better to pass around byte[] or the Longs themselves  as 
 applicable. 
 This may result in some api changes (store function) as well in adding a few 
 more function calls like getColumnQualifier which accepts a pre-encoded byte 
 array. It will be in addition to the existing api which accepts a String and 
 the ColumnHelper to return a byte[] column name instead of a String one. 
 Filing jira to track these changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat

2015-08-18 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701598#comment-14701598
 ] 

Wangda Tan commented on YARN-4024:
--

Hi [~zhiguohong],
Thanks for update, some minor comments:
1) I think we can limit the changes of remove cache in the NodesListManager, in 
the handle(..), we can do the flush(..), it will be as same as doing this in 
RMNodeImpl, and don't need expose an extra method, correct?

2) I suggest to rename CachedResolver.flush to something like removeCache, 
flush is more like a file system concept to me.

3) Add tests to see if NodesListManager can handle events correctly if you 
agree with 2).

 YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
 --

 Key: YARN-4024
 URL: https://issues.apache.org/jira/browse/YARN-4024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wangda Tan
Assignee: Hong Zhiguo
 Attachments: YARN-4024-draft-v2.patch, YARN-4024-draft-v3.patch, 
 YARN-4024-draft.patch


 Currently, YARN RM NodesListManager will resolve IP address every time when 
 node doing heartbeat. When DNS server becomes slow, NM heartbeat will be 
 blocked and cannot make progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4025) Deal with byte representations of Longs in writer code

2015-08-18 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701645#comment-14701645
 ] 

Sangjin Lee commented on YARN-4025:
---

Thanks for your review [~djp]!

{quote}
Do we handle columnPrefixBytes to be null here as Javadoc comments? I saw we 
handle this case explicitly in readResults() but I didn't see it here. Let me 
know if I miss something.
{quote}

That is a good catch. Let me look into that. If we retain the same behavior for 
a null qualifier (and probably we should), then the return type of this method 
would need to go back to {{MapObject, Object}}. I'll also think about the 
method names. Cc'ing [~vrushalic] for her opinion also.

{quote}
Checking with javadoc in Separator and TimelineWriterUtils - a negative value 
indicates no limit on number of segments., so can we define a constant value 
like NO_LIMIT to replace -1 here?
{quote}

Will do.

{quote}
I think we should do the same thing to some javadoc examples in 
EntityTable.java.
{quote}

The {{EntityTable.java}} file is already fixed in the v.3 patch.

{quote}
Forget to mention that, YARN-3049 should rename TestHBaseTimelineWriterImpl to 
something include Reader. Would you like to do it here? Thanks!
{quote}

Good idea. The thought definitely occurred to me.

I'll update the patch pretty soon.

 Deal with byte representations of Longs in writer code
 --

 Key: YARN-4025
 URL: https://issues.apache.org/jira/browse/YARN-4025
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Vrushali C
Assignee: Sangjin Lee
 Attachments: YARN-4025-YARN-2928.001.patch, 
 YARN-4025-YARN-2928.002.patch, YARN-4025-YARN-2928.003.patch


 Timestamps are being stored as Longs in hbase by the HBaseTimelineWriterImpl 
 code. There seem to be some places in the code where there are conversions 
 between Long to byte[] to String for easier argument passing between function 
 calls. Then these values end up being converted back to byte[] while storing 
 in hbase. 
 It would be better to pass around byte[] or the Longs themselves  as 
 applicable. 
 This may result in some api changes (store function) as well in adding a few 
 more function calls like getColumnQualifier which accepts a pre-encoded byte 
 array. It will be in addition to the existing api which accepts a String and 
 the ColumnHelper to return a byte[] column name instead of a String one. 
 Filing jira to track these changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3814) REST API implementation for getting raw entities in TimelineReader

2015-08-18 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3814:
---
Attachment: YARN-3814-YARN-2928.05.patch

 REST API implementation for getting raw entities in TimelineReader
 --

 Key: YARN-3814
 URL: https://issues.apache.org/jira/browse/YARN-3814
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3814-YARN-2928.01.patch, 
 YARN-3814-YARN-2928.02.patch, YARN-3814-YARN-2928.03.patch, 
 YARN-3814-YARN-2928.04.patch, YARN-3814-YARN-2928.05.patch, 
 YARN-3814.reference.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1473) Exception from container-launch(Apache Hadoop 2.2.0)

2015-08-18 Thread Maximiliano Mendez (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701766#comment-14701766
 ] 

Maximiliano Mendez commented on YARN-1473:
--

I don't know if it's [~gagansab] case but I've found this digging some logs of 
the container in yarn local-dir configuration:

java.io.FileNotFoundException: 
${yarn.nodemanager.log-dirs}/application_1439909765014_0004/container_e08_1439909765014_0004_02_01
 (Is a directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.init(FileOutputStream.java:221)
at java.io.FileOutputStream.init(FileOutputStream.java:142)
at org.apache.log4j.FileAppender.setFile(FileAppender.java:294)
at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165)
at 
org.apache.hadoop.yarn.ContainerLogAppender.activateOptions(ContainerLogAppender.java:55)
at 
org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307)
at 
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172)
at 
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104)
at 
org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842)
at 
org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768)
at 
org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:648)
at 
org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:514)
at 
org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580)
at 
org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
at org.apache.log4j.LogManager.clinit(LogManager.java:127)
at org.apache.log4j.Logger.getLogger(Logger.java:104)
at 
org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:262)
at 
org.apache.commons.logging.impl.Log4JLogger.init(Log4JLogger.java:108)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at 
org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1025)
at 
org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:844)
at 
org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:541)
at 
org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:292)
at 
org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:269)
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:657)
at 
org.apache.hadoop.service.AbstractService.clinit(AbstractService.java:43)

Could this be causing the error?

 Exception from container-launch(Apache Hadoop 2.2.0)
 

 Key: YARN-1473
 URL: https://issues.apache.org/jira/browse/YARN-1473
 Project: Hadoop YARN
  Issue Type: Bug
 Environment: CentOS5.8 and Apache Hadoop 2.2.0
Reporter: Joy Xu
 Attachments: yarn-site.xml


 Hello all,
 I have meet a exception from container-launch when I run the built-in 
 wordcount program .and the error messge as follow:
 {code}
 13/12/05 00:17:31 INFO mapreduce.Job: Job job_1386171829089_0003 failed with 
 state FAILED due to: Application application_1386171829089_0003 failed 2 
 times due to AM Container for appattempt_1386171829089_0003_02 exited 
 with  exitCode: 1 due to: Exception from container-launch: 
 org.apache.hadoop.util.Shell$ExitCodeException: 
   at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
   at org.apache.hadoop.util.Shell.run(Shell.java:379)
   at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
   at 
 

[jira] [Commented] (YARN-4014) Support user cli interface in for Application Priority

2015-08-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701811#comment-14701811
 ] 

Hadoop QA commented on YARN-4014:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  20m 45s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   7m 57s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 57s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 37s | The applied patch generated  5 
new checkstyle issues (total was 31, now 36). |
| {color:green}+1{color} | whitespace |   0m 12s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 23s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   6m 15s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | mapreduce tests | 101m 22s | Tests failed in 
hadoop-mapreduce-client-jobclient. |
| {color:green}+1{color} | yarn tests |   0m 39s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   7m 29s | Tests passed in 
hadoop-yarn-client. |
| {color:red}-1{color} | yarn tests |   2m 12s | Tests failed in 
hadoop-yarn-common. |
| {color:red}-1{color} | yarn tests |  56m  1s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | | 218m 20s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.yarn.util.TestRackResolver |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation |
| Timed out tests | org.apache.hadoop.mapred.TestMRIntermediateDataEncryption |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751035/0004-YARN-4014.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / fc509f6 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8876/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| hadoop-mapreduce-client-jobclient test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8876/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8876/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8876/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8876/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8876/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8876/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8876/console |


This message was automatically generated.

 Support user cli interface in for Application Priority
 --

 Key: YARN-4014
 URL: https://issues.apache.org/jira/browse/YARN-4014
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Rohith Sharma K S
Assignee: Rohith Sharma K S
 Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 
 0002-YARN-4014.patch, 0003-YARN-4014.patch, 0004-YARN-4014.patch, 
 0004-YARN-4014.patch


 Track the changes for user-RM client protocol i.e ApplicationClientProtocol 
 changes and discussions in this jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1644) RM-NM protocol changes and NodeStatusUpdater implementation to support container resizing

2015-08-18 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701654#comment-14701654
 ] 

Jian He commented on YARN-1644:
---

bq.  I am also wondering if we should do the same for 
ContainerMangagerImpl#startContainers
That should be the same issue. We may do this too.

 RM-NM protocol changes and NodeStatusUpdater implementation to support 
 container resizing
 -

 Key: YARN-1644
 URL: https://issues.apache.org/jira/browse/YARN-1644
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Wangda Tan
Assignee: MENG DING
 Attachments: YARN-1644-YARN-1197.4.patch, 
 YARN-1644-YARN-1197.5.patch, YARN-1644.1.patch, YARN-1644.2.patch, 
 YARN-1644.3.patch, yarn-1644.1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3893) Both RM in active state when Admin#transitionToActive failure from refeshAll()

2015-08-18 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-3893:
---
Affects Version/s: 2.7.1
 Target Version/s: 2.7.2

 Both RM in active state when Admin#transitionToActive failure from refeshAll()
 --

 Key: YARN-3893
 URL: https://issues.apache.org/jira/browse/YARN-3893
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.7.1
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Critical
 Attachments: 0001-YARN-3893.patch, 0002-YARN-3893.patch, 
 0003-YARN-3893.patch, 0004-YARN-3893.patch, yarn-site.xml


 Cases that can cause this.
 # Capacity scheduler xml is wrongly configured during switch
 # Refresh ACL failure due to configuration
 # Refresh User group failure due to configuration
 Continuously both RM will try to be active
 {code}
 dsperf@host-10-128:/opt/bibin/dsperf/OPENSOURCE_3_0/install/hadoop/resourcemanager/bin
  ./yarn rmadmin  -getServiceState rm1
 15/07/07 19:08:10 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 active
 dsperf@host-128:/opt/bibin/dsperf/OPENSOURCE_3_0/install/hadoop/resourcemanager/bin
  ./yarn rmadmin  -getServiceState rm2
 15/07/07 19:08:12 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 active
 {code}
 # Both Web UI active
 # Status shown as active for both RM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-3857) Memory leak in ResourceManager with SIMPLE mode

2015-08-18 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu resolved YARN-3857.
-
Resolution: Fixed

 Memory leak in ResourceManager with SIMPLE mode
 ---

 Key: YARN-3857
 URL: https://issues.apache.org/jira/browse/YARN-3857
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: mujunchao
Assignee: mujunchao
Priority: Critical
  Labels: patch
 Fix For: 2.7.2

 Attachments: YARN-3857-1.patch, YARN-3857-2.patch, YARN-3857-3.patch, 
 YARN-3857-4.patch, hadoop-yarn-server-resourcemanager.patch


  We register the ClientTokenMasterKey to avoid client may hold an invalid 
 ClientToken after RM restarts. In SIMPLE mode, we register 
 PairApplicationAttemptId, null ,  But we never remove it from HashMap, as 
 unregister only runing while in Security mode, so memory leak coming. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3857) Memory leak in ResourceManager with SIMPLE mode

2015-08-18 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3857:

Fix Version/s: 2.7.2

 Memory leak in ResourceManager with SIMPLE mode
 ---

 Key: YARN-3857
 URL: https://issues.apache.org/jira/browse/YARN-3857
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: mujunchao
Assignee: mujunchao
Priority: Critical
  Labels: patch
 Fix For: 2.7.2

 Attachments: YARN-3857-1.patch, YARN-3857-2.patch, YARN-3857-3.patch, 
 YARN-3857-4.patch, hadoop-yarn-server-resourcemanager.patch


  We register the ClientTokenMasterKey to avoid client may hold an invalid 
 ClientToken after RM restarts. In SIMPLE mode, we register 
 PairApplicationAttemptId, null ,  But we never remove it from HashMap, as 
 unregister only runing while in Security mode, so memory leak coming. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3857) Memory leak in ResourceManager with SIMPLE mode

2015-08-18 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701824#comment-14701824
 ] 

zhihai xu commented on YARN-3857:
-

Thanks to [~mujunchao] for the contribution and to Devaraj for additional 
review! I committed this to trunk, branch-2 and branch-2.7.

 Memory leak in ResourceManager with SIMPLE mode
 ---

 Key: YARN-3857
 URL: https://issues.apache.org/jira/browse/YARN-3857
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: mujunchao
Assignee: mujunchao
Priority: Critical
  Labels: patch
 Fix For: 2.7.2

 Attachments: YARN-3857-1.patch, YARN-3857-2.patch, YARN-3857-3.patch, 
 YARN-3857-4.patch, hadoop-yarn-server-resourcemanager.patch


  We register the ClientTokenMasterKey to avoid client may hold an invalid 
 ClientToken after RM restarts. In SIMPLE mode, we register 
 PairApplicationAttemptId, null ,  But we never remove it from HashMap, as 
 unregister only runing while in Security mode, so memory leak coming. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4057) If ContainersMonitor is not enabled, only print related log info one time

2015-08-18 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-4057:

Fix Version/s: 2.8.0

 If ContainersMonitor is not enabled, only print related log info one time
 -

 Key: YARN-4057
 URL: https://issues.apache.org/jira/browse/YARN-4057
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Jun Gong
Assignee: Jun Gong
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-4057.01.patch


 ContainersMonitorImpl will check whether it is enabled when handling every 
 event,  and it will print following messages again and again if not enabled:
 {quote}
 2015-08-17 13:20:13,792 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
  Neither virutal-memory nor physical-memory is needed. Not running the 
 monitor-thread
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3901) Populate flow run data in the flow_run table

2015-08-18 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701775#comment-14701775
 ] 

Vrushali C commented on YARN-3901:
--

I see, yes, will name it accordingly. 

 Populate flow run data in the flow_run table
 

 Key: YARN-3901
 URL: https://issues.apache.org/jira/browse/YARN-3901
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Vrushali C
Assignee: Vrushali C
 Attachments: YARN-3901-YARN-2928.WIP.patch


 As per the schema proposed in YARN-3815 in 
 https://issues.apache.org/jira/secure/attachment/12743391/hbase-schema-proposal-for-aggregation.pdf
 filing jira to track creation and population of data in the flow run table. 
 Some points that are being  considered:
 - Stores per flow run information aggregated across applications, flow version
 RM’s collector writes to on app creation and app completion
 - Per App collector writes to it for metric updates at a slower frequency 
 than the metric updates to application table
 primary key: cluster ! user ! flow ! flow run id
 - Only the latest version of flow-level aggregated metrics will be kept, even 
 if the entity and application level keep a timeseries.
 - The running_apps column will be incremented on app creation, and 
 decremented on app completion.
 - For min_start_time the RM writer will simply write a value with the tag for 
 the applicationId. A coprocessor will return the min value of all written 
 values. - 
 - Upon flush and compactions, the min value between all the cells of this 
 column will be written to the cell without any tag (empty tag) and all the 
 other cells will be discarded.
 - Ditto for the max_end_time, but then the max will be kept.
 - Tags are represented as #type:value. The type can be not set (0), or can 
 indicate running (1) or complete (2). In those cases (for metrics) only 
 complete app metrics are collapsed on compaction.
 - The m! values are aggregated (summed) upon read. Only when applications are 
 completed (indicated by tag type 2) can the values be collapsed.
 - The application ids that have completed and been aggregated into the flow 
 numbers are retained in a separate column for historical tracking: we don’t 
 want to re-aggregate for those upon replay
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4057) If ContainersMonitor is not enabled, only print related log info one time

2015-08-18 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701830#comment-14701830
 ] 

zhihai xu commented on YARN-4057:
-

thanks [~hex108] for the contribution! I committed this to trunk and branch-2.

 If ContainersMonitor is not enabled, only print related log info one time
 -

 Key: YARN-4057
 URL: https://issues.apache.org/jira/browse/YARN-4057
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Jun Gong
Assignee: Jun Gong
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-4057.01.patch


 ContainersMonitorImpl will check whether it is enabled when handling every 
 event,  and it will print following messages again and again if not enabled:
 {quote}
 2015-08-17 13:20:13,792 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
  Neither virutal-memory nor physical-memory is needed. Not running the 
 monitor-thread
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1644) RM-NM protocol changes and NodeStatusUpdater implementation to support container resizing

2015-08-18 Thread MENG DING (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701590#comment-14701590
 ] 

MENG DING commented on YARN-1644:
-

Thanks a lot [~leftnoteasy] and [~jianhe] for your comments and suggestions.

After more thoughts, I prefer [~jianhe]'s suggestion to synchronize 
{{ContainerMangagerImpl#increaseContainersResource}} with NM-RM registration.  
If we do that, we should be able to resolve the RM recovery race condition 
issue, more specifically:
* If increaseContainersResource happens first, then container resource will be 
increased in NM before NM-RM registration.
* If NM-RM registration happens first, then NM will get a new RM identifier 
after registration. Any subsequent increase request with a token issued by old 
RM will be rejected.

For implementation, I think I can simply synchronize on the {{NMContext}} 
object in both {{ContainerMangagerImpl}} and {{NodeStatusUpdaterImpl}}.

Let me know if you have further thoughts or comments. I am also wondering if we 
should do the same for  {{ContainerMangagerImpl#startContainers}}?

 RM-NM protocol changes and NodeStatusUpdater implementation to support 
 container resizing
 -

 Key: YARN-1644
 URL: https://issues.apache.org/jira/browse/YARN-1644
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Wangda Tan
Assignee: MENG DING
 Attachments: YARN-1644-YARN-1197.4.patch, 
 YARN-1644-YARN-1197.5.patch, YARN-1644.1.patch, YARN-1644.2.patch, 
 YARN-1644.3.patch, yarn-1644.1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3814) REST API implementation for getting raw entities in TimelineReader

2015-08-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701866#comment-14701866
 ] 

Hadoop QA commented on YARN-3814:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  19m 27s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   9m 31s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  12m 16s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 29s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 16s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m 16s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 47s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 50s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 55s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   2m 50s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  49m 45s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751064/YARN-3814-YARN-2928.05.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 9a82008 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8877/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8877/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8877/console |


This message was automatically generated.

 REST API implementation for getting raw entities in TimelineReader
 --

 Key: YARN-3814
 URL: https://issues.apache.org/jira/browse/YARN-3814
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3814-YARN-2928.01.patch, 
 YARN-3814-YARN-2928.02.patch, YARN-3814-YARN-2928.03.patch, 
 YARN-3814-YARN-2928.04.patch, YARN-3814-YARN-2928.05.patch, 
 YARN-3814.reference.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4025) Deal with byte representations of Longs in writer code

2015-08-18 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-4025:
--
Attachment: YARN-4025-YARN-2928.004.patch

v.4 patch posted.

- implemented proper handling of a null column prefix
- added more javadoc to clarify several places
- renamed the test from {{TestHBaseTimelineWriterImpl}} to 
{{TestHBaseTimelineStorage}}
- clarified and made explicit the no-limit split
- fixed javadoc comments for the value separator
- added some logging statements

This should address most of the review comments. I stopped short of renaming 
the {{readResults()}} method. We can treat that method as the default 
{{readResults()}} method, and the other one as the one for having raw 
(non-string) components. I added more javadoc to clarify that point. Let me 
know.

 Deal with byte representations of Longs in writer code
 --

 Key: YARN-4025
 URL: https://issues.apache.org/jira/browse/YARN-4025
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Vrushali C
Assignee: Sangjin Lee
 Attachments: YARN-4025-YARN-2928.001.patch, 
 YARN-4025-YARN-2928.002.patch, YARN-4025-YARN-2928.003.patch, 
 YARN-4025-YARN-2928.004.patch


 Timestamps are being stored as Longs in hbase by the HBaseTimelineWriterImpl 
 code. There seem to be some places in the code where there are conversions 
 between Long to byte[] to String for easier argument passing between function 
 calls. Then these values end up being converted back to byte[] while storing 
 in hbase. 
 It would be better to pass around byte[] or the Longs themselves  as 
 applicable. 
 This may result in some api changes (store function) as well in adding a few 
 more function calls like getColumnQualifier which accepts a pre-encoded byte 
 array. It will be in addition to the existing api which accepts a String and 
 the ColumnHelper to return a byte[] column name instead of a String one. 
 Filing jira to track these changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4059) Preemption should delay assignments back to the preempted queue

2015-08-18 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4059:
---
Attachment: YARN-4059.patch

 Preemption should delay assignments back to the preempted queue
 ---

 Key: YARN-4059
 URL: https://issues.apache.org/jira/browse/YARN-4059
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Chang Li
Assignee: Chang Li
 Attachments: YARN-4059.patch


 When preempting containers from a queue it can take a while for the other 
 queues to fully consume the resources that were freed up, due to delays 
 waiting for better locality, etc. Those delays can cause the resources to be 
 assigned back to the preempted queue, and then the preemption cycle continues.
 We should consider adding a delay, either based on node heartbeat counts or 
 time, to avoid granting containers to a queue that was recently preempted. 
 The delay should be sufficient to cover the cycles of the preemption monitor, 
 so we won't try to assign containers in-between preemption events for a queue.
 Worst-case scenario for assigning freed resources to other queues is when all 
 the other queues want no locality. No locality means only one container is 
 assigned per heartbeat, so we need to wait for the entire cluster 
 heartbeating in times the number of containers that could run on a single 
 node.
 So the penalty time for a queue should be the max of either the preemption 
 monitor cycle time or the amount of time it takes to allocate the cluster 
 with one container per heartbeat. Guessing this will be somewhere around 2 
 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-679) add an entry point that can start any Yarn service

2015-08-18 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-679:
---
Labels:   (was: BB2015-05-TBR)

 add an entry point that can start any Yarn service
 --

 Key: YARN-679
 URL: https://issues.apache.org/jira/browse/YARN-679
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: api
Affects Versions: 2.4.0
Reporter: Steve Loughran
Assignee: Steve Loughran
 Attachments: YARN-679-001.patch, YARN-679-002.patch, 
 YARN-679-002.patch, YARN-679-003.patch, YARN-679-004.patch, 
 org.apache.hadoop.servic...mon 3.0.0-SNAPSHOT API).pdf

  Time Spent: 72h
  Remaining Estimate: 0h

 There's no need to write separate .main classes for every Yarn service, given 
 that the startup mechanism should be identical: create, init, start, wait for 
 stopped -with an interrupt handler to trigger a clean shutdown on a control-c 
 interrrupt.
 Provide one that takes any classname, and a list of config files/options



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3223) Resource update during NM graceful decommission

2015-08-18 Thread Brook Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brook Zhou updated YARN-3223:
-
Attachment: YARN-3223-v0.1.patch

Contains tests, formatting changes

 Resource update during NM graceful decommission
 ---

 Key: YARN-3223
 URL: https://issues.apache.org/jira/browse/YARN-3223
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, resourcemanager
Affects Versions: 2.7.1
Reporter: Junping Du
Assignee: Brook Zhou
 Attachments: YARN-3223-v0.1.patch, YARN-3223-v0.patch


 During NM graceful decommission, we should handle resource update properly, 
 include: make RMNode keep track of old resource for possible rollback, keep 
 available resource to 0 and used resource get updated when
 container finished.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4057) If ContainersMonitor is not enabled, only print related log info one time

2015-08-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701971#comment-14701971
 ] 

Hudson commented on YARN-4057:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8318 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8318/])
YARN-4057. If ContainersMonitor is not enabled, only print related log info one 
time. Contributed by Jun Gong. (zxu: rev 
14215c8ef83d58b8443c52a3cb93e6d44fc87065)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java


 If ContainersMonitor is not enabled, only print related log info one time
 -

 Key: YARN-4057
 URL: https://issues.apache.org/jira/browse/YARN-4057
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Jun Gong
Assignee: Jun Gong
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-4057.01.patch


 ContainersMonitorImpl will check whether it is enabled when handling every 
 event,  and it will print following messages again and again if not enabled:
 {quote}
 2015-08-17 13:20:13,792 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
  Neither virutal-memory nor physical-memory is needed. Not running the 
 monitor-thread
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2005) Blacklisting support for scheduling AMs

2015-08-18 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-2005:

Attachment: YARN-2005.006.patch

Fixed YarnConfiguration unit test. Other failure is not happening locally for 
me.

 Blacklisting support for scheduling AMs
 ---

 Key: YARN-2005
 URL: https://issues.apache.org/jira/browse/YARN-2005
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 0.23.10, 2.4.0
Reporter: Jason Lowe
Assignee: Anubhav Dhoot
 Attachments: YARN-2005.001.patch, YARN-2005.002.patch, 
 YARN-2005.003.patch, YARN-2005.004.patch, YARN-2005.005.patch, 
 YARN-2005.006.patch


 It would be nice if the RM supported blacklisting a node for an AM launch 
 after the same node fails a configurable number of AM attempts.  This would 
 be similar to the blacklisting support for scheduling task attempts in the 
 MapReduce AM but for scheduling AM attempts on the RM side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4059) Preemption should delay assignments back to the preempted queue

2015-08-18 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4059:
---
Issue Type: Improvement  (was: Bug)

 Preemption should delay assignments back to the preempted queue
 ---

 Key: YARN-4059
 URL: https://issues.apache.org/jira/browse/YARN-4059
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Chang Li
Assignee: Chang Li

 When preempting containers from a queue it can take a while for the other 
 queues to fully consume the resources that were freed up, due to delays 
 waiting for better locality, etc. Those delays can cause the resources to be 
 assigned back to the preempted queue, and then the preemption cycle continues.
 We should consider adding a delay, either based on node heartbeat counts or 
 time, to avoid granting containers to a queue that was recently preempted. 
 The delay should be sufficient to cover the cycles of the preemption monitor, 
 so we won't try to assign containers in-between preemption events for a queue.
 Worst-case scenario for assigning freed resources to other queues is when all 
 the other queues want no locality. No locality means only one container is 
 assigned per heartbeat, so we need to wait for the entire cluster 
 heartbeating in times the number of containers that could run on a single 
 node.
 So the penalty time for a queue should be the max of either the preemption 
 monitor cycle time or the amount of time it takes to allocate the cluster 
 with one container per heartbeat. Guessing this will be somewhere around 2 
 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3814) REST API implementation for getting raw entities in TimelineReader

2015-08-18 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701893#comment-14701893
 ] 

Li Lu commented on YARN-3814:
-

Hi [~varun_saxena], thanks for the patch! I think the patch is mostly good, 
only with a few nits:

- In TimelineReaderManager
  - Configuration and YarnConfiguration appears to be unused. 
  - callerUGI is not used and not documented. What's our plan on that? How to 
set caller UGI for now?

- In TimelineReaderWebServices, 
  - Can we have two constants for default delimiters? Right now we're spreading 
them in the source code like:
  {code}
  parseKeyStrValuesStr(relatesTo, ,, :),
  parseKeyStrValuesStr(isRelatedTo, ,, :),
  parseKeyStrValueObj(infofilters, ,, :),
  parseKeyStrValueStr(conffilters, ,, :),
  parseValuesStr(metricfilters, ,), parseValuesStr(eventfilters, ,),
  parseFieldsStr(fields, ,), callerUGI);
  {code}
  Similar problem also happens on line 280 after patch. 

 REST API implementation for getting raw entities in TimelineReader
 --

 Key: YARN-3814
 URL: https://issues.apache.org/jira/browse/YARN-3814
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3814-YARN-2928.01.patch, 
 YARN-3814-YARN-2928.02.patch, YARN-3814-YARN-2928.03.patch, 
 YARN-3814-YARN-2928.04.patch, YARN-3814-YARN-2928.05.patch, 
 YARN-3814.reference.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2005) Blacklisting support for scheduling AMs

2015-08-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702337#comment-14702337
 ] 

Hadoop QA commented on YARN-2005:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |   3m 22s | trunk compilation may be 
broken. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 7 new or modified test files. |
| {color:red}-1{color} | javac |   2m 25s | The patch appears to cause the 
build to fail. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751130/YARN-2005.006.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 7ecbfd4 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8881/console |


This message was automatically generated.

 Blacklisting support for scheduling AMs
 ---

 Key: YARN-2005
 URL: https://issues.apache.org/jira/browse/YARN-2005
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 0.23.10, 2.4.0
Reporter: Jason Lowe
Assignee: Anubhav Dhoot
 Attachments: YARN-2005.001.patch, YARN-2005.002.patch, 
 YARN-2005.003.patch, YARN-2005.004.patch, YARN-2005.005.patch, 
 YARN-2005.006.patch


 It would be nice if the RM supported blacklisting a node for an AM launch 
 after the same node fails a configurable number of AM attempts.  This would 
 be similar to the blacklisting support for scheduling task attempts in the 
 MapReduce AM but for scheduling AM attempts on the RM side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3814) REST API implementation for getting raw entities in TimelineReader

2015-08-18 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702370#comment-14702370
 ] 

Varun Saxena commented on YARN-3814:


Yes, we wont use it as of now.

 REST API implementation for getting raw entities in TimelineReader
 --

 Key: YARN-3814
 URL: https://issues.apache.org/jira/browse/YARN-3814
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3814-YARN-2928.01.patch, 
 YARN-3814-YARN-2928.02.patch, YARN-3814-YARN-2928.03.patch, 
 YARN-3814-YARN-2928.04.patch, YARN-3814-YARN-2928.05.patch, 
 YARN-3814-YARN-2928.06.patch, YARN-3814.reference.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2884) Proxying all AM-RM communications

2015-08-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702397#comment-14702397
 ] 

Hadoop QA commented on YARN-2884:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  21m 27s | Pre-patch trunk has 7 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 6 new or modified test files. |
| {color:green}+1{color} | javac |   7m 48s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 39s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 31s | The applied patch generated  1 
new checkstyle issues (total was 237, now 237). |
| {color:green}+1{color} | whitespace |   0m  2s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 29s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   6m 54s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 23s | Tests passed in 
hadoop-yarn-api. |
| {color:red}-1{color} | yarn tests |   1m 57s | Tests failed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-server-common. |
| {color:green}+1{color} | yarn tests |   6m 14s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| {color:red}-1{color} | yarn tests |  53m 11s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | | 114m 10s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.yarn.util.TestRackResolver |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751175/YARN-2884-V9.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 30e342a |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/8880/artifact/patchprocess/trunkFindbugsWarningshadoop-yarn-server-common.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8880/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8880/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8880/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8880/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8880/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8880/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8880/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8880/console |


This message was automatically generated.

 Proxying all AM-RM communications
 -

 Key: YARN-2884
 URL: https://issues.apache.org/jira/browse/YARN-2884
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, resourcemanager
Reporter: Carlo Curino
Assignee: Kishore Chaliparambil
 Attachments: YARN-2884-V1.patch, YARN-2884-V2.patch, 
 YARN-2884-V3.patch, YARN-2884-V4.patch, YARN-2884-V5.patch, 
 YARN-2884-V6.patch, YARN-2884-V7.patch, YARN-2884-V8.patch, YARN-2884-V9.patch


 We introduce the notion of an RMProxy, running on each node (or once per 
 rack). Upon start the AM is forced (via tokens and configuration) to direct 
 all its requests to a new services running on the NM that provide a proxy to 
 the central RM. 
 This give us a place to:
 1) perform distributed scheduling decisions
 2) throttling mis-behaving AMs
 3) mask the access to a federation of RMs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3814) REST API implementation for getting raw entities in TimelineReader

2015-08-18 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702306#comment-14702306
 ] 

Varun Saxena commented on YARN-3814:


As now I have removed it, we can add it later when we do ACLs'. Is that fine ?


 REST API implementation for getting raw entities in TimelineReader
 --

 Key: YARN-3814
 URL: https://issues.apache.org/jira/browse/YARN-3814
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3814-YARN-2928.01.patch, 
 YARN-3814-YARN-2928.02.patch, YARN-3814-YARN-2928.03.patch, 
 YARN-3814-YARN-2928.04.patch, YARN-3814-YARN-2928.05.patch, 
 YARN-3814-YARN-2928.06.patch, YARN-3814.reference.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3814) REST API implementation for getting raw entities in TimelineReader

2015-08-18 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702346#comment-14702346
 ] 

Li Lu commented on YARN-3814:
-

OK... If we're sure we will not use that method else where, that LGTM. 

 REST API implementation for getting raw entities in TimelineReader
 --

 Key: YARN-3814
 URL: https://issues.apache.org/jira/browse/YARN-3814
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3814-YARN-2928.01.patch, 
 YARN-3814-YARN-2928.02.patch, YARN-3814-YARN-2928.03.patch, 
 YARN-3814-YARN-2928.04.patch, YARN-3814-YARN-2928.05.patch, 
 YARN-3814-YARN-2928.06.patch, YARN-3814.reference.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4014) Support user cli interface in for Application Priority

2015-08-18 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702421#comment-14702421
 ] 

Rohith Sharma K S commented on YARN-4014:
-

test failures are unrelated to this patch.

 Support user cli interface in for Application Priority
 --

 Key: YARN-4014
 URL: https://issues.apache.org/jira/browse/YARN-4014
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Rohith Sharma K S
Assignee: Rohith Sharma K S
 Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 
 0002-YARN-4014.patch, 0003-YARN-4014.patch, 0004-YARN-4014.patch, 
 0004-YARN-4014.patch


 Track the changes for user-RM client protocol i.e ApplicationClientProtocol 
 changes and discussions in this jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4060) Revisit default retry config for connection with RM

2015-08-18 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702308#comment-14702308
 ] 

Jian He commented on YARN-4060:
---

bq. Is it considered backwards compatible to change defaults?
 It should be fine, IMO

 Revisit default retry config for connection with RM 
 

 Key: YARN-4060
 URL: https://issues.apache.org/jira/browse/YARN-4060
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He

 15 minutes timeout for AM/NM connection with RM in non-ha scenario turns out 
 to be  short in production environment.  The suggestion is to increase that 
 to 30 min. Also, the retry-interval is set to 30 seconds which appears too 
 long. We may reduce that to 10 seconds ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3814) REST API implementation for getting raw entities in TimelineReader

2015-08-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702426#comment-14702426
 ] 

Hadoop QA commented on YARN-3814:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  16m 34s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m 10s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 55s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 16s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m 14s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 42s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 52s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   1m 35s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  40m 20s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751178/YARN-3814-YARN-2928.06.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 9a82008 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8882/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8882/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8882/console |


This message was automatically generated.

 REST API implementation for getting raw entities in TimelineReader
 --

 Key: YARN-3814
 URL: https://issues.apache.org/jira/browse/YARN-3814
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3814-YARN-2928.01.patch, 
 YARN-3814-YARN-2928.02.patch, YARN-3814-YARN-2928.03.patch, 
 YARN-3814-YARN-2928.04.patch, YARN-3814-YARN-2928.05.patch, 
 YARN-3814-YARN-2928.06.patch, YARN-3814.reference.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4060) Revisit default retry config for connection with RM

2015-08-18 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702186#comment-14702186
 ] 

Karthik Kambatla commented on YARN-4060:


I am in favor of the change. Is it considered backwards compatible to change 
defaults? 

 Revisit default retry config for connection with RM 
 

 Key: YARN-4060
 URL: https://issues.apache.org/jira/browse/YARN-4060
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He

 15 minutes timeout for AM/NM connection with RM in non-ha scenario turns out 
 to be  short in production environment.  The suggestion is to increase that 
 to 30 min. Also, the retry-interval is set to 30 seconds which appears too 
 long. We may reduce that to 10 seconds ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4025) Deal with byte representations of Longs in writer code

2015-08-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702212#comment-14702212
 ] 

Hadoop QA commented on YARN-4025:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  16m 58s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   8m 35s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 44s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 18s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  5s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 30s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 42s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 52s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   1m 29s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  41m 45s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751118/YARN-4025-YARN-2928.004.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 9a82008 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8879/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8879/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8879/console |


This message was automatically generated.

 Deal with byte representations of Longs in writer code
 --

 Key: YARN-4025
 URL: https://issues.apache.org/jira/browse/YARN-4025
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Vrushali C
Assignee: Sangjin Lee
 Attachments: YARN-4025-YARN-2928.001.patch, 
 YARN-4025-YARN-2928.002.patch, YARN-4025-YARN-2928.003.patch, 
 YARN-4025-YARN-2928.004.patch


 Timestamps are being stored as Longs in hbase by the HBaseTimelineWriterImpl 
 code. There seem to be some places in the code where there are conversions 
 between Long to byte[] to String for easier argument passing between function 
 calls. Then these values end up being converted back to byte[] while storing 
 in hbase. 
 It would be better to pass around byte[] or the Longs themselves  as 
 applicable. 
 This may result in some api changes (store function) as well in adding a few 
 more function calls like getColumnQualifier which accepts a pre-encoded byte 
 array. It will be in addition to the existing api which accepts a String and 
 the ColumnHelper to return a byte[] column name instead of a String one. 
 Filing jira to track these changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3814) REST API implementation for getting raw entities in TimelineReader

2015-08-18 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702253#comment-14702253
 ] 

Varun Saxena commented on YARN-3814:


bq.Can we have two constants for default delimiters? 
Ok.

bq. callerUGI is not used and not documented. What's our plan on that? How to 
set caller UGI for now?
callerUGI will be used for applying ACLs'. It is currently set in 
TimelineReaderWebServices. Can be removed for now.

bq. Configuration and YarnConfiguration appears to be unused.
Will remove the imports.

 REST API implementation for getting raw entities in TimelineReader
 --

 Key: YARN-3814
 URL: https://issues.apache.org/jira/browse/YARN-3814
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3814-YARN-2928.01.patch, 
 YARN-3814-YARN-2928.02.patch, YARN-3814-YARN-2928.03.patch, 
 YARN-3814-YARN-2928.04.patch, YARN-3814-YARN-2928.05.patch, 
 YARN-3814.reference.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3814) REST API implementation for getting raw entities in TimelineReader

2015-08-18 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702266#comment-14702266
 ] 

Li Lu commented on YARN-3814:
-

bq. callerUGI will be used for applying ACLs'. It is currently set in 
TimelineReaderWebServices. Can be removed for now.
Yes I know it's used for applying ACLs. We do have plan to support security in 
(possible near) future, and by then the UGI info will become useful. That's 
actually why I'm not suggesting remove it but instead document our current 
intentions on it. So I'd incline to not to remove it for now, but to make it 
clear about our current assumptions/requirements on it.

 REST API implementation for getting raw entities in TimelineReader
 --

 Key: YARN-3814
 URL: https://issues.apache.org/jira/browse/YARN-3814
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3814-YARN-2928.01.patch, 
 YARN-3814-YARN-2928.02.patch, YARN-3814-YARN-2928.03.patch, 
 YARN-3814-YARN-2928.04.patch, YARN-3814-YARN-2928.05.patch, 
 YARN-3814-YARN-2928.06.patch, YARN-3814.reference.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4057) If ContainersMonitor is not enabled, only print related log info one time

2015-08-18 Thread Jun Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702247#comment-14702247
 ] 

Jun Gong commented on YARN-4057:


Thanks [~zxu] for the review and commit!

 If ContainersMonitor is not enabled, only print related log info one time
 -

 Key: YARN-4057
 URL: https://issues.apache.org/jira/browse/YARN-4057
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Jun Gong
Assignee: Jun Gong
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-4057.01.patch


 ContainersMonitorImpl will check whether it is enabled when handling every 
 event,  and it will print following messages again and again if not enabled:
 {quote}
 2015-08-17 13:20:13,792 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
  Neither virutal-memory nor physical-memory is needed. Not running the 
 monitor-thread
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3814) REST API implementation for getting raw entities in TimelineReader

2015-08-18 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3814:
---
Attachment: YARN-3814-YARN-2928.06.patch

Added constants, removed unused imports and unused callerUGI in 
TimelineReaderManager

 REST API implementation for getting raw entities in TimelineReader
 --

 Key: YARN-3814
 URL: https://issues.apache.org/jira/browse/YARN-3814
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3814-YARN-2928.01.patch, 
 YARN-3814-YARN-2928.02.patch, YARN-3814-YARN-2928.03.patch, 
 YARN-3814-YARN-2928.04.patch, YARN-3814-YARN-2928.05.patch, 
 YARN-3814-YARN-2928.06.patch, YARN-3814.reference.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4025) Deal with byte representations of Longs in writer code

2015-08-18 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702274#comment-14702274
 ] 

Junping Du commented on YARN-4025:
--

bq. We can treat that method as the default readResults() method, and the 
other one as the one for having raw (non-string) components. I added more 
javadoc to clarify that point. 
Sounds good. Thanks for addressing this and other review comments.

+1 on latest (004) patch. Will commit it shortly if no further comments from 
others.

 Deal with byte representations of Longs in writer code
 --

 Key: YARN-4025
 URL: https://issues.apache.org/jira/browse/YARN-4025
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Vrushali C
Assignee: Sangjin Lee
 Attachments: YARN-4025-YARN-2928.001.patch, 
 YARN-4025-YARN-2928.002.patch, YARN-4025-YARN-2928.003.patch, 
 YARN-4025-YARN-2928.004.patch


 Timestamps are being stored as Longs in hbase by the HBaseTimelineWriterImpl 
 code. There seem to be some places in the code where there are conversions 
 between Long to byte[] to String for easier argument passing between function 
 calls. Then these values end up being converted back to byte[] while storing 
 in hbase. 
 It would be better to pass around byte[] or the Longs themselves  as 
 applicable. 
 This may result in some api changes (store function) as well in adding a few 
 more function calls like getColumnQualifier which accepts a pre-encoded byte 
 array. It will be in addition to the existing api which accepts a String and 
 the ColumnHelper to return a byte[] column name instead of a String one. 
 Filing jira to track these changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4061) [Fault tolerance] Fault tolerant writer for timeline v2

2015-08-18 Thread Li Lu (JIRA)
Li Lu created YARN-4061:
---

 Summary: [Fault tolerance] Fault tolerant writer for timeline v2
 Key: YARN-4061
 URL: https://issues.apache.org/jira/browse/YARN-4061
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Li Lu
Assignee: Li Lu


We need to build a timeline writer that can be resistant to backend storage 
down time and timeline collector failures. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4060) Revisit default retry config for connection with RM

2015-08-18 Thread Jian He (JIRA)
Jian He created YARN-4060:
-

 Summary: Revisit default retry config for connection with RM 
 Key: YARN-4060
 URL: https://issues.apache.org/jira/browse/YARN-4060
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He


15 minutes timeout for AM/NM connection with RM in non-ha scenario turns out to 
be  short in production environment.  The suggestion is to increase that to 30 
min. Also, the retry-interval is set to 30 seconds which appears too long. We 
may reduce that to 10 seconds ?





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2884) Proxying all AM-RM communications

2015-08-18 Thread Kishore Chaliparambil (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kishore Chaliparambil updated YARN-2884:

Attachment: YARN-2884-V9.patch

Thanks [~jianhe] for reviewing the patch. I have uploaded a new patch that 
addresses all your comments.

 Proxying all AM-RM communications
 -

 Key: YARN-2884
 URL: https://issues.apache.org/jira/browse/YARN-2884
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, resourcemanager
Reporter: Carlo Curino
Assignee: Kishore Chaliparambil
 Attachments: YARN-2884-V1.patch, YARN-2884-V2.patch, 
 YARN-2884-V3.patch, YARN-2884-V4.patch, YARN-2884-V5.patch, 
 YARN-2884-V6.patch, YARN-2884-V7.patch, YARN-2884-V8.patch, YARN-2884-V9.patch


 We introduce the notion of an RMProxy, running on each node (or once per 
 rack). Upon start the AM is forced (via tokens and configuration) to direct 
 all its requests to a new services running on the NM that provide a proxy to 
 the central RM. 
 This give us a place to:
 1) perform distributed scheduling decisions
 2) throttling mis-behaving AMs
 3) mask the access to a federation of RMs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-3422) relatedentities always return empty list when primary filter is set

2015-08-18 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li resolved YARN-3422.

Resolution: Won't Fix

 relatedentities always return empty list when primary filter is set
 ---

 Key: YARN-3422
 URL: https://issues.apache.org/jira/browse/YARN-3422
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Reporter: Chang Li
Assignee: Chang Li
 Attachments: YARN-3422.1.patch


 When you curl for ats entities with a primary filter, the relatedentities 
 fields always return empty list



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4059) Preemption should delay assignments back to the preempted queue

2015-08-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702258#comment-14702258
 ] 

Hadoop QA commented on YARN-4059:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m  2s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   8m 31s | There were no new javac warning 
messages. |
| {color:red}-1{color} | javadoc |  10m 33s | The applied patch generated  1  
additional warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m  2s | The applied patch generated  4 
new checkstyle issues (total was 184, now 188). |
| {color:red}-1{color} | whitespace |   0m  1s | The patch has 13  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 29s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 37s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  53m 29s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  95m 47s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751097/YARN-4059.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 71aedfa |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/8878/artifact/patchprocess/diffJavadocWarnings.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8878/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/8878/artifact/patchprocess/whitespace.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8878/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8878/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8878/console |


This message was automatically generated.

 Preemption should delay assignments back to the preempted queue
 ---

 Key: YARN-4059
 URL: https://issues.apache.org/jira/browse/YARN-4059
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Chang Li
Assignee: Chang Li
 Attachments: YARN-4059.patch


 When preempting containers from a queue it can take a while for the other 
 queues to fully consume the resources that were freed up, due to delays 
 waiting for better locality, etc. Those delays can cause the resources to be 
 assigned back to the preempted queue, and then the preemption cycle continues.
 We should consider adding a delay, either based on node heartbeat counts or 
 time, to avoid granting containers to a queue that was recently preempted. 
 The delay should be sufficient to cover the cycles of the preemption monitor, 
 so we won't try to assign containers in-between preemption events for a queue.
 Worst-case scenario for assigning freed resources to other queues is when all 
 the other queues want no locality. No locality means only one container is 
 assigned per heartbeat, so we need to wait for the entire cluster 
 heartbeating in times the number of containers that could run on a single 
 node.
 So the penalty time for a queue should be the max of either the preemption 
 monitor cycle time or the amount of time it takes to allocate the cluster 
 with one container per heartbeat. Guessing this will be somewhere around 2 
 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-221) NM should provide a way for AM to tell it not to aggregate logs.

2015-08-18 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702523#comment-14702523
 ] 

Xuan Gong commented on YARN-221:


+1. The last patch looks good to me. Let us wait for several days. If there are 
no other comments, I will commit this on this weekend.

[~mingma] At the mean time, could you open a related MR ticket and link it 
here, please ?

 NM should provide a way for AM to tell it not to aggregate logs.
 

 Key: YARN-221
 URL: https://issues.apache.org/jira/browse/YARN-221
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: log-aggregation, nodemanager
Reporter: Robert Joseph Evans
Assignee: Ming Ma
 Attachments: YARN-221-6.patch, YARN-221-7.patch, YARN-221-8.patch, 
 YARN-221-9.patch, YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch, 
 YARN-221-trunk-v3.patch, YARN-221-trunk-v4.patch, YARN-221-trunk-v5.patch


 The NodeManager should provide a way for an AM to tell it that either the 
 logs should not be aggregated, that they should be aggregated with a high 
 priority, or that they should be aggregated but with a lower priority.  The 
 AM should be able to do this in the ContainerLaunch context to provide a 
 default value, but should also be able to update the value when the container 
 is released.
 This would allow for the NM to not aggregate logs in some cases, and avoid 
 connection to the NN at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat

2015-08-18 Thread Hong Zhiguo (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Zhiguo updated YARN-4024:
--
Attachment: YARN-4024-v4.patch

Thanks for your comments, [~leftnoteasy]. I didn't notice there's already such 
events. I updated the patch accordingly.

 YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
 --

 Key: YARN-4024
 URL: https://issues.apache.org/jira/browse/YARN-4024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wangda Tan
Assignee: Hong Zhiguo
 Attachments: YARN-4024-draft-v2.patch, YARN-4024-draft-v3.patch, 
 YARN-4024-draft.patch, YARN-4024-v4.patch


 Currently, YARN RM NodesListManager will resolve IP address every time when 
 node doing heartbeat. When DNS server becomes slow, NM heartbeat will be 
 blocked and cannot make progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3986) getTransferredContainers in AbstractYarnScheduler should be present in YarnScheduler interface instead

2015-08-18 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702485#comment-14702485
 ] 

Varun Saxena commented on YARN-3986:


There is a JIRA raised for TestContainerAllocation failure. Its unrelated.

 getTransferredContainers in AbstractYarnScheduler should be present in 
 YarnScheduler interface instead
 --

 Key: YARN-3986
 URL: https://issues.apache.org/jira/browse/YARN-3986
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3986.01.patch, YARN-3986.02.patch, 
 YARN-3986.03.patch


 Currently getTransferredContainers is present in {{AbstractYarnScheduler}}.
 *But in ApplicationMasterService, while registering AM, we are calling this 
 method by typecasting it to AbstractYarnScheduler, which is incorrect.*
 This method should be moved to YarnScheduler.
 Because if a custom scheduler is to be added, it will implement 
 YarnScheduler, not AbstractYarnScheduler.
 As ApplicationMasterService is calling getTransferredContainers by 
 typecasting it to AbstractYarnScheduler, it is imposing an indirect 
 dependency on AbstractYarnScheduler for any pluggable custom scheduler.
 We can move the method to YarnScheduler and leave the definition in 
 AbstractYarnScheduler as it is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4028) AppBlock page key update and diagnostics value null on recovery

2015-08-18 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702534#comment-14702534
 ] 

Xuan Gong commented on YARN-4028:
-

+1 LGTM. Checking this in

 AppBlock page key update and diagnostics value null on recovery
 ---

 Key: YARN-4028
 URL: https://issues.apache.org/jira/browse/YARN-4028
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Minor
 Attachments: 0001-YARN-4028.patch, 0002-YARN-4028.patch, Image.jpg


 All keys ends with *:*  adding the same in *Log Aggregation Status* for 
 consistency
 Also Diagnostics value shown as null on recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4028) AppBlock page key update and diagnostics value null on recovery

2015-08-18 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702544#comment-14702544
 ] 

Xuan Gong commented on YARN-4028:
-

Committed into trunk/branch-2. Thanks, Bibin A Chundatt.

 AppBlock page key update and diagnostics value null on recovery
 ---

 Key: YARN-4028
 URL: https://issues.apache.org/jira/browse/YARN-4028
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Minor
 Attachments: 0001-YARN-4028.patch, 0002-YARN-4028.patch, Image.jpg


 All keys ends with *:*  adding the same in *Log Aggregation Status* for 
 consistency
 Also Diagnostics value shown as null on recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4014) Support user cli interface in for Application Priority

2015-08-18 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-4014:

Attachment: 0004-YARN-4014.patch

Updating the same with fixing java doc issues.. Kick off jenkins

 Support user cli interface in for Application Priority
 --

 Key: YARN-4014
 URL: https://issues.apache.org/jira/browse/YARN-4014
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Rohith Sharma K S
Assignee: Rohith Sharma K S
 Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 
 0002-YARN-4014.patch, 0003-YARN-4014.patch, 0004-YARN-4014.patch, 
 0004-YARN-4014.patch


 Track the changes for user-RM client protocol i.e ApplicationClientProtocol 
 changes and discussions in this jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3250) Support admin cli interface in for Application Priority

2015-08-18 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701292#comment-14701292
 ] 

Rohith Sharma K S commented on YARN-3250:
-

[~sunilg] [~jianhe] would you have look at patch please?  I will rebase the 
patch based on the review comments.

 Support admin cli interface in for Application Priority
 ---

 Key: YARN-3250
 URL: https://issues.apache.org/jira/browse/YARN-3250
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Sunil G
Assignee: Rohith Sharma K S
 Attachments: 0001-YARN-3250-V1.patch, 0002-YARN-3250.patch


 Current Application Priority Manager supports only configuration via file. 
 To support runtime configurations for admin cli and REST, a common management 
 interface has to be added which can be shared with NodeLabelsManager. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3652) A SchedulerMetrics may be need for evaluating the scheduler's performance

2015-08-18 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701291#comment-14701291
 ] 

Xianyin Xin commented on YARN-3652:
---

A simple introduction of the preview patch: SchedulerMetrics is focus on 
metrics that related to the scheduler's performace. The following metrics are 
considered:

num of waiting events in the scheduler dispatch queue;
num of all kinds events in the scheduler dispatch queue;

events handling rate;
node update handling rate;

events adding rate;
node update adding rate;

statistical info of num of waiting events;
statistical info of num of waiting node update events;

containers allocation rate;

scheduling method exec rate, i.e., num of scheduling tries per second;

app allocation call duration;
nodeUpdate call duration;
scheduling call duration;

These metrics give rich information of the scheduler performance, which can be 
used to diagnose the anomaly of the scheduler.

 A SchedulerMetrics may be need for evaluating the scheduler's performance
 -

 Key: YARN-3652
 URL: https://issues.apache.org/jira/browse/YARN-3652
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager, scheduler
Reporter: Xianyin Xin
 Attachments: YARN-3652-preview.patch


 As discussed in YARN-3630, a {{SchedulerMetrics}} may be need for evaluating 
 the scheduler's performance. The performance indexes includes #events waiting 
 for being handled by scheduler, the throughput, the scheduling delay and/or 
 other indicators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4025) Deal with byte representations of Longs in writer code

2015-08-18 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701325#comment-14701325
 ] 

Junping Du commented on YARN-4025:
--

bq. One major change I did is that now TestHBaseTimelineWriterImpl verifies the 
timeline entities read by HBaseTimelineReaderImpl as well. This provides a nice 
benefit of verifying correctness of HBaseTimelineReaderImpl. It uncovered a bug 
in the process. 
Nice work! Forget to mention that, YARN-3049 should rename 
TestHBaseTimelineWriterImpl to something include Reader. Would you like to do 
it here? Thanks!

 Deal with byte representations of Longs in writer code
 --

 Key: YARN-4025
 URL: https://issues.apache.org/jira/browse/YARN-4025
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Vrushali C
Assignee: Sangjin Lee
 Attachments: YARN-4025-YARN-2928.001.patch, 
 YARN-4025-YARN-2928.002.patch, YARN-4025-YARN-2928.003.patch


 Timestamps are being stored as Longs in hbase by the HBaseTimelineWriterImpl 
 code. There seem to be some places in the code where there are conversions 
 between Long to byte[] to String for easier argument passing between function 
 calls. Then these values end up being converted back to byte[] while storing 
 in hbase. 
 It would be better to pass around byte[] or the Longs themselves  as 
 applicable. 
 This may result in some api changes (store function) as well in adding a few 
 more function calls like getColumnQualifier which accepts a pre-encoded byte 
 array. It will be in addition to the existing api which accepts a String and 
 the ColumnHelper to return a byte[] column name instead of a String one. 
 Filing jira to track these changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3652) A SchedulerMetrics may be need for evaluating the scheduler's performance

2015-08-18 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701267#comment-14701267
 ] 

Xianyin Xin commented on YARN-3652:
---

In the patch i used functions from HADOOP-12338.

 A SchedulerMetrics may be need for evaluating the scheduler's performance
 -

 Key: YARN-3652
 URL: https://issues.apache.org/jira/browse/YARN-3652
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager, scheduler
Reporter: Xianyin Xin
 Attachments: YARN-3652-preview.patch


 As discussed in YARN-3630, a {{SchedulerMetrics}} may be need for evaluating 
 the scheduler's performance. The performance indexes includes #events waiting 
 for being handled by scheduler, the throughput, the scheduling delay and/or 
 other indicators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3652) A SchedulerMetrics may be need for evaluating the scheduler's performance

2015-08-18 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701296#comment-14701296
 ] 

Xianyin Xin commented on YARN-3652:
---

Hi [~sunilg], [~vvasudev], would you please have a look? Any comments are 
welcome.

 A SchedulerMetrics may be need for evaluating the scheduler's performance
 -

 Key: YARN-3652
 URL: https://issues.apache.org/jira/browse/YARN-3652
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager, scheduler
Reporter: Xianyin Xin
 Attachments: YARN-3652-preview.patch


 As discussed in YARN-3630, a {{SchedulerMetrics}} may be need for evaluating 
 the scheduler's performance. The performance indexes includes #events waiting 
 for being handled by scheduler, the throughput, the scheduling delay and/or 
 other indicators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3367) Replace starting a separate thread for post entity with event loop in TimelineClient

2015-08-18 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701302#comment-14701302
 ] 

Naganarasimha G R commented on YARN-3367:
-

Thanks [~djp], Assigning this jira to myself. 

 Replace starting a separate thread for post entity with event loop in 
 TimelineClient
 

 Key: YARN-3367
 URL: https://issues.apache.org/jira/browse/YARN-3367
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Junping Du
Assignee: Naganarasimha G R

 Since YARN-3039, we add loop in TimelineClient to wait for 
 collectorServiceAddress ready before posting any entity. In consumer of  
 TimelineClient (like AM), we are starting a new thread for each call to get 
 rid of potential deadlock in main thread. This way has at least 3 major 
 defects:
 1. The consumer need some additional code to wrap a thread before calling 
 putEntities() in TimelineClient.
 2. It cost many thread resources which is unnecessary.
 3. The sequence of events could be out of order because each posting 
 operation thread get out of waiting loop randomly.
 We should have something like event loop in TimelineClient side, 
 putEntities() only put related entities into a queue of entities and a 
 separated thread handle to deliver entities in queue to collector via REST 
 call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4025) Deal with byte representations of Longs in writer code

2015-08-18 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701309#comment-14701309
 ] 

Junping Du commented on YARN-4025:
--

Thanks [~sjlee0] for updating the patch. 003 patch looks good to me in general. 
Some minor comments:
In ColumnHelper.java,
{code}
   /**
...
+   * @param columnPrefixBytes
+   *  optional prefix to limit columns. If null all columns are
+   *  returned.
...
+   */
+  public Mapbyte[][], Object readResultsHavingCompoundColumnQualifiers(
{code}
Do we handle columnPrefixBytes to be null here as Javadoc comments? I saw we 
handle this case explicitly in readResults() but I didn't see it here. Let me 
know if I miss something. In addition, looks like previous readResults() only 
handle the case CQs are all String. I think we should update that method name 
to something like: readResultsWithAllStringColumnQualifiers() to get rid of 
possible confusion. Last but not the least, for result to be null case, do we 
need to handle it with log some warn messages like other cases in this patch? 

{code}
+  byte[][] columnQualifierParts = Separator.VALUES.split(
+  columnNameParts[1], -1);
{code}
Checking with javadoc in Separator and TimelineWriterUtils - a negative value 
indicates no limit on number of segments., so can we define a constant value 
like NO_LIMIT to replace -1 here? Actually, from checking with implementation 
in TimelineWriterUtils, 0 also indicates the same thing (no limit). Sounds like 
we don't have any tests in TestTimelineWriterUtils.java, we may want to improve 
this in future?

In ApplicationTable,
{code}
- * || e!eventId?timestamp?infoKey: |  |  |
+ * || e!eventId=timestamp=infoKey: | 
{code}
I think we should do the same thing to some javadoc examples in 
EntityTable.java. 

Other looks fine to me.

 Deal with byte representations of Longs in writer code
 --

 Key: YARN-4025
 URL: https://issues.apache.org/jira/browse/YARN-4025
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Vrushali C
Assignee: Sangjin Lee
 Attachments: YARN-4025-YARN-2928.001.patch, 
 YARN-4025-YARN-2928.002.patch, YARN-4025-YARN-2928.003.patch


 Timestamps are being stored as Longs in hbase by the HBaseTimelineWriterImpl 
 code. There seem to be some places in the code where there are conversions 
 between Long to byte[] to String for easier argument passing between function 
 calls. Then these values end up being converted back to byte[] while storing 
 in hbase. 
 It would be better to pass around byte[] or the Longs themselves  as 
 applicable. 
 This may result in some api changes (store function) as well in adding a few 
 more function calls like getColumnQualifier which accepts a pre-encoded byte 
 array. It will be in addition to the existing api which accepts a String and 
 the ColumnHelper to return a byte[] column name instead of a String one. 
 Filing jira to track these changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3942) Timeline store to read events from HDFS

2015-08-18 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701242#comment-14701242
 ] 

Jason Lowe commented on YARN-3942:
--

[~rajesh] the initial exception looks like an issue with the HDFS client layer, 
and most HDFS clients would have similar problems trying to use HDFS.  Normally 
HDFS operations are not retried because there are many retries already in the 
HDFS client and server layers.  So I don't think that exception is an issue to 
fix in the ATS but rather the HDFS configuration and/or code.

Also the patch does not treat that exception being logged as fatal.  It just 
logs the fact that it couldn't complete a scan for that iteration.  It will try 
again in the next scan interval.  The real problem is indicated by this line:
{noformat}
2015-08-18 01:03:35,600 [SIGTERM handler] ERROR 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer:
 RECEIVED SIGNAL 15: SIGTERM
{noformat}
Something outside of the ATS is killing the process with SIGTERM.

 Timeline store to read events from HDFS
 ---

 Key: YARN-3942
 URL: https://issues.apache.org/jira/browse/YARN-3942
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: timelineserver
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-3942.001.patch


 This adds a new timeline store plugin that is intended as a stop-gap measure 
 to mitigate some of the issues we've seen with ATS v1 while waiting for ATS 
 v2.  The intent of this plugin is to provide a workable solution for running 
 the Tez UI against the timeline server on a large-scale clusters running many 
 thousands of jobs per day.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3652) A SchedulerMetrics may be need for evaluating the scheduler's performance

2015-08-18 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-3652:
--
Attachment: YARN-3652-preview.patch

A preview patch submitted.

 A SchedulerMetrics may be need for evaluating the scheduler's performance
 -

 Key: YARN-3652
 URL: https://issues.apache.org/jira/browse/YARN-3652
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager, scheduler
Reporter: Xianyin Xin
 Attachments: YARN-3652-preview.patch


 As discussed in YARN-3630, a {{SchedulerMetrics}} may be need for evaluating 
 the scheduler's performance. The performance indexes includes #events waiting 
 for being handled by scheduler, the throughput, the scheduling delay and/or 
 other indicators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat

2015-08-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701285#comment-14701285
 ] 

Hadoop QA commented on YARN-4024:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  19m  1s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 46s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 40s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 54s | The applied patch generated  1 
new checkstyle issues (total was 211, now 211). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 23s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   4m 37s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 24s | Tests passed in 
hadoop-yarn-api. |
| {color:red}-1{color} | yarn tests |   1m 55s | Tests failed in 
hadoop-yarn-common. |
| {color:red}-1{color} | yarn tests |  57m 27s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | | 105m 51s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
| Failed unit tests | hadoop.yarn.util.TestRackResolver |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751009/YARN-4024-draft-v3.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 71566e2 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8875/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/8875/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8875/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8875/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8875/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8875/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8875/console |


This message was automatically generated.

 YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
 --

 Key: YARN-4024
 URL: https://issues.apache.org/jira/browse/YARN-4024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wangda Tan
Assignee: Hong Zhiguo
 Attachments: YARN-4024-draft-v2.patch, YARN-4024-draft-v3.patch, 
 YARN-4024-draft.patch


 Currently, YARN RM NodesListManager will resolve IP address every time when 
 node doing heartbeat. When DNS server becomes slow, NM heartbeat will be 
 blocked and cannot make progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3986) getTransferredContainers in AbstractYarnScheduler should be present in YarnScheduler interface instead

2015-08-18 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701334#comment-14701334
 ] 

Rohith Sharma K S commented on YARN-3986:
-

+1 for the latest patch..

 getTransferredContainers in AbstractYarnScheduler should be present in 
 YarnScheduler interface instead
 --

 Key: YARN-3986
 URL: https://issues.apache.org/jira/browse/YARN-3986
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3986.01.patch, YARN-3986.02.patch, 
 YARN-3986.03.patch


 Currently getTransferredContainers is present in {{AbstractYarnScheduler}}.
 *But in ApplicationMasterService, while registering AM, we are calling this 
 method by typecasting it to AbstractYarnScheduler, which is incorrect.*
 This method should be moved to YarnScheduler.
 Because if a custom scheduler is to be added, it will implement 
 YarnScheduler, not AbstractYarnScheduler.
 As ApplicationMasterService is calling getTransferredContainers by 
 typecasting it to AbstractYarnScheduler, it is imposing an indirect 
 dependency on AbstractYarnScheduler for any pluggable custom scheduler.
 We can move the method to YarnScheduler and leave the definition in 
 AbstractYarnScheduler as it is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4059) Preemption should delay assignments back to the preempted queue

2015-08-18 Thread Chang Li (JIRA)
Chang Li created YARN-4059:
--

 Summary: Preemption should delay assignments back to the preempted 
queue
 Key: YARN-4059
 URL: https://issues.apache.org/jira/browse/YARN-4059
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li




When preempting containers from a queue it can take a while for the other 
queues to fully consume the resources that were freed up, due to delays waiting 
for better locality, etc. Those delays can cause the resources to be assigned 
back to the preempted queue, and then the preemption cycle continues.

We should consider adding a delay, either based on node heartbeat counts or 
time, to avoid granting containers to a queue that was recently preempted. The 
delay should be sufficient to cover the cycles of the preemption monitor, so we 
won't try to assign containers in-between preemption events for a queue.

Worst-case scenario for assigning freed resources to other queues is when all 
the other queues want no locality. No locality means only one container is 
assigned per heartbeat, so we need to wait for the entire cluster heartbeating 
in times the number of containers that could run on a single node.

So the penalty time for a queue should be the max of either the preemption 
monitor cycle time or the amount of time it takes to allocate the cluster with 
one container per heartbeat. Guessing this will be somewhere around 2 minutes.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3893) Both RM in active state when Admin#transitionToActive failure from refeshAll()

2015-08-18 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701416#comment-14701416
 ] 

Bibin A Chundatt commented on YARN-3893:


Hi [~rohithsharma] 
Thank you for your review comments
Will update the same and upload patch soon.

 Both RM in active state when Admin#transitionToActive failure from refeshAll()
 --

 Key: YARN-3893
 URL: https://issues.apache.org/jira/browse/YARN-3893
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Critical
 Attachments: 0001-YARN-3893.patch, 0002-YARN-3893.patch, 
 0003-YARN-3893.patch, 0004-YARN-3893.patch, yarn-site.xml


 Cases that can cause this.
 # Capacity scheduler xml is wrongly configured during switch
 # Refresh ACL failure due to configuration
 # Refresh User group failure due to configuration
 Continuously both RM will try to be active
 {code}
 dsperf@host-10-128:/opt/bibin/dsperf/OPENSOURCE_3_0/install/hadoop/resourcemanager/bin
  ./yarn rmadmin  -getServiceState rm1
 15/07/07 19:08:10 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 active
 dsperf@host-128:/opt/bibin/dsperf/OPENSOURCE_3_0/install/hadoop/resourcemanager/bin
  ./yarn rmadmin  -getServiceState rm2
 15/07/07 19:08:12 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 active
 {code}
 # Both Web UI active
 # Status shown as active for both RM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1473) Exception from container-launch(Apache Hadoop 2.2.0)

2015-08-18 Thread Maximiliano Mendez (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701424#comment-14701424
 ] 

Maximiliano Mendez commented on YARN-1473:
--

Same error here after an upgrade from 2.6.0 to 2.7.1

 Exception from container-launch(Apache Hadoop 2.2.0)
 

 Key: YARN-1473
 URL: https://issues.apache.org/jira/browse/YARN-1473
 Project: Hadoop YARN
  Issue Type: Bug
 Environment: CentOS5.8 and Apache Hadoop 2.2.0
Reporter: Joy Xu
 Attachments: yarn-site.xml


 Hello all,
 I have meet a exception from container-launch when I run the built-in 
 wordcount program .and the error messge as follow:
 {code}
 13/12/05 00:17:31 INFO mapreduce.Job: Job job_1386171829089_0003 failed with 
 state FAILED due to: Application application_1386171829089_0003 failed 2 
 times due to AM Container for appattempt_1386171829089_0003_02 exited 
 with  exitCode: 1 due to: Exception from container-launch: 
 org.apache.hadoop.util.Shell$ExitCodeException: 
   at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
   at org.apache.hadoop.util.Shell.run(Shell.java:379)
   at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
   at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.
 13/12/05 00:17:31 INFO mapreduce.Job: Counters: 0
 {code}
 Hope someone can Help. Thx.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)