[jira] [Commented] (YARN-3575) Job using 2.5 jars fails on a 2.6 cluster whose RM has been restarted

2015-05-18 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548118#comment-14548118
 ] 

Jason Lowe commented on YARN-3575:
--

The only way to support compatibility would be to remove the epoch number field 
from the container ID, but I doubt that's going to happen at this point.  I 
filed this mostly to document the fact that an incompatibility exists.  Most 
likely we'll have to recommend that users do _not_ perform a restart of the RM 
where it tries to recover (and therefore starts using an epoch number in 
container IDs) as long as applications are running on the grid using YARN 
client jars version 2.5 or earlier.  RM restart with recovery would only be 
supported as long as all applications are using YARN jars = 2.6.

 Job using 2.5 jars fails on a 2.6 cluster whose RM has been restarted
 -

 Key: YARN-3575
 URL: https://issues.apache.org/jira/browse/YARN-3575
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.6.0
Reporter: Jason Lowe

 Trying to launch a job that uses the 2.5 jars fails on a 2.6 cluster whose RM 
 has been restarted (i.e.: epoch != 0) becaue the epoch number starts 
 appearing in the container IDs and the 2.5 jars no longer know how to parse 
 the container IDs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-41) The RM should handle the graceful shutdown of the NM.

2015-05-18 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548130#comment-14548130
 ] 

Jason Lowe commented on YARN-41:


Looking at the table above, I'm wondering about the case where we are doing a 
graceful shutdown, recovery is enabled, but we are not running under 
supervision.  When we are shutting down without supervision the NM will 
normally kill active containers, so in that sense I think we should also 
unregister.  I'm not sure there's a point to avoiding the unregister if no 
containers will survive the NM shutdown.  The NM only avoids killing containers 
on shutdown if the NM supports recovery and it has been told it is being 
supervised (i.e.: it is likely the NM will be restarted shortly after the 
shutdown completes).  In the other cases it kills containers on shutdown to 
avoid a situation where containers are running uncontrolled on a node due to 
the NM being unavailable for a prolonged duration.

 The RM should handle the graceful shutdown of the NM.
 -

 Key: YARN-41
 URL: https://issues.apache.org/jira/browse/YARN-41
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Ravi Teja Ch N V
Assignee: Devaraj K
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
 MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
 YARN-41-4.patch, YARN-41.patch


 Instead of waiting for the NM expiry, RM should remove and handle the NM, 
 which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree

2015-05-18 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548102#comment-14548102
 ] 

Tsuyoshi Ozawa commented on YARN-2336:
--

[~ajisakaa] thank you for updating. We're almost there.

TestRMWebServicesFairScheduler#testClusterSchedulerWithSubQueues: Can we add a 
following test to verify non-existence of the field 'childQueues'? Also, could 
you add a same kind of test to TestRMWebServicesCapacitySched for the 
consistency of APIs between CapacityScheduler and FairScheduler?
{code}
try {
  subQueueInfo.getJSONObject(1).getJSONObject(childQueues);
  Assert.fail(subQueue should omit field 'childQueues' when childQueue  +
  is empty.);
} catch (JSONException je) {
  je.getMessage().contains(JSONObject[\childQueues\] not found.);
}
{code}

ResourceManagerRest.md: we should describe childQueues is omitted if the queue 
doesn't have childQueue:
{code}
| childQueues | array of queues(JSON)/queue objects(XML) | A collection of 
sub-queue information |
{code}

We should fix CapacityScheduler's 'queues' field as same as FairScheduler's one:
{code}
| queues | array of queues(JSON)/zero or more queue objects(XML) | A collection 
of queue resources |
{code}

Minor nits: A following comment can be fixed as return null to omit 
childQueues field when its size is zero.. Also we should add a reason to do 
this like This is for consistency of return value of REST API between 
FairScheduler and CapacityScheduler -  childQueues and . What do you think?
{code}
+// return null for FairSchedulerLeafQueueInfo to avoid childQueues being
+// displayed in the response of REST API.
{code}


 Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
 --

 Key: YARN-2336
 URL: https://issues.apache.org/jira/browse/YARN-2336
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1, 2.6.0
Reporter: Kenji Kikushima
Assignee: Akira AJISAKA
  Labels: BB2015-05-RFC
 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
 YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, YARN-2336.patch


 When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
 blacket JSON for childQueues.
 This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-41) The RM should handle the graceful shutdown of the NM.

2015-05-18 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548141#comment-14548141
 ] 

Junping Du commented on YARN-41:


Thanks [~devaraj.k] for providing truth table above. Contents looks mostly good 
to me. 
One interesting case is when  yarn.nodemanager.recovery.enabled=true but 
yarn.nodemanager.recovery.supervised=false, the shutdown behavior after 
YARN-2331 is: running containers will get killed, but recovery work will 
continue after NM get restarted. Theoretically, I think we should unregister NM 
from RM because we don't expect app/container get recovered on this NM. 

bq. As per my understanding I assumed here that NM is under supervision enabled 
only when the NM recovery is enabled. 
Agree. In practical, if NM is under supervision while NM recovery is disabled, 
the behavior should be exactly the same as both config are set to false.

[~jlowe], any comments here?

 The RM should handle the graceful shutdown of the NM.
 -

 Key: YARN-41
 URL: https://issues.apache.org/jira/browse/YARN-41
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Ravi Teja Ch N V
Assignee: Devaraj K
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
 MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
 YARN-41-4.patch, YARN-41.patch


 Instead of waiting for the NM expiry, RM should remove and handle the NM, 
 which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-41) The RM should handle the graceful shutdown of the NM.

2015-05-18 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548145#comment-14548145
 ] 

Junping Du commented on YARN-41:


bq.  Jason Lowe, any comments here?
Sorry. I didn't see Jason's comments when I am putting my comments. But looks 
like we are talking about the same thing in parallel. :)

 The RM should handle the graceful shutdown of the NM.
 -

 Key: YARN-41
 URL: https://issues.apache.org/jira/browse/YARN-41
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Ravi Teja Ch N V
Assignee: Devaraj K
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
 MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
 YARN-41-4.patch, YARN-41.patch


 Instead of waiting for the NM expiry, RM should remove and handle the NM, 
 which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3668) Long run service shouldn't be killed even if Yarn crashed

2015-05-18 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548123#comment-14548123
 ] 

sandflee commented on YARN-3668:


thanks [~stevel], we're using our own AM not slider, and some online service 
are running on it, we really don't want applications to be killed because of 
AM's failure.

 Long run service shouldn't be killed even if Yarn crashed
 -

 Key: YARN-3668
 URL: https://issues.apache.org/jira/browse/YARN-3668
 Project: Hadoop YARN
  Issue Type: Wish
Reporter: sandflee

 For long running service, it shouldn't be killed even if all yarn component 
 crashed, with RM work preserving and NM restart, yarn could take over 
 applications again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2821) Distributed shell app master becomes unresponsive sometimes

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547966#comment-14547966
 ] 

Hadoop QA commented on YARN-2821:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 46s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 33s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 24s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 36s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   6m 58s | Tests passed in 
hadoop-yarn-applications-distributedshell. |
| | |  42m 25s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733521/YARN-2821.004.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 363c355 |
| hadoop-yarn-applications-distributedshell test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7967/artifact/patchprocess/testrun_hadoop-yarn-applications-distributedshell.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7967/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7967/console |


This message was automatically generated.

 Distributed shell app master becomes unresponsive sometimes
 ---

 Key: YARN-2821
 URL: https://issues.apache.org/jira/browse/YARN-2821
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Affects Versions: 2.5.1
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: YARN-2821.002.patch, YARN-2821.003.patch, 
 YARN-2821.004.patch, apache-yarn-2821.0.patch, apache-yarn-2821.1.patch


 We've noticed that once in a while the distributed shell app master becomes 
 unresponsive and is eventually killed by the RM. snippet of the logs -
 {noformat}
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: 
 appattempt_1415123350094_0017_01 received 0 previous attempts' running 
 containers on AM registration.
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:38 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez2:45454
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Got response from 
 RM for container ask, allocatedCnt=1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_02, 
 containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Setting up 
 container launch container for 
 containerid=container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 START_CONTAINER for Container container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
 onprem-tez2:45454
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 QUERY_CONTAINER for Container 

[jira] [Commented] (YARN-1922) Process group remains alive after container process is killed externally

2015-05-18 Thread gu-chi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547997#comment-14547997
 ] 

gu-chi commented on YARN-1922:
--

Hi, I see you comment here to check in YARN-1922.5.patch, but why 
YARN-1922.6.patch merged? What is the concern?
I find this solution may have defect.
Suppose one container finished, then it will do clean up, the PID file still 
exist and will trigger once singalContainer, this will kill the process with 
the pid in PID file, but as container already finished, so this PID may be 
occupied by other process, this may cause serious issue.
As I know, my NM was killed unexpectedly, what I described can be the cause. 
Even rarely occur.
Below is error scenario, task clean up not finished but NM was killed, then 
started

2015-05-14 21:49:03,063 | INFO  | DeletionService #1 | Deleting absolute path : 
/export/data1/yarn/nm/localdir/usercache/omm/appcache/application_1430456703237_8047/container_1430456703237_8047_01_12582917
 | 
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:400)
2015-05-14 21:49:03,063 | INFO  | AsyncDispatcher event handler | Container 
container_1430456703237_8047_01_12582917 transitioned from EXITED_WITH_SUCCESS 
to DONE | 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:918)
2015-05-14 21:49:03,064 | INFO  | AsyncDispatcher event handler | Removing 
container_1430456703237_8047_01_12582917 from application 
application_1430456703237_8047 | 
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl$ContainerDoneTransition.transition(ApplicationImpl.java:340)
2015-05-14 21:49:03,064 | INFO  | AsyncDispatcher event handler | Considering 
container container_1430456703237_8047_01_12582917 for log-aggregation | 
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.startContainerLogAggregation(AppLogAggregatorImpl.java:342)
2015-05-14 21:49:03,064 | INFO  | AsyncDispatcher event handler | Got event 
CONTAINER_STOP for appId application_1430456703237_8047 | 
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.handle(AuxServices.java:196)
2015-05-14 21:49:03,152 | INFO  | Node Status Updater | Removed completed 
containers from NM context: [container_1430456703237_8047_01_12582917] | 
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.removeCompletedContainersFromContext(NodeStatusUpdaterImpl.java:417)
2015-05-14 21:49:03,293 | INFO  | Task killer for 26924 | Using 
linux-container-executor.users as omm | 
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:349)
2015-05-14 21:49:20,667 | INFO  | main | STARTUP_MSG: 
/
STARTUP_MSG: Starting NodeManager
STARTUP_MSG:   host = SR6S11/192.168.10.21
STARTUP_MSG:   args = []
STARTUP_MSG:   version = V100R001C00
STARTUP_MSG:   classpath = 

 Process group remains alive after container process is killed externally
 

 Key: YARN-1922
 URL: https://issues.apache.org/jira/browse/YARN-1922
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.4.0
 Environment: CentOS 6.4
Reporter: Billie Rinaldi
Assignee: Billie Rinaldi
 Fix For: 2.6.0

 Attachments: YARN-1922.1.patch, YARN-1922.2.patch, YARN-1922.3.patch, 
 YARN-1922.4.patch, YARN-1922.5.patch, YARN-1922.6.patch


 If the main container process is killed externally, ContainerLaunch does not 
 kill the rest of the process group.  Before sending the event that results in 
 the ContainerLaunch.containerCleanup method being called, ContainerLaunch 
 sets the completed flag to true.  Then when cleaning up, it doesn't try to 
 read the pid file if the completed flag is true.  If it read the pid file, it 
 would proceed to send the container a kill signal.  In the case of the 
 DefaultContainerExecutor, this would kill the process group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3670) Add JobConf XML link in Yarn RM UI

2015-05-18 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548047#comment-14548047
 ] 

Jason Lowe commented on YARN-3670:
--

Sure, we could solve this by adding APIs for applications to convey 
job-specific settings or metrics.  However this is far from a minor 
improvement and could be quite involved, including questions like where will 
these settings be stored (probably timelineserver), limits on the key/value 
sizes, limits on total amount of data that can be stored, etc.

 Add JobConf XML link in Yarn RM UI
 --

 Key: YARN-3670
 URL: https://issues.apache.org/jira/browse/YARN-3670
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: webapp
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Hari Sekhon
Priority: Minor

 Request to add JobConf xml link for each application in the RM UI so I don't 
 have to keep recovering it from HDFS to debug if job settings are taking 
 effect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree

2015-05-18 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547990#comment-14547990
 ] 

Tsuyoshi Ozawa commented on YARN-2336:
--

OK, I'm checking it.

 Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
 --

 Key: YARN-2336
 URL: https://issues.apache.org/jira/browse/YARN-2336
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1, 2.6.0
Reporter: Kenji Kikushima
Assignee: Akira AJISAKA
  Labels: BB2015-05-RFC
 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
 YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, YARN-2336.patch


 When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
 blacket JSON for childQueues.
 This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547959#comment-14547959
 ] 

Hadoop QA commented on YARN-3411:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 23s | Pre-patch YARN-2928 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m  0s | There were no new javac warning 
messages. |
| {color:red}-1{color} | javadoc |  10m  1s | The applied patch generated  2  
additional warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 15s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 41s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 42s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 38s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   1m 16s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  38m 26s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733531/YARN-3411-YARN-2928.004.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 463e070 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/7966/artifact/patchprocess/diffJavadocWarnings.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7966/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7966/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7966/console |


This message was automatically generated.

 [Storage implementation] explore the native HBase write schema for storage
 --

 Key: YARN-3411
 URL: https://issues.apache.org/jira/browse/YARN-3411
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vrushali C
Priority: Critical
 Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
 YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
 YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
 YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, 
 YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, 
 YARN-3411.poc.txt


 There is work that's in progress to implement the storage based on a Phoenix 
 schema (YARN-3134).
 In parallel, we would like to explore an implementation based on a native 
 HBase schema for the write path. Such a schema does not exclude using 
 Phoenix, especially for reads and offline queries.
 Once we have basic implementations of both options, we could evaluate them in 
 terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree

2015-05-18 Thread Tsuyoshi Ozawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi Ozawa updated YARN-2336:
-
Affects Version/s: 2.6.0

 Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
 --

 Key: YARN-2336
 URL: https://issues.apache.org/jira/browse/YARN-2336
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1, 2.6.0
Reporter: Kenji Kikushima
Assignee: Akira AJISAKA
  Labels: BB2015-05-RFC
 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
 YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, YARN-2336.patch


 When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
 blacket JSON for childQueues.
 This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3670) Add JobConf XML link in Yarn RM UI

2015-05-18 Thread Hari Sekhon (JIRA)
Hari Sekhon created YARN-3670:
-

 Summary: Add JobConf XML link in Yarn RM UI
 Key: YARN-3670
 URL: https://issues.apache.org/jira/browse/YARN-3670
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: webapp
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Hari Sekhon
Priority: Minor


Request to add JobConf xml link for each application in the RM UI so I don't 
have to keep recovering it from HDFS to debug if job settings are taking effect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3670) Add JobConf XML link in Yarn RM UI

2015-05-18 Thread Hari Sekhon (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548039#comment-14548039
 ] 

Hari Sekhon commented on YARN-3670:
---

I understand (you can't imagine how often I've had to refrain from asking for 
Maps and Reduces to be available in UI like trusty old MRv1). At the moment 
it's too opaque, which is a first generation side-effect of Yarn being a 
generally abstracting resource manager.

However, perhaps it would be possible to create a generic mechanism that allows 
jobs to publish key-value pairs which could be populated with either settings 
or counters to be displayed via a new named link also specified per set of 
key-value pairs, such as a job could publish counters or configuration. In 
this way both the MapReduce client, the Tez client and any other YARN apps 
would have the ability to publish any arbitrary key-value pair table of 
information to YARN to display in the UI?

This would help immensely with debugging yarn jobs.

 Add JobConf XML link in Yarn RM UI
 --

 Key: YARN-3670
 URL: https://issues.apache.org/jira/browse/YARN-3670
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: webapp
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Hari Sekhon
Priority: Minor

 Request to add JobConf xml link for each application in the RM UI so I don't 
 have to keep recovering it from HDFS to debug if job settings are taking 
 effect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3670) Add JobConf XML link in Yarn RM UI

2015-05-18 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548020#comment-14548020
 ] 

Jason Lowe commented on YARN-3670:
--

This is not easily implementable in the general case, since not all YARN 
applications have a job conf that is an XML file and accessible via a link.  
While this is true for MapReduce jobs, YARN is not MapReduce-specific.  This is 
similar to asking for the number of maps and reduces to be available on the RM 
UI.


 Add JobConf XML link in Yarn RM UI
 --

 Key: YARN-3670
 URL: https://issues.apache.org/jira/browse/YARN-3670
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: webapp
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Hari Sekhon
Priority: Minor

 Request to add JobConf xml link for each application in the RM UI so I don't 
 have to keep recovering it from HDFS to debug if job settings are taking 
 effect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-18 Thread Vrushali C (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C updated YARN-3411:
-
Attachment: YARN-3411-YARN-2928.004.patch


Uploading YARN-3411-YARN-2928.004.patch. The earlier patch had a line missing, 
I think it got deleted by mistake when I was looking through the patch file. 

 [Storage implementation] explore the native HBase write schema for storage
 --

 Key: YARN-3411
 URL: https://issues.apache.org/jira/browse/YARN-3411
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vrushali C
Priority: Critical
 Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
 YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
 YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
 YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, 
 YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, 
 YARN-3411.poc.txt


 There is work that's in progress to implement the storage based on a Phoenix 
 schema (YARN-3134).
 In parallel, we would like to explore an implementation based on a native 
 HBase schema for the write path. Such a schema does not exclude using 
 Phoenix, especially for reads and offline queries.
 Once we have basic implementations of both options, we could evaluate them in 
 terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-18 Thread Vrushali C (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C updated YARN-3411:
-
Attachment: YARN-3411-YARN-2928.003.patch


Attaching new patch YARN-3411-YARN-2928.003.patch with code updated as per 
review suggestions.

 [Storage implementation] explore the native HBase write schema for storage
 --

 Key: YARN-3411
 URL: https://issues.apache.org/jira/browse/YARN-3411
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vrushali C
Priority: Critical
 Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
 YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
 YARN-3411-YARN-2928.003.patch, YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, 
 YARN-3411.poc.4.txt, YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, 
 YARN-3411.poc.7.txt, YARN-3411.poc.txt


 There is work that's in progress to implement the storage based on a Phoenix 
 schema (YARN-3134).
 In parallel, we would like to explore an implementation based on a native 
 HBase schema for the write path. Such a schema does not exclude using 
 Phoenix, especially for reads and offline queries.
 Once we have basic implementations of both options, we could evaluate them in 
 terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547730#comment-14547730
 ] 

Hadoop QA commented on YARN-3411:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m  5s | Pre-patch YARN-2928 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:red}-1{color} | javac |   3m 18s | The patch appears to cause the 
build to fail. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733480/YARN-3411-YARN-2928.003.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 463e070 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7965/console |


This message was automatically generated.

 [Storage implementation] explore the native HBase write schema for storage
 --

 Key: YARN-3411
 URL: https://issues.apache.org/jira/browse/YARN-3411
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vrushali C
Priority: Critical
 Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
 YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
 YARN-3411-YARN-2928.003.patch, YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, 
 YARN-3411.poc.4.txt, YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, 
 YARN-3411.poc.7.txt, YARN-3411.poc.txt


 There is work that's in progress to implement the storage based on a Phoenix 
 schema (YARN-3134).
 In parallel, we would like to explore an implementation based on a native 
 HBase schema for the write path. Such a schema does not exclude using 
 Phoenix, especially for reads and offline queries.
 Once we have basic implementations of both options, we could evaluate them in 
 terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3645) ResourceManager can't start success if attribute value of aclSubmitApps is null in fair-scheduler.xml

2015-05-18 Thread Mohammad Shahid Khan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547601#comment-14547601
 ] 

Mohammad Shahid Khan commented on YARN-3645:


Loading with invalid node configuration is not feasible.
But instead of throwing the NullPointerException, we can 
*AllocationConfigurationException* with proper message so that the reason of 
failure could be identified easily.
{code}
if (aclAdministerApps.equals(field.getTagName())) {
Text aclText = (Text)field.getFirstChild();
if (aclText == null) {
  throw new AllocationConfigurationException(
  Invalid admin ACL configuration in allocation file);
}
String text = ((Text)field.getFirstChild()).getData();
acls.put(QueueACL.ADMINISTER_QUEUE, new AccessControlList(text));
  }
{code}

 ResourceManager can't start success if  attribute value of aclSubmitApps is 
 null in fair-scheduler.xml
 

 Key: YARN-3645
 URL: https://issues.apache.org/jira/browse/YARN-3645
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.2
Reporter: zhoulinlin

 The aclSubmitApps is configured in fair-scheduler.xml like below:
 queue name=mr
 aclSubmitApps/aclSubmitApps
  /queue
 The resourcemanager log:
 2015-05-14 12:59:48,623 INFO org.apache.hadoop.service.AbstractService: 
 Service ResourceManager failed in state INITED; cause: 
 org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
 to initialize FairScheduler
 org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
 to initialize FairScheduler
   at 
 org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
   at 
 org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:493)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:920)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:240)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1159)
 Caused by: java.io.IOException: Failed to initialize FairScheduler
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1301)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1318)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   ... 7 more
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:458)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:337)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1299)
   ... 9 more
 2015-05-14 12:59:48,623 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioning 
 to standby state
 2015-05-14 12:59:48,623 INFO 
 com.zte.zdh.platformplugin.factory.YarnPlatformPluginProxyFactory: plugin 
 transitionToStandbyIn
 2015-05-14 12:59:48,623 WARN org.apache.hadoop.service.AbstractService: When 
 stopping the service ResourceManager : java.lang.NullPointerException
 java.lang.NullPointerException
   at 
 com.zte.zdh.platformplugin.factory.YarnPlatformPluginProxyFactory.transitionToStandbyIn(YarnPlatformPluginProxyFactory.java:71)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToStandby(ResourceManager.java:997)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1058)
   at 
 org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
   at 
 org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
   at 
 org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:171)
   at 
 

[jira] [Commented] (YARN-126) yarn rmadmin help message contains reference to hadoop cli and JT

2015-05-18 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547669#comment-14547669
 ] 

Akira AJISAKA commented on YARN-126:


Thanks [~rémy] for updating the patch. Some comments.

1. Would you fix TestGenericOptionsParser?
2. Would you fix the indent size? The indent size is 2 whitespaces.
3. Would you update CommandsManual.md as well?
4. I'm thinking we can remove the deprecated option from command-line help 
message.
{code}
+out.println(-jt local|resourcemanager:portspecify a ResourceManager 
(Deprecated));
{code}

5. The following code can be removed.
{code}
+   conf.set(yarn.resourcemanager.address, localhost:8032, 
+from -rm command line option);
{code}

 yarn rmadmin help message contains reference to hadoop cli and JT
 -

 Key: YARN-126
 URL: https://issues.apache.org/jira/browse/YARN-126
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.0.3-alpha
Reporter: Thomas Graves
Assignee: Rémy SAISSY
  Labels: usability
 Attachments: YARN-126.002.patch, YARN-126.patch


 has option to specify a job tracker and the last line for general command 
 line syntax had bin/hadoop command [genericOptions] [commandOptions]
 ran yarn rmadmin to get usage:
 RMAdmin
 Usage: java RMAdmin
[-refreshQueues]
[-refreshNodes]
[-refreshUserToGroupsMappings]
[-refreshSuperUserGroupsConfiguration]
[-refreshAdminAcls]
[-refreshServiceAcl]
[-help [cmd]]
 Generic options supported are
 -conf configuration file specify an application configuration file
 -D property=valueuse value for given property
 -fs local|namenode:port  specify a namenode
 -jt local|jobtracker:portspecify a job tracker
 -files comma separated list of filesspecify comma separated files to be 
 copied to the map reduce cluster
 -libjars comma separated list of jarsspecify comma separated jar files 
 to include in the classpath.
 -archives comma separated list of archivesspecify comma separated 
 archives to be unarchived on the compute machines.
 The general command line syntax is
 bin/hadoop command [genericOptions] [commandOptions]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3655) FairScheduler: potential livelock due to maxAMShare limitation and container reservation

2015-05-18 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3655:

Attachment: YARN-3655.001.patch

 FairScheduler: potential livelock due to maxAMShare limitation and container 
 reservation 
 -

 Key: YARN-3655
 URL: https://issues.apache.org/jira/browse/YARN-3655
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3655.000.patch, YARN-3655.001.patch


 FairScheduler: potential livelock due to maxAMShare limitation and container 
 reservation.
 If a node is reserved by an application, all the other applications don't 
 have any chance to assign a new container on this node, unless the 
 application which reserves the node assigns a new container on this node or 
 releases the reserved container on this node.
 The problem is if an application tries to call assignReservedContainer and 
 fail to get a new container due to maxAMShare limitation, it will block all 
 other applications to use the nodes it reserves. If all other running 
 applications can't release their AM containers due to being blocked by these 
 reserved containers. A livelock situation can happen.
 The following is the code at FSAppAttempt#assignContainer which can cause 
 this potential livelock.
 {code}
 // Check the AM resource usage for the leaf queue
 if (!isAmRunning()  !getUnmanagedAM()) {
   ListResourceRequest ask = appSchedulingInfo.getAllResourceRequests();
   if (ask.isEmpty() || !getQueue().canRunAppAM(
   ask.get(0).getCapability())) {
 if (LOG.isDebugEnabled()) {
   LOG.debug(Skipping allocation because maxAMShare limit would  +
   be exceeded);
 }
 return Resources.none();
   }
 }
 {code}
 To fix this issue, we can unreserve the node if we can't allocate the AM 
 container on the node due to Max AM share limitation and the node is reserved 
 by the application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3655) FairScheduler: potential livelock due to maxAMShare limitation and container reservation

2015-05-18 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547687#comment-14547687
 ] 

zhihai xu commented on YARN-3655:
-

I uploaded a new patch YARN-3655.001.patch, which added a test case to verify 
this fix. Without the fix, the test will fail.

 FairScheduler: potential livelock due to maxAMShare limitation and container 
 reservation 
 -

 Key: YARN-3655
 URL: https://issues.apache.org/jira/browse/YARN-3655
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3655.000.patch, YARN-3655.001.patch


 FairScheduler: potential livelock due to maxAMShare limitation and container 
 reservation.
 If a node is reserved by an application, all the other applications don't 
 have any chance to assign a new container on this node, unless the 
 application which reserves the node assigns a new container on this node or 
 releases the reserved container on this node.
 The problem is if an application tries to call assignReservedContainer and 
 fail to get a new container due to maxAMShare limitation, it will block all 
 other applications to use the nodes it reserves. If all other running 
 applications can't release their AM containers due to being blocked by these 
 reserved containers. A livelock situation can happen.
 The following is the code at FSAppAttempt#assignContainer which can cause 
 this potential livelock.
 {code}
 // Check the AM resource usage for the leaf queue
 if (!isAmRunning()  !getUnmanagedAM()) {
   ListResourceRequest ask = appSchedulingInfo.getAllResourceRequests();
   if (ask.isEmpty() || !getQueue().canRunAppAM(
   ask.get(0).getCapability())) {
 if (LOG.isDebugEnabled()) {
   LOG.debug(Skipping allocation because maxAMShare limit would  +
   be exceeded);
 }
 return Resources.none();
   }
 }
 {code}
 To fix this issue, we can unreserve the node if we can't allocate the AM 
 container on the node due to Max AM share limitation and the node is reserved 
 by the application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3583) Support of NodeLabel object instead of plain String in YarnClient side.

2015-05-18 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-3583:
--
Attachment: 0003-YARN-3583.patch

Thank you [~leftnoteasy] for the comments.
Updated a patch addressing the comments.

bq. I think it's better to add it separately to make sure it will be tested.
In TestClientRMServices we have test case and I added new test case in 
TestYarnClient also for GetNodeLabels which will check the NodeIdToLabelsInfo. 
In TestPBImplsRecords, we only have test case for Request and Response PB 
Objects, and the inner values are validated in same. Here we need to test an 
inner object for a ResponsePBImpl and is covered by above tests. Pls let me 
know if this is fine.

 Support of NodeLabel object instead of plain String in YarnClient side.
 ---

 Key: YARN-3583
 URL: https://issues.apache.org/jira/browse/YARN-3583
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.6.0
Reporter: Sunil G
Assignee: Sunil G
 Attachments: 0001-YARN-3583.patch, 0002-YARN-3583.patch, 
 0003-YARN-3583.patch


 Similar to YARN-3521, use NodeLabel objects in YarnClient side apis.
 getLabelsToNodes/getNodeToLabels api's can use NodeLabel object instead of 
 using plain label name.
 This will help to bring other label details such as Exclusivity to client 
 side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree

2015-05-18 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated YARN-2336:

Attachment: YARN-2336.009.patch

Thanks [~ozawa] for comments! Updated the patch.

 Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
 --

 Key: YARN-2336
 URL: https://issues.apache.org/jira/browse/YARN-2336
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1, 2.6.0
Reporter: Kenji Kikushima
Assignee: Akira AJISAKA
  Labels: BB2015-05-RFC
 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
 YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, 
 YARN-2336.009.patch, YARN-2336.patch


 When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
 blacket JSON for childQueues.
 This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3468) NM should not blindly rename usercache/filecache/nmPrivate on restart

2015-05-18 Thread Siqi Li (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548239#comment-14548239
 ] 

Siqi Li commented on YARN-3468:
---

Agreed. setting yarn.nodemanager.delete.debug-delay-sec to 10 minute is a bad 
idea in a production cluster.
I will mark this jira as won't fix

 NM should not blindly rename usercache/filecache/nmPrivate on restart
 -

 Key: YARN-3468
 URL: https://issues.apache.org/jira/browse/YARN-3468
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-3468.v1.patch, YARN-3468.v2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-3468) NM should not blindly rename usercache/filecache/nmPrivate on restart

2015-05-18 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li resolved YARN-3468.
---
Resolution: Won't Fix

 NM should not blindly rename usercache/filecache/nmPrivate on restart
 -

 Key: YARN-3468
 URL: https://issues.apache.org/jira/browse/YARN-3468
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-3468.v1.patch, YARN-3468.v2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues

2015-05-18 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-2876:
--
Attachment: YARN-2876.v2.patch

 In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for 
 subqueues
 

 Key: YARN-2876
 URL: https://issues.apache.org/jira/browse/YARN-2876
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2876.v1.patch, YARN-2876.v2.patch, screenshot-1.png


 If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and 
 Scheduler UI will display the entire cluster capacity as its maxResource 
 instead of its parent queue's maxResource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1945) Adding description for each pool in Fair Scheduler Page from fair-scheduler.xml

2015-05-18 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-1945:
--
Attachment: YARN-1945.v6.patch

 Adding description for each pool in Fair Scheduler Page from 
 fair-scheduler.xml
 ---

 Key: YARN-1945
 URL: https://issues.apache.org/jira/browse/YARN-1945
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Affects Versions: 2.3.0
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-1945.v2.patch, YARN-1945.v3.patch, 
 YARN-1945.v4.patch, YARN-1945.v5.patch, YARN-1945.v6.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues

2015-05-18 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-2876:
--
Attachment: YARN-2876.v2.patch

 In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for 
 subqueues
 

 Key: YARN-2876
 URL: https://issues.apache.org/jira/browse/YARN-2876
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2876.v1.patch, YARN-2876.v2.patch, screenshot-1.png


 If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and 
 Scheduler UI will display the entire cluster capacity as its maxResource 
 instead of its parent queue's maxResource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-18 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548335#comment-14548335
 ] 

Junping Du commented on YARN-3411:
--

bq. No, we will never drop the last value. MIN_VERSIONS and TTL are set such 
that last value is always retained. I am setting the MAX_VERSIONS for now to 
200, but we can revisit this when we determine how exactly the timeseries data 
is going to be handled. And of course it can be made configurable.
I meant the earliest (oldest) value not the latest. Agree that we can revisit 
the value later for other cases that I mentioned above, but just want to double 
check we don't have other options, i.e. making time series data as different 
rows or columns rather than different timestamps/versions here.

bq.  Wondering why we would aggregate data in one timeseries for one metric 
over time?
That's because the interested interval (present to enduser) is not always the 
same interval for gathering timeline metrics data. Let's saying we received 
container metrics data from NodeManager every second, but the aggregated data 
user interested is per minutes, then we need to aggregate 60 seconds data for 
one single metrics. Make sense?
 
Thanks for updating the patch. Just quickly check latest patch, a few comments 
so far:
1. Sounds like we don't leverage single row transaction of HBase feature, as we 
are updating different column families (events, configurations, metrics, etc.) 
separately.  Do we need to make sure data in each row get updated consistently?

2. We shouldn't swallow exception in updating data to HBase, just log.error() 
may not be enough.

3. We need to check null in writing TimelineEntity to HBase, as TimelineEntity 
could include null events/configurations/metrics, that could make foreach later 
throw NPE exception.

Comments with more details could come later.

 [Storage implementation] explore the native HBase write schema for storage
 --

 Key: YARN-3411
 URL: https://issues.apache.org/jira/browse/YARN-3411
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vrushali C
Priority: Critical
 Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
 YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
 YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
 YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, 
 YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, 
 YARN-3411.poc.txt


 There is work that's in progress to implement the storage based on a Phoenix 
 schema (YARN-3134).
 In parallel, we would like to explore an implementation based on a native 
 HBase schema for the write path. Such a schema does not exclude using 
 Phoenix, especially for reads and offline queries.
 Once we have basic implementations of both options, we could evaluate them in 
 terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3605) _ as method name may not be supported much longer

2015-05-18 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-3605:
---
Hadoop Flags: Incompatible change

 _ as method name may not be supported much longer
 -

 Key: YARN-3605
 URL: https://issues.apache.org/jira/browse/YARN-3605
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Robert Joseph Evans

 I was trying to run the precommit test on my mac under JDK8, and I got the 
 following error related to javadocs.
  
  (use of '_' as an identifier might not be supported in releases after Java 
 SE 8)
 It looks like we need to at least change the method name to not be '_' any 
 more, or possibly replace the HTML generation with something more standard. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3579) CommonNodeLabelsManager should support NodeLabel instead of string label name when getting node-to-label/label-to-label mappings

2015-05-18 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3579:
-
Fix Version/s: 2.8.0

 CommonNodeLabelsManager should support NodeLabel instead of string label name 
 when getting node-to-label/label-to-label mappings
 

 Key: YARN-3579
 URL: https://issues.apache.org/jira/browse/YARN-3579
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Sunil G
Assignee: Sunil G
Priority: Minor
 Fix For: 2.8.0

 Attachments: 0001-YARN-3579.patch, 0002-YARN-3579.patch, 
 0003-YARN-3579.patch, 0004-YARN-3579.patch


 CommonNodeLabelsManager#getLabelsToNodes returns label name as string. It is 
 not passing information such as Exclusivity etc back to REST interface apis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3669) Attempt-failures validatiy interval should have a global admin configurable lower limit

2015-05-18 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-3669:

Labels: newbie  (was: )

 Attempt-failures validatiy interval should have a global admin configurable 
 lower limit
 ---

 Key: YARN-3669
 URL: https://issues.apache.org/jira/browse/YARN-3669
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
  Labels: newbie

 Found this while reviewing YARN-3480.
 bq. When 'attemptFailuresValidityInterval'(introduced in YARN-611) is set to 
 a small value, retried attempts might be very large. So we need to delete 
 some attempts stored in RMStateStore and RMStateStore.
 I think we need to have a lower limit on the failure-validaty interval to 
 avoid situations like this.
 Having this will avoid pardoning too-many failures in too-short a duration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1735) For FairScheduler AvailableMB in QueueMetrics is the same as AllocateMB

2015-05-18 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-1735:
--
Attachment: YARN-1735.v2.patch

 For FairScheduler AvailableMB in QueueMetrics is the same as AllocateMB
 ---

 Key: YARN-1735
 URL: https://issues.apache.org/jira/browse/YARN-1735
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Siqi Li
 Attachments: YARN-1735.v1.patch, YARN-1735.v2.patch


 in monitoring graphs the AvailableMB of each queue regularly spikes between 
 the AllocatedMB and the entire cluster capacity.
 This cannot be correct since AvailableMB should never be more than the queue 
 max allocation. The spikes are quite confusing since the availableMB is set 
 as the fair share of each queue and the fair share of each queue is bond by 
 their allowed max resource.
 Other than the spiking, the availableMB is always equal to allocatedMB. I 
 think this is not very useful, availableMB for each queue should be their 
 allowed max resource minus allocatedMB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548350#comment-14548350
 ] 

Hadoop QA commented on YARN-2876:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 33s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:red}-1{color} | javac |   3m 26s | The patch appears to cause the 
build to fail. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733579/YARN-2876.v2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 060c84e |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7972/console |


This message was automatically generated.

 In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for 
 subqueues
 

 Key: YARN-2876
 URL: https://issues.apache.org/jira/browse/YARN-2876
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2876.v1.patch, YARN-2876.v2.patch, screenshot-1.png


 If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and 
 Scheduler UI will display the entire cluster capacity as its maxResource 
 instead of its parent queue's maxResource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-41) The RM should handle the graceful shutdown of the NM.

2015-05-18 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-41:
--
Labels:   (was: BB2015-05-TBR)

 The RM should handle the graceful shutdown of the NM.
 -

 Key: YARN-41
 URL: https://issues.apache.org/jira/browse/YARN-41
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Ravi Teja Ch N V
Assignee: Devaraj K
 Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
 MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
 YARN-41-4.patch, YARN-41.patch


 Instead of waiting for the NM expiry, RM should remove and handle the NM, 
 which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548410#comment-14548410
 ] 

Hadoop QA commented on YARN-2336:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m 14s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 41s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 54s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   2m 59s | Site still builds. |
| {color:green}+1{color} | checkstyle |   0m 34s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  2s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   1m 16s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  52m  5s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  95m 18s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS;
 locked 66% of time  Unsynchronized access at FileSystemRMStateStore.java:66% 
of time  Unsynchronized access at FileSystemRMStateStore.java:[line 156] |
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService
 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733572/YARN-2336.009.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle site |
| git revision | trunk / bcc1786 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/7969/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7969/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7969/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7969/console |


This message was automatically generated.

 Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
 --

 Key: YARN-2336
 URL: https://issues.apache.org/jira/browse/YARN-2336
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1, 2.6.0
Reporter: Kenji Kikushima
Assignee: Akira AJISAKA
  Labels: BB2015-05-RFC
 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
 YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, 
 YARN-2336.009.patch, YARN-2336.patch


 When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
 blacket JSON for childQueues.
 This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1945) Adding description for each pool in Fair Scheduler Page from fair-scheduler.xml

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548356#comment-14548356
 ] 

Hadoop QA commented on YARN-1945:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 41s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:red}-1{color} | javac |   3m 19s | The patch appears to cause the 
build to fail. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733578/YARN-1945.v6.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 060c84e |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7973/console |


This message was automatically generated.

 Adding description for each pool in Fair Scheduler Page from 
 fair-scheduler.xml
 ---

 Key: YARN-1945
 URL: https://issues.apache.org/jira/browse/YARN-1945
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Affects Versions: 2.3.0
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-1945.v2.patch, YARN-1945.v3.patch, 
 YARN-1945.v4.patch, YARN-1945.v5.patch, YARN-1945.v6.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3069) Document missing properties in yarn-default.xml

2015-05-18 Thread Ray Chiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang updated YARN-3069:
-
Attachment: YARN-3069.009.patch

- Move yarn.client.app-submission.poll-interval to DeprecatedProperties.md
- Add new property yarn.application.classpath.prepend.distcache to 
yarn-default.xml
- Update properties descriptions and values based on Akira's feedback

 Document missing properties in yarn-default.xml
 ---

 Key: YARN-3069
 URL: https://issues.apache.org/jira/browse/YARN-3069
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Reporter: Ray Chiang
Assignee: Ray Chiang
  Labels: BB2015-05-TBR, supportability
 Attachments: YARN-3069.001.patch, YARN-3069.002.patch, 
 YARN-3069.003.patch, YARN-3069.004.patch, YARN-3069.005.patch, 
 YARN-3069.006.patch, YARN-3069.007.patch, YARN-3069.008.patch, 
 YARN-3069.009.patch


 The following properties are currently not defined in yarn-default.xml.  
 These properties should either be
   A) documented in yarn-default.xml OR
   B)  listed as an exception (with comments, e.g. for internal use) in the 
 TestYarnConfigurationFields unit test
 Any comments for any of the properties below are welcome.
   org.apache.hadoop.yarn.server.sharedcachemanager.RemoteAppChecker
   org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore
   security.applicationhistory.protocol.acl
   yarn.app.container.log.backups
   yarn.app.container.log.dir
   yarn.app.container.log.filesize
   yarn.client.app-submission.poll-interval
   yarn.client.application-client-protocol.poll-timeout-ms
   yarn.is.minicluster
   yarn.log.server.url
   yarn.minicluster.control-resource-monitoring
   yarn.minicluster.fixed.ports
   yarn.minicluster.use-rpc
   yarn.node-labels.fs-store.retry-policy-spec
   yarn.node-labels.fs-store.root-dir
   yarn.node-labels.manager-class
   yarn.nodemanager.container-executor.os.sched.priority.adjustment
   yarn.nodemanager.container-monitor.process-tree.class
   yarn.nodemanager.disk-health-checker.enable
   yarn.nodemanager.docker-container-executor.image-name
   yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms
   yarn.nodemanager.linux-container-executor.group
   yarn.nodemanager.log.deletion-threads-count
   yarn.nodemanager.user-home-dir
   yarn.nodemanager.webapp.https.address
   yarn.nodemanager.webapp.spnego-keytab-file
   yarn.nodemanager.webapp.spnego-principal
   yarn.nodemanager.windows-secure-container-executor.group
   yarn.resourcemanager.configuration.file-system-based-store
   yarn.resourcemanager.delegation-token-renewer.thread-count
   yarn.resourcemanager.delegation.key.update-interval
   yarn.resourcemanager.delegation.token.max-lifetime
   yarn.resourcemanager.delegation.token.renew-interval
   yarn.resourcemanager.history-writer.multi-threaded-dispatcher.pool-size
   yarn.resourcemanager.metrics.runtime.buckets
   yarn.resourcemanager.nm-tokens.master-key-rolling-interval-secs
   yarn.resourcemanager.reservation-system.class
   yarn.resourcemanager.reservation-system.enable
   yarn.resourcemanager.reservation-system.plan.follower
   yarn.resourcemanager.reservation-system.planfollower.time-step
   yarn.resourcemanager.rm.container-allocation.expiry-interval-ms
   yarn.resourcemanager.webapp.spnego-keytab-file
   yarn.resourcemanager.webapp.spnego-principal
   yarn.scheduler.include-port-in-node-name
   yarn.timeline-service.delegation.key.update-interval
   yarn.timeline-service.delegation.token.max-lifetime
   yarn.timeline-service.delegation.token.renew-interval
   yarn.timeline-service.generic-application-history.enabled
   
 yarn.timeline-service.generic-application-history.fs-history-store.compression-type
   yarn.timeline-service.generic-application-history.fs-history-store.uri
   yarn.timeline-service.generic-application-history.store-class
   yarn.timeline-service.http-cross-origin.enabled
   yarn.tracking.url.generator



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues

2015-05-18 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-2876:
--
Attachment: (was: YARN-2876.v2.patch)

 In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for 
 subqueues
 

 Key: YARN-2876
 URL: https://issues.apache.org/jira/browse/YARN-2876
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2876.v1.patch, YARN-2876.v2.patch, screenshot-1.png


 If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and 
 Scheduler UI will display the entire cluster capacity as its maxResource 
 instead of its parent queue's maxResource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3605) _ as method name may not be supported much longer

2015-05-18 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548375#comment-14548375
 ] 

Robert Joseph Evans commented on YARN-3605:
---

This is not a newbie issue.  The code that has the _ method in it is generated 
code, and the code that generates it is far from simple.  This is also 
technically a backwards incompatible change, because other YARN applications 
could be using it.

 _ as method name may not be supported much longer
 -

 Key: YARN-3605
 URL: https://issues.apache.org/jira/browse/YARN-3605
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Robert Joseph Evans

 I was trying to run the precommit test on my mac under JDK8, and I got the 
 following error related to javadocs.
  
  (use of '_' as an identifier might not be supported in releases after Java 
 SE 8)
 It looks like we need to at least change the method name to not be '_' any 
 more, or possibly replace the HTML generation with something more standard. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3605) _ as method name may not be supported much longer

2015-05-18 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated YARN-3605:
--
Labels:   (was: newbie)

 _ as method name may not be supported much longer
 -

 Key: YARN-3605
 URL: https://issues.apache.org/jira/browse/YARN-3605
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Robert Joseph Evans

 I was trying to run the precommit test on my mac under JDK8, and I got the 
 following error related to javadocs.
  
  (use of '_' as an identifier might not be supported in releases after Java 
 SE 8)
 It looks like we need to at least change the method name to not be '_' any 
 more, or possibly replace the HTML generation with something more standard. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-18 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548379#comment-14548379
 ] 

Vrushali C commented on YARN-3411:
--

Hi [~djp]

Thanks for the initial quick feedback! Some responses below:

bq. Do we need to make sure data in each row get updated consistently?
I was thinking it is not necessary since the entity information would come in a 
more streaming fashion, one update at a time anyways. If say, one column is 
written and other is not, the callee can retry again, hbase put will simply 
over-write existing value. 

bq. We shouldn't swallow exception in updating data to HBase, just log.error() 
may not be enough.
Okay, let me look through and modify that.

bq. We need to check null in writing TimelineEntity to HBase, as TimelineEntity 
could include null events/configurations/metrics, that could make foreach later 
throw NPE exception
I have added some null checks, I will go over the code again and update it to 
ensure I have null checks for entity class members like configurations, metrics 
etc.



 [Storage implementation] explore the native HBase write schema for storage
 --

 Key: YARN-3411
 URL: https://issues.apache.org/jira/browse/YARN-3411
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vrushali C
Priority: Critical
 Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
 YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
 YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
 YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, 
 YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, 
 YARN-3411.poc.txt


 There is work that's in progress to implement the storage based on a Phoenix 
 schema (YARN-3134).
 In parallel, we would like to explore an implementation based on a native 
 HBase schema for the write path. Such a schema does not exclude using 
 Phoenix, especially for reads and offline queries.
 Once we have basic implementations of both options, we could evaluate them in 
 terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1735) For FairScheduler AvailableMB in QueueMetrics is the same as AllocateMB

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548378#comment-14548378
 ] 

Hadoop QA commented on YARN-1735:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 35s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:red}-1{color} | javac |   3m 18s | The patch appears to cause the 
build to fail. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733581/YARN-1735.v2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 060c84e |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7974/console |


This message was automatically generated.

 For FairScheduler AvailableMB in QueueMetrics is the same as AllocateMB
 ---

 Key: YARN-1735
 URL: https://issues.apache.org/jira/browse/YARN-1735
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Siqi Li
 Attachments: YARN-1735.v1.patch, YARN-1735.v2.patch


 in monitoring graphs the AvailableMB of each queue regularly spikes between 
 the AllocatedMB and the entire cluster capacity.
 This cannot be correct since AvailableMB should never be more than the queue 
 max allocation. The spikes are quite confusing since the availableMB is set 
 as the fair share of each queue and the fair share of each queue is bond by 
 their allowed max resource.
 Other than the spiking, the availableMB is always equal to allocatedMB. I 
 think this is not very useful, availableMB for each queue should be their 
 allowed max resource minus allocatedMB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-18 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548383#comment-14548383
 ] 

Vrushali C commented on YARN-3411:
--


The patch has an overall -1 due to a couple of javadoc warnings. Will fix those 
in the next patch. 

 [Storage implementation] explore the native HBase write schema for storage
 --

 Key: YARN-3411
 URL: https://issues.apache.org/jira/browse/YARN-3411
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vrushali C
Priority: Critical
 Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
 YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
 YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
 YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, 
 YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, 
 YARN-3411.poc.txt


 There is work that's in progress to implement the storage based on a Phoenix 
 schema (YARN-3134).
 In parallel, we would like to explore an implementation based on a native 
 HBase schema for the write path. Such a schema does not exclude using 
 Phoenix, especially for reads and offline queries.
 Once we have basic implementations of both options, we could evaluate them in 
 terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1896) For FairScheduler expose MinimumQueueResource of each queue in QueueMetrics

2015-05-18 Thread Siqi Li (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548265#comment-14548265
 ] 

Siqi Li commented on YARN-1896:
---

As I said in the description, it would be good to have MinimumQueueResource and 
MaximumQueueResource exposed through QueueMetrics. By doing this, we can not 
only see the current usage of a queue but also the resource limits

 For FairScheduler expose MinimumQueueResource of each queue in QueueMetrics
 ---

 Key: YARN-1896
 URL: https://issues.apache.org/jira/browse/YARN-1896
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-1896.v1.patch, YARN-1896.v2.patch


 For FairScheduler, it's very useful to expose MinimumQueueResource and 
 MaximumQueueResource of each queu in QueueMetrics. Therefore, people can use 
 monitoring graph to see what are their current usage and their limit. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548271#comment-14548271
 ] 

Hadoop QA commented on YARN-2876:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733577/YARN-2876.v2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / bcc1786 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7971/console |


This message was automatically generated.

 In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for 
 subqueues
 

 Key: YARN-2876
 URL: https://issues.apache.org/jira/browse/YARN-2876
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2876.v1.patch, YARN-2876.v2.patch, screenshot-1.png


 If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and 
 Scheduler UI will display the entire cluster capacity as its maxResource 
 instead of its parent queue's maxResource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3302) TestDockerContainerExecutor should run automatically if it can detect docker in the usual place

2015-05-18 Thread Ravindra Kumar Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Kumar Naik updated YARN-3302:
--
Attachment: YARN-3302-trunk.002.patch

Updated patch to address Ravi' s comments

 TestDockerContainerExecutor should run automatically if it can detect docker 
 in the usual place
 ---

 Key: YARN-3302
 URL: https://issues.apache.org/jira/browse/YARN-3302
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Ravi Prakash
Assignee: Ravindra Kumar Naik
 Attachments: YARN-3302-trunk.001.patch, YARN-3302-trunk.002.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-41) The RM should handle the graceful shutdown of the NM.

2015-05-18 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548261#comment-14548261
 ] 

Devaraj K commented on YARN-41:
---

Thanks a lot [~jlowe] and [~djp] for your comments.

I will update the patch considering that NM unregistering with RM when NM 
recovery enabled and supervision disabled as well.

 The RM should handle the graceful shutdown of the NM.
 -

 Key: YARN-41
 URL: https://issues.apache.org/jira/browse/YARN-41
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Ravi Teja Ch N V
Assignee: Devaraj K
  Labels: BB2015-05-TBR
 Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
 MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
 YARN-41-4.patch, YARN-41.patch


 Instead of waiting for the NM expiry, RM should remove and handle the NM, 
 which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3302) TestDockerContainerExecutor should run automatically if it can detect docker in the usual place

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548274#comment-14548274
 ] 

Hadoop QA commented on YARN-3302:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   5m 18s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 30s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 20s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 36s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 2  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m  1s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m 49s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  23m 42s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733573/YARN-3302-trunk.002.patch
 |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / bcc1786 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/7970/artifact/patchprocess/whitespace.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7970/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7970/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7970/console |


This message was automatically generated.

 TestDockerContainerExecutor should run automatically if it can detect docker 
 in the usual place
 ---

 Key: YARN-3302
 URL: https://issues.apache.org/jira/browse/YARN-3302
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Ravi Prakash
Assignee: Ravindra Kumar Naik
 Attachments: YARN-3302-trunk.001.patch, YARN-3302-trunk.002.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3489) RMServerUtils.validateResourceRequests should only obtain queue info once

2015-05-18 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548460#comment-14548460
 ] 

Wangda Tan commented on YARN-3489:
--

[~varun_saxena],
Tx for updating, I tried to run RM tests locally after applied the patch, a 
couple of test failures:

Results :

Failed tests:
  
TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForCapacitySche:383-testRMWritingMassiveHistory:441
 null

Tests in error:
  TestAppManager.testRMAppSubmitDuplicateApplicationId:531 » NullPointer
  TestAppManager.testRMAppSubmitMaxAppAttempts:506 » NullPointer
  TestAppManager.testRMAppSubmit:463 » NullPointer
  TestClientRMService.testAppSubmit:859 » NullPointer
  TestClientRMService.testGetApplications:959 » NullPointer
  TestClientRMService.testConcurrentAppSubmit:1115 »  test timed out after 4000 
...

Could you look at them?

 RMServerUtils.validateResourceRequests should only obtain queue info once
 -

 Key: YARN-3489
 URL: https://issues.apache.org/jira/browse/YARN-3489
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Varun Saxena
  Labels: BB2015-05-RFC
 Attachments: YARN-3489-branch-2.7.02.patch, 
 YARN-3489-branch-2.7.03.patch, YARN-3489-branch-2.7.patch, 
 YARN-3489.01.patch, YARN-3489.02.patch, YARN-3489.03.patch


 Since the label support was added we now get the queue info for each request 
 being validated in SchedulerUtils.validateResourceRequest.  If 
 validateResourceRequests needs to validate a lot of requests at a time (e.g.: 
 large cluster with lots of varied locality in the requests) then it will get 
 the queue info for each request.  Since we build the queue info this 
 generates a lot of unnecessary garbage, as the queue isn't changing between 
 requests.  We should grab the queue info once and pass it down rather than 
 building it again for each request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues

2015-05-18 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-2876:
--
Attachment: YARN-2876.v2.patch

 In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for 
 subqueues
 

 Key: YARN-2876
 URL: https://issues.apache.org/jira/browse/YARN-2876
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2876.v1.patch, YARN-2876.v2.patch, screenshot-1.png


 If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and 
 Scheduler UI will display the entire cluster capacity as its maxResource 
 instead of its parent queue's maxResource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues

2015-05-18 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-2876:
--
Attachment: (was: YARN-2876.v2.patch)

 In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for 
 subqueues
 

 Key: YARN-2876
 URL: https://issues.apache.org/jira/browse/YARN-2876
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2876.v1.patch, YARN-2876.v2.patch, screenshot-1.png


 If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and 
 Scheduler UI will display the entire cluster capacity as its maxResource 
 instead of its parent queue's maxResource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3565) NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String

2015-05-18 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548485#comment-14548485
 ] 

Wangda Tan commented on YARN-3565:
--

[~Naganarasimha], thanks for updating, mostly looks good, two minor comments:

Changes in NodeStatusUpdaterImpl:
- convertToNodeLabelSet could be removed
- {{+   (nodeLabels));}} this line change is not necessary?

 NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object 
 instead of String
 -

 Key: YARN-3565
 URL: https://issues.apache.org/jira/browse/YARN-3565
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
Priority: Blocker
 Attachments: YARN-3565-20150502-1.patch, YARN-3565.20150515-1.patch, 
 YARN-3565.20150516-1.patch


 Now NM HB/Register uses SetString, it will be hard to add new fields if we 
 want to support specifying NodeLabel type such as exclusivity/constraints, 
 etc. We need to make sure rolling upgrade works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1902) Allocation of too many containers when a second request is done with the same resource capability

2015-05-18 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548492#comment-14548492
 ] 

Bikas Saha commented on YARN-1902:
--

An alternate approach that we tried in Apache Tez is to wrap a TaskScheduler 
around the AMRMClient that would take request from the application and do the 
matching internally. Since it would know the matching, it could automatically 
remove the matched requests also. (Still does not remove the race condition but 
it cleaner wrt to the user as an API). The TaskScheduler was written to be 
independent of Tez code so that we could contribute it to YARN as a library, 
however we did not find time to do so. Now that code has evolved quite a bit 
but the original, well-tested code could still be extracted from Tez 0.1 branch 
and contributed to YARN if someone is interested in doing that work.

 Allocation of too many containers when a second request is done with the same 
 resource capability
 -

 Key: YARN-1902
 URL: https://issues.apache.org/jira/browse/YARN-1902
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.2.0, 2.3.0, 2.4.0
Reporter: Sietse T. Au
Assignee: Sietse T. Au
  Labels: client
 Attachments: YARN-1902.patch, YARN-1902.v2.patch, YARN-1902.v3.patch


 Regarding AMRMClientImpl
 Scenario 1:
 Given a ContainerRequest x with Resource y, when addContainerRequest is 
 called z times with x, allocate is called and at least one of the z allocated 
 containers is started, then if another addContainerRequest call is done and 
 subsequently an allocate call to the RM, (z+1) containers will be allocated, 
 where 1 container is expected.
 Scenario 2:
 No containers are started between the allocate calls. 
 Analyzing debug logs of the AMRMClientImpl, I have found that indeed a (z+1) 
 are requested in both scenarios, but that only in the second scenario, the 
 correct behavior is observed.
 Looking at the implementation I have found that this (z+1) request is caused 
 by the structure of the remoteRequestsTable. The consequence of MapResource, 
 ResourceRequestInfo is that ResourceRequestInfo does not hold any 
 information about whether a request has been sent to the RM yet or not.
 There are workarounds for this, such as releasing the excess containers 
 received.
 The solution implemented is to initialize a new ResourceRequest in 
 ResourceRequestInfo when a request has been successfully sent to the RM.
 The patch includes a test in which scenario one is tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3069) Document missing properties in yarn-default.xml

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548496#comment-14548496
 ] 

Hadoop QA commented on YARN-3069:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 36s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:red}-1{color} | javac |   7m 28s | The applied patch generated  1  
additional warning messages. |
| {color:green}+1{color} | javadoc |   9m 32s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   2m 59s | Site still builds. |
| {color:green}+1{color} | checkstyle |   2m  1s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  2s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m  2s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  23m 13s | Tests passed in 
hadoop-common. |
| {color:green}+1{color} | yarn tests |   1m 56s | Tests passed in 
hadoop-yarn-common. |
| | |  70m 19s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733585/YARN-3069.009.patch |
| Optional Tests | site javadoc javac unit findbugs checkstyle |
| git revision | trunk / 060c84e |
| javac | 
https://builds.apache.org/job/PreCommit-YARN-Build/7975/artifact/patchprocess/diffJavacWarnings.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/7975/artifact/patchprocess/whitespace.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7975/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7975/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7975/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7975/console |


This message was automatically generated.

 Document missing properties in yarn-default.xml
 ---

 Key: YARN-3069
 URL: https://issues.apache.org/jira/browse/YARN-3069
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Reporter: Ray Chiang
Assignee: Ray Chiang
  Labels: BB2015-05-TBR, supportability
 Attachments: YARN-3069.001.patch, YARN-3069.002.patch, 
 YARN-3069.003.patch, YARN-3069.004.patch, YARN-3069.005.patch, 
 YARN-3069.006.patch, YARN-3069.007.patch, YARN-3069.008.patch, 
 YARN-3069.009.patch


 The following properties are currently not defined in yarn-default.xml.  
 These properties should either be
   A) documented in yarn-default.xml OR
   B)  listed as an exception (with comments, e.g. for internal use) in the 
 TestYarnConfigurationFields unit test
 Any comments for any of the properties below are welcome.
   org.apache.hadoop.yarn.server.sharedcachemanager.RemoteAppChecker
   org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore
   security.applicationhistory.protocol.acl
   yarn.app.container.log.backups
   yarn.app.container.log.dir
   yarn.app.container.log.filesize
   yarn.client.app-submission.poll-interval
   yarn.client.application-client-protocol.poll-timeout-ms
   yarn.is.minicluster
   yarn.log.server.url
   yarn.minicluster.control-resource-monitoring
   yarn.minicluster.fixed.ports
   yarn.minicluster.use-rpc
   yarn.node-labels.fs-store.retry-policy-spec
   yarn.node-labels.fs-store.root-dir
   yarn.node-labels.manager-class
   yarn.nodemanager.container-executor.os.sched.priority.adjustment
   yarn.nodemanager.container-monitor.process-tree.class
   yarn.nodemanager.disk-health-checker.enable
   yarn.nodemanager.docker-container-executor.image-name
   yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms
   yarn.nodemanager.linux-container-executor.group
   yarn.nodemanager.log.deletion-threads-count
   yarn.nodemanager.user-home-dir
   yarn.nodemanager.webapp.https.address
   

[jira] [Commented] (YARN-1902) Allocation of too many containers when a second request is done with the same resource capability

2015-05-18 Thread MENG DING (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548480#comment-14548480
 ] 

MENG DING commented on YARN-1902:
-

Thanks [~bikassaha] and [~vinodkv] for the education and background info. 
Really helpful. I can now appreciate that there is not a straightforward 
solution to this problem.

Originally I was coming from a pure user experience point of view, where I was 
thinking that if I ever want to use removeContainerRequest, it should only be 
because that I need to cancel previous add requests. Yes I may still get the 
number of containers from the previous requests, but that is understandable. 
However, I would have never thought that I still need to do 
removeContainerRequest to remove requests of matched containers in order to 
make the internal bookkeeping of AMRMClient correct. Why should a user worry 
about these things?

After reading the comments, I start to think that even if we were able to 
figure out which ResourceRequest to deduct from and automatically deduct it at 
the Client, it still won't solve race condition 1 (i.e., allocated containers 
are sitting in RM).

So rather than changing the client, can we not do something at the RM side? For 
example, in AppSchedulingInfo:
1. Maintain a table for total request *only*. The updateResourceRequests() call 
will update this table to reflect the total requests from the client (matching 
the client side remoteRequestsTable).
2. Maintain a table for requests that have been satisfied. Every time a 
successful allocation is made for this application, this table is updated.
3. The difference between table 1 and table 2 will be the outstanding resource 
requests. This table is updated at every updateResourceRequests() and every 
successful allocation. Of course proper synchronization needs to be taken care 
of.
4. The scheduling will be made based on the table 3 (i.e., the outstanding 
request table). 

Do you think if this is something worth considering?

Thanks a lot in advance.
Meng

 Allocation of too many containers when a second request is done with the same 
 resource capability
 -

 Key: YARN-1902
 URL: https://issues.apache.org/jira/browse/YARN-1902
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.2.0, 2.3.0, 2.4.0
Reporter: Sietse T. Au
Assignee: Sietse T. Au
  Labels: client
 Attachments: YARN-1902.patch, YARN-1902.v2.patch, YARN-1902.v3.patch


 Regarding AMRMClientImpl
 Scenario 1:
 Given a ContainerRequest x with Resource y, when addContainerRequest is 
 called z times with x, allocate is called and at least one of the z allocated 
 containers is started, then if another addContainerRequest call is done and 
 subsequently an allocate call to the RM, (z+1) containers will be allocated, 
 where 1 container is expected.
 Scenario 2:
 No containers are started between the allocate calls. 
 Analyzing debug logs of the AMRMClientImpl, I have found that indeed a (z+1) 
 are requested in both scenarios, but that only in the second scenario, the 
 correct behavior is observed.
 Looking at the implementation I have found that this (z+1) request is caused 
 by the structure of the remoteRequestsTable. The consequence of MapResource, 
 ResourceRequestInfo is that ResourceRequestInfo does not hold any 
 information about whether a request has been sent to the RM yet or not.
 There are workarounds for this, such as releasing the excess containers 
 received.
 The solution implemented is to initialize a new ResourceRequest in 
 ResourceRequestInfo when a request has been successfully sent to the RM.
 The patch includes a test in which scenario one is tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3583) Support of NodeLabel object instead of plain String in YarnClient side.

2015-05-18 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548510#comment-14548510
 ] 

Wangda Tan commented on YARN-3583:
--

[~sunilg],
Thanks updating, mostly looks good, 2 nits:
yarn_server_resourcemanager_service_proto:
- NodeIdToLabelsProto - NodeIdToLabelsNameProto, otherwise people will confuse 
this with NodeIdToLabelsInfoProto
- A couple of asserts in tests like  {{Assert.assertTrue(y.isExclusive() == 
true)}}, could be changed to {{Assert.assertTrue/False(y.isExclusive())}}

 Support of NodeLabel object instead of plain String in YarnClient side.
 ---

 Key: YARN-3583
 URL: https://issues.apache.org/jira/browse/YARN-3583
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.6.0
Reporter: Sunil G
Assignee: Sunil G
 Attachments: 0001-YARN-3583.patch, 0002-YARN-3583.patch, 
 0003-YARN-3583.patch


 Similar to YARN-3521, use NodeLabel objects in YarnClient side apis.
 getLabelsToNodes/getNodeToLabels api's can use NodeLabel object instead of 
 using plain label name.
 This will help to bring other label details such as Exclusivity to client 
 side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3645) ResourceManager can't start success if attribute value of aclSubmitApps is null in fair-scheduler.xml

2015-05-18 Thread Gabor Liptak (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548516#comment-14548516
 ] 

Gabor Liptak commented on YARN-3645:


Maybe we are to extract the functionality into a helper method and use it for 
all lookup instances? Thanks

 ResourceManager can't start success if  attribute value of aclSubmitApps is 
 null in fair-scheduler.xml
 

 Key: YARN-3645
 URL: https://issues.apache.org/jira/browse/YARN-3645
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.2
Reporter: zhoulinlin

 The aclSubmitApps is configured in fair-scheduler.xml like below:
 queue name=mr
 aclSubmitApps/aclSubmitApps
  /queue
 The resourcemanager log:
 2015-05-14 12:59:48,623 INFO org.apache.hadoop.service.AbstractService: 
 Service ResourceManager failed in state INITED; cause: 
 org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
 to initialize FairScheduler
 org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
 to initialize FairScheduler
   at 
 org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
   at 
 org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:493)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:920)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:240)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1159)
 Caused by: java.io.IOException: Failed to initialize FairScheduler
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1301)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1318)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   ... 7 more
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:458)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:337)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1299)
   ... 9 more
 2015-05-14 12:59:48,623 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioning 
 to standby state
 2015-05-14 12:59:48,623 INFO 
 com.zte.zdh.platformplugin.factory.YarnPlatformPluginProxyFactory: plugin 
 transitionToStandbyIn
 2015-05-14 12:59:48,623 WARN org.apache.hadoop.service.AbstractService: When 
 stopping the service ResourceManager : java.lang.NullPointerException
 java.lang.NullPointerException
   at 
 com.zte.zdh.platformplugin.factory.YarnPlatformPluginProxyFactory.transitionToStandbyIn(YarnPlatformPluginProxyFactory.java:71)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToStandby(ResourceManager.java:997)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1058)
   at 
 org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
   at 
 org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
   at 
 org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:171)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1159)
 2015-05-14 12:59:48,623 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
 ResourceManager
 org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
 to initialize FairScheduler
   at 
 org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
   at 
 org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
   at 
 

[jira] [Updated] (YARN-1945) Adding description for each pool in Fair Scheduler Page from fair-scheduler.xml

2015-05-18 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-1945:
--
Attachment: YARN-1945.v6.patch

 Adding description for each pool in Fair Scheduler Page from 
 fair-scheduler.xml
 ---

 Key: YARN-1945
 URL: https://issues.apache.org/jira/browse/YARN-1945
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Affects Versions: 2.3.0
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-1945.v2.patch, YARN-1945.v3.patch, 
 YARN-1945.v4.patch, YARN-1945.v5.patch, YARN-1945.v6.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2884) Proxying all AM-RM communications

2015-05-18 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-2884:
-
Assignee: Kishore Chaliparambil  (was: Subru Krishnan)

 Proxying all AM-RM communications
 -

 Key: YARN-2884
 URL: https://issues.apache.org/jira/browse/YARN-2884
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, resourcemanager
Reporter: Carlo Curino
Assignee: Kishore Chaliparambil

 We introduce the notion of an RMProxy, running on each node (or once per 
 rack). Upon start the AM is forced (via tokens and configuration) to direct 
 all its requests to a new services running on the NM that provide a proxy to 
 the central RM. 
 This give us a place to:
 1) perform distributed scheduling decisions
 2) throttling mis-behaving AMs
 3) mask the access to a federation of RMs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3671) Integrate Federation services with ResourceManager

2015-05-18 Thread Subru Krishnan (JIRA)
Subru Krishnan created YARN-3671:


 Summary: Integrate Federation services with ResourceManager
 Key: YARN-3671
 URL: https://issues.apache.org/jira/browse/YARN-3671
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Subru Krishnan
Assignee: Subru Krishnan


This JIRA proposes adding the ability to turn on Federation services like 
StateStore, cluster membership heartbeat etc in the RM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3673) Create a FailoverProxy for Federation services

2015-05-18 Thread Subru Krishnan (JIRA)
Subru Krishnan created YARN-3673:


 Summary: Create a FailoverProxy for Federation services
 Key: YARN-3673
 URL: https://issues.apache.org/jira/browse/YARN-3673
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, resourcemanager
Reporter: Subru Krishnan
Assignee: Subru Krishnan


This JIRA proposes creating a facade for Federation State and Policy Store to 
simply access and have a common place for cache management etc that can be used 
by both Router  AMRMProxy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1945) Adding description for each pool in Fair Scheduler Page from fair-scheduler.xml

2015-05-18 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-1945:
--
Attachment: (was: YARN-1945.v6.patch)

 Adding description for each pool in Fair Scheduler Page from 
 fair-scheduler.xml
 ---

 Key: YARN-1945
 URL: https://issues.apache.org/jira/browse/YARN-1945
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Affects Versions: 2.3.0
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-1945.v2.patch, YARN-1945.v3.patch, 
 YARN-1945.v4.patch, YARN-1945.v5.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3302) TestDockerContainerExecutor should run automatically if it can detect docker in the usual place

2015-05-18 Thread Ravindra Kumar Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Kumar Naik updated YARN-3302:
--
Attachment: YARN-3302-trunk.003.patch

fixed whitespaces in patch

 TestDockerContainerExecutor should run automatically if it can detect docker 
 in the usual place
 ---

 Key: YARN-3302
 URL: https://issues.apache.org/jira/browse/YARN-3302
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Ravi Prakash
Assignee: Ravindra Kumar Naik
 Attachments: YARN-3302-trunk.001.patch, YARN-3302-trunk.002.patch, 
 YARN-3302-trunk.003.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-2884) Proxying all AM-RM communications

2015-05-18 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan reassigned YARN-2884:


Assignee: Subru Krishnan

 Proxying all AM-RM communications
 -

 Key: YARN-2884
 URL: https://issues.apache.org/jira/browse/YARN-2884
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, resourcemanager
Reporter: Carlo Curino
Assignee: Subru Krishnan

 We introduce the notion of an RMProxy, running on each node (or once per 
 rack). Upon start the AM is forced (via tokens and configuration) to direct 
 all its requests to a new services running on the NM that provide a proxy to 
 the central RM. 
 This give us a place to:
 1) perform distributed scheduling decisions
 2) throttling mis-behaving AMs
 3) mask the access to a federation of RMs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3672) Create Facade for Federation State and Policy Store

2015-05-18 Thread Subru Krishnan (JIRA)
Subru Krishnan created YARN-3672:


 Summary: Create Facade for Federation State and Policy Store
 Key: YARN-3672
 URL: https://issues.apache.org/jira/browse/YARN-3672
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Subru Krishnan
Assignee: Subru Krishnan


This JIRA proposes creating a facade for Federation State and Policy Store to 
simply access and have a common place for cache management etc that can be used 
by both Router  AMRMProxy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3673) Create a FailoverProxy for Federation services

2015-05-18 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-3673:
-
Description: This JIRA proposes creating a failover proxy for Federation 
based on the cluster membership information in the StateStore that can be used 
by both Router  AMRMProxy  (was: This JIRA proposes creating a facade for 
Federation State and Policy Store to simply access and have a common place for 
cache management etc that can be used by both Router  AMRMProxy)

 Create a FailoverProxy for Federation services
 --

 Key: YARN-3673
 URL: https://issues.apache.org/jira/browse/YARN-3673
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, resourcemanager
Reporter: Subru Krishnan
Assignee: Subru Krishnan

 This JIRA proposes creating a failover proxy for Federation based on the 
 cluster membership information in the StateStore that can be used by both 
 Router  AMRMProxy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3583) Support of NodeLabel object instead of plain String in YarnClient side.

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548623#comment-14548623
 ] 

Hadoop QA commented on YARN-3583:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 55s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 49s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 54s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   3m 29s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  5s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   6m 29s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | mapreduce tests | 106m 46s | Tests failed in 
hadoop-mapreduce-client-jobclient. |
| {color:green}+1{color} | yarn tests |   0m 27s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   7m  1s | Tests passed in 
hadoop-yarn-client. |
| {color:green}+1{color} | yarn tests |   2m  1s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   0m 30s | Tests passed in 
hadoop-yarn-server-common. |
| {color:green}+1{color} | yarn tests |  50m 23s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | | 212m 33s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS;
 locked 66% of time  Unsynchronized access at FileSystemRMStateStore.java:66% 
of time  Unsynchronized access at FileSystemRMStateStore.java:[line 156] |
| Failed unit tests | hadoop.mapred.TestJobSysDirWithDFS |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733569/0003-YARN-3583.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 182d86d |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/7968/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/7968/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-mapreduce-client-jobclient test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7968/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7968/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7968/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7968/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7968/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7968/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7968/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7968/console |


This message was automatically generated.

 Support of NodeLabel object instead of plain String in YarnClient side.
 ---

 Key: YARN-3583
 URL: https://issues.apache.org/jira/browse/YARN-3583
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.6.0
Reporter: Sunil G
Assignee: Sunil G
 Attachments: 0001-YARN-3583.patch, 0002-YARN-3583.patch, 
 0003-YARN-3583.patch


 Similar to YARN-3521, use NodeLabel objects in YarnClient side apis.
 getLabelsToNodes/getNodeToLabels api's can use NodeLabel object 

[jira] [Commented] (YARN-2821) Distributed shell app master becomes unresponsive sometimes

2015-05-18 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549099#comment-14549099
 ] 

Jian He commented on YARN-2821:
---

thanks Varun ! looks good overall.
test case - for below, we need to test that if AM receives any unknown 
completed container, the numCompletedContainers still equals to the  
numTotalContainers
{code}
// ignore containers we know nothing about - probably from a previous
// attempt
if (!launchedContainers.contains(containerStatus.getContainerId())) {
  LOG.info(Ignoring completed status of 
  + containerStatus.getContainerId()
  + ; unknown container(probably launched by previous attempt));
  continue;
}
{code}

 Distributed shell app master becomes unresponsive sometimes
 ---

 Key: YARN-2821
 URL: https://issues.apache.org/jira/browse/YARN-2821
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Affects Versions: 2.5.1
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: YARN-2821.002.patch, YARN-2821.003.patch, 
 YARN-2821.004.patch, apache-yarn-2821.0.patch, apache-yarn-2821.1.patch


 We've noticed that once in a while the distributed shell app master becomes 
 unresponsive and is eventually killed by the RM. snippet of the logs -
 {noformat}
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: 
 appattempt_1415123350094_0017_01 received 0 previous attempts' running 
 containers on AM registration.
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:38 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez2:45454
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Got response from 
 RM for container ask, allocatedCnt=1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_02, 
 containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Setting up 
 container launch container for 
 containerid=container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 START_CONTAINER for Container container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
 onprem-tez2:45454
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 QUERY_CONTAINER for Container container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
 onprem-tez2:45454
 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez3:45454
 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez4:45454
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Got response from 
 RM for container ask, allocatedCnt=3
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_03, 
 containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_04, 
 containerNode=onprem-tez3:45454, containerNodeURI=onprem-tez3:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_05, 
 containerNode=onprem-tez4:45454, containerNodeURI=onprem-tez4:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Setting up 
 container launch container for 
 containerid=container_1415123350094_0017_01_03
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Setting up 
 container launch container for 
 

[jira] [Commented] (YARN-3541) Add version info on timeline service / generic history web UI and REST API

2015-05-18 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549116#comment-14549116
 ] 

Xuan Gong commented on YARN-3541:
-

+1 LGTM. Will commit

 Add version info on timeline service / generic history web UI and REST API
 --

 Key: YARN-3541
 URL: https://issues.apache.org/jira/browse/YARN-3541
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-3541.1.patch, YARN-3541.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549238#comment-14549238
 ] 

Hadoop QA commented on YARN-2876:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m  6s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 49s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 50s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 23s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   1m 19s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  50m  2s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  87m  7s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS;
 locked 66% of time  Unsynchronized access at FileSystemRMStateStore.java:66% 
of time  Unsynchronized access at FileSystemRMStateStore.java:[line 156] |
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler |
| Timed out tests | 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation
 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733594/YARN-2876.v2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / cdfae44 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/7976/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7976/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7976/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7976/console |


This message was automatically generated.

 In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for 
 subqueues
 

 Key: YARN-2876
 URL: https://issues.apache.org/jira/browse/YARN-2876
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2876.v1.patch, YARN-2876.v2.patch, screenshot-1.png


 If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and 
 Scheduler UI will display the entire cluster capacity as its maxResource 
 instead of its parent queue's maxResource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3647) RMWebServices api's should use updated api from CommonNodeLabelsManager to get NodeLabel object

2015-05-18 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549378#comment-14549378
 ] 

Wangda Tan commented on YARN-3647:
--

[~sunilg], thanks for working on this:

1) getLabelsInfoOnNode in RMNodeLabelsManager is not needed actually, you can 
use CommonNodeLabelsManager.getLabelsByNode instead. Make it public and add 
readlock should be enough, right?

2) Tests of RMWebServices: I suggest to add tests to make sure all getters with 
NodeLabel has proper exclusive from NodeLabelsManager to avoid future 
regression.
You can set different label properties like x.exclusive=true y.exclusive=false 
and check it in test.

 RMWebServices api's should use updated api from CommonNodeLabelsManager to 
 get NodeLabel object
 ---

 Key: YARN-3647
 URL: https://issues.apache.org/jira/browse/YARN-3647
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Sunil G
Assignee: Sunil G
 Attachments: 0001-YARN-3647.patch


 After YARN-3579, RMWebServices apis can use the updated version of apis in 
 CommonNodeLabelsManager which gives full NodeLabel object instead of creating 
 NodeLabel object from plain label name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3541) Add version info on timeline service / generic history web UI and REST API

2015-05-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549152#comment-14549152
 ] 

Hudson commented on YARN-3541:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7856 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7856/])
YARN-3541. Add version info on timeline service / generic history web UI and 
REST API. Contributed by Zhijie Shen (xgong: rev 
76afd28862c1f27011273659a82cd45903a77170)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/timeline/TimelineUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/webapp/TimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/NavBlock.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timeline/TimelineAbout.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebServices.java


 Add version info on timeline service / generic history web UI and REST API
 --

 Key: YARN-3541
 URL: https://issues.apache.org/jira/browse/YARN-3541
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.8.0

 Attachments: YARN-3541.1.patch, YARN-3541.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3069) Document missing properties in yarn-default.xml

2015-05-18 Thread Ray Chiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang updated YARN-3069:
-
Attachment: YARN-3069.010.patch

- Fix whitespace and deprecation warnings

 Document missing properties in yarn-default.xml
 ---

 Key: YARN-3069
 URL: https://issues.apache.org/jira/browse/YARN-3069
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Reporter: Ray Chiang
Assignee: Ray Chiang
  Labels: BB2015-05-TBR, supportability
 Attachments: YARN-3069.001.patch, YARN-3069.002.patch, 
 YARN-3069.003.patch, YARN-3069.004.patch, YARN-3069.005.patch, 
 YARN-3069.006.patch, YARN-3069.007.patch, YARN-3069.008.patch, 
 YARN-3069.009.patch, YARN-3069.010.patch


 The following properties are currently not defined in yarn-default.xml.  
 These properties should either be
   A) documented in yarn-default.xml OR
   B)  listed as an exception (with comments, e.g. for internal use) in the 
 TestYarnConfigurationFields unit test
 Any comments for any of the properties below are welcome.
   org.apache.hadoop.yarn.server.sharedcachemanager.RemoteAppChecker
   org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore
   security.applicationhistory.protocol.acl
   yarn.app.container.log.backups
   yarn.app.container.log.dir
   yarn.app.container.log.filesize
   yarn.client.app-submission.poll-interval
   yarn.client.application-client-protocol.poll-timeout-ms
   yarn.is.minicluster
   yarn.log.server.url
   yarn.minicluster.control-resource-monitoring
   yarn.minicluster.fixed.ports
   yarn.minicluster.use-rpc
   yarn.node-labels.fs-store.retry-policy-spec
   yarn.node-labels.fs-store.root-dir
   yarn.node-labels.manager-class
   yarn.nodemanager.container-executor.os.sched.priority.adjustment
   yarn.nodemanager.container-monitor.process-tree.class
   yarn.nodemanager.disk-health-checker.enable
   yarn.nodemanager.docker-container-executor.image-name
   yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms
   yarn.nodemanager.linux-container-executor.group
   yarn.nodemanager.log.deletion-threads-count
   yarn.nodemanager.user-home-dir
   yarn.nodemanager.webapp.https.address
   yarn.nodemanager.webapp.spnego-keytab-file
   yarn.nodemanager.webapp.spnego-principal
   yarn.nodemanager.windows-secure-container-executor.group
   yarn.resourcemanager.configuration.file-system-based-store
   yarn.resourcemanager.delegation-token-renewer.thread-count
   yarn.resourcemanager.delegation.key.update-interval
   yarn.resourcemanager.delegation.token.max-lifetime
   yarn.resourcemanager.delegation.token.renew-interval
   yarn.resourcemanager.history-writer.multi-threaded-dispatcher.pool-size
   yarn.resourcemanager.metrics.runtime.buckets
   yarn.resourcemanager.nm-tokens.master-key-rolling-interval-secs
   yarn.resourcemanager.reservation-system.class
   yarn.resourcemanager.reservation-system.enable
   yarn.resourcemanager.reservation-system.plan.follower
   yarn.resourcemanager.reservation-system.planfollower.time-step
   yarn.resourcemanager.rm.container-allocation.expiry-interval-ms
   yarn.resourcemanager.webapp.spnego-keytab-file
   yarn.resourcemanager.webapp.spnego-principal
   yarn.scheduler.include-port-in-node-name
   yarn.timeline-service.delegation.key.update-interval
   yarn.timeline-service.delegation.token.max-lifetime
   yarn.timeline-service.delegation.token.renew-interval
   yarn.timeline-service.generic-application-history.enabled
   
 yarn.timeline-service.generic-application-history.fs-history-store.compression-type
   yarn.timeline-service.generic-application-history.fs-history-store.uri
   yarn.timeline-service.generic-application-history.store-class
   yarn.timeline-service.http-cross-origin.enabled
   yarn.tracking.url.generator



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3069) Document missing properties in yarn-default.xml

2015-05-18 Thread Ray Chiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang updated YARN-3069:
-
Attachment: (was: YARN-3069.010.patch)

 Document missing properties in yarn-default.xml
 ---

 Key: YARN-3069
 URL: https://issues.apache.org/jira/browse/YARN-3069
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Reporter: Ray Chiang
Assignee: Ray Chiang
  Labels: BB2015-05-TBR, supportability
 Attachments: YARN-3069.001.patch, YARN-3069.002.patch, 
 YARN-3069.003.patch, YARN-3069.004.patch, YARN-3069.005.patch, 
 YARN-3069.006.patch, YARN-3069.007.patch, YARN-3069.008.patch, 
 YARN-3069.009.patch


 The following properties are currently not defined in yarn-default.xml.  
 These properties should either be
   A) documented in yarn-default.xml OR
   B)  listed as an exception (with comments, e.g. for internal use) in the 
 TestYarnConfigurationFields unit test
 Any comments for any of the properties below are welcome.
   org.apache.hadoop.yarn.server.sharedcachemanager.RemoteAppChecker
   org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore
   security.applicationhistory.protocol.acl
   yarn.app.container.log.backups
   yarn.app.container.log.dir
   yarn.app.container.log.filesize
   yarn.client.app-submission.poll-interval
   yarn.client.application-client-protocol.poll-timeout-ms
   yarn.is.minicluster
   yarn.log.server.url
   yarn.minicluster.control-resource-monitoring
   yarn.minicluster.fixed.ports
   yarn.minicluster.use-rpc
   yarn.node-labels.fs-store.retry-policy-spec
   yarn.node-labels.fs-store.root-dir
   yarn.node-labels.manager-class
   yarn.nodemanager.container-executor.os.sched.priority.adjustment
   yarn.nodemanager.container-monitor.process-tree.class
   yarn.nodemanager.disk-health-checker.enable
   yarn.nodemanager.docker-container-executor.image-name
   yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms
   yarn.nodemanager.linux-container-executor.group
   yarn.nodemanager.log.deletion-threads-count
   yarn.nodemanager.user-home-dir
   yarn.nodemanager.webapp.https.address
   yarn.nodemanager.webapp.spnego-keytab-file
   yarn.nodemanager.webapp.spnego-principal
   yarn.nodemanager.windows-secure-container-executor.group
   yarn.resourcemanager.configuration.file-system-based-store
   yarn.resourcemanager.delegation-token-renewer.thread-count
   yarn.resourcemanager.delegation.key.update-interval
   yarn.resourcemanager.delegation.token.max-lifetime
   yarn.resourcemanager.delegation.token.renew-interval
   yarn.resourcemanager.history-writer.multi-threaded-dispatcher.pool-size
   yarn.resourcemanager.metrics.runtime.buckets
   yarn.resourcemanager.nm-tokens.master-key-rolling-interval-secs
   yarn.resourcemanager.reservation-system.class
   yarn.resourcemanager.reservation-system.enable
   yarn.resourcemanager.reservation-system.plan.follower
   yarn.resourcemanager.reservation-system.planfollower.time-step
   yarn.resourcemanager.rm.container-allocation.expiry-interval-ms
   yarn.resourcemanager.webapp.spnego-keytab-file
   yarn.resourcemanager.webapp.spnego-principal
   yarn.scheduler.include-port-in-node-name
   yarn.timeline-service.delegation.key.update-interval
   yarn.timeline-service.delegation.token.max-lifetime
   yarn.timeline-service.delegation.token.renew-interval
   yarn.timeline-service.generic-application-history.enabled
   
 yarn.timeline-service.generic-application-history.fs-history-store.compression-type
   yarn.timeline-service.generic-application-history.fs-history-store.uri
   yarn.timeline-service.generic-application-history.store-class
   yarn.timeline-service.http-cross-origin.enabled
   yarn.tracking.url.generator



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3668) Long run service shouldn't be killed even if Yarn crashed

2015-05-18 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549218#comment-14549218
 ] 

Steve Loughran commented on YARN-3668:
--

[~sandflee] : I know you are using something else, I was just describing what 
we do to deal with failures. 

If it is purely AM failure you care about, then setting the restart bit at 
launch time is enough for YARN to bring things back. If the AM fails too many 
times in the failure window then the app will fail, for which there is one fix: 
don't fail as often.

I'd actually like a failure code to tell YARN to restart us without counting it 
as a failure; this would help us do live updates more safely.

 Long run service shouldn't be killed even if Yarn crashed
 -

 Key: YARN-3668
 URL: https://issues.apache.org/jira/browse/YARN-3668
 Project: Hadoop YARN
  Issue Type: Wish
Reporter: sandflee

 For long running service, it shouldn't be killed even if all yarn component 
 crashed, with RM work preserving and NM restart, yarn could take over 
 applications again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues

2015-05-18 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-2876:
--
Attachment: YARN-2876.v3.patch

 In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for 
 subqueues
 

 Key: YARN-2876
 URL: https://issues.apache.org/jira/browse/YARN-2876
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2876.v1.patch, YARN-2876.v2.patch, 
 YARN-2876.v3.patch, screenshot-1.png


 If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and 
 Scheduler UI will display the entire cluster capacity as its maxResource 
 instead of its parent queue's maxResource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3675) FairScheduler: RM quits when node removal races with continousscheduling on the same node

2015-05-18 Thread Anubhav Dhoot (JIRA)
Anubhav Dhoot created YARN-3675:
---

 Summary: FairScheduler: RM quits when node removal races with 
continousscheduling on the same node
 Key: YARN-3675
 URL: https://issues.apache.org/jira/browse/YARN-3675
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot


With continuous scheduling, scheduling can be done on a node thats just removed 
causing errors like below.

{noformat}
12:28:53.782 AM FATAL 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager

Error in handling event type APP_ATTEMPT_REMOVED to the scheduler
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.unreserve(FSAppAttempt.java:469)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:815)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:763)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1217)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:111)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684)
at java.lang.Thread.run(Thread.java:745)
12:28:53.783 AM  INFO 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Exiting, bbye..
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3302) TestDockerContainerExecutor should run automatically if it can detect docker in the usual place

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549127#comment-14549127
 ] 

Hadoop QA commented on YARN-3302:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   5m 14s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 30s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 20s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 36s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 31s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m  1s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m  5s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  22m 53s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733601/YARN-3302-trunk.003.patch
 |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / cdfae44 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7978/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7978/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7978/console |


This message was automatically generated.

 TestDockerContainerExecutor should run automatically if it can detect docker 
 in the usual place
 ---

 Key: YARN-3302
 URL: https://issues.apache.org/jira/browse/YARN-3302
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Ravi Prakash
Assignee: Ravindra Kumar Naik
 Attachments: YARN-3302-trunk.001.patch, YARN-3302-trunk.002.patch, 
 YARN-3302-trunk.003.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3541) Add version info on timeline service / generic history web UI and REST API

2015-05-18 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549138#comment-14549138
 ] 

Xuan Gong commented on YARN-3541:
-

Committed into trunk/branch-2. Thanks, zhijie

 Add version info on timeline service / generic history web UI and REST API
 --

 Key: YARN-3541
 URL: https://issues.apache.org/jira/browse/YARN-3541
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.8.0

 Attachments: YARN-3541.1.patch, YARN-3541.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1735) For FairScheduler AvailableMB in QueueMetrics is the same as AllocateMB

2015-05-18 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-1735:
--
Attachment: YARN-1735.v3.patch

 For FairScheduler AvailableMB in QueueMetrics is the same as AllocateMB
 ---

 Key: YARN-1735
 URL: https://issues.apache.org/jira/browse/YARN-1735
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Siqi Li
 Attachments: YARN-1735.v1.patch, YARN-1735.v2.patch, 
 YARN-1735.v3.patch


 in monitoring graphs the AvailableMB of each queue regularly spikes between 
 the AllocatedMB and the entire cluster capacity.
 This cannot be correct since AvailableMB should never be more than the queue 
 max allocation. The spikes are quite confusing since the availableMB is set 
 as the fair share of each queue and the fair share of each queue is bond by 
 their allowed max resource.
 Other than the spiking, the availableMB is always equal to allocatedMB. I 
 think this is not very useful, availableMB for each queue should be their 
 allowed max resource minus allocatedMB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3069) Document missing properties in yarn-default.xml

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549314#comment-14549314
 ] 

Hadoop QA commented on YARN-3069:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 56s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 48s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 55s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   2m 57s | Site still builds. |
| {color:green}+1{color} | checkstyle |   1m 34s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  2s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m  1s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  23m  8s | Tests passed in 
hadoop-common. |
| {color:green}+1{color} | yarn tests |   1m 58s | Tests passed in 
hadoop-yarn-common. |
| | |  70m 50s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733618/YARN-3069.010.patch |
| Optional Tests | site javadoc javac unit findbugs checkstyle |
| git revision | trunk / 0790275 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7979/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7979/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7979/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7979/console |


This message was automatically generated.

 Document missing properties in yarn-default.xml
 ---

 Key: YARN-3069
 URL: https://issues.apache.org/jira/browse/YARN-3069
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Reporter: Ray Chiang
Assignee: Ray Chiang
  Labels: BB2015-05-TBR, supportability
 Attachments: YARN-3069.001.patch, YARN-3069.002.patch, 
 YARN-3069.003.patch, YARN-3069.004.patch, YARN-3069.005.patch, 
 YARN-3069.006.patch, YARN-3069.007.patch, YARN-3069.008.patch, 
 YARN-3069.009.patch, YARN-3069.010.patch


 The following properties are currently not defined in yarn-default.xml.  
 These properties should either be
   A) documented in yarn-default.xml OR
   B)  listed as an exception (with comments, e.g. for internal use) in the 
 TestYarnConfigurationFields unit test
 Any comments for any of the properties below are welcome.
   org.apache.hadoop.yarn.server.sharedcachemanager.RemoteAppChecker
   org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore
   security.applicationhistory.protocol.acl
   yarn.app.container.log.backups
   yarn.app.container.log.dir
   yarn.app.container.log.filesize
   yarn.client.app-submission.poll-interval
   yarn.client.application-client-protocol.poll-timeout-ms
   yarn.is.minicluster
   yarn.log.server.url
   yarn.minicluster.control-resource-monitoring
   yarn.minicluster.fixed.ports
   yarn.minicluster.use-rpc
   yarn.node-labels.fs-store.retry-policy-spec
   yarn.node-labels.fs-store.root-dir
   yarn.node-labels.manager-class
   yarn.nodemanager.container-executor.os.sched.priority.adjustment
   yarn.nodemanager.container-monitor.process-tree.class
   yarn.nodemanager.disk-health-checker.enable
   yarn.nodemanager.docker-container-executor.image-name
   yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms
   yarn.nodemanager.linux-container-executor.group
   yarn.nodemanager.log.deletion-threads-count
   yarn.nodemanager.user-home-dir
   yarn.nodemanager.webapp.https.address
   yarn.nodemanager.webapp.spnego-keytab-file
   yarn.nodemanager.webapp.spnego-principal
   yarn.nodemanager.windows-secure-container-executor.group
   yarn.resourcemanager.configuration.file-system-based-store
   

[jira] [Updated] (YARN-3069) Document missing properties in yarn-default.xml

2015-05-18 Thread Ray Chiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang updated YARN-3069:
-
Attachment: YARN-3069.010.patch

- Leave out MR bits from previous patch.

 Document missing properties in yarn-default.xml
 ---

 Key: YARN-3069
 URL: https://issues.apache.org/jira/browse/YARN-3069
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Reporter: Ray Chiang
Assignee: Ray Chiang
  Labels: BB2015-05-TBR, supportability
 Attachments: YARN-3069.001.patch, YARN-3069.002.patch, 
 YARN-3069.003.patch, YARN-3069.004.patch, YARN-3069.005.patch, 
 YARN-3069.006.patch, YARN-3069.007.patch, YARN-3069.008.patch, 
 YARN-3069.009.patch, YARN-3069.010.patch


 The following properties are currently not defined in yarn-default.xml.  
 These properties should either be
   A) documented in yarn-default.xml OR
   B)  listed as an exception (with comments, e.g. for internal use) in the 
 TestYarnConfigurationFields unit test
 Any comments for any of the properties below are welcome.
   org.apache.hadoop.yarn.server.sharedcachemanager.RemoteAppChecker
   org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore
   security.applicationhistory.protocol.acl
   yarn.app.container.log.backups
   yarn.app.container.log.dir
   yarn.app.container.log.filesize
   yarn.client.app-submission.poll-interval
   yarn.client.application-client-protocol.poll-timeout-ms
   yarn.is.minicluster
   yarn.log.server.url
   yarn.minicluster.control-resource-monitoring
   yarn.minicluster.fixed.ports
   yarn.minicluster.use-rpc
   yarn.node-labels.fs-store.retry-policy-spec
   yarn.node-labels.fs-store.root-dir
   yarn.node-labels.manager-class
   yarn.nodemanager.container-executor.os.sched.priority.adjustment
   yarn.nodemanager.container-monitor.process-tree.class
   yarn.nodemanager.disk-health-checker.enable
   yarn.nodemanager.docker-container-executor.image-name
   yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms
   yarn.nodemanager.linux-container-executor.group
   yarn.nodemanager.log.deletion-threads-count
   yarn.nodemanager.user-home-dir
   yarn.nodemanager.webapp.https.address
   yarn.nodemanager.webapp.spnego-keytab-file
   yarn.nodemanager.webapp.spnego-principal
   yarn.nodemanager.windows-secure-container-executor.group
   yarn.resourcemanager.configuration.file-system-based-store
   yarn.resourcemanager.delegation-token-renewer.thread-count
   yarn.resourcemanager.delegation.key.update-interval
   yarn.resourcemanager.delegation.token.max-lifetime
   yarn.resourcemanager.delegation.token.renew-interval
   yarn.resourcemanager.history-writer.multi-threaded-dispatcher.pool-size
   yarn.resourcemanager.metrics.runtime.buckets
   yarn.resourcemanager.nm-tokens.master-key-rolling-interval-secs
   yarn.resourcemanager.reservation-system.class
   yarn.resourcemanager.reservation-system.enable
   yarn.resourcemanager.reservation-system.plan.follower
   yarn.resourcemanager.reservation-system.planfollower.time-step
   yarn.resourcemanager.rm.container-allocation.expiry-interval-ms
   yarn.resourcemanager.webapp.spnego-keytab-file
   yarn.resourcemanager.webapp.spnego-principal
   yarn.scheduler.include-port-in-node-name
   yarn.timeline-service.delegation.key.update-interval
   yarn.timeline-service.delegation.token.max-lifetime
   yarn.timeline-service.delegation.token.renew-interval
   yarn.timeline-service.generic-application-history.enabled
   
 yarn.timeline-service.generic-application-history.fs-history-store.compression-type
   yarn.timeline-service.generic-application-history.fs-history-store.uri
   yarn.timeline-service.generic-application-history.store-class
   yarn.timeline-service.http-cross-origin.enabled
   yarn.tracking.url.generator



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1735) For FairScheduler AvailableMB in QueueMetrics is the same as AllocateMB

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549417#comment-14549417
 ] 

Hadoop QA commented on YARN-1735:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 42s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 38s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 45s | The applied patch generated  3 
new checkstyle issues (total was 129, now 132). |
| {color:red}-1{color} | whitespace |   0m  1s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   1m 19s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  50m  1s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  86m 34s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS;
 locked 66% of time  Unsynchronized access at FileSystemRMStateStore.java:66% 
of time  Unsynchronized access at FileSystemRMStateStore.java:[line 156] |
| Failed unit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733625/YARN-1735.v3.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 0790275 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7980/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/7980/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/7980/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7980/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7980/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7980/console |


This message was automatically generated.

 For FairScheduler AvailableMB in QueueMetrics is the same as AllocateMB
 ---

 Key: YARN-1735
 URL: https://issues.apache.org/jira/browse/YARN-1735
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Siqi Li
 Attachments: YARN-1735.v1.patch, YARN-1735.v2.patch, 
 YARN-1735.v3.patch


 in monitoring graphs the AvailableMB of each queue regularly spikes between 
 the AllocatedMB and the entire cluster capacity.
 This cannot be correct since AvailableMB should never be more than the queue 
 max allocation. The spikes are quite confusing since the availableMB is set 
 as the fair share of each queue and the fair share of each queue is bond by 
 their allowed max resource.
 Other than the spiking, the availableMB is always equal to allocatedMB. I 
 think this is not very useful, availableMB for each queue should be their 
 allowed max resource minus allocatedMB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1945) Adding description for each pool in Fair Scheduler Page from fair-scheduler.xml

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549249#comment-14549249
 ] 

Hadoop QA commented on YARN-1945:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 10s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 44s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 53s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 13s | The applied patch generated  2 
new checkstyle issues (total was 214, now 216). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 40s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 24s | Tests passed in 
hadoop-yarn-api. |
| {color:red}-1{color} | yarn tests |  49m 49s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  89m 40s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS;
 locked 66% of time  Unsynchronized access at FileSystemRMStateStore.java:66% 
of time  Unsynchronized access at FileSystemRMStateStore.java:[line 156] |
| Failed unit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService
 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733599/YARN-1945.v6.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / cdfae44 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7977/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/7977/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/7977/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7977/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7977/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7977/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7977/console |


This message was automatically generated.

 Adding description for each pool in Fair Scheduler Page from 
 fair-scheduler.xml
 ---

 Key: YARN-1945
 URL: https://issues.apache.org/jira/browse/YARN-1945
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Affects Versions: 2.3.0
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-1945.v2.patch, YARN-1945.v3.patch, 
 YARN-1945.v4.patch, YARN-1945.v5.patch, YARN-1945.v6.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549459#comment-14549459
 ] 

Hadoop QA commented on YARN-2876:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m  1s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 37s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 53s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 45s | The applied patch generated  2 
new checkstyle issues (total was 17, now 19). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   1m 19s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  50m  2s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  87m 11s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS;
 locked 66% of time  Unsynchronized access at FileSystemRMStateStore.java:66% 
of time  Unsynchronized access at FileSystemRMStateStore.java:[line 156] |
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService
 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733632/YARN-2876.v3.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 0790275 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7981/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/7981/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7981/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7981/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7981/console |


This message was automatically generated.

 In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for 
 subqueues
 

 Key: YARN-2876
 URL: https://issues.apache.org/jira/browse/YARN-2876
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2876.v1.patch, YARN-2876.v2.patch, 
 YARN-2876.v3.patch, screenshot-1.png


 If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and 
 Scheduler UI will display the entire cluster capacity as its maxResource 
 instead of its parent queue's maxResource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3632) Ordering policy should be allowed to reorder an application when demand changes

2015-05-18 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549460#comment-14549460
 ] 

Jian He commented on YARN-3632:
---

-  the current reorder implementation in containerReleased and 
containerAllocated is triggered by every single container completed or 
allocated. This results in time complexity of 
{code} (#containersCompleted + #containersReleased)* #appsOnNode * 
log(#appsInQueue) {code} on every node heartbeat, we can improve this by 
reordering the app after processing all containers of the app to get rid of the 
first {code} (#containersCompleted + #containersReleased) {code} overhead.

- this null check is not needed, if it can never be null;
{code}
 if (updateDemandForQueue != null) {
{code}

 Ordering policy should be allowed to reorder an application when demand 
 changes
 ---

 Key: YARN-3632
 URL: https://issues.apache.org/jira/browse/YARN-3632
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3632.0.patch, YARN-3632.1.patch, YARN-3632.3.patch, 
 YARN-3632.4.patch, YARN-3632.5.patch


 At present, ordering policies have the option to have an application 
 re-ordered (for allocation and preemption) when it is allocated to or a 
 container is recovered from the application.  Some ordering policies may also 
 need to reorder when demand changes if that is part of the ordering 
 comparison, this needs to be made available (and used by the 
 fairorderingpolicy when sizebasedweight is true)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1735) For FairScheduler AvailableMB in QueueMetrics is the same as AllocateMB

2015-05-18 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-1735:
--
Attachment: YARN-1735.v4.patch

 For FairScheduler AvailableMB in QueueMetrics is the same as AllocateMB
 ---

 Key: YARN-1735
 URL: https://issues.apache.org/jira/browse/YARN-1735
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Siqi Li
 Attachments: YARN-1735.v1.patch, YARN-1735.v2.patch, 
 YARN-1735.v3.patch, YARN-1735.v4.patch


 in monitoring graphs the AvailableMB of each queue regularly spikes between 
 the AllocatedMB and the entire cluster capacity.
 This cannot be correct since AvailableMB should never be more than the queue 
 max allocation. The spikes are quite confusing since the availableMB is set 
 as the fair share of each queue and the fair share of each queue is bond by 
 their allowed max resource.
 Other than the spiking, the availableMB is always equal to allocatedMB. I 
 think this is not very useful, availableMB for each queue should be their 
 allowed max resource minus allocatedMB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1945) Adding description for each pool in Fair Scheduler Page from fair-scheduler.xml

2015-05-18 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-1945:
--
Attachment: YARN-1945.v7.patch

 Adding description for each pool in Fair Scheduler Page from 
 fair-scheduler.xml
 ---

 Key: YARN-1945
 URL: https://issues.apache.org/jira/browse/YARN-1945
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Affects Versions: 2.3.0
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-1945.v2.patch, YARN-1945.v3.patch, 
 YARN-1945.v4.patch, YARN-1945.v5.patch, YARN-1945.v6.patch, YARN-1945.v7.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549725#comment-14549725
 ] 

Hadoop QA commented on YARN-3411:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 15s | Pre-patch YARN-2928 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 58s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 53s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 16s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 39s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 40s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 38s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   1m 15s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  38m  2s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733693/YARN-3411-YARN-2928.006.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 463e070 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7990/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7990/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7990/console |


This message was automatically generated.

 [Storage implementation] explore the native HBase write schema for storage
 --

 Key: YARN-3411
 URL: https://issues.apache.org/jira/browse/YARN-3411
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vrushali C
Priority: Critical
 Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
 YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
 YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
 YARN-3411-YARN-2928.005.patch, YARN-3411-YARN-2928.006.patch, 
 YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, 
 YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, 
 YARN-3411.poc.txt


 There is work that's in progress to implement the storage based on a Phoenix 
 schema (YARN-3134).
 In parallel, we would like to explore an implementation based on a native 
 HBase schema for the write path. Such a schema does not exclude using 
 Phoenix, especially for reads and offline queries.
 Once we have basic implementations of both options, we could evaluate them in 
 terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-18 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549692#comment-14549692
 ] 

Sangjin Lee commented on YARN-3411:
---

It seems like the line that adds the hbase configuration was removed from 
HBaseTimelineWriterImpl. Is that intentional? How would it be able to use and 
load hbase configuration then? Come to think of it, I think we may need to add 
both hbase-site.xml and hbase-default.xml?

{code}
conf.addResource(hbase-default.xml);
conf.addResource(hbase-site.xml);
{code}

 [Storage implementation] explore the native HBase write schema for storage
 --

 Key: YARN-3411
 URL: https://issues.apache.org/jira/browse/YARN-3411
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vrushali C
Priority: Critical
 Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
 YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
 YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
 YARN-3411-YARN-2928.005.patch, YARN-3411-YARN-2928.006.patch, 
 YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, 
 YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, 
 YARN-3411.poc.txt


 There is work that's in progress to implement the storage based on a Phoenix 
 schema (YARN-3134).
 In parallel, we would like to explore an implementation based on a native 
 HBase schema for the write path. Such a schema does not exclude using 
 Phoenix, especially for reads and offline queries.
 Once we have basic implementations of both options, we could evaluate them in 
 terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-18 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549776#comment-14549776
 ] 

Sangjin Lee commented on YARN-3411:
---

Or, better:

{code}
Configuration hbaseConf = HBaseConfiguration.create(conf);
{code}


 [Storage implementation] explore the native HBase write schema for storage
 --

 Key: YARN-3411
 URL: https://issues.apache.org/jira/browse/YARN-3411
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vrushali C
Priority: Critical
 Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
 YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
 YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
 YARN-3411-YARN-2928.005.patch, YARN-3411-YARN-2928.006.patch, 
 YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, 
 YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, 
 YARN-3411.poc.txt


 There is work that's in progress to implement the storage based on a Phoenix 
 schema (YARN-3134).
 In parallel, we would like to explore an implementation based on a native 
 HBase schema for the write path. Such a schema does not exclude using 
 Phoenix, especially for reads and offline queries.
 Once we have basic implementations of both options, we could evaluate them in 
 terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-18 Thread Vrushali C (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C updated YARN-3411:
-
Attachment: YARN-3411-YARN-2928.005.patch

 [Storage implementation] explore the native HBase write schema for storage
 --

 Key: YARN-3411
 URL: https://issues.apache.org/jira/browse/YARN-3411
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vrushali C
Priority: Critical
 Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
 YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
 YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
 YARN-3411-YARN-2928.005.patch, YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, 
 YARN-3411.poc.4.txt, YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, 
 YARN-3411.poc.7.txt, YARN-3411.poc.txt


 There is work that's in progress to implement the storage based on a Phoenix 
 schema (YARN-3134).
 In parallel, we would like to explore an implementation based on a native 
 HBase schema for the write path. Such a schema does not exclude using 
 Phoenix, especially for reads and offline queries.
 Once we have basic implementations of both options, we could evaluate them in 
 terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549601#comment-14549601
 ] 

Hadoop QA commented on YARN-3411:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m  2s | Pre-patch YARN-2928 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 43s | There were no new javac warning 
messages. |
| {color:red}-1{color} | javadoc |   9m 46s | The applied patch generated  3  
additional warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 19s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 41s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 40s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 38s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   1m 14s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  37m 30s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733677/YARN-3411-YARN-2928.005.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 463e070 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/7986/artifact/patchprocess/diffJavadocWarnings.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7986/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7986/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7986/console |


This message was automatically generated.

 [Storage implementation] explore the native HBase write schema for storage
 --

 Key: YARN-3411
 URL: https://issues.apache.org/jira/browse/YARN-3411
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vrushali C
Priority: Critical
 Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
 YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
 YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
 YARN-3411-YARN-2928.005.patch, YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, 
 YARN-3411.poc.4.txt, YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, 
 YARN-3411.poc.7.txt, YARN-3411.poc.txt


 There is work that's in progress to implement the storage based on a Phoenix 
 schema (YARN-3134).
 In parallel, we would like to explore an implementation based on a native 
 HBase schema for the write path. Such a schema does not exclude using 
 Phoenix, especially for reads and offline queries.
 Once we have basic implementations of both options, we could evaluate them in 
 terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3676) Disregard 'assignMultiple' directive while scheduling apps with NODE_LOCAL resource requests

2015-05-18 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-3676:
-

 Summary: Disregard 'assignMultiple' directive while scheduling 
apps with NODE_LOCAL resource requests
 Key: YARN-3676
 URL: https://issues.apache.org/jira/browse/YARN-3676
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Arun Suresh
Assignee: Arun Suresh


AssignMultiple is generally set to false to prevent overloading a Node (for eg, 
new NMs that have just joined)

A possible scheduling optimization would be to disregard this directive for 
apps whose allowed locality is NODE_LOCAL



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1735) For FairScheduler AvailableMB in QueueMetrics is the same as AllocateMB

2015-05-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549614#comment-14549614
 ] 

Hadoop QA commented on YARN-1735:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 44s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 35s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 33s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 49s | The applied patch generated  1 
new checkstyle issues (total was 130, now 131). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   1m 20s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  50m  8s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  86m 43s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS;
 locked 66% of time  Unsynchronized access at FileSystemRMStateStore.java:66% 
of time  Unsynchronized access at FileSystemRMStateStore.java:[line 156] |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733665/YARN-1735.v4.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 0790275 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7984/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/7984/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/7984/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7984/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7984/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7984/console |


This message was automatically generated.

 For FairScheduler AvailableMB in QueueMetrics is the same as AllocateMB
 ---

 Key: YARN-1735
 URL: https://issues.apache.org/jira/browse/YARN-1735
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Siqi Li
 Attachments: YARN-1735.v1.patch, YARN-1735.v2.patch, 
 YARN-1735.v3.patch, YARN-1735.v4.patch


 in monitoring graphs the AvailableMB of each queue regularly spikes between 
 the AllocatedMB and the entire cluster capacity.
 This cannot be correct since AvailableMB should never be more than the queue 
 max allocation. The spikes are quite confusing since the availableMB is set 
 as the fair share of each queue and the fair share of each queue is bond by 
 their allowed max resource.
 Other than the spiking, the availableMB is always equal to allocatedMB. I 
 think this is not very useful, availableMB for each queue should be their 
 allowed max resource minus allocatedMB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-18 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549631#comment-14549631
 ] 

Vrushali C commented on YARN-3411:
--


Updating the code now to fix the javadoc warning and [~sjlee0]'s review 
suggestion. 

 [Storage implementation] explore the native HBase write schema for storage
 --

 Key: YARN-3411
 URL: https://issues.apache.org/jira/browse/YARN-3411
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vrushali C
Priority: Critical
 Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
 YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
 YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
 YARN-3411-YARN-2928.005.patch, YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, 
 YARN-3411.poc.4.txt, YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, 
 YARN-3411.poc.7.txt, YARN-3411.poc.txt


 There is work that's in progress to implement the storage based on a Phoenix 
 schema (YARN-3134).
 In parallel, we would like to explore an implementation based on a native 
 HBase schema for the write path. Such a schema does not exclude using 
 Phoenix, especially for reads and offline queries.
 Once we have basic implementations of both options, we could evaluate them in 
 terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3518) default rm/am expire interval should not less than default resourcemanager connect wait time

2015-05-18 Thread sandflee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sandflee updated YARN-3518:
---
Attachment: YARN-3518.002.patch

replace RESOURCEMANAGER_CONNECT_MAX_WAIT_MS with 
RESOURCETRACKER_RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
APPLICATIONMASTER_RESOURCEMANAGER_CONNECT_MAX_WAIT_MS and 
APPLICATIONCLIENT_RESOURCEMANAGER_CONNECT_MAX_WAIT_MS

 default rm/am expire interval should not less than default resourcemanager 
 connect wait time
 

 Key: YARN-3518
 URL: https://issues.apache.org/jira/browse/YARN-3518
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Reporter: sandflee
Assignee: sandflee
  Labels: BB2015-05-TBR, configuration, newbie
 Attachments: YARN-3518.001.patch, YARN-3518.002.patch


 take am for example, if am can't connect to RM, after am expire (600s), RM 
 relaunch am, and there will be two am at the same time util resourcemanager 
 connect max wait time(900s) passed.
 DEFAULT_RESOURCEMANAGER_CONNECT_MAX_WAIT_MS =  15 * 60 * 1000;
 DEFAULT_RM_AM_EXPIRY_INTERVAL_MS = 60;
 DEFAULT_RM_NM_EXPIRY_INTERVAL_MS = 60;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived

2015-05-18 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549505#comment-14549505
 ] 

Chris Douglas commented on YARN-1039:
-

The semantics of a boolean flag are opaque. The policies enforced by different 
RM configurations (and versions) will not be- and cannot be made to be- 
consistent. Application and container priority are already encoded (or in 
progress, YARN-1963), so it's not just preemption priority or cost. Affinity 
and anti-affinity are also covered by different features. Discussion has been 
wide-ranging because it is unclear what long-lived guarantees across existing 
features (beyond removing the progress bar from the UI, which I hope we can 
stop mentioning).

An implementation that only recognizes infinite and undefined leases could be 
mapped into duration. Lease duration could also be used to communicate when 
security tokens cannot be renewed, short-lived guarantees for YARN-2877 
containers, boundaries of YARN-1051 reservations, and planned decommissioning. 
In contrast, the long-lived flag cannot be used for these cases. We could 
expose probabilistic guarantees (which are what we give in reality), but that's 
a later issue.

Considering the blockers more concretely:
bq. (a) reservations (b) white-listed requests or (c) node-label requests 
getting stuck on a node used by other services' containers that don't exit.

Aren't these handled by adding a timeout to allocations, which would also catch 
cases where this flag is _not_ set? The timeout value could be set across the 
scheduler to start, but could even be user-visible in later versions...

All said, I don't have time to work on this, agree the API can be evolved from 
the flag, and am -0 on it.

 Add parameter for YARN resource requests to indicate long lived
 -

 Key: YARN-1039
 URL: https://issues.apache.org/jira/browse/YARN-1039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Steve Loughran
Assignee: Craig Welch
 Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch


 A container request could support a new parameter long-lived. This could be 
 used by a scheduler that would know not to host the service on a transient 
 (cloud: spot priced) node.
 Schedulers could also decide whether or not to allocate multiple long-lived 
 containers on the same node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >