[jira] [Commented] (YARN-2062) Too many InvalidStateTransitionExceptions from NodeState.NEW on RM failover

2014-05-16 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998269#comment-13998269 ] Karthik Kambatla commented on YARN-2062: I propose having a dummy invalid

[jira] [Commented] (YARN-1969) Fair Scheduler: Add policy for Earliest Deadline First

2014-05-16 Thread Maysam Yabandeh (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996444#comment-13996444 ] Maysam Yabandeh commented on YARN-1969: --- [~kkambatl], you are right. The title of the

[jira] [Commented] (YARN-2011) Fix typo and warning in TestLeafQueue

2014-05-16 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998226#comment-13998226 ] Hudson commented on YARN-2011: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1753 (See

[jira] [Commented] (YARN-2016) Yarn getApplicationRequest start time range is not honored

2014-05-16 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998231#comment-13998231 ] Hudson commented on YARN-2016: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1753 (See

[jira] [Updated] (YARN-2053) Slider AM fails to restart: NPE in RegisterApplicationMasterResponseProto$Builder.addAllNmTokensFromPreviousAttempts

2014-05-16 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2053: - Attachment: YARN-2053.patch Attached a new patch with UT according to [~jianhe]'s suggestion. Slider AM

[jira] [Updated] (YARN-766) TestNodeManagerShutdown in branch-2 should use Shell to form the output path and a format issue in trunk

2014-05-16 Thread Junping Du (JIRA)
[ https://issues.apache.org/jira/browse/YARN-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-766: Summary: TestNodeManagerShutdown in branch-2 should use Shell to form the output path and a format issue in

[jira] [Commented] (YARN-2034) Description for yarn.nodemanager.localizer.cache.target-size-mb is incorrect

2014-05-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993020#comment-13993020 ] Jason Lowe commented on YARN-2034: -- While updating it we may also want to clarify that it

[jira] [Commented] (YARN-893) Capacity scheduler allocates vcores to containers but does not report it in headroom

2014-05-16 Thread Tsuyoshi OZAWA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998657#comment-13998657 ] Tsuyoshi OZAWA commented on YARN-893: - Thanks for updating a patch, [~kj-ki]. It looks

[jira] [Commented] (YARN-2061) Revisit logging levels in ZKRMStateStore

2014-05-16 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998524#comment-13998524 ] Karthik Kambatla commented on YARN-2061: We assume that the Log level is at least

[jira] [Commented] (YARN-1957) ProportionalCapacitPreemptionPolicy handling of corner cases...

2014-05-16 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999005#comment-13999005 ] Hudson commented on YARN-1957: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5605 (See

[jira] [Commented] (YARN-2027) YARN ignores host-specific resource requests

2014-05-16 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998912#comment-13998912 ] Bikas Saha commented on YARN-2027: -- Yes. If strict node locality is needed then the rack

[jira] [Commented] (YARN-2017) Merge some of the common lib code in schedulers

2014-05-16 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998825#comment-13998825 ] Wangda Tan commented on YARN-2017: -- LGTM, +1 (non-binding). Please kick off Jenkins

[jira] [Updated] (YARN-2053) Slider AM fails to restart: NPE in RegisterApplicationMasterResponseProto$Builder.addAllNmTokensFromPreviousAttempts

2014-05-16 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2053: - Attachment: YARN-2053.patch Slider AM fails to restart: NPE in

[jira] [Commented] (YARN-1612) Change Fair Scheduler to not disable delay scheduling by default

2014-05-16 Thread Chen He (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998887#comment-13998887 ] Chen He commented on YARN-1612: --- ping Change Fair Scheduler to not disable delay scheduling

[jira] [Commented] (YARN-2061) Revisit logging levels in ZKRMStateStore

2014-05-16 Thread Tsuyoshi OZAWA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998623#comment-13998623 ] Tsuyoshi OZAWA commented on YARN-2061: -- The logging in

[jira] [Commented] (YARN-2036) Document yarn.resourcemanager.hostname in ClusterSetup

2014-05-16 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994001#comment-13994001 ] Hadoop QA commented on YARN-2036: - {color:green}+1 overall{color}. Here are the results of

[jira] [Commented] (YARN-1365) ApplicationMasterService to allow Register and Unregister of an app that was running before restart

2014-05-16 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999139#comment-13999139 ] Hadoop QA commented on YARN-1365: - {color:red}-1 overall{color}. Here are the results of

[jira] [Updated] (YARN-2053) Slider AM fails to restart: NPE in RegisterApplicationMasterResponseProto$Builder.addAllNmTokensFromPreviousAttempts

2014-05-16 Thread Jian He (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-2053: -- Attachment: YARN-2053.patch Slider AM fails to restart: NPE in

[jira] [Commented] (YARN-2065) AM cannot create new containers after restart-NM token from previous attempt used

2014-05-16 Thread Jian He (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999094#comment-13999094 ] Jian He commented on YARN-2065: --- Looked at the exception posted in SLIDER-34, the problem is

[jira] [Commented] (YARN-1514) Utility to benchmark ZKRMStateStore#loadState for ResourceManager-HA

2014-05-16 Thread Tsuyoshi OZAWA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999258#comment-13999258 ] Tsuyoshi OZAWA commented on YARN-1514: -- I'll make these parameters configurable: 1.

[jira] [Commented] (YARN-2055) Preemption: Jobs are failing due to AMs are getting launched and killed multiple times

2014-05-16 Thread Mayank Bansal (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998941#comment-13998941 ] Mayank Bansal commented on YARN-2055: - YARN-2022 is for avoiding killing AM however

[jira] [Commented] (YARN-1569) For handle(SchedulerEvent) in FifoScheduler and CapacityScheduler, SchedulerEvent should get checked (instanceof) for appropriate type before casting

2014-05-16 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998992#comment-13998992 ] zhihai xu commented on YARN-1569: - Hi, I want to work on this issue(YARN-1569), Can someone

[jira] [Commented] (YARN-1987) Wrapper for leveldb DBIterator to aid in handling database exceptions

2014-05-16 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998195#comment-13998195 ] Hudson commented on YARN-1987: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1779 (See

[jira] [Commented] (YARN-941) RM Should have a way to update the tokens it has for a running application

2014-05-16 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/YARN-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998704#comment-13998704 ] Steve Loughran commented on YARN-941: - We've been doing AM restart, and seen some token

[jira] [Commented] (YARN-1366) ApplicationMasterService should Resync with the AM upon allocate call after restart

2014-05-16 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999451#comment-13999451 ] Anubhav Dhoot commented on YARN-1366: - Seems like we are going with no resync api for

[jira] [Commented] (YARN-1474) Make schedulers services

2014-05-16 Thread Tsuyoshi OZAWA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999689#comment-13999689 ] Tsuyoshi OZAWA commented on YARN-1474: -- [~kkambatl], can you check a latest patch and

[jira] [Commented] (YARN-2055) Preemption: Jobs are failing due to AMs are getting launched and killed multiple times

2014-05-16 Thread Sunil G (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999724#comment-13999724 ] Sunil G commented on YARN-2055: --- Thank you Mayank for the clarification. I have a small doubt

[jira] [Commented] (YARN-2054) Poor defaults for YARN ZK configs for retries and retry-inteval

2014-05-16 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999771#comment-13999771 ] Karthik Kambatla commented on YARN-2054: bq. If we want these configs to match up

[jira] [Commented] (YARN-1338) Recover localized resource cache state upon nodemanager restart

2014-05-16 Thread Junping Du (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999558#comment-13999558 ] Junping Du commented on YARN-1338: -- Hi [~jlowe], thanks for contributing a patch here.

[jira] [Created] (YARN-2068) FairScheduler uses the same ResourceCalculator for all policies

2014-05-16 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-2068: -- Summary: FairScheduler uses the same ResourceCalculator for all policies Key: YARN-2068 URL: https://issues.apache.org/jira/browse/YARN-2068 Project: Hadoop YARN

[jira] [Commented] (YARN-2054) Poor defaults for YARN ZK configs for retries and retry-inteval

2014-05-16 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999173#comment-13999173 ] Hadoop QA commented on YARN-2054: - {color:red}-1 overall{color}. Here are the results of

[jira] [Commented] (YARN-1365) ApplicationMasterService to allow Register and Unregister of an app that was running before restart

2014-05-16 Thread Tsuyoshi OZAWA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998625#comment-13998625 ] Tsuyoshi OZAWA commented on YARN-1365: -- Oops, this comment is for YARN-1367. I'll

[jira] [Commented] (YARN-2017) Merge some of the common lib code in schedulers

2014-05-16 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999233#comment-13999233 ] Hadoop QA commented on YARN-2017: - {color:red}-1 overall{color}. Here are the results of

[jira] [Commented] (YARN-1751) Improve MiniYarnCluster for log aggregation testing

2014-05-16 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999003#comment-13999003 ] Hudson commented on YARN-1751: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5605 (See

[jira] [Commented] (YARN-2027) YARN ignores host-specific resource requests

2014-05-16 Thread Chris Riccomini (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1343#comment-1343 ] Chris Riccomini commented on YARN-2027: --- K, feel free to close. I'm fairly sure that

[jira] [Updated] (YARN-2049) Delegation token stuff for the timeline sever

2014-05-16 Thread Zhijie Shen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-2049: -- Attachment: YARN-2049.2.patch Fix a bug in the previous patch: When creating the delegation token, we

[jira] [Commented] (YARN-2053) Slider AM fails to restart: NPE in RegisterApplicationMasterResponseProto$Builder.addAllNmTokensFromPreviousAttempts

2014-05-16 Thread Jian He (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998948#comment-13998948 ] Jian He commented on YARN-2053: --- LGTM, +1, submit the same patch to kick jenkins Slider AM

[jira] [Updated] (YARN-1514) Utility to benchmark ZKRMStateStore#loadState for ResourceManager-HA

2014-05-16 Thread Tsuyoshi OZAWA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi OZAWA updated YARN-1514: - Attachment: YARN-1514.wip.patch Attached a WIP patch. Utility to benchmark

[jira] [Created] (YARN-2065) AM cannot create new containers after restart-NM token from previous attempt used

2014-05-16 Thread Steve Loughran (JIRA)
Steve Loughran created YARN-2065: Summary: AM cannot create new containers after restart-NM token from previous attempt used Key: YARN-2065 URL: https://issues.apache.org/jira/browse/YARN-2065

[jira] [Assigned] (YARN-1424) RMAppAttemptImpl should precompute a zeroed ApplicationResourceUsageReport to return when attempt not active

2014-05-16 Thread Ray Chiang (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang reassigned YARN-1424: Assignee: Ray Chiang RMAppAttemptImpl should precompute a zeroed ApplicationResourceUsageReport to

[jira] [Updated] (YARN-1569) For handle(SchedulerEvent) in FifoScheduler and CapacityScheduler, SchedulerEvent should get checked (instanceof) for appropriate type before casting

2014-05-16 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-1569: Assignee: (was: Anubhav Dhoot) For handle(SchedulerEvent) in FifoScheduler and

[jira] [Created] (YARN-2067) FairScheduler update/continuous-scheduling threads should start only when after the scheduler is started

2014-05-16 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-2067: -- Summary: FairScheduler update/continuous-scheduling threads should start only when after the scheduler is started Key: YARN-2067 URL:

[jira] [Updated] (YARN-2056) Disable preemption at Queue level

2014-05-16 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-2056: -- Fix Version/s: (was: 2.1.0-beta) Disable preemption at Queue level

[jira] [Updated] (YARN-1550) NPE in FairSchedulerAppsBlock#render

2014-05-16 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-1550: Attachment: YARN-1550.001.patch Updated caolong's patch NPE in FairSchedulerAppsBlock#render

[jira] [Updated] (YARN-1338) Recover localized resource cache state upon nodemanager restart

2014-05-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1338: - Attachment: YARN-1338v4.patch Updating patch to trunk. Recover localized resource cache state upon

[jira] [Assigned] (YARN-2066) Wrong field is referenced in GetApplicationsRequestPBImpl#mergeLocalToBuilder()

2014-05-16 Thread Hong Zhiguo (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Zhiguo reassigned YARN-2066: - Assignee: Hong Zhiguo Wrong field is referenced in

[jira] [Commented] (YARN-1362) Distinguish between nodemanager shutdown for decommission vs shutdown for restart

2014-05-16 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999018#comment-13999018 ] Hudson commented on YARN-1362: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5605 (See

[jira] [Updated] (YARN-1936) Secured timeline client

2014-05-16 Thread Zhijie Shen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-1936: -- Attachment: YARN-1936.2.patch Upload a new patch: We shouldn't request the timeline DT when the

[jira] [Commented] (YARN-2061) Revisit logging levels in ZKRMStateStore

2014-05-16 Thread Tsuyoshi OZAWA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998624#comment-13998624 ] Tsuyoshi OZAWA commented on YARN-2061: -- s/RACE/TRACE/ Revisit logging levels in

[jira] [Updated] (YARN-2061) Revisit logging levels in ZKRMStateStore

2014-05-16 Thread Ray Chiang (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated YARN-2061: - Attachment: YARN2061-01.patch Patch to move several LOG.info messages to LOG.debug. Cleans up messages a

[jira] [Updated] (YARN-2017) Merge some of the common lib code in schedulers

2014-05-16 Thread Jian He (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-2017: -- Attachment: YARN-2017.4.patch Same patch to kick jenkins Merge some of the common lib code in schedulers

[jira] [Assigned] (YARN-1799) Enhance LocalDirAllocator in NM to consider DiskMaxUtilization cutoff

2014-05-16 Thread Sunil G (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G reassigned YARN-1799: - Assignee: Sunil G Enhance LocalDirAllocator in NM to consider DiskMaxUtilization cutoff

[jira] [Commented] (YARN-1981) Nodemanager version is not updated when a node reconnects

2014-05-16 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999021#comment-13999021 ] Hudson commented on YARN-1981: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5605 (See

[jira] [Created] (YARN-2070) DistributedShell publish unfriendly user information to the timeline server

2014-05-16 Thread Zhijie Shen (JIRA)
Zhijie Shen created YARN-2070: - Summary: DistributedShell publish unfriendly user information to the timeline server Key: YARN-2070 URL: https://issues.apache.org/jira/browse/YARN-2070 Project: Hadoop

[jira] [Updated] (YARN-2070) DistributedShell publishes unfriendly user information to the timeline server

2014-05-16 Thread Zhijie Shen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-2070: -- Summary: DistributedShell publishes unfriendly user information to the timeline server (was:

[jira] [Updated] (YARN-1996) Provide alternative policies for UNHEALTHY nodes.

2014-05-16 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1996: --- Description: Currently, UNHEALTHY nodes can significantly prolong execution of large

[jira] [Updated] (YARN-1354) Recover applications upon nodemanager restart

2014-05-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1354: - Attachment: YARN-1354-v3.patch Updated patch now that YARN-1987 and YARN-1362 have been committed.

[jira] [Updated] (YARN-1969) Fair Scheduler: Add policy for Earliest Endtime First

2014-05-16 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1969: --- Summary: Fair Scheduler: Add policy for Earliest Endtime First (was: Fair Scheduler: Add

[jira] [Updated] (YARN-1424) RMAppAttemptImpl should precompute a zeroed ApplicationResourceUsageReport to return when attempt not active

2014-05-16 Thread Ray Chiang (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated YARN-1424: - Attachment: YARN1424-01.patch First version of a potential patch. - Moves

[jira] [Updated] (YARN-1969) Fair Scheduler: Add policy for Earliest Endtime First

2014-05-16 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1969: --- Description: What we are observing is that some big jobs with many allocated containers are

[jira] [Commented] (YARN-2061) Revisit logging levels in ZKRMStateStore

2014-05-16 Thread Ray Chiang (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998911#comment-13998911 ] Ray Chiang commented on YARN-2061: -- One other observation. For the various LOG.info()

[jira] [Created] (YARN-2066) Wrong field is referenced in GetApplicationsRequestPBImpl#mergeLocalToBuilder()

2014-05-16 Thread Ted Yu (JIRA)
Ted Yu created YARN-2066: Summary: Wrong field is referenced in GetApplicationsRequestPBImpl#mergeLocalToBuilder() Key: YARN-2066 URL: https://issues.apache.org/jira/browse/YARN-2066 Project: Hadoop YARN

[jira] [Updated] (YARN-2066) Wrong field is referenced in GetApplicationsRequestPBImpl#mergeLocalToBuilder()

2014-05-16 Thread Hong Zhiguo (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Zhiguo updated YARN-2066: -- Attachment: YARN-2066.patch Wrong field is referenced in

[jira] [Updated] (YARN-2017) Merge some of the common lib code in schedulers

2014-05-16 Thread Jian He (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-2017: -- Attachment: YARN-2017.5.patch Merge some of the common lib code in schedulers

[jira] [Commented] (YARN-1962) Timeline server is enabled by default

2014-05-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994006#comment-13994006 ] Jason Lowe commented on YARN-1962: -- +1 lgtm. Will commit this early next week to give

[jira] [Commented] (YARN-2061) Revisit logging levels in ZKRMStateStore

2014-05-16 Thread Jian He (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999111#comment-13999111 ] Jian He commented on YARN-2061: --- Hi Ray, thanks for cleaning it up. I think a reasonable

[jira] [Commented] (YARN-1861) Both RM stuck in standby mode when automatic failover is enabled

2014-05-16 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999020#comment-13999020 ] Hudson commented on YARN-1861: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5605 (See

[jira] [Commented] (YARN-1514) Utility to benchmark ZKRMStateStore#loadState for ResourceManager-HA

2014-05-16 Thread Tsuyoshi OZAWA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998660#comment-13998660 ] Tsuyoshi OZAWA commented on YARN-1514: -- Rough design: 1. Launch ZKRMStateStore and

[jira] [Updated] (YARN-1861) Both RM stuck in standby mode when automatic failover is enabled

2014-05-16 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1861: --- Attachment: yarn-1861-6.patch Updated new patch (yarn-1861-6.patch) to fix the nits. Also,

[jira] [Commented] (YARN-766) TestNodeManagerShutdown in branch-2 should use Shell to form the output path and a format issue in trunk

2014-05-16 Thread Junping Du (JIRA)
[ https://issues.apache.org/jira/browse/YARN-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994129#comment-13994129 ] Junping Du commented on YARN-766: - Hi [~sseth], the patch against trunk make sense to me. So

[jira] [Updated] (YARN-1937) Add entity-level access control of the timeline data for owners only

2014-05-16 Thread Zhijie Shen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-1937: -- Attachment: YARN-1937.3.patch I've tested the patch on a single node cluster, which seems to work fine

[jira] [Updated] (YARN-2070) DistributedShell publishes unfriendly user information to the timeline server

2014-05-16 Thread Zhijie Shen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-2070: -- Labels: newbie (was: ) DistributedShell publishes unfriendly user information to the timeline server

[jira] [Commented] (YARN-1365) ApplicationMasterService to allow Register and Unregister of an app that was running before restart

2014-05-16 Thread Tsuyoshi OZAWA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999719#comment-13999719 ] Tsuyoshi OZAWA commented on YARN-1365: -- Sure! I'll check it.

[jira] [Updated] (YARN-1935) Security for timeline server

2014-05-16 Thread Zhijie Shen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-1935: -- Attachment: Timeline_Kerberos_DT_ACLs.patch I created an uber patch which integrate the pieces I've

[jira] [Updated] (YARN-2034) Description for yarn.nodemanager.localizer.cache.target-size-mb is incorrect

2014-05-16 Thread Chen He (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated YARN-2034: -- Attachment: YARN-2034.patch Description for yarn.nodemanager.localizer.cache.target-size-mb is incorrect

[jira] [Commented] (YARN-1365) ApplicationMasterService to allow Register and Unregister of an app that was running before restart

2014-05-16 Thread Tsuyoshi OZAWA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998592#comment-13998592 ] Tsuyoshi OZAWA commented on YARN-1365: -- I've read your code. The prototype is

[jira] [Commented] (YARN-1986) In Fifo Scheduler, node heartbeat in between creating app and attempt causes NPE

2014-05-16 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999022#comment-13999022 ] Hudson commented on YARN-1986: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5605 (See

[jira] [Commented] (YARN-1550) NPE in FairSchedulerAppsBlock#render

2014-05-16 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14000434#comment-14000434 ] Hadoop QA commented on YARN-1550: - {color:red}-1 overall{color}. Here are the results of

[jira] [Commented] (YARN-1367) After restart NM should resync with the RM without killing containers

2014-05-16 Thread Tsuyoshi OZAWA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998631#comment-13998631 ] Tsuyoshi OZAWA commented on YARN-1367: -- Some comments against a patch: 1. Can you fix

[jira] [Commented] (YARN-1368) Common work to re-populate containers’ state into scheduler

2014-05-16 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998823#comment-13998823 ] Wangda Tan commented on YARN-1368: -- Sorry I went to wrong JIRA, please ignore above

[jira] [Commented] (YARN-2030) Use StateMachine to simplify handleStoreEvent() in RMStateStore

2014-05-16 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993246#comment-13993246 ] Wangda Tan commented on YARN-2030: -- +1 for this idea, I think we should handle this neatly

[jira] [Updated] (YARN-1918) Typo in description and error message for 'yarn.resourcemanager.cluster-id'

2014-05-16 Thread Anandha L Ranganathan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anandha L Ranganathan updated YARN-1918: Attachment: YARN-1918.1.patch Typo in description and error message for

[jira] [Updated] (YARN-2012) Fair Scheduler : Default rule in queue placement policy can take a queue as an optional attribute

2014-05-16 Thread Ashwin Shankar (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashwin Shankar updated YARN-2012: - Attachment: YARN-2012-v2.txt Patch refreshed. Fair Scheduler : Default rule in queue placement

[jira] [Commented] (YARN-1354) Recover applications upon nodemanager restart

2014-05-16 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14000435#comment-14000435 ] Hadoop QA commented on YARN-1354: - {color:green}+1 overall{color}. Here are the results of

[jira] [Commented] (YARN-2049) Delegation token stuff for the timeline sever

2014-05-16 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14000471#comment-14000471 ] Hadoop QA commented on YARN-2049: - {color:red}-1 overall{color}. Here are the results of

[jira] [Commented] (YARN-2053) Slider AM fails to restart: NPE in RegisterApplicationMasterResponseProto$Builder.addAllNmTokensFromPreviousAttempts

2014-05-16 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999533#comment-13999533 ] Hadoop QA commented on YARN-2053: - {color:green}+1 overall{color}. Here are the results of

[jira] [Updated] (YARN-1339) Recover DeletionService state upon nodemanager restart

2014-05-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1339: - Attachment: YARN-1339v4.patch Updated patch now that YARN-1987 has been committed. Recover

[jira] [Updated] (YARN-2069) Add cross-user preemption within CapacityScheduler's leaf-queue

2014-05-16 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-2069: -- Fix Version/s: (was: 2.1.0-beta) Add cross-user preemption within

[jira] [Commented] (YARN-1366) ApplicationMasterService should Resync with the AM upon allocate call after restart

2014-05-16 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14000284#comment-14000284 ] Bikas Saha commented on YARN-1366: -- bq. Seems like we are going with no resync api for now

[jira] [Commented] (YARN-1936) Secured timeline client

2014-05-16 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14000448#comment-14000448 ] Hadoop QA commented on YARN-1936: - {color:red}-1 overall{color}. Here are the results of

[jira] [Commented] (YARN-1365) ApplicationMasterService to allow Register and Unregister of an app that was running before restart

2014-05-16 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998979#comment-13998979 ] Anubhav Dhoot commented on YARN-1365: - Hi [~ozawa] just saw your comment after i had it

[jira] [Commented] (YARN-1976) Tracking url missing http protocol for FAILED application

2014-05-16 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999004#comment-13999004 ] Hudson commented on YARN-1976: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5605 (See

[jira] [Assigned] (YARN-1569) For handle(SchedulerEvent) in FifoScheduler and CapacityScheduler, SchedulerEvent should get checked (instanceof) for appropriate type before casting

2014-05-16 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu reassigned YARN-1569: --- Assignee: zhihai xu For handle(SchedulerEvent) in FifoScheduler and CapacityScheduler,

[jira] [Commented] (YARN-2066) Wrong field is referenced in GetApplicationsRequestPBImpl#mergeLocalToBuilder()

2014-05-16 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14000393#comment-14000393 ] Hadoop QA commented on YARN-2066: - {color:green}+1 overall{color}. Here are the results of

[jira] [Assigned] (YARN-1569) For handle(SchedulerEvent) in FifoScheduler and CapacityScheduler, SchedulerEvent should get checked (instanceof) for appropriate type before casting

2014-05-16 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot reassigned YARN-1569: --- Assignee: Anubhav Dhoot For handle(SchedulerEvent) in FifoScheduler and CapacityScheduler,

[jira] [Commented] (YARN-1368) Common work to re-populate containers’ state into scheduler

2014-05-16 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998822#comment-13998822 ] Wangda Tan commented on YARN-1368: -- LGTM, +1 (non-binding). Please kick off Jenkins

[jira] [Commented] (YARN-1338) Recover localized resource cache state upon nodemanager restart

2014-05-16 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14000398#comment-14000398 ] Hadoop QA commented on YARN-1338: - {color:green}+1 overall{color}. Here are the results of

[jira] [Commented] (YARN-1474) Make schedulers services

2014-05-16 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14000528#comment-14000528 ] Karthik Kambatla commented on YARN-1474: Looks like it did run, but couldn't apply

[jira] [Assigned] (YARN-2065) AM cannot create new containers after restart-NM token from previous attempt used

2014-05-16 Thread Jian He (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He reassigned YARN-2065: - Assignee: Jian He AM cannot create new containers after restart-NM token from previous attempt used

[jira] [Commented] (YARN-1339) Recover DeletionService state upon nodemanager restart

2014-05-16 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14000477#comment-14000477 ] Hadoop QA commented on YARN-1339: - {color:green}+1 overall{color}. Here are the results of

  1   2   >