[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-06-06 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6677: - Attachment: YARN-6677.05.patch > Preempt opportunistic containers when root container cgroup goes over mem

[jira] [Commented] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-06-06 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16504204#comment-16504204 ] Haibo Chen commented on YARN-6677: -- Yes, attached a new patch to address the findbugs war

[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-06-06 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6677: - Attachment: YARN-6677.04.patch > Preempt opportunistic containers when root container cgroup goes over mem

[jira] [Updated] (YARN-6794) Fair Scheduler to explicitly promote OPPORTUNISITIC containers locally at the node where they're running

2018-06-06 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6794: - Attachment: YARN-6794-YARN-1011.prelim.patch > Fair Scheduler to explicitly promote OPPORTUNISITIC contain

[jira] [Commented] (YARN-1014) Configure OOM Killer to kill OPPORTUNISTIC containers first

2018-06-06 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503822#comment-16503822 ] Haibo Chen commented on YARN-1014: -- Based on the previous description of this jira, yes.

[jira] [Created] (YARN-8393) timeline flow runs API createdtimestart/createdtimeend parameter does not work

2018-06-04 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-8393: Summary: timeline flow runs API createdtimestart/createdtimeend parameter does not work Key: YARN-8393 URL: https://issues.apache.org/jira/browse/YARN-8393 Project: Hadoop YA

[jira] [Commented] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16501039#comment-16501039 ] Haibo Chen commented on YARN-6677: -- Updated the patch to address all but one checkstyle i

[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6677: - Attachment: YARN-6677.03.patch > Preempt opportunistic containers when root container cgroup goes over mem

[jira] [Commented] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16501010#comment-16501010 ] Haibo Chen commented on YARN-6677: -- I think the findbugs warning is bonus because the wra

[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6677: - Attachment: YARN-6677.02.patch > Preempt opportunistic containers when root container cgroup goes over mem

[jira] [Commented] (YARN-8388) TestCGroupElasticMemoryController.testNormalExit() hangs on Linux

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500650#comment-16500650 ] Haibo Chen commented on YARN-8388: -- +1 on the latest patch (02) pending Jenkins. > TestC

[jira] [Updated] (YARN-8240) Add queue-level control to allow all applications in a queue to opt-out

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8240: - Attachment: YARN-8240-YARN-1011.02.patch > Add queue-level control to allow all applications in a queue to

[jira] [Commented] (YARN-8240) Add queue-level control to allow all applications in a queue to opt-out

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500617#comment-16500617 ] Haibo Chen commented on YARN-8240: -- Thanks for the review [~miklos.szeg...@cloudera.com],

[jira] [Commented] (YARN-8388) TestCGroupElasticMemoryController.testNormalExit() hangs on Linux

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500570#comment-16500570 ] Haibo Chen commented on YARN-8388: -- Two minor comments 1) Let's add a comment above ` wh

[jira] [Commented] (YARN-8390) Fix API incompatible changes in FairScheduler's AllocationFileLoaderService

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500451#comment-16500451 ] Haibo Chen commented on YARN-8390: -- Thanks [~grepas] for the quick fix. I have checked in

[jira] [Updated] (YARN-8391) Investigate AllocationFileLoaderService.reloadListener locking issue

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8391: - Description: Per findbugs report in YARN-8390, there is some inconsistent locking of  reloadListener   h

[jira] [Created] (YARN-8391) Investigate AllocationFileLoaderService.reloadListener locking issue

2018-06-04 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-8391: Summary: Investigate AllocationFileLoaderService.reloadListener locking issue Key: YARN-8391 URL: https://issues.apache.org/jira/browse/YARN-8391 Project: Hadoop YARN

[jira] [Commented] (YARN-8390) Fix API incompatible changes in FairScheduler's AllocationFileLoaderService

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500437#comment-16500437 ] Haibo Chen commented on YARN-8390: -- +1. The findbug issue is independent of this patch, t

[jira] [Updated] (YARN-8191) Fair scheduler: queue deletion without RM restart

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8191: - Hadoop Flags: Incompatible change,Reviewed (was: Reviewed) > Fair scheduler: queue deletion without RM re

[jira] [Updated] (YARN-8240) Add queue-level control to allow all applications in a queue to opt-out

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8240: - Attachment: YARN-8240-YARN-1011.01.patch > Add queue-level control to allow all applications in a queue to

[jira] [Commented] (YARN-8240) Add queue-level control to allow all applications in a queue to opt-out

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500315#comment-16500315 ] Haibo Chen commented on YARN-8240: -- TestCapacityOverTimePolicy.testAllocation is flaky an

[jira] [Created] (YARN-8388) TestCGroupElasticMemoryController.testNormalExit() hangs on Linux

2018-06-01 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-8388: Summary: TestCGroupElasticMemoryController.testNormalExit() hangs on Linux Key: YARN-8388 URL: https://issues.apache.org/jira/browse/YARN-8388 Project: Hadoop YARN

[jira] [Commented] (YARN-8375) TestCGroupElasticMemoryController fails surefire build

2018-06-01 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498696#comment-16498696 ] Haibo Chen commented on YARN-8375: -- [~miklos.szeg...@cloudera.com] and I debugged this to

[jira] [Updated] (YARN-8240) Add queue-level control to allow all applications in a queue to opt-out

2018-06-01 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8240: - Attachment: YARN-8240-YARN-1011.00.patch > Add queue-level control to allow all applications in a queue to

[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-31 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6677: - Issue Type: Improvement (was: Sub-task) Parent: (was: YARN-1011) > Preempt opportunistic cont

[jira] [Commented] (YARN-8250) Create another implementation of ContainerScheduler to support NM overallocation

2018-05-31 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496872#comment-16496872 ] Haibo Chen commented on YARN-8250: -- [~asuresh], [~leftnoteasy] and I had an offline discu

[jira] [Commented] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-31 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496638#comment-16496638 ] Haibo Chen commented on YARN-6677: -- I updated the patch without addressing the getCGroups

[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-31 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6677: - Attachment: YARN-6677.01.patch > Preempt opportunistic containers when root container cgroup goes over mem

[jira] [Updated] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH

2018-05-30 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8373: - Component/s: fairscheduler > RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH > ---

[jira] [Assigned] (YARN-8375) TestCGroupElasticMemoryController fails surefire build

2018-05-29 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen reassigned YARN-8375: Assignee: Miklos Szegedi > TestCGroupElasticMemoryController fails surefire build > ---

[jira] [Commented] (YARN-8375) TestCGroupElasticMemoryController fails surefire build

2018-05-29 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493985#comment-16493985 ] Haibo Chen commented on YARN-8375: -- Thanks [~jlowe] for reporting. We'll take a look at t

[jira] [Commented] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-29 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493926#comment-16493926 ] Haibo Chen commented on YARN-6677: -- Thanks [~miklos.szeg...@cloudera.com] for the comment

[jira] [Commented] (YARN-8191) Fair scheduler: queue deletion without RM restart

2018-05-24 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16489751#comment-16489751 ] Haibo Chen commented on YARN-8191: -- +1 on the latest patch. Will check it in later today.

[jira] [Commented] (YARN-8191) Fair scheduler: queue deletion without RM restart

2018-05-24 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16489410#comment-16489410 ] Haibo Chen commented on YARN-8191: -- Findbugs is complaining about `return !Boolean.FALSE.

[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-23 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6677: - Summary: Preempt opportunistic containers when root container cgroup goes over memory limit (was: Preempt

[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-23 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6677: - Attachment: YARN-6677.00.patch > Preempt opportunistic containers when root container cgroup goes over memo

[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups

2018-05-23 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488210#comment-16488210 ] Haibo Chen commented on YARN-4599: -- Thanks [~sandflee] for the initial proposal, [~miklos.

[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups

2018-05-23 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16487749#comment-16487749 ] Haibo Chen commented on YARN-4599: -- +1 on the latest patch. Will check it in later today i

[jira] [Commented] (YARN-8338) TimelineService V1.5 doesn't come up after HADOOP-15406

2018-05-23 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16487710#comment-16487710 ] Haibo Chen commented on YARN-8338: -- [~jlowe] [~vinodkv], I did not understand who else was

[jira] [Commented] (YARN-8191) Fair scheduler: queue deletion without RM restart

2018-05-23 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16487621#comment-16487621 ] Haibo Chen commented on YARN-8191: -- +1 pending the findbug fix. The TestAMRestart.testPree

[jira] [Commented] (YARN-8191) Fair scheduler: queue deletion without RM restart

2018-05-22 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484491#comment-16484491 ] Haibo Chen commented on YARN-8191: -- {quote}The point here is that the set of removed queue

[jira] [Commented] (YARN-8248) Job hangs when a job requests a resource that its queue does not have

2018-05-21 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16482576#comment-16482576 ] Haibo Chen commented on YARN-8248: -- I'm okay with not fixing this one. But it is indeed a

[jira] [Commented] (YARN-8248) Job hangs when a job requests a resource that its queue does not have

2018-05-20 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16482168#comment-16482168 ] Haibo Chen commented on YARN-8248: -- [~snemeth] Can you please address the checkstyle issue

[jira] [Commented] (YARN-8191) Fair scheduler: queue deletion without RM restart

2018-05-18 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16481341#comment-16481341 ] Haibo Chen commented on YARN-8191: -- Thanks [~grepas] for the patch! I have some comments.

[jira] [Created] (YARN-8325) Miscellaneous QueueManager code clean up

2018-05-18 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-8325: Summary: Miscellaneous QueueManager code clean up Key: YARN-8325 URL: https://issues.apache.org/jira/browse/YARN-8325 Project: Hadoop YARN Issue Type: Improvement

[jira] [Created] (YARN-8323) FairScheduler.allocConf should be declared as volatile

2018-05-18 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-8323: Summary: FairScheduler.allocConf should be declared as volatile Key: YARN-8323 URL: https://issues.apache.org/jira/browse/YARN-8323 Project: Hadoop YARN Issue Type:

[jira] [Created] (YARN-8322) Change log level when there is an IOException when the allocation file is loaded

2018-05-18 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-8322: Summary: Change log level when there is an IOException when the allocation file is loaded Key: YARN-8322 URL: https://issues.apache.org/jira/browse/YARN-8322 Project: Hadoop

[jira] [Created] (YARN-8321) AllocationFileLoaderService. getAllocationFile() should be declared as VisibleForTest

2018-05-18 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-8321: Summary: AllocationFileLoaderService. getAllocationFile() should be declared as VisibleForTest Key: YARN-8321 URL: https://issues.apache.org/jira/browse/YARN-8321 Project: Ha

[jira] [Commented] (YARN-8248) Job hangs when a job requests a resource that its queue does not have

2018-05-18 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16481009#comment-16481009 ] Haibo Chen commented on YARN-8248: -- +1 pending Jenkins. > Job hangs when a job requests a

[jira] [Commented] (YARN-8248) Job hangs when a job requests a resource that its queue does not have

2018-05-18 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16480931#comment-16480931 ] Haibo Chen commented on YARN-8248: -- Thanks [~snemeth] for updating the patch! I agree that

[jira] [Commented] (YARN-8248) Job hangs when a job requests a resource that its queue does not have

2018-05-17 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16479664#comment-16479664 ] Haibo Chen commented on YARN-8248: -- Thanks [~snemeth] for the update! I have a few follow-

[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups

2018-05-17 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16479588#comment-16479588 ] Haibo Chen commented on YARN-4599: -- Two more comments: 1) The oomHandler will still have

[jira] [Updated] (YARN-8248) Job hangs when a job requests a resource that its queue does not have

2018-05-17 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8248: - Component/s: (was: yarn) > Job hangs when a job requests a resource that its queue does not have >

[jira] [Updated] (YARN-8248) Job hangs when a job requests a resource that its queue does not have

2018-05-17 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8248: - Target Version/s: 3.2.0, 3.1.1 (was: 3.1.1) > Job hangs when a job requests a resource that its queue does

[jira] [Commented] (YARN-8270) Adding JMX Metrics for Timeline Collector and Reader

2018-05-17 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16479178#comment-16479178 ] Haibo Chen commented on YARN-8270: -- The getInstance() methods are not thread-safe, if two

[jira] [Commented] (YARN-8270) Adding JMX Metrics for Timeline Collector and Reader

2018-05-17 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16479169#comment-16479169 ] Haibo Chen commented on YARN-8270: -- Thanks [~Sushil-K-S] for the patch! I don't know much

[jira] [Assigned] (YARN-8270) Adding JMX Metrics for Timeline Collector and Reader

2018-05-17 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen reassigned YARN-8270: Assignee: Sushil Kumar S > Adding JMX Metrics for Timeline Collector and Reader > --

[jira] [Commented] (YARN-7933) [atsv2 read acls] Add TimelineWriter#writeDomain

2018-05-16 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16477991#comment-16477991 ] Haibo Chen commented on YARN-7933: -- Thanks [~rohithsharma] for the patch! I have checked i

[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups

2018-05-16 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16477826#comment-16477826 ] Haibo Chen commented on YARN-4599: -- +1 on the latest patch, pending one minor comments. 

[jira] [Commented] (YARN-7933) [atsv2 read acls] Add TimelineWriter#writeDomain

2018-05-16 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1648#comment-1648 ] Haibo Chen commented on YARN-7933: -- +1 on the 06 patch. Will commit it later today. > [at

[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups

2018-05-15 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16476682#comment-16476682 ] Haibo Chen commented on YARN-4599: -- Thanks [~miklos.szeg...@cloudera.com] for updating the

[jira] [Commented] (YARN-8250) Create another implementation of ContainerScheduler to support NM overallocation

2018-05-15 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16476384#comment-16476384 ] Haibo Chen commented on YARN-8250: -- [~leftnoteasy] I can introduce a plug-gable policy th

[jira] [Commented] (YARN-8250) Create another implementation of ContainerScheduler to support NM overallocation

2018-05-15 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16476376#comment-16476376 ] Haibo Chen commented on YARN-8250: -- Thanks [~asuresh] for the response. {quote} the whole

[jira] [Commented] (YARN-8250) Create another implementation of ContainerScheduler to support NM overallocation

2018-05-14 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16475100#comment-16475100 ] Haibo Chen commented on YARN-8250: -- [~leftnoteasy] Thanks for your comments. I agree that

[jira] [Updated] (YARN-8248) Job hangs when a job requests a resource that its queue does not have

2018-05-14 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8248: - Summary: Job hangs when a job requests a resource that its queue does not have (was: Job hangs when a queu

[jira] [Commented] (YARN-8248) Job hangs when a queue is specified and the maxResources of the queue cannot satisfy the AM resource request

2018-05-14 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16475058#comment-16475058 ] Haibo Chen commented on YARN-8248: -- {quote}as {{RMAppManager.validateAndCreateResourceRequ

[jira] [Commented] (YARN-8250) Create another implementation of ContainerScheduler to support NM overallocation

2018-05-14 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16475011#comment-16475011 ] Haibo Chen commented on YARN-8250: -- My understanding of SHED_QUEUED_CONTAINERS is to notif

[jira] [Commented] (YARN-7933) [atsv2 read acls] Add TimelineWriter#writeDomain

2018-05-14 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474761#comment-16474761 ] Haibo Chen commented on YARN-7933: -- I am okay with just remove the TODO comment, and have

[jira] [Commented] (YARN-7933) [atsv2 read acls] Add TimelineWriter#writeDomain

2018-05-14 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474759#comment-16474759 ] Haibo Chen commented on YARN-7933: -- If it's not in our design, I am inclined to remove it

[jira] [Updated] (YARN-6677) Preempt all opportunistic containers when root container cgroup goes over memory limit

2018-05-14 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6677: - Summary: Preempt all opportunistic containers when root container cgroup goes over memory limit (was: Paus

[jira] [Commented] (YARN-8250) Create another implementation of ContainerScheduler to support NM overallocation

2018-05-14 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474715#comment-16474715 ] Haibo Chen commented on YARN-8250: -- [~asuresh] Did you get a change to look at the patch?

[jira] [Commented] (YARN-8130) Race condition when container events are published for KILLED applications

2018-05-14 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474552#comment-16474552 ] Haibo Chen commented on YARN-8130: -- Checking this in later today if no objection > Race c

[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups

2018-05-14 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474543#comment-16474543 ] Haibo Chen commented on YARN-4599: -- Thanks [~miklos.szeg...@cloudera.com] for the patch! T

[jira] [Commented] (YARN-3610) FairScheduler: Add steady-fair-shares to the REST API documentation

2018-05-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472657#comment-16472657 ] Haibo Chen commented on YARN-3610: -- +1. Checking this in shortly. > FairScheduler: Add st

[jira] [Commented] (YARN-8248) Job hangs when a queue is specified and the maxResources of the queue cannot satisfy the AM resource request

2018-05-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472598#comment-16472598 ] Haibo Chen commented on YARN-8248: -- Thanks [~snemeth] for updating the patch. I have a few

[jira] [Updated] (YARN-8248) Job hangs when a queue is specified and the maxResources of the queue cannot satisfy the AM resource request

2018-05-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8248: - Summary: Job hangs when a queue is specified and the maxResources of the queue cannot satisfy the AM resour

[jira] [Commented] (YARN-8268) Fair scheduler: reservable queue is configured both as parent and leaf queue

2018-05-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472452#comment-16472452 ] Haibo Chen commented on YARN-8268: -- Thanks [~grepas] for the contribution, [~wilfreds] for

[jira] [Commented] (YARN-8268) Fair scheduler: reservable queue is configured both as parent and leaf queue

2018-05-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472423#comment-16472423 ] Haibo Chen commented on YARN-8268: -- +1. Checking this in shortly. > Fair scheduler: reser

[jira] [Commented] (YARN-8130) Race condition when container events are published for KILLED applications

2018-05-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472387#comment-16472387 ] Haibo Chen commented on YARN-8130: -- +1 pending jenkins. > Race condition when container e

[jira] [Commented] (YARN-8248) Job hangs when queue is specified and that queue has 0 capability of a resource

2018-05-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472335#comment-16472335 ] Haibo Chen commented on YARN-8248: -- Thanks [~snemeth] for the clarification. I think in g

[jira] [Commented] (YARN-8130) Race condition when container events are published for KILLED applications

2018-05-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472205#comment-16472205 ] Haibo Chen commented on YARN-8130: -- Yes > Race condition when container events are publis

[jira] [Commented] (YARN-7933) [atsv2 read acls] Add TimelineWriter#writeDomain

2018-05-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472082#comment-16472082 ] Haibo Chen commented on YARN-7933: -- While thinking of the appId issue, there's one questio

[jira] [Commented] (YARN-7933) [atsv2 read acls] Add TimelineWriter#writeDomain

2018-05-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472077#comment-16472077 ] Haibo Chen commented on YARN-7933: -- [~rohithsharma] There is TODO comment in TimelineColle

[jira] [Commented] (YARN-8130) Race condition when container events are published for KILLED applications

2018-05-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472010#comment-16472010 ] Haibo Chen commented on YARN-8130: -- [~rohithsharma] Do you think it's viable that we just

[jira] [Commented] (YARN-8248) Job hangs when queue is specified and that queue has 0 capability of a resource

2018-05-10 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16471309#comment-16471309 ] Haibo Chen commented on YARN-8248: -- Thanks [~snemeth] for the patch. I have some questions

[jira] [Updated] (YARN-7715) Support NM promotion/demotion of running containers.

2018-05-10 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-7715: - Summary: Support NM promotion/demotion of running containers. (was: Update CPU and Memory cgroups params o

[jira] [Commented] (YARN-7715) Update CPU and Memory cgroups params on container update as well.

2018-05-10 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470866#comment-16470866 ] Haibo Chen commented on YARN-7715: -- +1. Checking this in shortly. > Update CPU and Memory

[jira] [Commented] (YARN-7933) [atsv2 read acls] Add TimelineWriter#writeDomain

2018-05-10 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470490#comment-16470490 ] Haibo Chen commented on YARN-7933: -- {quote} For storing in TimelineEntity, we should discu

[jira] [Updated] (YARN-8130) Race condition when container events are published for KILLED applications

2018-05-10 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8130: - Issue Type: Sub-task (was: Bug) Parent: YARN-7055 > Race condition when container events are publi

[jira] [Updated] (YARN-8270) Adding JMX Metrics for Timeline Collector and Reader

2018-05-10 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8270: - Issue Type: Sub-task (was: Improvement) Parent: YARN-7055 > Adding JMX Metrics for Timeline Collec

[jira] [Updated] (YARN-8253) HTTPS Ats v2 api call fails with "bad HTTP parsed"

2018-05-10 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8253: - Issue Type: Sub-task (was: Bug) Parent: YARN-7055 > HTTPS Ats v2 api call fails with "bad HTTP par

[jira] [Updated] (YARN-8215) ATS v2 returns invalid YARN_CONTAINER_ALLOCATED_HOST_HTTP_ADDRESS from NM

2018-05-10 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8215: - Issue Type: Sub-task (was: Bug) Parent: YARN-7055 > ATS v2 returns invalid YARN_CONTAINER_ALLOCATE

[jira] [Updated] (YARN-8132) Final Status of applications shown as UNDEFINED in ATS app queries

2018-05-10 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8132: - Issue Type: Sub-task (was: Bug) Parent: YARN-7055 > Final Status of applications shown as UNDEFINE

[jira] [Updated] (YARN-8247) Incorrect HTTP status code returned by ATSv2 for non-whitelisted users

2018-05-10 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8247: - Issue Type: Sub-task (was: Bug) Parent: YARN-7055 > Incorrect HTTP status code returned by ATSv2 f

[jira] [Updated] (YARN-8107) Give an informative message when incorrect format is used in ATSv2 filter attributes

2018-05-10 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8107: - Parent Issue: YARN-7055 (was: YARN-1011) > Give an informative message when incorrect format is used in AT

[jira] [Updated] (YARN-8129) Improve error message for invalid value in fields attribute

2018-05-10 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8129: - Parent Issue: YARN-7055 (was: YARN-1011) > Improve error message for invalid value in fields attribute > -

[jira] [Updated] (YARN-8129) Improve error message for invalid value in fields attribute

2018-05-10 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8129: - Issue Type: Sub-task (was: Improvement) Parent: YARN-1011 > Improve error message for invalid valu

[jira] [Updated] (YARN-8107) Give an informative message when incorrect format is used in ATSv2 filter attributes

2018-05-10 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8107: - Issue Type: Sub-task (was: Bug) Parent: YARN-1011 > Give an informative message when incorrect for

[jira] [Commented] (YARN-8132) Final Status of applications shown as UNDEFINED in ATS app queries

2018-05-10 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470462#comment-16470462 ] Haibo Chen commented on YARN-8132: -- Both YARN_APPLICATION_STATE and YARN_APPLICATON_FINAL_

[jira] [Commented] (YARN-8130) Race condition when container events are published for KILLED applications

2018-05-10 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470442#comment-16470442 ] Haibo Chen commented on YARN-8130: -- [~rohithsharma] I have one question about the race con

<    1   2   3   4   5   6   7   8   9   10   >