[jira] [Commented] (YARN-8130) Race condition when container events are published for KILLED applications

2018-05-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472387#comment-16472387 ] Haibo Chen commented on YARN-8130: -- +1 pending jenkins. > Race condition when container events are

[jira] [Commented] (YARN-8268) Fair scheduler: reservable queue is configured both as parent and leaf queue

2018-05-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472452#comment-16472452 ] Haibo Chen commented on YARN-8268: -- Thanks [~grepas] for the contribution, [~wilfreds] for the additional

[jira] [Commented] (YARN-8130) Race condition when container events are published for KILLED applications

2018-05-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472205#comment-16472205 ] Haibo Chen commented on YARN-8130: -- Yes > Race condition when container events are published for KILLED

[jira] [Commented] (YARN-8248) Job hangs when queue is specified and that queue has 0 capability of a resource

2018-05-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472335#comment-16472335 ] Haibo Chen commented on YARN-8248: -- Thanks [~snemeth] for the clarification. I think in general it's okay

[jira] [Updated] (YARN-8248) Job hangs when a queue is specified and the maxResources of the queue cannot satisfy the AM resource request

2018-05-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8248: - Summary: Job hangs when a queue is specified and the maxResources of the queue cannot satisfy the AM

[jira] [Commented] (YARN-8248) Job hangs when a queue is specified and the maxResources of the queue cannot satisfy the AM resource request

2018-05-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472598#comment-16472598 ] Haibo Chen commented on YARN-8248: -- Thanks [~snemeth] for updating the patch. I have a few more

[jira] [Commented] (YARN-3610) FairScheduler: Add steady-fair-shares to the REST API documentation

2018-05-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472657#comment-16472657 ] Haibo Chen commented on YARN-3610: -- +1. Checking this in shortly. > FairScheduler: Add steady-fair-shares

[jira] [Assigned] (YARN-8270) Adding JMX Metrics for Timeline Collector and Reader

2018-05-17 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen reassigned YARN-8270: Assignee: Sushil Kumar S > Adding JMX Metrics for Timeline Collector and Reader >

[jira] [Commented] (YARN-8270) Adding JMX Metrics for Timeline Collector and Reader

2018-05-17 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479169#comment-16479169 ] Haibo Chen commented on YARN-8270: -- Thanks [~Sushil-K-S] for the patch! I don't know much about the metric

[jira] [Commented] (YARN-8270) Adding JMX Metrics for Timeline Collector and Reader

2018-05-17 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479178#comment-16479178 ] Haibo Chen commented on YARN-8270: -- The getInstance() methods are not thread-safe, if two threads happen

[jira] [Updated] (YARN-8248) Job hangs when a job requests a resource that its queue does not have

2018-05-17 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8248: - Target Version/s: 3.2.0, 3.1.1 (was: 3.1.1) > Job hangs when a job requests a resource that its queue

[jira] [Updated] (YARN-8248) Job hangs when a job requests a resource that its queue does not have

2018-05-17 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8248: - Component/s: (was: yarn) > Job hangs when a job requests a resource that its queue does not have >

[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups

2018-05-16 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477826#comment-16477826 ] Haibo Chen commented on YARN-4599: -- +1 on the latest patch, pending one minor comments.  In  

[jira] [Commented] (YARN-7933) [atsv2 read acls] Add TimelineWriter#writeDomain

2018-05-16 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1648#comment-1648 ] Haibo Chen commented on YARN-7933: -- +1 on the 06 patch. Will commit it later today. > [atsv2 read acls]

[jira] [Commented] (YARN-8250) Create another implementation of ContainerScheduler to support NM overallocation

2018-05-15 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476376#comment-16476376 ] Haibo Chen commented on YARN-8250: -- Thanks [~asuresh] for the response. {quote} the whole point of

[jira] [Commented] (YARN-8250) Create another implementation of ContainerScheduler to support NM overallocation

2018-05-15 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476384#comment-16476384 ] Haibo Chen commented on YARN-8250: -- [~leftnoteasy] I can introduce a plug-gable policy that encapsulates

[jira] [Commented] (YARN-8248) Job hangs when a job requests a resource that its queue does not have

2018-05-21 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482576#comment-16482576 ] Haibo Chen commented on YARN-8248: -- I'm okay with not fixing this one. But it is indeed a convention to

[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups

2018-05-23 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488210#comment-16488210 ] Haibo Chen commented on YARN-4599: -- Thanks [~sandflee] for the initial proposal,

[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-23 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6677: - Summary: Preempt opportunistic containers when root container cgroup goes over memory limit (was: Preempt

[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-23 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6677: - Attachment: YARN-6677.00.patch > Preempt opportunistic containers when root container cgroup goes over

[jira] [Commented] (YARN-8191) Fair scheduler: queue deletion without RM restart

2018-05-22 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484491#comment-16484491 ] Haibo Chen commented on YARN-8191: -- {quote}The point here is that the set of removed queues can be

[jira] [Commented] (YARN-8191) Fair scheduler: queue deletion without RM restart

2018-05-24 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489410#comment-16489410 ] Haibo Chen commented on YARN-8191: -- Findbugs is complaining about `return

[jira] [Commented] (YARN-8191) Fair scheduler: queue deletion without RM restart

2018-05-24 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489751#comment-16489751 ] Haibo Chen commented on YARN-8191: -- +1 on the latest patch. Will check it in later today. > Fair

[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups

2018-05-17 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479588#comment-16479588 ] Haibo Chen commented on YARN-4599: -- Two more comments: 1) The oomHandler will still have the flag as

[jira] [Created] (YARN-8323) FairScheduler.allocConf should be declared as volatile

2018-05-18 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-8323: Summary: FairScheduler.allocConf should be declared as volatile Key: YARN-8323 URL: https://issues.apache.org/jira/browse/YARN-8323 Project: Hadoop YARN Issue Type:

[jira] [Commented] (YARN-8248) Job hangs when a job requests a resource that its queue does not have

2018-05-18 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480931#comment-16480931 ] Haibo Chen commented on YARN-8248: -- Thanks [~snemeth] for updating the patch! I agree that it's valuable

[jira] [Created] (YARN-8321) AllocationFileLoaderService. getAllocationFile() should be declared as VisibleForTest

2018-05-18 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-8321: Summary: AllocationFileLoaderService. getAllocationFile() should be declared as VisibleForTest Key: YARN-8321 URL: https://issues.apache.org/jira/browse/YARN-8321 Project:

[jira] [Commented] (YARN-8248) Job hangs when a job requests a resource that its queue does not have

2018-05-18 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16481009#comment-16481009 ] Haibo Chen commented on YARN-8248: -- +1 pending Jenkins. > Job hangs when a job requests a resource that

[jira] [Created] (YARN-8322) Change log level when there is an IOException when the allocation file is loaded

2018-05-18 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-8322: Summary: Change log level when there is an IOException when the allocation file is loaded Key: YARN-8322 URL: https://issues.apache.org/jira/browse/YARN-8322 Project: Hadoop

[jira] [Commented] (YARN-8248) Job hangs when a job requests a resource that its queue does not have

2018-05-17 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479664#comment-16479664 ] Haibo Chen commented on YARN-8248: -- Thanks [~snemeth] for the update! I have a few follow-up comments. 1) 

[jira] [Commented] (YARN-7933) [atsv2 read acls] Add TimelineWriter#writeDomain

2018-05-16 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477991#comment-16477991 ] Haibo Chen commented on YARN-7933: -- Thanks [~rohithsharma] for the patch! I have checked it in trunk. >

[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups

2018-05-15 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476682#comment-16476682 ] Haibo Chen commented on YARN-4599: -- Thanks [~miklos.szeg...@cloudera.com] for updating the patch! The new

[jira] [Commented] (YARN-8191) Fair scheduler: queue deletion without RM restart

2018-05-23 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487621#comment-16487621 ] Haibo Chen commented on YARN-8191: -- +1 pending the findbug fix. The

[jira] [Commented] (YARN-8248) Job hangs when a job requests a resource that its queue does not have

2018-05-20 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482168#comment-16482168 ] Haibo Chen commented on YARN-8248: -- [~snemeth] Can you please address the checkstyle issues? > Job hangs

[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups

2018-05-23 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487749#comment-16487749 ] Haibo Chen commented on YARN-4599: -- +1 on the latest patch. Will check it in later today if no objections

[jira] [Commented] (YARN-8338) TimelineService V1.5 doesn't come up after HADOOP-15406

2018-05-23 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487710#comment-16487710 ] Haibo Chen commented on YARN-8338: -- [~jlowe] [~vinodkv], I did not understand who else was depending on

[jira] [Created] (YARN-8325) Miscellaneous QueueManager code clean up

2018-05-18 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-8325: Summary: Miscellaneous QueueManager code clean up Key: YARN-8325 URL: https://issues.apache.org/jira/browse/YARN-8325 Project: Hadoop YARN Issue Type: Improvement

[jira] [Commented] (YARN-8191) Fair scheduler: queue deletion without RM restart

2018-05-18 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16481341#comment-16481341 ] Haibo Chen commented on YARN-8191: -- Thanks [~grepas] for the patch! I have some comments. 1) In

[jira] [Commented] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-06-08 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506151#comment-16506151 ] Haibo Chen commented on YARN-6677: -- Thanks for the review, [~szegedim]! > Preempt opportunistic

[jira] [Commented] (YARN-8250) Create another implementation of ContainerScheduler to support NM overallocation

2018-06-08 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506296#comment-16506296 ] Haibo Chen commented on YARN-8250: -- Ping [~asuresh].  > Create another implementation of

[jira] [Resolved] (YARN-6800) Add opportunity to start containers while periodically checking for preemption

2018-06-12 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen resolved YARN-6800. -- Resolution: Implemented Fix Version/s: YARN-1011 Resolving this as it is already done as part of

[jira] [Assigned] (YARN-6409) RM does not blacklist node for AM launch failures

2018-06-12 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen reassigned YARN-6409: Assignee: Kanwaljeet Sachdev (was: Haibo Chen) > RM does not blacklist node for AM launch

[jira] [Updated] (YARN-8427) Don't start opportunistic containers at container scheduler/finish event with over-allocation

2018-06-14 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8427: - Issue Type: Sub-task (was: Improvement) Parent: YARN-1011 > Don't start opportunistic containers

[jira] [Created] (YARN-8427) Don't start opportunistic containers at container scheduler/finish event with over-allocation

2018-06-14 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-8427: Summary: Don't start opportunistic containers at container scheduler/finish event with over-allocation Key: YARN-8427 URL: https://issues.apache.org/jira/browse/YARN-8427

[jira] [Commented] (YARN-8250) Create another implementation of ContainerScheduler to support NM overallocation

2018-06-14 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512924#comment-16512924 ] Haibo Chen commented on YARN-8250: -- Thanks [~kkaranasos] for your comments and suggestions! {quote}I

[jira] [Commented] (YARN-6586) YARN to facilitate HTTPS in AM web server

2018-06-14 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513038#comment-16513038 ] Haibo Chen commented on YARN-6586: -- [~rkanter]  I have one question about setting the lifetime of the

[jira] [Commented] (YARN-8250) Create another implementation of ContainerScheduler to support NM overallocation

2018-06-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508382#comment-16508382 ] Haibo Chen commented on YARN-8250: -- My apologies, Arun. I did not mean to indicate in any way you are

[jira] [Commented] (YARN-6931) Make the aggregation interval in AppLevelTimelineCollector configurable

2018-06-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508447#comment-16508447 ] Haibo Chen commented on YARN-6931: -- Thanks [~abmodi] for the patch. I have one minor comment.  We can

[jira] [Commented] (YARN-8250) Create another implementation of ContainerScheduler to support NM overallocation

2018-06-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508385#comment-16508385 ] Haibo Chen commented on YARN-8250: -- In the meantime, I'll try to see how well YARN-6675 works with some

[jira] [Commented] (YARN-8321) AllocationFileLoaderService. getAllocationFile() should be declared as VisibleForTest

2018-06-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508479#comment-16508479 ] Haibo Chen commented on YARN-8321: -- +1. Checking it in shortly. > AllocationFileLoaderService.

[jira] [Commented] (YARN-8325) Miscellaneous QueueManager code clean up

2018-06-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508485#comment-16508485 ] Haibo Chen commented on YARN-8325: -- [~snemeth] the patch no longer applies. Can you update it? >

[jira] [Commented] (YARN-8323) FairScheduler.allocConf should be declared as volatile

2018-06-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508458#comment-16508458 ] Haibo Chen commented on YARN-8323: -- +1. Checking in shortly. > FairScheduler.allocConf should be

[jira] [Commented] (YARN-8322) Change log level when there is an IOException when the allocation file is loaded

2018-06-11 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508475#comment-16508475 ] Haibo Chen commented on YARN-8322: -- +1 > Change log level when there is an IOException when the

[jira] [Updated] (YARN-6794) Fair Scheduler to explicitly promote OPPORTUNISITIC containers locally at the node where they're running

2018-06-19 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6794: - Attachment: YARN-6794-YARN-1011.01.patch > Fair Scheduler to explicitly promote OPPORTUNISITIC containers

[jira] [Commented] (YARN-6794) Fair Scheduler to explicitly promote OPPORTUNISITIC containers locally at the node where they're running

2018-06-19 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517513#comment-16517513 ] Haibo Chen commented on YARN-6794: -- Thanks [~szegedim] for the review!. {quote}I think this should refer

[jira] [Commented] (YARN-8270) Adding JMX Metrics for Timeline Collector and Reader

2018-06-13 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511938#comment-16511938 ] Haibo Chen commented on YARN-8270: -- Thanks [~Sushil-K-S] for updating the patch! The unit test failures

[jira] [Commented] (YARN-6931) Make the aggregation interval in AppLevelTimelineCollector configurable

2018-06-12 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509884#comment-16509884 ] Haibo Chen commented on YARN-6931: -- +1. Checking in shortly. > Make the aggregation interval in

[jira] [Commented] (YARN-8325) Miscellaneous QueueManager code clean up

2018-06-12 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509893#comment-16509893 ] Haibo Chen commented on YARN-8325: -- +1. Checking in shortly. > Miscellaneous QueueManager code clean up

[jira] [Commented] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-29 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493926#comment-16493926 ] Haibo Chen commented on YARN-6677: -- Thanks [~miklos.szeg...@cloudera.com] for the comment. {quote}Could

[jira] [Commented] (YARN-8375) TestCGroupElasticMemoryController fails surefire build

2018-05-29 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493985#comment-16493985 ] Haibo Chen commented on YARN-8375: -- Thanks [~jlowe] for reporting. We'll take a look at this. >

[jira] [Assigned] (YARN-8375) TestCGroupElasticMemoryController fails surefire build

2018-05-29 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen reassigned YARN-8375: Assignee: Miklos Szegedi > TestCGroupElasticMemoryController fails surefire build >

[jira] [Updated] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH

2018-05-30 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8373: - Component/s: fairscheduler > RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH >

[jira] [Commented] (YARN-8250) Create another implementation of ContainerScheduler to support NM overallocation

2018-05-31 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16496872#comment-16496872 ] Haibo Chen commented on YARN-8250: -- [~asuresh], [~leftnoteasy] and I had an offline discussion about this

[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6677: - Attachment: YARN-6677.02.patch > Preempt opportunistic containers when root container cgroup goes over

[jira] [Updated] (YARN-8240) Add queue-level control to allow all applications in a queue to opt-out

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8240: - Attachment: YARN-8240-YARN-1011.01.patch > Add queue-level control to allow all applications in a queue

[jira] [Commented] (YARN-8240) Add queue-level control to allow all applications in a queue to opt-out

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16500315#comment-16500315 ] Haibo Chen commented on YARN-8240: -- TestCapacityOverTimePolicy.testAllocation is flaky and unrelated to

[jira] [Commented] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16501010#comment-16501010 ] Haibo Chen commented on YARN-6677: -- I think the findbugs warning is bonus because the wrapper class is

[jira] [Commented] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16501039#comment-16501039 ] Haibo Chen commented on YARN-6677: -- Updated the patch to address all but one checkstyle issues. >

[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6677: - Attachment: YARN-6677.03.patch > Preempt opportunistic containers when root container cgroup goes over

[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-31 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6677: - Issue Type: Improvement (was: Sub-task) Parent: (was: YARN-1011) > Preempt opportunistic

[jira] [Commented] (YARN-8375) TestCGroupElasticMemoryController fails surefire build

2018-06-01 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498696#comment-16498696 ] Haibo Chen commented on YARN-8375: -- [~miklos.szeg...@cloudera.com] and I debugged this together. We were

[jira] [Created] (YARN-8388) TestCGroupElasticMemoryController.testNormalExit() hangs on Linux

2018-06-01 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-8388: Summary: TestCGroupElasticMemoryController.testNormalExit() hangs on Linux Key: YARN-8388 URL: https://issues.apache.org/jira/browse/YARN-8388 Project: Hadoop YARN

[jira] [Updated] (YARN-8240) Add queue-level control to allow all applications in a queue to opt-out

2018-06-01 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8240: - Attachment: YARN-8240-YARN-1011.00.patch > Add queue-level control to allow all applications in a queue

[jira] [Created] (YARN-8393) timeline flow runs API createdtimestart/createdtimeend parameter does not work

2018-06-04 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-8393: Summary: timeline flow runs API createdtimestart/createdtimeend parameter does not work Key: YARN-8393 URL: https://issues.apache.org/jira/browse/YARN-8393 Project: Hadoop

[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-06-06 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6677: - Attachment: YARN-6677.04.patch > Preempt opportunistic containers when root container cgroup goes over

[jira] [Commented] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-06-06 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16504204#comment-16504204 ] Haibo Chen commented on YARN-6677: -- Yes, attached a new patch to address the findbugs warning. > Preempt

[jira] [Updated] (YARN-6794) Fair Scheduler to explicitly promote OPPORTUNISITIC containers locally at the node where they're running

2018-06-06 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6794: - Attachment: YARN-6794-YARN-1011.prelim.patch > Fair Scheduler to explicitly promote OPPORTUNISITIC

[jira] [Updated] (YARN-6794) Fair Scheduler to explicitly promote OPPORTUNISITIC containers locally at the node where they're running

2018-06-07 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6794: - Attachment: YARN-6794-YARN-1011.00.patch > Fair Scheduler to explicitly promote OPPORTUNISITIC containers

[jira] [Commented] (YARN-1014) Configure OOM Killer to kill OPPORTUNISTIC containers first

2018-06-06 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16503822#comment-16503822 ] Haibo Chen commented on YARN-1014: -- Based on the previous description of this jira, yes. > Configure OOM

[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-06-07 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6677: - Attachment: YARN-6677.05.patch > Preempt opportunistic containers when root container cgroup goes over

[jira] [Commented] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-06-07 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16504288#comment-16504288 ] Haibo Chen commented on YARN-6677: -- Updated the patch again to address the new findbugs issue > Preempt

[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-31 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6677: - Attachment: YARN-6677.01.patch > Preempt opportunistic containers when root container cgroup goes over

[jira] [Commented] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-31 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16496638#comment-16496638 ] Haibo Chen commented on YARN-6677: -- I updated the patch without addressing the getCGroupsHandler()

[jira] [Commented] (YARN-8390) Fix API incompatible changes in FairScheduler's AllocationFileLoaderService

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16500437#comment-16500437 ] Haibo Chen commented on YARN-8390: -- +1. The findbug issue is independent of this patch, that we should

[jira] [Created] (YARN-8391) Investigate AllocationFileLoaderService.reloadListener locking issue

2018-06-04 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-8391: Summary: Investigate AllocationFileLoaderService.reloadListener locking issue Key: YARN-8391 URL: https://issues.apache.org/jira/browse/YARN-8391 Project: Hadoop YARN

[jira] [Updated] (YARN-8391) Investigate AllocationFileLoaderService.reloadListener locking issue

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8391: - Description: Per findbugs report in YARN-8390, there is some inconsistent locking of  reloadListener  

[jira] [Commented] (YARN-8390) Fix API incompatible changes in FairScheduler's AllocationFileLoaderService

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16500451#comment-16500451 ] Haibo Chen commented on YARN-8390: -- Thanks [~grepas] for the quick fix. I have checked in the patch to

[jira] [Updated] (YARN-8191) Fair scheduler: queue deletion without RM restart

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8191: - Hadoop Flags: Incompatible change,Reviewed (was: Reviewed) > Fair scheduler: queue deletion without RM

[jira] [Commented] (YARN-8388) TestCGroupElasticMemoryController.testNormalExit() hangs on Linux

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16500570#comment-16500570 ] Haibo Chen commented on YARN-8388: -- Two minor comments 1) Let's add a comment above `

[jira] [Commented] (YARN-8240) Add queue-level control to allow all applications in a queue to opt-out

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16500617#comment-16500617 ] Haibo Chen commented on YARN-8240: -- Thanks for the review [~miklos.szeg...@cloudera.com], I updated the

[jira] [Updated] (YARN-8240) Add queue-level control to allow all applications in a queue to opt-out

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8240: - Attachment: YARN-8240-YARN-1011.02.patch > Add queue-level control to allow all applications in a queue

[jira] [Commented] (YARN-8388) TestCGroupElasticMemoryController.testNormalExit() hangs on Linux

2018-06-04 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16500650#comment-16500650 ] Haibo Chen commented on YARN-8388: -- +1 on the latest patch (02) pending Jenkins. >

[jira] [Commented] (YARN-7334) Add documentation of oversubscription

2018-06-29 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16528267#comment-16528267 ] Haibo Chen commented on YARN-7334: -- Sure, [~elgoiri] I will get on this once we are done with more

[jira] [Commented] (YARN-6672) Add NM preemption of opportunistic containers when utilization goes high

2018-06-29 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16528269#comment-16528269 ] Haibo Chen commented on YARN-6672: -- Thanks [~elgoiri] for the review! {quote} * In

[jira] [Updated] (YARN-6672) Add NM preemption of opportunistic containers when utilization goes high

2018-06-29 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6672: - Attachment: YARN-6672-YARN-1011.06.patch > Add NM preemption of opportunistic containers when utilization

[jira] [Updated] (YARN-8462) Resource Manager shutdown with FATAL Exception

2018-06-26 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8462: - Component/s: capacity scheduler > Resource Manager shutdown with FATAL Exception >

[jira] [Commented] (YARN-1013) CS should watch resource utilization of containers and allocate speculative containers if appropriate

2018-06-26 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524129#comment-16524129 ] Haibo Chen commented on YARN-1013: -- To add a bit of context to YARN-1015 (or YARN-1011 as a whole), we

[jira] [Updated] (YARN-8461) Support strict memory control on individual container with elastic control memory mechanism

2018-06-26 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8461: - Attachment: YARN-8461.01.patch > Support strict memory control on individual container with elastic

[jira] [Commented] (YARN-8461) Support strict memory control on individual container with elastic control memory mechanism

2018-06-26 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523992#comment-16523992 ] Haibo Chen commented on YARN-8461: -- Thanks for the review, [~szegedim]! I have updated the patch to

[jira] [Updated] (YARN-8461) Support strict memory control on individual container with elastic control memory mechanism

2018-06-26 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8461: - Attachment: YARN-8461.02.patch > Support strict memory control on individual container with elastic

<    7   8   9   10   11   12   13   14   15   16   >