[jira] [Commented] (YARN-8141) YARN Native Service: Respect YARN_CONTAINER_RUNTIME_DOCKER_LOCAL_RESOURCE_MOUNTS specified in service spec

2018-05-10 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16470877#comment-16470877 ] Wangda Tan commented on YARN-8141: -- Thanks [~csingh],  Overall patch looks good, it gonna be better to

[jira] [Commented] (YARN-8141) YARN Native Service: Respect YARN_CONTAINER_RUNTIME_DOCKER_LOCAL_RESOURCE_MOUNTS specified in service spec

2018-05-09 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469758#comment-16469758 ] Wangda Tan commented on YARN-8141: -- [~csingh], Thanks for working on the fix. I think we don't need to

[jira] [Created] (YARN-8272) Several items are missing from Hadoop 3.1.0 documentation

2018-05-09 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-8272: Summary: Several items are missing from Hadoop 3.1.0 documentation Key: YARN-8272 URL: https://issues.apache.org/jira/browse/YARN-8272 Project: Hadoop YARN Issue

[jira] [Assigned] (YARN-8108) RM metrics rest API throws GSSException in kerberized environment

2018-05-08 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan reassigned YARN-8108: Assignee: Eric Yang > RM metrics rest API throws GSSException in kerberized environment >

[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component

2018-05-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466857#comment-16466857 ] Wangda Tan commented on YARN-8255: -- [~eyang],  Thanks for commenting, your suggestion makes sense, and

[jira] [Commented] (YARN-8257) Native service should automatically adding escapes for environment/launch cmd before sending to YARN

2018-05-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466656#comment-16466656 ] Wangda Tan commented on YARN-8257: -- Just took a closer look:  Since both of environment/launch command

[jira] [Commented] (YARN-8257) Native service should automatically adding escapes for environment/launch cmd before sending to YARN

2018-05-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466634#comment-16466634 ] Wangda Tan commented on YARN-8257: -- Talked to [~gsaha], and [~gsaha] mentioned he will help if get chance.

[jira] [Created] (YARN-8257) Native service should automatically adding escapes for environment/launch cmd before sending to YARN

2018-05-07 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-8257: Summary: Native service should automatically adding escapes for environment/launch cmd before sending to YARN Key: YARN-8257 URL: https://issues.apache.org/jira/browse/YARN-8257

[jira] [Commented] (YARN-7892) Revisit NodeAttribute class structure

2018-05-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466618#comment-16466618 ] Wangda Tan commented on YARN-7892: -- Thanks [~Naganarasimha], For id(identifier) and key, I think they're

[jira] [Commented] (YARN-8141) YARN Native Service: Respect YARN_CONTAINER_RUNTIME_DOCKER_LOCAL_RESOURCE_MOUNTS specified in service spec

2018-05-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466576#comment-16466576 ] Wangda Tan commented on YARN-8141: -- Thanks [~shaneku...@gmail.com], I think we should consolidate the two,

[jira] [Assigned] (YARN-8141) YARN Native Service: Respect YARN_CONTAINER_RUNTIME_DOCKER_LOCAL_RESOURCE_MOUNTS specified in service spec

2018-05-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan reassigned YARN-8141: Assignee: Chandni Singh > YARN Native Service: Respect >

[jira] [Commented] (YARN-8255) Allow option to disable flex for a service component

2018-05-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466462#comment-16466462 ] Wangda Tan commented on YARN-8255: -- Thanks [~suma.shivaprasad] for filing the JIRA and suggestions from

[jira] [Updated] (YARN-8251) Clicking on app link at the header goes to Diagnostics Tab instead of AppAttempt Tab

2018-05-04 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8251: - Reporter: Sumana Sathish (was: Yesha Vora) > Clicking on app link at the header goes to Diagnostics Tab

[jira] [Updated] (YARN-8223) ClassNotFoundException when auxiliary service is loaded from HDFS

2018-05-04 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8223: - Fix Version/s: 3.1.1 3.2.0 > ClassNotFoundException when auxiliary service is loaded

[jira] [Commented] (YARN-8234) Improve RM system metrics publisher's performance by pushing events to timeline server in batch

2018-05-03 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463331#comment-16463331 ] Wangda Tan commented on YARN-8234: -- [~ziqian hu], mind to check the Jenkins report? > Improve RM system

[jira] [Commented] (YARN-8232) RMContainer lost queue name when RM HA happens

2018-05-03 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463329#comment-16463329 ] Wangda Tan commented on YARN-8232: -- +1, thanks [~ziqian hu], will commit tomorrow if no objections. >

[jira] [Commented] (YARN-4606) CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps

2018-05-03 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463324#comment-16463324 ] Wangda Tan commented on YARN-4606: -- Thanks [~maniraj...@gmail.com], Some questions: 1) Does this patch

[jira] [Updated] (YARN-8242) YARN NM: OOM error while reading back the state store on recovery

2018-05-02 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8242: - Target Version/s: 3.1.1 Priority: Blocker (was: Major) > YARN NM: OOM error while reading

[jira] [Assigned] (YARN-4781) Support intra-queue preemption for fairness ordering policy.

2018-04-30 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan reassigned YARN-4781: Assignee: Eric Payne (was: Wangda Tan) > Support intra-queue preemption for fairness ordering

[jira] [Commented] (YARN-8232) RMContainer lost queue name when RM HA happens

2018-04-30 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458691#comment-16458691 ] Wangda Tan commented on YARN-8232: -- Thanks [~ziqian hu], could you add an unit test to avoid regression in

[jira] [Commented] (YARN-8234) Improve RM system metrics publisher's performance by pushing events to timeline server in batch

2018-04-29 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458292#comment-16458292 ] Wangda Tan commented on YARN-8234: -- Thank [~ziqian hu], this is an interesting fix. I think it is

[jira] [Commented] (YARN-8232) RMContainer lost queue name when RM HA happens

2018-04-29 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458291#comment-16458291 ] Wangda Tan commented on YARN-8232: -- Thanks [~ziqian hu] for reporting and work on the patch. Could you

[jira] [Assigned] (YARN-8232) RMContainer lost queue name when RM HA happens

2018-04-29 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan reassigned YARN-8232: Assignee: Hu Ziqian > RMContainer lost queue name when RM HA happens >

[jira] [Assigned] (YARN-8234) Improve RM system metrics publisher's performance by pushing events to timeline server in batch

2018-04-29 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan reassigned YARN-8234: Assignee: Hu Ziqian > Improve RM system metrics publisher's performance by pushing events to >

[jira] [Comment Edited] (YARN-8005) Add unit tests for queue priority with dominant resource calculator  

2018-04-27 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456996#comment-16456996 ] Wangda Tan edited comment on YARN-8005 at 4/27/18 8:25 PM: --- Committed to trunk,

[jira] [Commented] (YARN-8005) Add unit tests for queue priority with dominant resource calculator  

2018-04-27 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457004#comment-16457004 ] Wangda Tan commented on YARN-8005: -- Update: Before pushing to branch-3.0, I found it causes compilation

[jira] [Updated] (YARN-8005) Add unit tests for queue priority with dominant resource calculator  

2018-04-27 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8005: - Target Version/s: 3.0.3 > Add unit tests for queue priority with dominant resource calculator   >

[jira] [Updated] (YARN-8005) Add unit tests for queue priority with dominant resource calculator  

2018-04-27 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8005: - Fix Version/s: (was: 3.0.3) > Add unit tests for queue priority with dominant resource calculator   >

[jira] [Commented] (YARN-7574) Add support for Node Labels on Auto Created Leaf Queue Template

2018-04-27 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456925#comment-16456925 ] Wangda Tan commented on YARN-7574: -- [~suma.shivaprasad], this patch added

[jira] [Commented] (YARN-8005) Add unit tests for queue priority with dominant resource calculator  

2018-04-27 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456855#comment-16456855 ] Wangda Tan commented on YARN-8005: -- +1, thanks [~Zian Chen]. will commit shortly. > Add unit tests for

[jira] [Commented] (YARN-8225) YARN precommit build failing in TestPlacementConstraintTransformations

2018-04-27 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456852#comment-16456852 ] Wangda Tan commented on YARN-8225: -- +1, thanks [~shaneku...@gmail.com], will commit shortly. > YARN

[jira] [Comment Edited] (YARN-8079) Support specify files to be downloaded (localized) before containers launched by YARN

2018-04-26 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16455609#comment-16455609 ] Wangda Tan edited comment on YARN-8079 at 4/27/18 12:49 AM: I'm too packed

[jira] [Commented] (YARN-8079) Support specify files to be downloaded (localized) before containers launched by YARN

2018-04-26 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16455609#comment-16455609 ] Wangda Tan commented on YARN-8079: -- I'm a bit packed recently to finish this patch, discussed with

[jira] [Assigned] (YARN-8080) YARN native service should support component restart policy

2018-04-26 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan reassigned YARN-8080: Assignee: Suma Shivaprasad (was: Wangda Tan) > YARN native service should support component

[jira] [Assigned] (YARN-8079) Support specify files to be downloaded (localized) before containers launched by YARN

2018-04-26 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan reassigned YARN-8079: Assignee: Suma Shivaprasad (was: Wangda Tan) > Support specify files to be downloaded (localized)

[jira] [Commented] (YARN-8079) Support specify files to be downloaded (localized) before containers launched by YARN

2018-04-26 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16455562#comment-16455562 ] Wangda Tan commented on YARN-8079: -- Found previous spec mentioned has some issues, for the files part, it

[jira] [Commented] (YARN-8080) YARN native service should support component restart policy

2018-04-26 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16455547#comment-16455547 ] Wangda Tan commented on YARN-8080: -- Following spec can be used to do tests: {code} { "version": "100",

[jira] [Commented] (YARN-8079) Support specify files to be downloaded (localized) before containers launched by YARN

2018-04-26 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16455541#comment-16455541 ] Wangda Tan commented on YARN-8079: -- Attached ver.6 patch, fixed all issues. Spec mentioned by Eric above:

[jira] [Updated] (YARN-8079) Support specify files to be downloaded (localized) before containers launched by YARN

2018-04-26 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8079: - Attachment: YARN-8079.006.patch > Support specify files to be downloaded (localized) before containers

[jira] [Commented] (YARN-8080) YARN native service should support component restart policy

2018-04-26 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16455539#comment-16455539 ] Wangda Tan commented on YARN-8080: -- Attached ver.6 patch, addressed all comments from Gour except

[jira] [Updated] (YARN-8080) YARN native service should support component restart policy

2018-04-26 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8080: - Attachment: YARN-8080.006.patch > YARN native service should support component restart policy >

[jira] [Commented] (YARN-8210) AMRMClient logging on every heartbeat to track updation of AM RM token causes too many log lines to be generated in AM logs

2018-04-26 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16455511#comment-16455511 ] Wangda Tan commented on YARN-8210: -- +1, thanks [~suma.shivaprasad] > AMRMClient logging on every

[jira] [Commented] (YARN-8213) Add Capacity Scheduler metrics

2018-04-26 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454579#comment-16454579 ] Wangda Tan commented on YARN-8213: -- Thanks [~cheersyang], took a quick look, haven't checked details of

[jira] [Commented] (YARN-4606) CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps

2018-04-25 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453297#comment-16453297 ] Wangda Tan commented on YARN-4606: -- Thanks [~eepayne] / [~maniraj...@gmail.com], Here's my understanding

[jira] [Commented] (YARN-8193) YARN RM hangs abruptly (stops allocating resources) when running successive applications.

2018-04-25 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453046#comment-16453046 ] Wangda Tan commented on YARN-8193: -- +1, thanks [~Zian Chen], will commit by today if no objections. >

[jira] [Updated] (YARN-8183) Fix ConcurrentModificationException inside RMAppAttemptMetrics#convertAtomicLongMaptoLongMap

2018-04-24 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8183: - Summary: Fix ConcurrentModificationException inside RMAppAttemptMetrics#convertAtomicLongMaptoLongMap

[jira] [Commented] (YARN-8200) Backport resource types/GPU features to branch-2

2018-04-24 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451385#comment-16451385 ] Wangda Tan commented on YARN-8200: -- +1 to have a branch for this which we can easier know which patches

[jira] [Commented] (YARN-8183) yClient for Kill Application stuck in infinite loop with message "Waiting for Application to be killed"

2018-04-24 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451339#comment-16451339 ] Wangda Tan commented on YARN-8183: -- Thanks [~suma.shivaprasad], +1, pending Jenkins. > yClient for Kill

[jira] [Commented] (YARN-8200) Backport resource types/GPU features to branch-2

2018-04-23 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449312#comment-16449312 ] Wangda Tan commented on YARN-8200: -- [~chris.douglas], I think [~sunilg] has already pointed out, the

[jira] [Commented] (YARN-4606) CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps

2018-04-23 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449075#comment-16449075 ] Wangda Tan commented on YARN-4606: -- Thanks [~eepayne] / [~maniraj...@gmail.com] for working on the fix. I

[jira] [Assigned] (YARN-4606) CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps

2018-04-23 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan reassigned YARN-4606: Assignee: (was: Wangda Tan) > CapacityScheduler: applications could get starved because

[jira] [Commented] (YARN-8200) Backport resource types/GPU features to branch-2

2018-04-23 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448943#comment-16448943 ] Wangda Tan commented on YARN-8200: -- [~jhung], I would suggest to try use 3.x instead back porting this to

[jira] [Commented] (YARN-8183) yClient for Kill Application stuck in infinite loop with message "Waiting for Application to be killed"

2018-04-23 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448511#comment-16448511 ] Wangda Tan commented on YARN-8183: -- Thanks [~suma.shivaprasad],  Overall the fix looks good. Not related

[jira] [Commented] (YARN-8169) Review RackResolver.java

2018-04-18 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16442456#comment-16442456 ] Wangda Tan commented on YARN-8169: -- [~belugabehr], thanks for the clarification, very helpful! > Review

[jira] [Commented] (YARN-8169) Review RackResolver.java

2018-04-18 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16442087#comment-16442087 ] Wangda Tan commented on YARN-8169: -- [~belugabehr],  it's better to keep the: {code:java} if

[jira] [Commented] (YARN-8135) Hadoop {Submarine} Project: Simple and scalable deployment of deep learning training / serving jobs on Hadoop

2018-04-14 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438548#comment-16438548 ] Wangda Tan commented on YARN-8135: -- And just attached the WIP POC patch (poc.001), I know this is very

[jira] [Updated] (YARN-8135) Hadoop {Submarine} Project: Simple and scalable deployment of deep learning training / serving jobs on Hadoop

2018-04-14 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8135: - Attachment: YARN-8135.poc.001.patch > Hadoop {Submarine} Project: Simple and scalable deployment of deep

[jira] [Commented] (YARN-8135) Hadoop {Submarine} Project: Simple and scalable deployment of deep learning training / serving jobs on Hadoop

2018-04-14 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438543#comment-16438543 ] Wangda Tan commented on YARN-8135: -- I just removed some contents from description, and put a link to the

[jira] [Updated] (YARN-8135) Hadoop {Submarine} Project: Simple and scalable deployment of deep learning training / serving jobs on Hadoop

2018-04-14 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8135: - Description: Description: *Goals:* - Allow infra engineer / data scientist to run *unmodified*

[jira] [Updated] (YARN-8135) Hadoop {Submarine} Project: Simple and scalable deployment of deep learning training / serving jobs on Hadoop

2018-04-14 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8135: - Description: Description: *Goals:* - Allow infra engineer / data scientist to run *unmodified*

[jira] [Updated] (YARN-8135) Hadoop {Submarine} Project: Simple and scalable deployment of deep learning training / serving jobs on Hadoop

2018-04-14 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8135: - Attachment: (was: image-2018-04-09-14-44-41-101.png) > Hadoop {Submarine} Project: Simple and scalable

[jira] [Updated] (YARN-8135) Hadoop {Submarine} Project: Simple and scalable deployment of deep learning training / serving jobs on Hadoop

2018-04-14 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8135: - Attachment: (was: image-2018-04-09-14-35-16-778.png) > Hadoop {Submarine} Project: Simple and scalable

[jira] [Updated] (YARN-8138) Add unit test to validate queue priority preemption works under node partition.

2018-04-14 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8138: - Issue Type: Sub-task (was: Bug) Parent: YARN-8159 > Add unit test to validate queue priority

[jira] [Updated] (YARN-8159) [Umbrella] Fixes for Multiple Resource Type Preemption in Capacity Scheduler

2018-04-14 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8159: - Description: There're a couple of JIRAs open for multiple resource types preemption in CS. It might be

[jira] [Updated] (YARN-8159) [Umbrella] Fixes for Multiple Resource Type Preemption in Capacity Scheduler

2018-04-14 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8159: - Description: We see a couple of  > [Umbrella] Fixes for Multiple Resource Type Preemption in Capacity

[jira] [Updated] (YARN-6538) Inter Queue preemption is not happening when DRF is configured

2018-04-13 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6538: - Issue Type: Sub-task (was: Bug) Parent: YARN-8159 > Inter Queue preemption is not happening when

[jira] [Updated] (YARN-8020) when DRF is used, preemption does not trigger due to incorrect idealAssigned

2018-04-13 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8020: - Issue Type: Sub-task (was: Bug) Parent: YARN-8159 > when DRF is used, preemption does not trigger

[jira] [Commented] (YARN-8149) Revisit behavior of Re-Reservation in Capacity Scheduler

2018-04-12 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436488#comment-16436488 ] Wangda Tan commented on YARN-8149: -- [~tgraves],  Preemption for large reserved container is already

[jira] [Commented] (YARN-8138) Add unit test to validate queue priority preemption works under node partition.

2018-04-12 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436342#comment-16436342 ] Wangda Tan commented on YARN-8138: -- [~Zian Chen], mind to check the failed unit tests as well as

[jira] [Commented] (YARN-8149) Revisit behavior of Re-Reservation in Capacity Scheduler

2018-04-12 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436327#comment-16436327 ] Wangda Tan commented on YARN-8149: -- Thanks [~tgraves] for the suggestions.  To your question: {quote}are

[jira] [Commented] (YARN-7930) Add configuration to initialize RM with configured labels.

2018-04-12 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436322#comment-16436322 ] Wangda Tan commented on YARN-7930: -- [~asuresh], [~abmodi],  We thought about this when do initial design

[jira] [Commented] (YARN-8149) Revisit behavior of Re-Reservation in Capacity Scheduler

2018-04-12 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436109#comment-16436109 ] Wangda Tan commented on YARN-8149: -- Thanks [~cheersyang] for pointing to the original Jira.  I would say

[jira] [Updated] (YARN-8138) Add unit test to validate queue priority preemption works under node partition.

2018-04-11 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8138: - Target Version/s: 3.2.0, 3.1.1 > Add unit test to validate queue priority preemption works under node >

[jira] [Updated] (YARN-8138) Add unit test to validate queue priority preemption works under node partition.

2018-04-11 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8138: - Summary: Add unit test to validate queue priority preemption works under node partition. (was: No

[jira] [Updated] (YARN-8138) No containers pre-empted from another queue when using node labels

2018-04-11 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8138: - Priority: Minor (was: Blocker) > No containers pre-empted from another queue when using node labels >

[jira] [Commented] (YARN-8018) Yarn Service Upgrade: Add support for initiating service upgrade

2018-04-11 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434524#comment-16434524 ] Wangda Tan commented on YARN-8018: -- Thanks [~eyang] , I think the whole service related APIs are marked as

[jira] [Commented] (YARN-8127) Resource leak when async scheduling is enabled

2018-04-11 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434388#comment-16434388 ] Wangda Tan commented on YARN-8127: -- Nice catching! Thanks [~Tao Yang] / [~cheersyang]! > Resource leak

[jira] [Commented] (YARN-8149) Revisit behavior of Re-Reservation in Capacity Scheduler

2018-04-11 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434342#comment-16434342 ] Wangda Tan commented on YARN-8149: -- [~jlowe] / [~eepayne] / [~cheersyang] / [~Tao Yang] / [~sunilg]. 

[jira] [Created] (YARN-8149) Revisit behavior of Re-Reservation in Capacity Scheduler

2018-04-11 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-8149: Summary: Revisit behavior of Re-Reservation in Capacity Scheduler Key: YARN-8149 URL: https://issues.apache.org/jira/browse/YARN-8149 Project: Hadoop YARN Issue

[jira] [Commented] (YARN-7142) Support placement policy in yarn native services

2018-04-11 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434284#comment-16434284 ] Wangda Tan commented on YARN-7142: -- [~cheersyang], thanks for reviewing this Jira, I agree with [~gsaha]:

[jira] [Commented] (YARN-7402) Federation V2: Global Optimizations

2018-04-10 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433404#comment-16433404 ] Wangda Tan commented on YARN-7402: -- [~curino] / [~subru], thanks for working on this improvement, is there

[jira] [Updated] (YARN-8133) Doc link broken for yarn-service from overview page.

2018-04-10 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8133: - Fix Version/s: 3.2.0 > Doc link broken for yarn-service from overview page. >

[jira] [Created] (YARN-8141) YARN Native Service: Respect YARN_CONTAINER_RUNTIME_DOCKER_LOCAL_RESOURCE_MOUNTS specified in service spec

2018-04-10 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-8141: Summary: YARN Native Service: Respect YARN_CONTAINER_RUNTIME_DOCKER_LOCAL_RESOURCE_MOUNTS specified in service spec Key: YARN-8141 URL: https://issues.apache.org/jira/browse/YARN-8141

[jira] [Commented] (YARN-7494) Add muti node lookup support for better placement

2018-04-10 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16432970#comment-16432970 ] Wangda Tan commented on YARN-7494: -- Thanks [~sunilg],  In general change looks good. Could u check UT

[jira] [Commented] (YARN-8116) Nodemanager fails with NumberFormatException: For input string: ""

2018-04-10 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16432938#comment-16432938 ] Wangda Tan commented on YARN-8116: -- +1, thanks [~csingh], will commit shortly. > Nodemanager fails with

[jira] [Commented] (YARN-7530) hadoop-yarn-services-api should be part of hadoop-yarn-services

2018-04-10 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16432934#comment-16432934 ] Wangda Tan commented on YARN-7530: -- [~eyang], thanks for sharing ur thoughts. To me, for currently scope

[jira] [Commented] (YARN-7974) Allow updating application tracking url after registration

2018-04-10 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16432922#comment-16432922 ] Wangda Tan commented on YARN-7974: -- [~jhung], Thanks for working on the feature, I can see it's values. 

[jira] [Commented] (YARN-8135) Hadoop {Submarine} Project: Simple and scalable deployment of deep learning training / serving jobs on Hadoop

2018-04-10 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16432492#comment-16432492 ] Wangda Tan commented on YARN-8135: -- [~oliverhuh...@gmail.com],  There's no technical issues to make TF

[jira] [Updated] (YARN-8079) Support specify files to be downloaded (localized) before containers launched by YARN

2018-04-09 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8079: - Summary: Support specify files to be downloaded (localized) before containers launched by YARN (was: YARN

[jira] [Commented] (YARN-7530) hadoop-yarn-services-api should be part of hadoop-yarn-services

2018-04-09 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431548#comment-16431548 ] Wangda Tan commented on YARN-7530: -- A quick proposal for this:   - ApiServerClient/ServiceClient ->

[jira] [Commented] (YARN-8135) Hadoop {Submarine} Project: Simple and scalable deployment of deep learning training / serving jobs on Hadoop

2018-04-09 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431369#comment-16431369 ] Wangda Tan commented on YARN-8135: -- [~oliverhuh...@gmail.com],  Thanks for the responses,  {quote}what

[jira] [Updated] (YARN-8135) Hadoop {Submarine} Project: Simple and scalable deployment of deep learning training / serving jobs on Hadoop

2018-04-09 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8135: - Description: Description: *Goals:* - Allow infra engineer / data scientist to run *unmodified*

[jira] [Updated] (YARN-8135) Hadoop {Submarine} Project: Simple and scalable deployment of deep learning training / serving jobs on Hadoop

2018-04-09 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8135: - Description: Description: *Goals:* - Allow infra engineer / data scientist to run *unmodified*

[jira] [Updated] (YARN-8135) Hadoop {Submarine} Project: Simple and scalable deployment of deep learning training / serving jobs on Hadoop

2018-04-09 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8135: - Attachment: image-2018-04-09-14-44-41-101.png > Hadoop {Submarine} Project: Simple and scalable deployment

[jira] [Commented] (YARN-8135) Hadoop {Submarine} Project: Simple and scalable deployment of deep learning training / serving jobs on Hadoop

2018-04-09 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431333#comment-16431333 ] Wangda Tan commented on YARN-8135: -- I'm currently working on a design doc and a prototype, will share more

[jira] [Created] (YARN-8135) Hadoop {Submarine} Project: Simple and scalable deployment of deep learning training / serving jobs on Hadoop

2018-04-09 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-8135: Summary: Hadoop {Submarine} Project: Simple and scalable deployment of deep learning training / serving jobs on Hadoop Key: YARN-8135 URL: https://issues.apache.org/jira/browse/YARN-8135

[jira] [Commented] (YARN-8116) Nodemanager fails with NumberFormatException: For input string: ""

2018-04-09 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431063#comment-16431063 ] Wangda Tan commented on YARN-8116: -- [~csingh], thanks for working on the fix. It's better to include a

[jira] [Updated] (YARN-8133) Doc link broken for yarn-service from overview page.

2018-04-09 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8133: - Priority: Blocker (was: Major) > Doc link broken for yarn-service from overview page. >

[jira] [Updated] (YARN-8133) Doc link broken for yarn-service from overview page.

2018-04-09 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8133: - Target Version/s: 3.1.1 > Doc link broken for yarn-service from overview page. >

[jira] [Reopened] (YARN-7142) Support placement policy in yarn native services

2018-04-05 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan reopened YARN-7142: -- Thanks [~gsaha], reopening to run Jenkins > Support placement policy in yarn native services >

<    5   6   7   8   9   10   11   12   13   14   >