[jira] [Updated] (YARN-9173) FairShare calculation broken for large values after YARN-8833

2019-01-21 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-9173: - Fix Version/s: (was: 3.1.3) 3.1.2 > FairShare calculation broken for large values

[jira] [Commented] (YARN-9173) FairShare calculation broken for large values after YARN-8833

2019-01-21 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748107#comment-16748107 ] Wangda Tan commented on YARN-9173: -- Cherry-picked to branch-3.1.2 as well. Updated fix version >

[jira] [Updated] (YARN-9205) When using custom resource type, application will fail to run due to the CapacityScheduler throws InvalidResourceRequestException(GREATER_THEN_MAX_ALLOCATION)

2019-01-21 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-9205: - Target Version/s: 3.1.2, 3.2.1 (was: 3.1.2) > When using custom resource type, application will fail to

[jira] [Commented] (YARN-9205) When using custom resource type, application will fail to run due to the CapacityScheduler throws InvalidResourceRequestException(GREATER_THEN_MAX_ALLOCATION)

2019-01-21 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748090#comment-16748090 ] Wangda Tan commented on YARN-9205: -- Gotcha,  Thanks [~tangzhankun], then ver.2 patch looks good, could u

[jira] [Updated] (YARN-9205) When using custom resource type, application will fail to run due to the CapacityScheduler throws InvalidResourceRequestException(GREATER_THEN_MAX_ALLOCATION)

2019-01-21 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-9205: - Target Version/s: 3.1.2 > When using custom resource type, application will fail to run due to the >

[jira] [Commented] (YARN-9204) yarn.scheduler.capacity..accessible-node-labels..capacity can not support absolute resource value

2019-01-20 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16747644#comment-16747644 ] Wangda Tan commented on YARN-9204: -- [~cheersyang], this sounds like a critical instead of blocker, I can

[jira] [Commented] (YARN-9210) YARN UI can not display node info

2019-01-18 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746513#comment-16746513 ] Wangda Tan commented on YARN-9210: -- LGTM, +1 to the patch. > YARN UI can not display node info >

[jira] [Commented] (YARN-9205) When using custom resource type, application will fail to run due to the CapacityScheduler throws InvalidResourceRequestException(GREATER_THEN_MAX_ALLOCATION)

2019-01-18 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746510#comment-16746510 ] Wangda Tan commented on YARN-9205: -- [~tangzhankun],  Thanks for troubleshooting and find the root cause.

[jira] [Updated] (YARN-9194) Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and NullPointerException happens in RM while shutdown a NM

2019-01-17 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-9194: - Fix Version/s: 3.1.3 3.2.1 3.3.0 > Invalid event: REGISTERED and

[jira] [Commented] (YARN-9206) RMServerUtils does not count SHUTDOWN as an accepted state

2019-01-17 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16745628#comment-16745628 ] Wangda Tan commented on YARN-9206: -- [~kshukla], could u please add a method to NodeState such as

[jira] [Commented] (YARN-9204) yarn.scheduler.capacity..accessible-node-labels..capacity can not support absolute resource value

2019-01-17 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16745625#comment-16745625 ] Wangda Tan commented on YARN-9204: -- [~yangjiandan], thanks, could you please provide a UT to prevent this

[jira] [Commented] (YARN-9195) RM Queue's pending container number might get decreased unexpectedly or even become negative once RM failover

2019-01-17 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16745611#comment-16745611 ] Wangda Tan commented on YARN-9195: -- [~ssy], thanks for filing the issue and provide analysis. We

[jira] [Commented] (YARN-9074) Docker container rm command should be executed after stop

2019-01-17 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16745607#comment-16745607 ] Wangda Tan commented on YARN-9074: -- [~shaneku...@gmail.com], could u get the patch committed if you're

[jira] [Commented] (YARN-9197) NPE in service AM when failed to launch container

2019-01-17 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16745606#comment-16745606 ] Wangda Tan commented on YARN-9197: -- Thanks [~kyungwan nam] for filing and working on the patch.

[jira] [Commented] (YARN-9205) When using custom resource type, application will fail to run due to the CapacityScheduler throws InvalidResourceRequestException(GREATER_THEN_MAX_ALLOCATION)

2019-01-17 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16745591#comment-16745591 ] Wangda Tan commented on YARN-9205: -- [~tangzhankun],  >From the log it looks like by design. There's a

[jira] [Commented] (YARN-9200) Enable resource configuration of queue capacity for different resources independently

2019-01-15 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743248#comment-16743248 ] Wangda Tan commented on YARN-9200: -- [~aihuaxu], adding percentage to different resource types is what we

[jira] [Commented] (YARN-9194) Invalid event: REGISTERED at FAILED

2019-01-15 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743240#comment-16743240 ] Wangda Tan commented on YARN-9194: -- Fix LGTM, thanks [~xiaoheipangzi], will commit the patch tomorrow if

[jira] [Updated] (YARN-9194) Invalid event: REGISTERED at FAILED, and NullPointerException happens in RM while shutdown a NM

2019-01-15 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-9194: - Target Version/s: 3.2.1, 3.1.3 Priority: Critical (was: Major) Summary: Invalid

[jira] [Commented] (YARN-9199) Compatible issue: AM throws NoSuchMethodError when 2.7.2 client submits mr job to 3.1.3 RM

2019-01-15 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743226#comment-16743226 ] Wangda Tan commented on YARN-9199: -- [~yangjiandan], Thanks for reporting the issue, but I'm not sure how

[jira] [Commented] (YARN-9116) Capacity Scheduler: add the default maximum-allocation-mb and maximum-allocation-vcores for the queues

2019-01-08 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737856#comment-16737856 ] Wangda Tan commented on YARN-9116: -- Maybe we can support maximum-allocation-mb/vcores to parent queues

[jira] [Commented] (YARN-9116) Capacity Scheduler: add the default maximum-allocation-mb and maximum-allocation-vcores for the queues

2019-01-08 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737855#comment-16737855 ] Wangda Tan commented on YARN-9116: -- [~cheersyang], [~aihuaxu],  The proposal from Weiwei sounds good to

[jira] [Updated] (YARN-9161) Absolute resources of capacity scheduler doesn't support GPU and FPGA

2019-01-08 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-9161: - Component/s: capacity scheduler > Absolute resources of capacity scheduler doesn't support GPU and FPGA >

[jira] [Commented] (YARN-6695) Race condition in RM for publishing container events vs appFinished events causes NPE

2019-01-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736803#comment-16736803 ] Wangda Tan commented on YARN-6695: -- Thanks [~eyang]/[~rohithsharma], I'm going to update target version

[jira] [Updated] (YARN-6695) Race condition in RM for publishing container events vs appFinished events causes NPE

2019-01-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6695: - Target Version/s: 3.2.1, 3.1.3 (was: 3.2.0, 3.1.2) > Race condition in RM for publishing container

[jira] [Updated] (YARN-6695) Race condition in RM for publishing container events vs appFinished events causes NPE

2019-01-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6695: - Target Version/s: 3.2.0, 3.1.2 > Race condition in RM for publishing container events vs appFinished

[jira] [Commented] (YARN-8822) Nvidia-docker v2 support for YARN GPU feature

2019-01-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736322#comment-16736322 ] Wangda Tan commented on YARN-8822: -- Thanks [~Charo Zhang] for the patch and [~tangzhankun] for

[jira] [Updated] (YARN-8822) Nvidia-docker v2 support for YARN GPU feature

2019-01-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8822: - Target Version/s: 3.2.0 (was: 3.1.2, 3.3.0, 3.2.1) > Nvidia-docker v2 support for YARN GPU feature >

[jira] [Updated] (YARN-8822) Nvidia-docker v2 support for YARN GPU feature

2019-01-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8822: - Fix Version/s: 3.2.1 3.3.0 3.1.2 > Nvidia-docker v2 support for

[jira] [Updated] (YARN-8822) Nvidia-docker v2 support for YARN GPU feature

2019-01-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8822: - Attachment: YARN-8822-branch-3.2.0-001.patch > Nvidia-docker v2 support for YARN GPU feature >

[jira] [Updated] (YARN-8822) Nvidia-docker v2 support for YARN GPU feature

2019-01-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8822: - Summary: Nvidia-docker v2 support for YARN GPU feature (was: Nvidia-docker v2 support) > Nvidia-docker

[jira] [Commented] (YARN-8822) Nvidia-docker v2 support

2019-01-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736107#comment-16736107 ] Wangda Tan commented on YARN-8822: -- Thanks [~Charo Zhang],   Latest patches LGTM, will get them

[jira] [Updated] (YARN-9053) Support set environment variables for Docker Containers In nonEntryPoint mode

2019-01-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-9053: - Target Version/s: 3.1.3 (was: 3.1.2) > Support set environment variables for Docker Containers In

[jira] [Updated] (YARN-8822) Nvidia-docker v2 support

2019-01-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8822: - Target Version/s: 3.1.2, 3.3.0, 3.2.1 (was: 3.3.0, 3.2.1, 3.1.3) > Nvidia-docker v2 support >

[jira] [Commented] (YARN-9160) [Submarine] Document "PYTHONPATH" environment variable setting when using -localization options

2019-01-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735285#comment-16735285 ] Wangda Tan commented on YARN-9160: -- Committed to trunk, thanks [~tangzhankun]. > [Submarine] Document

[jira] [Updated] (YARN-9141) [submarine] JobStatus outputs with system UTC clock, not local clock

2019-01-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-9141: - Fix Version/s: 3.3.0 > [submarine] JobStatus outputs with system UTC clock, not local clock >

[jira] [Updated] (YARN-9141) [submarine] JobStatus outputs with system UTC clock, not local clock

2019-01-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-9141: - Fix Version/s: 3.2.1 > [submarine] JobStatus outputs with system UTC clock, not local clock >

[jira] [Comment Edited] (YARN-9160) [Submarine] Document "PYTHONPATH" environment variable setting when using -localization options

2019-01-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735285#comment-16735285 ] Wangda Tan edited comment on YARN-9160 at 1/6/19 7:17 PM: -- Committed to trunk and

[jira] [Updated] (YARN-9160) [Submarine] Document "PYTHONPATH" environment variable setting when using -localization options

2019-01-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-9160: - Fix Version/s: 3.2.1 > [Submarine] Document "PYTHONPATH" environment variable setting when using >

[jira] [Comment Edited] (YARN-9141) [submarine] JobStatus outputs with system UTC clock, not local clock

2019-01-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735286#comment-16735286 ] Wangda Tan edited comment on YARN-9141 at 1/6/19 7:17 PM: -- Thanks [~yuan_zac],

[jira] [Updated] (YARN-9160) [Submarine] Document "PYTHONPATH" environment variable setting when using -localization options

2019-01-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-9160: - Fix Version/s: 3.3.0 > [Submarine] Document "PYTHONPATH" environment variable setting when using >

[jira] [Commented] (YARN-9141) [submarine] JobStatus outputs with system UTC clock, not local clock

2019-01-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735286#comment-16735286 ] Wangda Tan commented on YARN-9141: -- Thanks [~yuan_zac], committed to trunk. > [submarine] JobStatus

[jira] [Updated] (YARN-8822) Nvidia-docker v2 support

2019-01-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8822: - Target Version/s: 3.2.1, 3.1.3 (was: 3.1.3) > Nvidia-docker v2 support > > >

[jira] [Commented] (YARN-8822) Nvidia-docker v2 support

2019-01-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735273#comment-16735273 ] Wangda Tan commented on YARN-8822: -- I moved this Jira out of 3.1.2, I will work on RC for 3.1.2 tomorrow.

[jira] [Updated] (YARN-8822) Nvidia-docker v2 support

2019-01-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8822: - Target Version/s: 3.3.0, 3.2.1, 3.1.3 (was: 3.2.1, 3.1.3) > Nvidia-docker v2 support >

[jira] [Updated] (YARN-8822) Nvidia-docker v2 support

2019-01-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8822: - Target Version/s: 3.1.3 (was: 3.1.2) > Nvidia-docker v2 support > > >

[jira] [Commented] (YARN-8822) Nvidia-docker v2 support

2019-01-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735272#comment-16735272 ] Wangda Tan commented on YARN-8822: -- [~Charo Zhang], latest patch looks good, apologize for my late

[jira] [Commented] (YARN-9141) [submarine] JobStatus outputs with system UTC clock, not local clock

2019-01-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735241#comment-16735241 ] Wangda Tan commented on YARN-9141: -- straightforward fix, +1, will commit later. > [submarine] JobStatus

[jira] [Commented] (YARN-8489) Need to support "dominant" component concept inside YARN service

2019-01-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735239#comment-16735239 ] Wangda Tan commented on YARN-8489: -- [~yuan_zac],  Thanks for working on this ticket. 1)

[jira] [Commented] (YARN-9160) [Submarine] Document "PYTHONPATH" environment variable setting when using -localization options

2019-01-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735246#comment-16735246 ] Wangda Tan commented on YARN-9160: -- Straightforward fix, +1. Thanks [~tangzhankun]. > [Submarine]

[jira] [Commented] (YARN-9155) Can't re-run a submarine job, if the previous job with the same service name has finished

2019-01-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735244#comment-16735244 ] Wangda Tan commented on YARN-9155: -- [~yuan_zac]. can we just add a Submarine cli option to remove old job

[jira] [Commented] (YARN-9144) WebAppProxyServlet can't redirect to ATS V1.5 when a yarn native service app is finished

2019-01-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735240#comment-16735240 ] Wangda Tan commented on YARN-9144: -- Thanks [~yuan_zac] for working on this Jira? 

[jira] [Commented] (YARN-8967) Change FairScheduler to use PlacementRule interface

2019-01-03 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733580#comment-16733580 ] Wangda Tan commented on YARN-8967: -- Thanks  [~wilfreds], I'm very glad to see that the original YARN-3635

[jira] [Commented] (YARN-9161) Absolute resources of capacity scheduler doesn't support GPU and FPGA

2019-01-02 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732270#comment-16732270 ] Wangda Tan commented on YARN-9161: -- [~yuan_zac], thanks for reporting this issue. [~sunilg], do you

[jira] [Commented] (YARN-9163) Deadlock when use yarn rmadmin -refreshQueues

2019-01-02 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732255#comment-16732255 ] Wangda Tan commented on YARN-9163: -- [~ziqian hu], could u upload jstack or at least 3 full stacktrace of

[jira] [Commented] (YARN-9090) [Submarine] Adjust the submarine installation script document

2018-12-24 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728538#comment-16728538 ] Wangda Tan commented on YARN-9090: -- +1, thanks [~liuxun323]. > [Submarine] Adjust the submarine

[jira] [Commented] (YARN-9120) Need to have a way to turn off GPU auto-discovery in GpuDiscoverer

2018-12-14 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16721650#comment-16721650 ] Wangda Tan commented on YARN-9120: -- [~snemeth] / [~tangzhankun], I prefer to make GPU plugin can be

[jira] [Commented] (YARN-9116) Capacity Scheduler: add the default maximum-allocation-mb and maximum-allocation-vcores for the queues

2018-12-12 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719676#comment-16719676 ] Wangda Tan commented on YARN-9116: -- [~aihuaxu],  This sounds like a plan, but existing maximum memory,

[jira] [Commented] (YARN-9055) Capacity Scheduler: allow larger queue level maximum-allocation-mb to override the cluster configuration

2018-12-12 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719658#comment-16719658 ] Wangda Tan commented on YARN-9055: -- [~aihuaxu], I agree with Thomas, this looks like a change of

[jira] [Commented] (YARN-9015) [DevicePlugin] Add an interface for device plugin to provide customized scheduler

2018-12-12 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719376#comment-16719376 ] Wangda Tan commented on YARN-9015: -- Committed to trunk, thanks [~tangzhankun]! > [DevicePlugin] Add an

[jira] [Updated] (YARN-8885) [DevicePlugin] Support NM APIs to query device resource allocation

2018-12-12 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8885: - Summary: [DevicePlugin] Support NM APIs to query device resource allocation (was: Phase 1 - Support NM

[jira] [Updated] (YARN-9015) [DevicePlugin] Add an interface for device plugin to provide customized scheduler

2018-12-12 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-9015: - Summary: [DevicePlugin] Add an interface for device plugin to provide customized scheduler (was: Phase 1

[jira] [Commented] (YARN-9112) [Submarine] Support polling applicationId when it's not ready in cluster

2018-12-12 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719349#comment-16719349 ] Wangda Tan commented on YARN-9112: -- LGTM, +1. Thanks [~tangzhankun]. > [Submarine] Support polling

[jira] [Commented] (YARN-8885) Phase 1 - Support NM APIs to query device resource allocation

2018-12-12 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719347#comment-16719347 ] Wangda Tan commented on YARN-8885: -- Thanks [~tangzhankun], patch LGTM, will commit by today. > Phase 1 -

[jira] [Commented] (YARN-9078) [Submarine] Clean up the code of CliUtils#parseResourcesString

2018-12-12 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719354#comment-16719354 ] Wangda Tan commented on YARN-9078: -- Change looks good. Thanks [~tangzhankun].  > [Submarine] Clean up

[jira] [Commented] (YARN-9015) Phase 1 - Add an interface for device plugin to provide customized scheduler

2018-12-12 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719351#comment-16719351 ] Wangda Tan commented on YARN-9015: -- Thanks [~tangzhankun], latest patch LGTM, +1. > Phase 1 - Add an

[jira] [Commented] (YARN-9075) Dynamically add or remove auxiliary services

2018-12-12 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719307#comment-16719307 ] Wangda Tan commented on YARN-9075: -- Thanks [~billie.rinaldi],  The overall code flow looks good to me.

[jira] [Commented] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.

2018-12-11 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718234#comment-16718234 ] Wangda Tan commented on YARN-8714: -- [~tangzhankun], sounds like a plan, but let's try to solve the issue

[jira] [Commented] (YARN-9087) Better logging for initialization of Resource plugins

2018-12-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711837#comment-16711837 ] Wangda Tan commented on YARN-9087: -- [~snemeth], Device plugin framework is for the future plugins. We

[jira] [Updated] (YARN-8822) Nvidia-docker v2 support

2018-12-05 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8822: - Priority: Critical (was: Major) > Nvidia-docker v2 support > > >

[jira] [Commented] (YARN-8822) Nvidia-docker v2 support

2018-12-05 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710646#comment-16710646 ] Wangda Tan commented on YARN-8822: -- [~Charo Zhang], Thanks for the patch, apologize missed this Jira. I

[jira] [Updated] (YARN-8822) Nvidia-docker v2 support

2018-12-05 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8822: - Fix Version/s: (was: 3.1.2) > Nvidia-docker v2 support > > >

[jira] [Commented] (YARN-8870) [Submarine] Add submarine installation scripts

2018-12-04 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16709341#comment-16709341 ] Wangda Tan commented on YARN-8870: -- As we discussed offline, reverted the patch from branches. It's

[jira] [Updated] (YARN-8870) [Submarine] Add submarine installation scripts

2018-12-04 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8870: - Target Version/s: (was: 3.2.0) > [Submarine] Add submarine installation scripts >

[jira] [Updated] (YARN-8870) [Submarine] Add submarine installation scripts

2018-12-04 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8870: - Fix Version/s: (was: 3.2.0) > [Submarine] Add submarine installation scripts >

[jira] [Commented] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.

2018-12-04 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16709178#comment-16709178 ] Wangda Tan commented on YARN-8714: -- Thanks [~tangzhankun], what I remember is YARN doesn't support

[jira] [Commented] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.

2018-12-03 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707602#comment-16707602 ] Wangda Tan commented on YARN-8714: -- [~liuxun323], fair enough.  [~tangzhankun], I think we can add a

[jira] [Commented] (YARN-9015) Phase 1 - Add an interface for device plugin to provide customized scheduler

2018-12-03 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707587#comment-16707587 ] Wangda Tan commented on YARN-9015: -- [~tangzhankun], 1) DevicePluginScheduler: Why use Integer instead

[jira] [Commented] (YARN-8885) Phase 1 - Support NM APIs to query device resource allocation

2018-12-03 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707577#comment-16707577 ] Wangda Tan commented on YARN-8885: -- [~tangzhankun], could u provide example output of the API? Thanks,

[jira] [Commented] (YARN-9078) [Submarine] Clean up the code of CliUtils#parseResourcesString

2018-12-03 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707574#comment-16707574 ] Wangda Tan commented on YARN-9078: -- [~tangzhankun], I'm wondering if {code} 82 if

[jira] [Commented] (YARN-9050) Usability improvements for scheduler activities

2018-12-03 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707553#comment-16707553 ] Wangda Tan commented on YARN-9050: -- [~Tao Yang], make sense to me. Once you figured out details, I can

[jira] [Commented] (YARN-8870) [Submarine] Add submarine installation scripts

2018-12-01 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706070#comment-16706070 ] Wangda Tan commented on YARN-8870: -- [~liuxun323], I figured out how to do it manually First you need to

[jira] [Commented] (YARN-9010) Fix the incorrect trailing slash deletion in constructor method of CGroupsHandlerImpl

2018-11-29 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16703996#comment-16703996 ] Wangda Tan commented on YARN-9010: -- Committed to trunk, thanks [~tangzhankun]. > Fix the incorrect

[jira] [Updated] (YARN-9010) Fix the incorrect trailing slash deletion in constructor method of CGroupsHandlerImpl

2018-11-29 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-9010: - Priority: Major (was: Minor) > Fix the incorrect trailing slash deletion in constructor method of >

[jira] [Commented] (YARN-8870) [Submarine] Add submarine installation scripts

2018-11-29 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16703895#comment-16703895 ] Wangda Tan commented on YARN-8870: -- That's my bad, [~liuxun323], could u work on an addendum patch to get

[jira] [Commented] (YARN-9050) Usability improvements for scheduler activities

2018-11-29 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16703553#comment-16703553 ] Wangda Tan commented on YARN-9050: -- [~Tao Yang], thanks for filing the JIRA. The all issues you

[jira] [Commented] (YARN-9060) [YARN-8851] Phase 1 - Support device isolation in native container-executor

2018-11-29 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16703546#comment-16703546 ] Wangda Tan commented on YARN-9060: -- [~tangzhankun], explanation makes sense, and the issue about GPU

[jira] [Resolved] (YARN-8975) [Submarine] Use predefined Charset object StandardCharsets.UTF_8 instead of String "UTF-8"

2018-11-28 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan resolved YARN-8975. -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.3.0 Committed to trunk, thanks

[jira] [Commented] (YARN-8989) Move DockerCommandPlugin volume related APIs' invocation from DockerLinuxContainerRuntime#prepareContainer to #launchContainer

2018-11-28 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702480#comment-16702480 ] Wangda Tan commented on YARN-8989: -- LGTM, thanks [~tangzhankun], committing. > Move DockerCommandPlugin

[jira] [Updated] (YARN-8882) [YARN-8851] Add a shared device mapping manager (scheduler) for device plugins

2018-11-28 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8882: - Summary: [YARN-8851] Add a shared device mapping manager (scheduler) for device plugins (was: Phase 1 -

[jira] [Commented] (YARN-9061) Improve the GPU/FPGA module log message of container-executor

2018-11-28 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702451#comment-16702451 ] Wangda Tan commented on YARN-9061: -- +1, thanks [~tangzhankun], committing. > Improve the GPU/FPGA module

[jira] [Commented] (YARN-7277) Container Launch expand environment needs to consider bracket matching

2018-11-28 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702450#comment-16702450 ] Wangda Tan commented on YARN-7277: -- [~tangzhankun], typically what you can do is add a new line or empty

[jira] [Comment Edited] (YARN-7277) Container Launch expand environment needs to consider bracket matching

2018-11-28 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702450#comment-16702450 ] Wangda Tan edited comment on YARN-7277 at 11/28/18 10:30 PM: - [~tangzhankun],

[jira] [Comment Edited] (YARN-9060) [YARN-8851] Phase 1 - Support device isolation in native container-executor

2018-11-28 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702438#comment-16702438 ] Wangda Tan edited comment on YARN-9060 at 11/28/18 10:26 PM: - [~tangzhankun],

[jira] [Commented] (YARN-9060) [YARN-8851] Phase 1 - Support device isolation in native container-executor

2018-11-28 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702438#comment-16702438 ] Wangda Tan commented on YARN-9060: -- [~tangzhankun], just want to understand some high level

[jira] [Commented] (YARN-8882) Phase 1 - Add a shared device mapping manager for device plugin to use

2018-11-28 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702429#comment-16702429 ] Wangda Tan commented on YARN-8882: -- Thanks [~tangzhankun], existing code looks good, committing .. >

[jira] [Commented] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.

2018-11-28 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702424#comment-16702424 ] Wangda Tan commented on YARN-8714: -- Thanks [~tangzhankun] for working on the patch, several comments:

[jira] [Updated] (YARN-9030) Log aggregation changes to handle filesystems which do not support setting permissions

2018-11-21 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-9030: - Summary: Log aggregation changes to handle filesystems which do not support setting permissions (was:

[jira] [Commented] (YARN-8882) Phase 1 - Add a shared device mapping manager for device plugin to use

2018-11-21 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695450#comment-16695450 ] Wangda Tan commented on YARN-8882: -- [~tangzhankun], why rename "device-scheduler" to

[jira] [Commented] (YARN-9030) Log aggregation changes to handle filesystems which do not support permissions

2018-11-20 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16693852#comment-16693852 ] Wangda Tan commented on YARN-9030: -- Thanks [~suma.shivaprasad], +1, will get it committed later today.

[jira] [Commented] (YARN-8881) [YARN-8851] Add basic pluggable device plugin framework

2018-11-19 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692007#comment-16692007 ] Wangda Tan commented on YARN-8881: -- [~tangzhankun], patch committed to trunk. Thanks for reviews from

<    1   2   3   4   5   6   7   8   9   10   >