[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692020#comment-16692020 ] Hudson commented on YARN-8851: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15462 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15462/]) YARN-8881. [YARN-8851] Add basic pluggable device plugin framework. (wangda: rev 63578036450f660d49ae204327efcd629d9dd137) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/resourceplugin/ResourcePluginManager.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/api/deviceplugin/DeviceRegisterRequest.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/resourceplugin/deviceframework/package-info.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/resourceplugin/deviceframework/FakeTestDevicePlugin2.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/resourceplugin/package-info.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/resourceplugin/deviceframework/TestDevicePluginAdapter.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/resourceplugin/deviceframework/FakeTestDevicePlugin3.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/resource-types-pluggable-devices.xml * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/api/deviceplugin/package-info.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/api/deviceplugin/DevicePlugin.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/api/deviceplugin/MountDeviceSpec.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/api/deviceplugin/VolumeSpec.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/resourceplugin/deviceframework/DevicePluginAdapter.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/api/deviceplugin/Device.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/resourceplugin/deviceframework/FakeTestDevicePlugin4.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/api/deviceplugin/YarnRuntimeType.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/api/deviceplugin/DeviceRuntimeSpec.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/resourceplugin/TestResourcePluginManager.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/api/deviceplugin/MountVolumeSpec.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/resourceplugin/deviceframework/DeviceResourceUpdaterImpl.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/resourceplugin/deviceframework/FakeTestDevicePlugin1.java > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 > URL: https://issues.apache.org/jira/browse/YARN-8851 > Project: Hadoop YARN > Issue
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677733#comment-16677733 ] Hadoop QA commented on YARN-8851: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 47s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 4s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 13m 37s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 41s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 5s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 16s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 175 new + 240 unchanged - 3 fixed = 415 total (was 243) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 10m 42s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 44s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 2s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 35s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 41s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 87m 17s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8851 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947182/YARN-8851-trunk.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux c3679116819d 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676074#comment-16676074 ] Zhankun Tang commented on YARN-8851: {quote}1) Regarding to the NM_PLUGGABLE_DEVICE_FRAMEWORK_PREFER_CUSTOMIZED_SCHEDULER, should we just use default scheduler if device plugin doesn't provide their customized scheduler? We should assume that load device plugin runs "trusted" code, we may not need to add extra protection here. {quote} Zhankun–> Agree. {quote}2) DeviceSchedulerManager, it sounds like "manages scheduler", however it handles how to map device to containers, and scheduler is just implementation details. How about call it DeviceMappingManager? {quote} - {quote}internalAssignDevices should be private, and it is a bit long, might be better for future maintenance if you can break it down to multiple methods. {quote} Zhankun -> Good idea. Will do that. {quote}I think we could move to make this POC to sub tasks and get them done piece by piece. It gonna be helpful if you can highlight subtasks required. {quote} Zhankun-> The YARN-8880, YARN-8881, YARN-8882, YARN-8883, YARN-8885 are our Phase 1 highlighted subtasks. Thanks for the review! [~leftnoteasy] > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 > URL: https://issues.apache.org/jira/browse/YARN-8851 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8851-WIP2-trunk.001.patch, > YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, > YARN-8851-WIP5-trunk.001.patch, YARN-8851-WIP6-trunk.001.patch, > YARN-8851-WIP7-trunk.001.patch, YARN-8851-WIP8-trunk.001.patch, > YARN-8851-WIP9-trunk.001.patch, YARN-8851-trunk.001.patch, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-4.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal.pdf > > > At present, we support GPU/FPGA device in YARN through a native, coupling > way. But it's difficult for a vendor to implement such a device plugin > because the developer needs much knowledge of YARN internals. And this brings > burden to the community to maintain both YARN core and vendor-specific code. > Here we propose a new device plugin framework to ease vendor device plugin > development and provide a more flexible way to integrate with YARN NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675606#comment-16675606 ] Wangda Tan commented on YARN-8851: -- Thanks [~tangzhankun] , 1) Regarding to the NM_PLUGGABLE_DEVICE_FRAMEWORK_PREFER_CUSTOMIZED_SCHEDULER, should we just use default scheduler if device plugin doesn't provide their customized scheduler? We should assume that load device plugin runs "trusted" code, we may not need to add extra protection here. 2) DeviceSchedulerManager, it sounds like "manages scheduler", however it handles how to map device to containers, and scheduler is just implementation details. How about call it DeviceMappingManager? - internalAssignDevices should be private, and it is a bit long, might be better for future maintenance if you can break it down to multiple methods. I think we could move to make this POC to sub tasks and get them done piece by piece. It gonna be helpful if you can highlight subtasks required. > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 > URL: https://issues.apache.org/jira/browse/YARN-8851 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8851-WIP2-trunk.001.patch, > YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, > YARN-8851-WIP5-trunk.001.patch, YARN-8851-WIP6-trunk.001.patch, > YARN-8851-WIP7-trunk.001.patch, YARN-8851-WIP8-trunk.001.patch, > YARN-8851-WIP9-trunk.001.patch, YARN-8851-trunk.001.patch, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-4.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal.pdf > > > At present, we support GPU/FPGA device in YARN through a native, coupling > way. But it's difficult for a vendor to implement such a device plugin > because the developer needs much knowledge of YARN internals. And this brings > burden to the community to maintain both YARN core and vendor-specific code. > Here we propose a new device plugin framework to ease vendor device plugin > development and provide a more flexible way to integrate with YARN NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675015#comment-16675015 ] Hadoop QA commented on YARN-8851: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 6 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 11s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 54s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 33s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 186 new + 229 unchanged - 3 fixed = 415 total (was 232) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 0s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 11s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager generated 5 new + 0 unchanged - 0 fixed = 5 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 51s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 47s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 21s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 18m 39s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 38s{color} | {color:red} The patch generated 2 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}109m 42s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | | Null pointer dereference of allocated in org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.deviceframework.DeviceResourceDockerRuntimePluginImpl.getAllocatedDevices(Container, Set) Dereferenced at
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673972#comment-16673972 ] Zhankun Tang commented on YARN-8851: [~leftnoteasy] [~csingh] . Thanks for the review. After offline discussion with wangda, we'll prefer to give below one interface for the plugin to prepare device and reply DeviceRuntimeSpec after device allocated. Also, we can improve it as time goes by. {code:java} /** * Asking how these devices should be prepared/used * before/when container launch. A plugin can do some tasks in its own or * define it in DeviceRuntimeSpec to let the framework do it. * For instance, define {@code VolumeSpec} to let the * framework to create volume before running container. * * @param allocatedDevices A set of allocated {@link Device}. * @param yarnRuntime Indicate which runtime YARN will use * Could be {@code docker} or {@code default} * in {@link DeviceRuntimeSpec} constants * @return a {@link DeviceRuntimeSpec} description about environment, * {@link VolumeSpec}, {@link MountVolumeSpec}. etc * */ DeviceRuntimeSpec onDeviceAllocated(Set allocatedDevices, String yarnRuntime);{code} > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 > URL: https://issues.apache.org/jira/browse/YARN-8851 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8851-WIP2-trunk.001.patch, > YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, > YARN-8851-WIP5-trunk.001.patch, YARN-8851-WIP6-trunk.001.patch, > YARN-8851-WIP7-trunk.001.patch, YARN-8851-WIP8-trunk.001.patch, > YARN-8851-WIP9-trunk.001.patch, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal.pdf > > > At present, we support GPU/FPGA device in YARN through a native, coupling > way. But it's difficult for a vendor to implement such a device plugin > because the developer needs much knowledge of YARN internals. And this brings > burden to the community to maintain both YARN core and vendor-specific code. > Here we propose a new device plugin framework to ease vendor device plugin > development and provide a more flexible way to integrate with YARN NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667341#comment-16667341 ] Zhankun Tang commented on YARN-8851: Updated the patch. # Add a sanity-check to fast fail an incompatible plugin # Add topology information in Device class > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 > URL: https://issues.apache.org/jira/browse/YARN-8851 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8851-WIP2-trunk.001.patch, > YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, > YARN-8851-WIP5-trunk.001.patch, YARN-8851-WIP6-trunk.001.patch, > YARN-8851-WIP7-trunk.001.patch, YARN-8851-WIP8-trunk.001.patch, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal.pdf > > > At present, we support GPU/FPGA device in YARN through a native, coupling > way. But it's difficult for a vendor to implement such a device plugin > because the developer needs much knowledge of YARN internals. And this brings > burden to the community to maintain both YARN core and vendor-specific code. > Here we propose a new device plugin framework to ease vendor device plugin > development and provide a more flexible way to integrate with YARN NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665981#comment-16665981 ] Zhankun Tang commented on YARN-8851: [~csingh] , {quote}However, the DevicePluginAPI can have a method {code:java} onDevicesAllocated(Set allocatedDevices) {code} Let's say the user does NOT want to provide a custom DevicePluginScheduler and use the default one. This will be call back after the devices get allocated. It also seems to complete the current DevicePluginAPI which has a {{onDevicesReleased(devices)}} method. {quote} Yeah. Very good point. Let the plugin do something after YARN allocated these devices is ok. But from our current use case in GPU/FPGA, we haven't an idea of what will the plugin do after we allocated the devices. And this interesting name "onDeviceAllocated" is used in my prior patch to let the plugin do some preparation and provide runtime spec. I changed it from "preLaunchContainer" to "onDeviceAllocated" and to "onDeviceUse" and to "getDeviceRuntimeSpec" based on feedback. But I admit that the method is still confusing in the meaning/name. Let's go over the interface again. The current status of my latest code: {code:java} /** * A must interface for vendor plugin to implement. * */ public interface DevicePlugin { /** * Called first when device plugin framework wants to register * @return DeviceRegisterRequest {@link DeviceRegisterRequest} * */ DeviceRegisterRequest getRegisterRequestInfo(); /** * Called when update node resource * @return a set of {@link Device}, {@link java.util.TreeSet} recommended * */ Set getDevices(); /** * Asking how these devices should be prepared/used before/when container launch. * @param allocatedDevices A set of allocated {@link Device}. * Note that it could be null which means no device allocated. * Only {@code volumeClaims} in it will be handled to create volume. * @param runtime Indicate which runtime the framework will use * Could be {@code RUNTIME_CGROUPS} or {@code RUNTIME_DOCKER} * in {@link DeviceRuntimeSpec} * @return a {@link DeviceRuntimeSpec} description about environment, * {@link VolumeSpec}, {@link MountVolumeSpec}. etc * */ // THis is called onDeviceAllocated in prior patches. DeviceRuntimeSpec getDeviceRuntimeSpec(Set allocatedDevices, String runtime); /** * Called after device released. * */ void onDevicesReleased(Set releasedDevices); }{code} The "getRegisterRequestInfo" and "getDevice" is quite clear. And "onDeviceRelease" is also clear that this is a hook when container finishes(devices back to YARN, some plugin may do some cleanup or device reset). But the name "_getDeviceRuntimeSpec_" is still a little confusing after I think about it again. I should explain more details on this. My original name "_onDeviceUse_"'s intention is to tell a plugin that YARN is going to *USE* the "alloccatedDevice"(no matter who allocated it) and asking how to use these devices by this runtime ( sets environments, volume/device mounts, volume creation). Confusing comes when the allocation can be null. " If the allocatedDevices is null and runtime is Docker, the plugin can do some preparation or prefer YARN do it, for instance, in Nvidia GPU Docker case it needs a docker volume to be created which needs permission. If the allocatedDevices is not null and runtime is Docker, the plugin should tell YARN which device and volume to mount and what environment to set. " This explanation is confusing and indicates the limitation of our YARN internal plugin lifecycle management. I shouldn't keep our internals easy but make such a complex method parameter. I passed a null and expect the plugin understand the intention and return volume creation request. This is silly. !/jira/images/icons/emoticons/smile.png! The current YARN internal lifecycle management shortage details are here. You can skip it and go to the end directly if too much detail to read. The internal "_DockerCommandPlugin_'s "_getCreateDockerVolumeCommand_" is called in _DockerLinuxContainerRuntime_'s "_prepareContainer_". At this time, the container hasn't been allocated (we can do allocation in the _DockerCommandPlugin_'s getCreateDockerVolumeCommand but this is weird) so I pass null value to the hook. Maybe this hook should volume creation be moved to launchContainer(before update the Docker run command). In this case, we can allocate the devices in _ResourceHandlerChain_ before the _DockerCommandPlugin's_ method invocation. The allocation passed in the vendor plugin won't be null. And the _DeviceRuntimeSpec_ can be shared and passed to DockerCommandPlugin to do volume creation and docker run stuff later. Based on the above discussion, now I prefer to keep the original"onDeviceUse" intention/scope unchanged but try to find a better name for it. If we agree this API's scope, we should change YARN
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665382#comment-16665382 ] Chandni Singh commented on YARN-8851: - Thanks [~tangzhankun] {quote} "DeviceRegisterRequest getDeviceResourceInfo" and "DeviceRegisterRequest getRegisterRequestInfo()". Maybe the later one is more acurate since the "DeviceRegisterRequest" may contains more info besides resource name & version we currently want? {quote} {{getRegisterRequestInfo()}} is good. {quote} We have another interface "DevicePluginScheduler" to do this. {quote} {code} Set allocateDevices(Set availableDevices, Integer count); {code} I saw the above API. This one seems that if the implementation of custom scheduler is provided, this implementation will allocate devices which is fine. However, the DevicePluginAPI can have a method {code} onDevicesAllocated(Set allocatedDevices) {code} Let's say the user does NOT want to provide a custom DevicePluginScheduler and use the default one. This will be call back after the devices get allocated. It also seems to complete the current DevicePluginAPI which has a {{onDevicesReleased(devices)}} method. > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 > URL: https://issues.apache.org/jira/browse/YARN-8851 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8851-WIP2-trunk.001.patch, > YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, > YARN-8851-WIP5-trunk.001.patch, YARN-8851-WIP6-trunk.001.patch, > YARN-8851-WIP7-trunk.001.patch, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal.pdf > > > At present, we support GPU/FPGA device in YARN through a native, coupling > way. But it's difficult for a vendor to implement such a device plugin > because the developer needs much knowledge of YARN internals. And this brings > burden to the community to maintain both YARN core and vendor-specific code. > Here we propose a new device plugin framework to ease vendor device plugin > development and provide a more flexible way to integrate with YARN NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16664540#comment-16664540 ] Zhankun Tang commented on YARN-8851: [~csingh] , Thanks for the review! {quote}1. {code:java} DeviceRegisterRequest register(); {code} This is misleading. {{register()}} would mean that the device plugin is registering itself. However, here we need some information from the device plugin. Maybe, it can be changed to something like {code:java} DeviceResourceInfo getDeviceResourceInfo() {code} {quote} Zhankun-> Yeah. Weiwei also mentioned this problem. "getDeviceResourceInfo" is also very good. Now we have two names for it. :) "DeviceRegisterRequest getDeviceResourceInfo" and "DeviceRegisterRequest getRegisterRequestInfo()". Maybe the later one is more acurate since the "DeviceRegisterRequest" may contains more info besides resource name & version we currently want? {quote}2. {code:java} DeviceRuntimeSpec onDevicesUse(Set allocatedDevices, String runtime); {code} If this is get the {{DeviceRuntimeSpec}}, then should it be called {{getDeviceRuntimeSpec()}} ? {quote} Zhankun-> That's a good idea. {quote}3. Since we have callback for devices released, do we also need a callback for devices allocated? {{void onDevicesAllocated(Set allocatedDevices)}} {quote} Zhankun-> We have another interface "DevicePluginScheduler" to do this. And one may ask the reason why it's two interfaces, the intention here is that this scheduler interface is optional. And the other one is a must. {code:java} /** * Called when allocating devices. The framework will do all device book keeping * and fail recovery. So this hook should only do scheduling based on available devices * passed in. This method could be invoked multiple times. * @param availableDevices Devices allowed to be chosen from. * @param count Number of device to be allocated. * @return a set of {@link Device} * */ Set allocateDevices(Set availableDevices, Integer count);{code} {quote}4. Just a suggestion about logging Use slf4j logging format since that's the framework we are using and it improves readability of logging stmts. eg. instead of {{LOG.info("Adapter of " + pluginClassName + " created. Initializing..");}} we can use {code:java} LOG.info("Adapter of {} created. Initializing..", pluginClassName);{code} {quote} Zhankun -> Yeah. I also noted that we're using slf4j here in this "ResourcePluginManager" instead of log4j. Will change it. > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 > URL: https://issues.apache.org/jira/browse/YARN-8851 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8851-WIP2-trunk.001.patch, > YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, > YARN-8851-WIP5-trunk.001.patch, YARN-8851-WIP6-trunk.001.patch, > YARN-8851-WIP7-trunk.001.patch, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal.pdf > > > At present, we support GPU/FPGA device in YARN through a native, coupling > way. But it's difficult for a vendor to implement such a device plugin > because the developer needs much knowledge of YARN internals. And this brings > burden to the community to maintain both YARN core and vendor-specific code. > Here we propose a new device plugin framework to ease vendor device plugin > development and provide a more flexible way to integrate with YARN NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16664228#comment-16664228 ] Chandni Singh commented on YARN-8851: - [~tangzhankun] Thanks for working on this. I have few initial comments about the Device Plugin API 1. {code:java} DeviceRegisterRequest register(); {code} This is misleading. {{register()}} would mean that the device plugin is registering itself. However, here we need some information from the device plugin. Maybe, it can be changed to something like {code:java} DeviceResourceInfo getDeviceResourceInfo() {code} 2. {code:java} DeviceRuntimeSpec onDevicesUse(Set allocatedDevices, String runtime); {code} If this is get the {{DeviceRuntimeSpec}}, then should it be called {{getDeviceRuntimeSpec()}} ? 3. Since we have callback for devices released, do we also need a callback for devices allocated? \{{ void onDevicesAllocated(Set allocatedDevices)}} 4. Just a suggestion about logging Use slf4j logging format since that's the framework we are using and it improves readability of logging stmts. eg. instead of {{LOG.info("Adapter of " + pluginClassName + " created. Initializing..");}} we can use : \{{LOG.info("Adapter of {} created. Initializing..", pluginClassName); }} > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 > URL: https://issues.apache.org/jira/browse/YARN-8851 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8851-WIP2-trunk.001.patch, > YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, > YARN-8851-WIP5-trunk.001.patch, YARN-8851-WIP6-trunk.001.patch, > YARN-8851-WIP7-trunk.001.patch, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal.pdf > > > At present, we support GPU/FPGA device in YARN through a native, coupling > way. But it's difficult for a vendor to implement such a device plugin > because the developer needs much knowledge of YARN internals. And this brings > burden to the community to maintain both YARN core and vendor-specific code. > Here we propose a new device plugin framework to ease vendor device plugin > development and provide a more flexible way to integrate with YARN NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16663495#comment-16663495 ] Zhankun Tang commented on YARN-8851: Sorry that I missed your comments. Thanks [~cheersyang] . :) > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 > URL: https://issues.apache.org/jira/browse/YARN-8851 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8851-WIP2-trunk.001.patch, > YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, > YARN-8851-WIP5-trunk.001.patch, YARN-8851-WIP6-trunk.001.patch, > YARN-8851-WIP7-trunk.001.patch, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal.pdf > > > At present, we support GPU/FPGA device in YARN through a native, coupling > way. But it's difficult for a vendor to implement such a device plugin > because the developer needs much knowledge of YARN internals. And this brings > burden to the community to maintain both YARN core and vendor-specific code. > Here we propose a new device plugin framework to ease vendor device plugin > development and provide a more flexible way to integrate with YARN NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661657#comment-16661657 ] Zhankun Tang commented on YARN-8851: [~leftnoteasy], Thanks for the review. Answer as below: {quote}1) From a user perspective, what needs to be implemented? Is it just following two? DevicePlugin (required) DevicePluginScheduler (optional) {quote} Zhankun-> Yeah. Just the follow two. {quote}2) It's good to see you added a examples package, it will be useful for user to start with. However instead of providing a fake implementation, can we implement a demo device plugin that can be actually configured and tested on a single node cluster? This will give more sense to user how to implement their own plugin. Further, it will be good if you can provide a sanity test-suit to verify if a device plugin is compatible. {quote} Zhankun-> The fake device plugin can be actually configured and tested. The only problem in my mind here is in the example it's just a class but not an maven project with pom.xml in it. Add pom.xml dependencies in the document and the example device plugin code comments? For the sanity test-suit, will do that. {quote}3) Some high-level comments about the APIs in DevicePlugin DeviceRegisterRequest register(); This is a bit confusing. A register() function is normally a two-side call, e.g a slave registers itself to a master. But here it simply returns a DeviceRegisterRequest, it looks more like a getDeviceInfo() API to me. Set getDevices(); is this supposed to return a set available devices? If so, is it better to rename it to "getAvailableDevices"? {quote} Zhankun-> The DeviceRegisterRequest contains the name of the resource type that plugin wants to register. And maybe other info in the future. How about "DeviceRegisterRequest getRegisterInfo()"? Yeah. "getAvailableDevices" is more concrete. I'm afraid once we support monitoring the devices, this method would be called regularly. The name is also a little confusing to the plugin which has scheduling logic. It may be confused by what the available means? Do I need to count the already using devices in? I guess we are actually asking allowed devices? How about "Set getAllowedDevices"? {quote}4) It is interesting to allow customized DevicePluginScheduler, how failure recovery can be done? Does that mean user needs to implement all the logic about allocated resource persistent & recovery in NM store? In that case, we are exposing too much YARN internals in a plugin framework. {quote} Zhankun-> YARN will do bookkeeping and persistent & recovery of all the customized device plugin scheduler's allocation. The DevicePluginScheduler should be stateless. Check the API description below, and we ensure the "availabeDevices" we passed into the API is an immutable set. Calling the API won't affect YARN stability. Here we ask the plugin this question "hey, there's some available devices at my hand, choose N for me". The vendor plugin developer can check it and do customized scheduling based the topology, utilization, virtualization or health status based on its own idea that we don't know. {code:java} /** * Called when allocating devices. The framework will do all device book keeping * and fail recovery. So this hook should only do scheduling based on available devices * passed in. This method could be invoked multiple times. * @param availableDevices Devices allowed to be chosen from. * @param count Number of device to be allocated. * @return a set of {@link Device} * */ Set allocateDevices(Set availableDevices, Integer count);{code} {quote}5) DevicePluginAdapter doesn't look like a adaptor, it looks more like a base class of ResourcePlugin to me. Pls correct me if I misunderstood this. {quote} Zhankun-> I'm afraid not. One device plugin instance is wrapped with one DevicePluginAdapter to be integrated into the YARN ResourcePlugin handling process. In this angle, the DevicePluginAdapter adapts YARN's requirements to the plugin instance. I haven't got a better name for it. The previous implementation of DevicePluginAdapter is to inherit 4 interfaces. Now it only inherit the ResourcePlugin. How about "DeviceResourceImpl"? {quote}6) It is confusing that DevicePluginAdapter has a reference to ResourcePluginManager, could you remove that? From what I can see, ResourcePluginManager manages all ResourcePlugins, and each ResourcePlugins can be instanced by a DevicePluginAdapter. {quote} Zhankun-> Yeah, It's a legacy in WIP patch. Will remove that. One thing to clarify is that the DevicePluginAdapter itsefl is actually a ResourcePlugin. It is added into ResourcePluginManager's pluginMap. > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 >
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660906#comment-16660906 ] Weiwei Yang commented on YARN-8851: --- Hi [~tangzhankun] Thanks for the design doc and patch. I have some high-level comments 1) From a user perspective, what needs to be implemented? Is it just following two? * DevicePlugin (required) * DevicePluginScheduler (optional) 2) It's good to see you added a *examples* package, it will be useful for user to start with. However instead of providing a fake implementation, can we implement a demo device plugin that can be actually configured and tested on a single node cluster? This will give more sense to user how to implement their own plugin. Further, it will be good if you can provide a sanity test-suit to verify if a device plugin is compatible. 3) Some high-level comments about the APIs in {{DevicePlugin}} {code:java} DeviceRegisterRequest register(); {code} This is a bit confusing. A register() function is normally a two-side call, e.g a slave registers itself to a master. But here it simply returns a {{DeviceRegisterRequest}}, it looks more like a {{getDeviceInfo()}} API to me. {code:java} Set getDevices(); {code} is this supposed to return a set available devices? If so, is it better to rename it to "getAvailableDevices"? 4) It is interesting to allow customized {{DevicePluginScheduler}}, how failure recovery can be done? Does that mean user needs to implement all the logic about allocated resource persistent & recovery in NM store? In that case, we are exposing too much YARN internals in a plugin framework. 5) {{DevicePluginAdapter}} doesn't look like a adaptor, it looks more like a base class of {{ResourcePlugin}} to me. Pls correct me if I misunderstood this. 6) It is confusing that DevicePluginAdapter has a reference to ResourcePluginManager, could you remove that? From what I can see, ResourcePluginManager manages all ResourcePlugins, and each ResourcePlugins can be instanced by a DevicePluginAdapter. Let me know if these make sense. Thanks > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 > URL: https://issues.apache.org/jira/browse/YARN-8851 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8851-WIP2-trunk.001.patch, > YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, > YARN-8851-WIP5-trunk.001.patch, YARN-8851-WIP6-trunk.001.patch, > YARN-8851-WIP7-trunk.001.patch, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal.pdf > > > At present, we support GPU/FPGA device in YARN through a native, coupling > way. But it's difficult for a vendor to implement such a device plugin > because the developer needs much knowledge of YARN internals. And this brings > burden to the community to maintain both YARN core and vendor-specific code. > Here we propose a new device plugin framework to ease vendor device plugin > development and provide a more flexible way to integrate with YARN NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660804#comment-16660804 ] Zhankun Tang commented on YARN-8851: [~leftnoteasy], Agree and updated the patch. Please review: 1. Change one API of DevicePlugin: Added a runtime parameter of the API as below. {code:java} /** * Asking how these devices should be prepared/used before container launch. * @param allocatedDevices A set of allocated {@link Device}. *Note that it could be null which means no device allocated. *Only {@code volumeClaims} in it will be handled to create volume. * @param runtime Indicate which runtime the framework will use *Could be {@code RUNTIME_CGROUPS} or {@code RUNTIME_DOCKER} *in {@link DeviceRuntimeSpec} * @return a {@link DeviceRuntimeSpec} description about environment, * {@link VolumeSpec}, {@link MountVolumeSpec}. etc * */ DeviceRuntimeSpec onDevicesUse(Set allocatedDevices, String runtime); {code} 2. Added some code to show how to get DeviceRuntimeSpec and use it. The above onDevicesUse is called in ResourceHandler's preStart and DockerCommandPlugin's all three methods. Because DockerCommanPlugin's getCreateDockerVolumeCommand method is called before ResourceHandler's preStart if runtime is Docker. So here the allocatedDevices would be null to pass in. The code here let the device plugin return DeviceRuntimeSpec with only VolumeSpec in it which requires YARN to create docker volume. Or it can return an empty object if it can create in its own way. Then YARN does nothing for the docker volume creation. This above API might be a little complex. And we can also add one interface like below. But I'm not quite sure if the two "onDevicePreparation" and "onDeviceUse" would cause confusion. In theory, we can change our internals to make the allocation earlier and visible to DockerCommandPlugin. In that way, the allocation will not be null and the onDeviceUse seems clear. So I don't add one. {code:java} VolumeSpec onDevicePreparation(String runtime) {code} Please let me know your thoughts. Thanks. > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 > URL: https://issues.apache.org/jira/browse/YARN-8851 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8851-WIP2-trunk.001.patch, > YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, > YARN-8851-WIP5-trunk.001.patch, YARN-8851-WIP6-trunk.001.patch, > YARN-8851-WIP7-trunk.001.patch, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal.pdf > > > At present, we support GPU/FPGA device in YARN through a native, coupling > way. But it's difficult for a vendor to implement such a device plugin > because the developer needs much knowledge of YARN internals. And this brings > burden to the community to maintain both YARN core and vendor-specific code. > Here we propose a new device plugin framework to ease vendor device plugin > development and provide a more flexible way to integrate with YARN NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659747#comment-16659747 ] Wangda Tan commented on YARN-8851: -- [~tangzhankun], Thanks for updating the patch, the latest patch looks much better now. One suggestion: * The DevicePluginAdapter extends/implements 4 interfaces, Instead of doing that, is it possible to just make the Adapter implements ResourcePlugin interface, and make several "sub-adapter" to implement ResourceHandler, DockerCommandPlugin, and NMResourceUpdaterPlugin? By doing this, we can get a more grandularized interface definition and very much close to ResourcePlugin interface so less changes of integration code required. * I can understand most of the DevicePluginAdapter logics should be alike GPUResourcePlugin implementation, but some part will come from DeviceRuntimeSpec. It gonna be help to get more concrete implementation to see if our APIs properly designed or not. And I haven't dig into details of code logics / naming, etc. while we're trying to sort out overall code structure. > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 > URL: https://issues.apache.org/jira/browse/YARN-8851 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8851-WIP2-trunk.001.patch, > YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, > YARN-8851-WIP5-trunk.001.patch, YARN-8851-WIP6-trunk.001.patch, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal.pdf > > > At present, we support GPU/FPGA device in YARN through a native, coupling > way. But it's difficult for a vendor to implement such a device plugin > because the developer needs much knowledge of YARN internals. And this brings > burden to the community to maintain both YARN core and vendor-specific code. > Here we propose a new device plugin framework to ease vendor device plugin > development and provide a more flexible way to integrate with YARN NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658763#comment-16658763 ] Zhankun Tang commented on YARN-8851: [~leftnoteasy], [~sunilg], [~cheersyang]. Updated the patch for your review. The key changes are: 1. Added _DeviceRuntimeSpec_ related classes 2. Added another new interface class file "DevicePluginScheduler" for the vendor to implement to provide their own schedule logic in the method _"Set allocateDevices(Set availableDevices, Integer count)"_. The framework will use plugin's scheduling logic based on configuration. If set true to "yarn.nodemanager.pluggable-device-framework.prefer-customized-scheduler", otherwise, use internal scheduling logic 3. Change current "DevicePlugin"'s interfaces names. "preLaunchContainer => OnDevicesAllocated postCompleteContainer => OnDeviceReleased" 4. Change name of "DeviceLocalScheduler" to "DeviceSchedulerManager". 5. Added some unit tests to check basic workflow of DevicePlugin, DevicePluginAdapter, and DeviceSchedulerManager. > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 > URL: https://issues.apache.org/jira/browse/YARN-8851 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8851-WIP2-trunk.001.patch, > YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, > YARN-8851-WIP5-trunk.001.patch, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal.pdf > > > At present, we support GPU/FPGA device in YARN through a native, coupling > way. But it's difficult for a vendor to implement such a device plugin > because the developer needs much knowledge of YARN internals. And this brings > burden to the community to maintain both YARN core and vendor-specific code. > Here we propose a new device plugin framework to ease vendor device plugin > development and provide a more flexible way to integrate with YARN NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656188#comment-16656188 ] Zhankun Tang commented on YARN-8851: [~leftnoteasy] , a lot thanks for the offline discussion. Agreed that we remove the unnecessary APIVersion field since we can throw an exception if the plugin is not compatible. For the Factory pattern involved to create device adapter or device plugin instance, we'll keep it for the future plan if we encounter huge complexity in current design. So we'll go with the current DevicePlugn interface (keep adapter invisible to vendor developer) and shared device local scheduler. But try to leave the vendor interface to insert their own device scheduling logic. > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 > URL: https://issues.apache.org/jira/browse/YARN-8851 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8851-WIP2-trunk.001.patch, > YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal.pdf > > > At present, we support GPU/FPGA device in YARN through a native, coupling > way. But it's difficult for a vendor to implement such a device plugin > because the developer needs much knowledge of YARN internals. And this brings > burden to the community to maintain both YARN core and vendor-specific code. > Here we propose a new device plugin framework to ease vendor device plugin > development and provide a more flexible way to integrate with YARN NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654594#comment-16654594 ] Zhankun Tang commented on YARN-8851: [~leftnoteasy] , For question 6, I'll try to answer here and we can talk offline if it's not clear to you. The design principle here I'm trying to follow is trying to make the vendor completely agnostic to our YARN internals. Simpler for them, better for YARN's device plugin ecosystem. Actually, I'm not very sure if this will bring huge out of control complexity for us. But my idea is like this: The vendor developer only needs to use libraries YARN provided to describes the requirements related to their devices. And the *_DevicePlugin_* interface defines the hooks which are the only chances the vendor can tell YARN what devices they have and how to use their devices. It is the only interfaces that the vendor needs to know. And the specs can be only created with the library provided by us. Sorry that the *_DevicePluginAdapter_* name is confusing. This class act as a bridge between NM and the vendor plugin. When NM wants to get devices, the DevicePluginAdapter knows and delegate it to vendor plugin and give back result. When NM wants to use these devices, the DevicePluginAdapter knows, it allocates devices and delegates to the vendor plugin to get back how to use them and tell YARN in YARN's language. The DevicePluginAdapter is a 1 to 1 relation with DevicePlugin. Each DevicePlugin instance needs a DevicePluginAdapter instance to help it. So it's not a problem that DevicePlugin interfaces are not similar to DevicePluginAdapter. The DevicePluginAdapter knows NM well and DevicePlugin is utilized by it. Maybe "DevicePluginWrapper" or "ResourcePluginAdapter" is more proper name? For the device scheduler, I'm now using a shared device scheduler to handle all DevicePluginAdapter's allocation request before container launch. The various type of resources allocated one by one in this shared scheduler which is, in essence, the same with current independent scheduler inside each GPU plugin/FPGA plugin. Regarding to whether we should accept vendor's customized scheduler, it's a good idea. But from my experience, I guess a shared scheduler supporting FIFO and topology scheduling might be enough for most of the vendor in a long term? > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 > URL: https://issues.apache.org/jira/browse/YARN-8851 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8851-WIP2-trunk.001.patch, > YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal.pdf > > > At present, we support GPU/FPGA device in YARN through a native, coupling > way. But it's difficult for a vendor to implement such a device plugin > because the developer needs much knowledge of YARN internals. And this brings > burden to the community to maintain both YARN core and vendor-specific code. > Here we propose a new device plugin framework to ease vendor device plugin > development and provide a more flexible way to integrate with YARN NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654557#comment-16654557 ] Zhankun Tang commented on YARN-8851: [~leftnoteasy] 5) DeviceRuntimeSpec is empty, what you plan to add? {color:#d04437}Zhankun–>{color} It's a set of classeds and is described more clearly in the UML figure of the design doc. In general, it is returned by vendor plugin's implementation hook "OnDeviceAllocated" to describe the requirements of environment or volume creation or docker command updates.etc. This DeviceRuntimeSpec will be translated to YARN internal operations by the "DevicePluginAdapter". For instance, GPUv1 might require a volume creation before container launch. And in this DeviceRuntimeSpec, it is a volumeClaim to describe it and let NM to create it. Another example is GPUv2 needs additional environment when running Docker container, this is described by "envs". And for cgroups device isolation, it is described by "MountDeviceSpec". The class is like this: {code:java} class DeviceRuntimeSpec { Map envs; // describe needed environment variables before using devices Set volumeMounts; // describe volumes need to be mounted before using devices Set devices; // describe devices needed to be mount Set volumeClaim; // describe volume to be created/delete before using devices } {code} > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 > URL: https://issues.apache.org/jira/browse/YARN-8851 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8851-WIP2-trunk.001.patch, > YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal.pdf > > > At present, we support GPU/FPGA device in YARN through a native, coupling > way. But it's difficult for a vendor to implement such a device plugin > because the developer needs much knowledge of YARN internals. And this brings > burden to the community to maintain both YARN core and vendor-specific code. > Here we propose a new device plugin framework to ease vendor device plugin > development and provide a more flexible way to integrate with YARN NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654534#comment-16654534 ] Zhankun Tang commented on YARN-8851: [~leftnoteasy] Thanks for the review! Very helpful comments! 1. {code:java} // Check version for compatibility String pluginVersion = request.getVersion(); if (!isVersionCompatible(pluginVersion)) { LOG.error("Class: " + pluginClassName + " version: " + pluginVersion + " is not compatible. Expected: " + DeviceConstants.version); } {code} What's the use case for this? My understanding is, version match should happen when requests come to NM. And I'm not sure if it is the best idea to limit format of version, maybe we should just treat it as an identifier in addition to name? {color:#FF}Zhankun -->{color} Sorry for the misleading name "pluginVersion". It should be "APIVersion" in fact. The format of it follows semantic versioning which is "Major.Mino.patch". A vendor plugin should report which DevicePlugin API version it is using. Given a version number MAJOR.MINOR.PATCH, increment the: MAJOR version when you make incompatible API changes, MINOR version when you add functionality in a backwards-compatible manner, and PATCH version when you make backwards-compatible bug fixes. When NM gets the request from vendor plugin, this "APIVersion" is used to check if the vendor plugin is developed by a compatible version of "org.apache.hadoop.yarn.server.nodemanager.api.deviceplugin". For instance, the NM uses a "1.0.0" but the plugin's APIversion is "0.1.0"(which means this vendor plugin is developed by 0.1.0 APIs), we should reject this register request because the APIs it used maybe deprecated (major version 0 < 1). And we can add a field of "pluginVersion" for the plugin to indicate its own version. But I guess this not that important to YARN. 2. Instead of adding two configs: {code:java} @Private public static final String NM_RESOURCE_PLUGINS_ENABLE_PLUGGABLE_DEVICE_FRAMEWORK = NM_RESOURCE_PLUGINS + ".pluggable-device-framework.enable"; @Private public static final String NM_RESOURCE_PLUGINS_PLUGGABLE_CLASS = NM_RESOURCE_PLUGINS + ".pluggable-class"; {code} Maybe leaving the pluggable-class is sufficient? {color:#FF}Zhankun -->{color} Ah ha, I think leave only this one is ok for now. But I'm not sure if there'll be more configurations related to the device framework. So maybe leave a switch here is more easy for the administrator to open/close the whole? 3. Set getAndWatch(), I'm not sure what does the "Watch" mean? Should it be just getDevices? {color:#FF}Zhankun–>{color} Good idea. 4. It looks like you try to make DevicePlugin agnostic to Container itself, maybe we should change the name: preLaunchContainer => allocateDevices postCompleteContainer => releaseDevices? {color:#FF}Zhankun–> {color:#33}Yeah. This name is confusing. How about this? Since we want the vendor plugin {color}deveoper{color:#33} to know these two are hooks which will be invoked by NM (more accurate, DevicePluginAdapter).{color}{color} > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 > URL: https://issues.apache.org/jira/browse/YARN-8851 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8851-WIP2-trunk.001.patch, > YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal.pdf > > > At present, we support GPU/FPGA device in YARN through a native, coupling > way. But it's difficult for a vendor to implement such a device plugin > because the developer needs much knowledge of YARN internals. And this brings > burden to the community to maintain both YARN core and vendor-specific code. > Here we propose a new device plugin framework to ease vendor device plugin > development and provide a more flexible way to integrate with YARN NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654192#comment-16654192 ] Wangda Tan commented on YARN-8851: -- Thanks [~tangzhankun], mostly high level comments. item #6 will be most important and fundamental of the feature. 1) Regarding to version compatibility: {code:java} // Check version for compatibility String pluginVersion = request.getVersion(); if (!isVersionCompatible(pluginVersion)) { LOG.error("Class: " + pluginClassName + " version: " + pluginVersion + " is not compatible. Expected: " + DeviceConstants.version); } {code} What's the use case for this? My understanding is, version match should happen when requests come to NM. And I'm not sure if it is the best idea to limit format of version, maybe we should just treat it as an identifier in addition to name? 2) Instead of adding two configs: {code:java} @Private public static final String NM_RESOURCE_PLUGINS_ENABLE_PLUGGABLE_DEVICE_FRAMEWORK = NM_RESOURCE_PLUGINS + ".pluggable-device-framework.enable"; @Private public static final String NM_RESOURCE_PLUGINS_PLUGGABLE_CLASS = NM_RESOURCE_PLUGINS + ".pluggable-class"; {code} Maybe leaving the pluggable-class is sufficient? 3) Set getAndWatch(), I'm not sure what does the "Watch" mean? Should it be just getDevices? 4) It looks like you try to make DevicePlugin agnostic to Container itself, maybe we should change the name: preLaunchContainer => allocateDevices postCompleteContainer => releaseDevices? 5) DeviceRuntimeSpec is empty, what you plan to add? 6) The purpose of {{DevicePluginAdapter}} is to handle all resource plugins, however, given DevicePlugin interface and DevicePluginAdapter are not quite matching. It is very likely that we need customized logic for DevicePluginAdapter. Such as how to manipulate Docker command could be quite different for GPU and FPGA. So instead of only make pluggable interface for DevicePlugin itself, should we use Factory pattern to make all required interfaces pluggable? What I meant is, Change: {code:java} .pluggable-class {code} To {{.pluggable-factory-class}}. And device provider should provide a factory method which can returns {{DevicePluginAdapter}} and {{DevicePlugin}} instances. I also felt it will be better if we can make the scheduler to be part of the factory given how to allocate resources for different devices could be different. So the Factory interface could have following method. {code:java} DevicePluginFactory { DevicePlugin getDevicePlugin(); DevicePluginAdapter getDevicePluginAdapter(); DevicePluginScheduler getDevicePluginScheduler(); } {code} Or, if you think DevicePlugin/DevicePluginScheduler should be internal implementation details of getDevicePluginAdapter, we can only leave getDevicePluginAdapter, and maybe rename it to getDevicePlugin(). And I think it gonna be fine to leave a common implementation for PluginAdapter which exists inside NM, but the DevicePlugin interface should be at least close to the PluginAdapter interface, otherwise it is very hard to bridge the two interfaces. > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 > URL: https://issues.apache.org/jira/browse/YARN-8851 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8851-WIP2-trunk.001.patch, > YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal.pdf > > > At present, we support GPU/FPGA device in YARN through a native, coupling > way. But it's difficult for a vendor to implement such a device plugin > because the developer needs much knowledge of YARN internals. And this brings > burden to the community to maintain both YARN core and vendor-specific code. > Here we propose a new device plugin framework to ease vendor device plugin > development and provide a more flexible way to integrate with YARN NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org