[ 
https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16654192#comment-16654192
 ] 

Wangda Tan commented on YARN-8851:
----------------------------------

Thanks [~tangzhankun],  mostly high level comments.  item #6 will be most 
important and fundamental of the feature. 

1) Regarding to version compatibility:
{code:java}
 
 // Check version for compatibility
 String pluginVersion = request.getVersion();
 if (!isVersionCompatible(pluginVersion)) {
 LOG.error("Class: " + pluginClassName + " version: " + pluginVersion +
 " is not compatible. Expected: " + DeviceConstants.version);
 }
{code}
What's the use case for this? My understanding is, version match should happen 
when requests come to NM. And I'm not sure if it is the best idea to limit 
format of version, maybe we should just treat it as an identifier in addition 
to name?

2) Instead of adding two configs:
{code:java}
 @Private
 public static final String 
NM_RESOURCE_PLUGINS_ENABLE_PLUGGABLE_DEVICE_FRAMEWORK =
 NM_RESOURCE_PLUGINS + ".pluggable-device-framework.enable";


 @Private
 public static final String NM_RESOURCE_PLUGINS_PLUGGABLE_CLASS =
 NM_RESOURCE_PLUGINS + ".pluggable-class";
{code}
Maybe leaving the ....pluggable-class is sufficient?

3) Set<Device> getAndWatch(), 
 I'm not sure what does the "Watch" mean? Should it be just getDevices?

4) It looks like you try to make DevicePlugin agnostic to Container itself, 
maybe we should change the name:
 preLaunchContainer => allocateDevices 
 postCompleteContainer => releaseDevices?

5) DeviceRuntimeSpec is empty, what you plan to add?

6) The purpose of {{DevicePluginAdapter}} is to handle all resource plugins, 
however, given DevicePlugin interface and DevicePluginAdapter are not quite 
matching. It is very likely that we need customized logic for 
DevicePluginAdapter. Such as how to manipulate Docker command could be quite 
different for GPU and FPGA. So instead of only make pluggable interface for 
DevicePlugin itself, should we use Factory pattern to make all required 
interfaces pluggable?

What I meant is,
 Change:
{code:java}
.pluggable-class
{code}
To {{.pluggable-factory-class}}. And device provider should provide a factory 
method which can returns {{DevicePluginAdapter}} and {{DevicePlugin}} instances.

I also felt it will be better if we can make the scheduler to be part of the 
factory given how to allocate resources for different devices could be 
different.

So the Factory interface could have following method.
{code:java}
 
DevicePluginFactory {
 DevicePlugin getDevicePlugin();
 DevicePluginAdapter getDevicePluginAdapter();
 DevicePluginScheduler getDevicePluginScheduler();

}
{code}
Or, if you think DevicePlugin/DevicePluginScheduler should be internal 
implementation details of getDevicePluginAdapter, we can only leave 
getDevicePluginAdapter, and maybe rename it to getDevicePlugin().

And I think it gonna be fine to leave a common implementation for PluginAdapter 
which exists inside NM, but the DevicePlugin interface should be at least close 
to the PluginAdapter interface, otherwise it is very hard to bridge the two 
interfaces.

> [Umbrella] A new pluggable device plugin framework to ease vendor plugin 
> development
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-8851
>                 URL: https://issues.apache.org/jira/browse/YARN-8851
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: yarn
>            Reporter: Zhankun Tang
>            Assignee: Zhankun Tang
>            Priority: Major
>         Attachments: YARN-8851-WIP2-trunk.001.patch, 
> YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, [YARN-8851] 
> YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] 
> YARN_New_Device_Plugin_Framework_Design_Proposal.pdf
>
>
> At present, we support GPU/FPGA device in YARN through a native, coupling 
> way. But it's difficult for a vendor to implement such a device plugin 
> because the developer needs much knowledge of YARN internals. And this brings 
> burden to the community to maintain both YARN core and vendor-specific code.
> Here we propose a new device plugin framework to ease vendor device plugin 
> development and provide a more flexible way to integrate with YARN NM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to