[ 
https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16654534#comment-16654534
 ] 

Zhankun Tang edited comment on YARN-8851 at 10/18/18 2:22 AM:
--------------------------------------------------------------

[~leftnoteasy] Thanks for the review! Very helpful comments!

1. 
{code:java}
 
 // Check version for compatibility
 String pluginVersion = request.getVersion();
 if (!isVersionCompatible(pluginVersion)) {
 LOG.error("Class: " + pluginClassName + " version: " + pluginVersion +
 " is not compatible. Expected: " + DeviceConstants.version);
 }
{code}
What's the use case for this? My understanding is, version match should happen 
when requests come to NM. And I'm not sure if it is the best idea to limit 
format of version, maybe we should just treat it as an identifier in addition 
to name?

{color:#ff0000}Zhankun -->{color} Sorry for the misleading name 
"pluginVersion". It should be "APIVersion" in fact. The format of it follows 
semantic versioning which is "Major.Mino.patch". A vendor plugin should report 
which DevicePlugin API version it is using.

Given a version number MAJOR.MINOR.PATCH, increment the:
MAJOR version when you make incompatible API changes,
MINOR version when you add functionality in a backwards-compatible manner, and
PATCH version when you make backwards-compatible bug fixes.

When NM gets the request from vendor plugin, this "APIVersion" is used to check 
if the vendor plugin is developed by a compatible version of 
"org.apache.hadoop.yarn.server.nodemanager.api.deviceplugin". For instance, the 
NM uses a "1.0.0" but the plugin's APIversion is "0.1.0"(which means this 
vendor plugin is developed by 0.1.0 APIs), we should reject this register 
request because the APIs it used maybe deprecated (major version 0 < 1).

And we can add a field of "pluginVersion" for the plugin to indicate its own 
version. But I guess this not that important to YARN.

2. 

Instead of adding two configs:
{code:java}
 @Private
 public static final String 
NM_RESOURCE_PLUGINS_ENABLE_PLUGGABLE_DEVICE_FRAMEWORK =
 NM_RESOURCE_PLUGINS + ".pluggable-device-framework.enable";


 @Private
 public static final String NM_RESOURCE_PLUGINS_PLUGGABLE_CLASS =
 NM_RESOURCE_PLUGINS + ".pluggable-class";
{code}
Maybe leaving the ....pluggable-class is sufficient?

{color:#ff0000}Zhankun -->{color} Ah ha, I think leave only this one is ok for 
now. But I'm not sure if there'll be more configurations related to the device 
framework. So maybe leave a switch here is more easy for the administrator to 
open/close the whole?

3.

Set<Device> getAndWatch(), 
I'm not sure what does the "Watch" mean? Should it be just getDevices?

{color:#ff0000}Zhankun–>{color} Good idea.

4. It looks like you try to make DevicePlugin agnostic to Container itself, 
maybe we should change the name:
preLaunchContainer => allocateDevices 
postCompleteContainer => releaseDevices?

{color:#ff0000}Zhankun–> {color}  Yeah. This name is confusing. How about this? 
Since we want the vendor plugindeveloper to know these two are hooks which will 
be invoked by NM (more accurate, DevicePluginAdapter).

 

 


was (Author: tangzhankun):
[~leftnoteasy] Thanks for the review! Very helpful comments!

1. 
{code:java}
 
 // Check version for compatibility
 String pluginVersion = request.getVersion();
 if (!isVersionCompatible(pluginVersion)) {
 LOG.error("Class: " + pluginClassName + " version: " + pluginVersion +
 " is not compatible. Expected: " + DeviceConstants.version);
 }
{code}
What's the use case for this? My understanding is, version match should happen 
when requests come to NM. And I'm not sure if it is the best idea to limit 
format of version, maybe we should just treat it as an identifier in addition 
to name?

{color:#ff0000}Zhankun -->{color} Sorry for the misleading name 
"pluginVersion". It should be "APIVersion" in fact. The format of it follows 
semantic versioning which is "Major.Mino.patch". A vendor plugin should report 
which DevicePlugin API version it is using.

Given a version number MAJOR.MINOR.PATCH, increment the:
MAJOR version when you make incompatible API changes,
MINOR version when you add functionality in a backwards-compatible manner, and
PATCH version when you make backwards-compatible bug fixes.

When NM gets the request from vendor plugin, this "APIVersion" is used to check 
if the vendor plugin is developed by a compatible version of 
"org.apache.hadoop.yarn.server.nodemanager.api.deviceplugin". For instance, the 
NM uses a "1.0.0" but the plugin's APIversion is "0.1.0"(which means this 
vendor plugin is developed by 0.1.0 APIs), we should reject this register 
request because the APIs it used maybe deprecated (major version 0 < 1).

And we can add a field of "pluginVersion" for the plugin to indicate its own 
version. But I guess this not that important to YARN.

2. 

Instead of adding two configs:
{code:java}
 @Private
 public static final String 
NM_RESOURCE_PLUGINS_ENABLE_PLUGGABLE_DEVICE_FRAMEWORK =
 NM_RESOURCE_PLUGINS + ".pluggable-device-framework.enable";


 @Private
 public static final String NM_RESOURCE_PLUGINS_PLUGGABLE_CLASS =
 NM_RESOURCE_PLUGINS + ".pluggable-class";
{code}
Maybe leaving the ....pluggable-class is sufficient?

{color:#ff0000}Zhankun -->{color} Ah ha, I think leave only this one is ok for 
now. But I'm not sure if there'll be more configurations related to the device 
framework. So maybe leave a switch here is more easy for the administrator to 
open/close the whole?

3.

Set<Device> getAndWatch(), 
I'm not sure what does the "Watch" mean? Should it be just getDevices?

{color:#ff0000}Zhankun–>{color} Good idea.

4. It looks like you try to make DevicePlugin agnostic to Container itself, 
maybe we should change the name:
preLaunchContainer => allocateDevices 
postCompleteContainer => releaseDevices?

{color:#ff0000}Zhankun–> {color:#333333}Yeah. This name is confusing. How about 
this? Since we want the vendor plugin {color}{color}developer{color:#333333} to 
know these two are hooks which will be invoked by NM (more accurate, 
DevicePluginAdapter).{color}

 

 

> [Umbrella] A new pluggable device plugin framework to ease vendor plugin 
> development
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-8851
>                 URL: https://issues.apache.org/jira/browse/YARN-8851
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: yarn
>            Reporter: Zhankun Tang
>            Assignee: Zhankun Tang
>            Priority: Major
>         Attachments: YARN-8851-WIP2-trunk.001.patch, 
> YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, [YARN-8851] 
> YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] 
> YARN_New_Device_Plugin_Framework_Design_Proposal.pdf
>
>
> At present, we support GPU/FPGA device in YARN through a native, coupling 
> way. But it's difficult for a vendor to implement such a device plugin 
> because the developer needs much knowledge of YARN internals. And this brings 
> burden to the community to maintain both YARN core and vendor-specific code.
> Here we propose a new device plugin framework to ease vendor device plugin 
> development and provide a more flexible way to integrate with YARN NM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to