[ 
https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16654594#comment-16654594
 ] 

Zhankun Tang edited comment on YARN-8851 at 10/18/18 3:56 AM:
--------------------------------------------------------------

[~leftnoteasy] , For question 6, I'll try to answer here and we can talk 
offline if it's not clear to you.

The design principle here I'm trying to follow is trying to make the vendor 
completely agnostic to our YARN internals. Simpler for them, better for YARN's 
device plugin ecosystem. Actually, I'm not very sure if this will bring huge 
out of control complexity for us. But my idea is like this:

The vendor developer only needs to use libraries YARN provided to describes the 
requirements related to their devices. And the *_DevicePlugin_* interface 
defines the hooks which are the only chances the vendor can tell YARN what 
devices they have and how to use their devices. It is the only interfaces that 
the vendor needs to know. And the specs can be only created with the library 
provided by us.

Sorry that the *_DevicePluginAdapter_* name is confusing. This class act as a 
bridge between NM and the vendor plugin. When NM wants to get devices, the 
DevicePluginAdapter knows and delegate it to vendor plugin and give back 
result. When NM wants to use these devices, the DevicePluginAdapter knows, it 
allocates devices and delegates to the vendor plugin to get back how to use 
them and tell YARN in YARN's language. The DevicePluginAdapter is a 1 to 1 
relation with DevicePlugin. Each DevicePlugin instance needs a 
DevicePluginAdapter instance to help it. So it's not a  problem that 
DevicePlugin interfaces are not similar to DevicePluginAdapter. The 
DevicePluginAdapter knows YARN internals well and should not be touched by the 
vendor.

Maybe "DevicePluginWrapper" or "ResourcePluginAdapter" is more proper name? 

 

For the device scheduler, I'm now using a shared device scheduler to handle all 
DevicePluginAdapter's allocation request before container launch. The various 
type of resources allocated one by one in this shared scheduler which is, in 
essence, the same with current independent scheduler inside each GPU 
plugin/FPGA plugin.

Regarding to whether we should accept vendor's customized scheduler, it's a 
good idea. But from my experience, I guess a shared scheduler supporting FIFO 
and topology scheduling(topology can be described in _Device,_ check design 
doc) might be enough for most of the vendor in a long term? 

 

 


was (Author: tangzhankun):
[~leftnoteasy] , For question 6, I'll try to answer here and we can talk 
offline if it's not clear to you.

The design principle here I'm trying to follow is trying to make the vendor 
completely agnostic to our YARN internals. Simpler for them, better for YARN's 
device plugin ecosystem. Actually, I'm not very sure if this will bring huge 
out of control complexity for us. But my idea is like this:

The vendor developer only needs to use libraries YARN provided to describes the 
requirements related to their devices. And the *_DevicePlugin_* interface 
defines the hooks which are the only chances the vendor can tell YARN what 
devices they have and how to use their devices. It is the only interfaces that 
the vendor needs to know. And the specs can be only created with the library 
provided by us.

Sorry that the *_DevicePluginAdapter_* name is confusing. This class act as a 
bridge between NM and the vendor plugin. When NM wants to get devices, the 
DevicePluginAdapter knows and delegate it to vendor plugin and give back 
result. When NM wants to use these devices, the DevicePluginAdapter knows, it 
allocates devices and delegates to the vendor plugin to get back how to use 
them and tell YARN in YARN's language. The DevicePluginAdapter is a 1 to 1 
relation with DevicePlugin. Each DevicePlugin instance needs a 
DevicePluginAdapter instance to help it. So it's not a  problem that 
DevicePlugin interfaces are not similar to DevicePluginAdapter. The 
DevicePluginAdapter knows YARN internals well and should not be touched by the 
vendor.

Maybe "DevicePluginWrapper" or "ResourcePluginAdapter" is more proper name? 

 

For the device scheduler, I'm now using a shared device scheduler to handle all 
DevicePluginAdapter's allocation request before container launch. The various 
type of resources allocated one by one in this shared scheduler which is, in 
essence, the same with current independent scheduler inside each GPU 
plugin/FPGA plugin.

Regarding to whether we should accept vendor's customized scheduler, it's a 
good idea. But from my experience, I guess a shared scheduler supporting FIFO 
and topology scheduling might be enough for most of the vendor in a long term? 

 

 

> [Umbrella] A new pluggable device plugin framework to ease vendor plugin 
> development
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-8851
>                 URL: https://issues.apache.org/jira/browse/YARN-8851
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: yarn
>            Reporter: Zhankun Tang
>            Assignee: Zhankun Tang
>            Priority: Major
>         Attachments: YARN-8851-WIP2-trunk.001.patch, 
> YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, [YARN-8851] 
> YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] 
> YARN_New_Device_Plugin_Framework_Design_Proposal.pdf
>
>
> At present, we support GPU/FPGA device in YARN through a native, coupling 
> way. But it's difficult for a vendor to implement such a device plugin 
> because the developer needs much knowledge of YARN internals. And this brings 
> burden to the community to maintain both YARN core and vendor-specific code.
> Here we propose a new device plugin framework to ease vendor device plugin 
> development and provide a more flexible way to integrate with YARN NM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to