[ 
https://issues.apache.org/jira/browse/YARN-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685355#comment-16685355
 ] 

Zhankun Tang edited comment on YARN-8881 at 11/13/18 3:20 PM:
--------------------------------------------------------------

[~sunilg], Many thanks for the review!
{quote}In Device class, any reason why using Boxed type Integer. It may be 
better to use int
{quote}
 Zhankun => Yeah. int is better for performance. Here the integer can do a null 
check to ensure the plugin to set it. If the plugin doesn't set it, int would 
lead to 0 by default. We cannot distinguish whether it's intentional or not.
{quote}DevicePlugin is placed in hadoop-yarn-server-nodemanager package. Is 
that correct?
{quote}
Zhankun => DevicePlugin.java is placed in nodemanager package.  Do you mean the 
real vendor device plugin? We haven't decided whether to put the vendor plugin 
maven project.
{quote}DevicePlugin: do we need validation interface api to ensure basic sanity 
is done during allocation?
{quote}
Zhankun => Yeah. For now the "checkInterfaceCompatibility(DevicePlugin.class, 
pluginClazz)" will do a very basic check to ensure the interfaces implemented 
is correct. More complex checking like if the plugin methods are stateless, 
return result .etc is valid will be wrapped into a new specific class in the 
future. And can be used as a seperate tool. Please see YARN-8946.
{quote}Why DeviceRegisterRequest need to serializable?
{quote}
Zhankun => Good point. I did this incase we have serialization in the future. 
This is overkill. I'll delete it.
{quote}who will really invoke adapter.getNodeResourceHandlerInstance() ?
{quote}
Zhankun => It's the "NodeStatusUpdaterImpl"#serviceInit.
{quote}DevicePluginAdapter cleanup is missing? Does it need to be handled by 
ResourcePluginManager?
{quote}
Zhankun => It's not missing. We haven't seen scenarios that need logic here. So 
it's empty.
{quote}In below case, do we need to throw exception?

if (pluginMap.containsKey(resourceName))
 Unknown macro: \{          throw new YarnRuntimeException(resourceName         
    + " already registered! Please change resource type name"); 162       }
{quote}
Zhankun => Yeah. I'm afraid so. This indicates that the plugin registered a 
resourceName that already exists. We could just log an error and ignore it but 
I'm afraid that the admin would think his configuration or plugin is working 
correctly although it's not. So we throw an exception to tell that there's 
something wrong and the admin needs to know this.

The error message is not good enough, I'll refine it to "... Please change 
resource type name for <pluginClass> or configure correct resource name in 
resource-types.xml".


was (Author: tangzhankun):
[~sunilg], Many thanks for the review!
{quote}In Device class, any reason why using Boxed type Integer. It may be 
better to use int
{quote}
 Zhankun => Yeah. int is better for performance. Here the integer can do a null 
check to ensure the plugin to set it. If the plugin doesn't set it, int would 
lead to 0 by default. We cannot distinguish whether it's intentional or not.
{quote}DevicePlugin is placed in hadoop-yarn-server-nodemanager package. Is 
that correct?
{quote}
Zhankun => DevicePlugin.java is placed in nodemanager package.  Do you mean the 
real vendor device plugin? We haven't decided whether to put the vendor plugin 
maven project.
{quote}DevicePlugin: do we need validation interface api to ensure basic sanity 
is done during allocation?
{quote}
Zhankun => Yeah. For now the "checkInterfaceCompatibility(DevicePlugin.class, 
pluginClazz)" will do a very basic check to ensure the interfaces implemented 
is correct. More complex checking like if the plugin methods are stateless, 
return result .etc is valid will be wrapped into a new specific class in the 
future. And can be used as a seperate tool. Please see YARN-8946.
{quote}Why DeviceRegisterRequest need to serializable?
{quote}
Zhankun => Good point. I did this incase we have serialization in the future. 
This is overkill. I'll delete it.
{quote}who will really invoke adapter.getNodeResourceHandlerInstance() ?
{quote}
Zhankun => It's the "NodeStatusUpdaterImpl"#serviceInit.
{quote}DevicePluginAdapter cleanup is missing? Does it need to be handled by 
ResourcePluginManager?
{quote}
Zhankun => It's not missing. We haven't seen scenarios that need logic here. So 
it's empty.
{quote}In below case, do we need to throw exception?

if (pluginMap.containsKey(resourceName))
 Unknown macro: \{          throw new YarnRuntimeException(resourceName         
    + " already registered! Please change resource type name"); 162       }
{quote}
Zhankun => Yeah. I'm afraid so. This indicates that the plugin registered a 
resourceName that already exists. We could just log an error and ignore it but 
I'm afraid that the admin would think his configuration or plugin is working 
correctly although it's not. So we throw an exception to tell that there's 
something wrong and the admin needs to know this.

The error message is not good enough, I'll refine it to "... Please change 
resource type name for <pluginClass> or configure correct resource name for the 
plugin".

> Phase 1 - Add basic pluggable device plugin framework
> -----------------------------------------------------
>
>                 Key: YARN-8881
>                 URL: https://issues.apache.org/jira/browse/YARN-8881
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Zhankun Tang
>            Assignee: Zhankun Tang
>            Priority: Major
>         Attachments: YARN-8881-trunk.001.patch, YARN-8881-trunk.002.patch, 
> YARN-8881-trunk.003.patch, YARN-8881-trunk.004.patch, 
> YARN-8881-trunk.005.patch, YARN-8881-trunk.006.patch, 
> YARN-8881-trunk.007.patch, YARN-8881-trunk.008.patch
>
>
> It includes adding support in "ResourcePluginManager" to load plugin classes 
> based on configuration, an interface for the vendor to implement and the 
> adapter to decouple plugin and YARN internals. And the vendor device resource 
> discovery will be ready after this support



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to