[ 
https://issues.apache.org/jira/browse/YARN-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16938306#comment-16938306
 ] 

Peter Bacsko edited comment on YARN-9699 at 9/26/19 6:14 AM:
-------------------------------------------------------------

Had a discussion with [~sunilg], [~Prabhu Joseph], [~snemeth].

A couple of things to add:
* Development of this tool should happen in stages. Stage #1 converter will not 
support every FS feature/property, simply because it's missing in CS. Those 
which are missing should be added gradually (see YARN-9840 and YARN-9841 for 
example). What we have in the POC is already a good starting point.

* Users should be able to define a "rule" file, which imposes certain limits on 
various things (eg. no more than 100 queue are allowed on the same level) and 
also defines what should happen if the tool encounters an unsupported feature. 
For example, CS does not support max running apps per user, so we can have the 
following settings:
{noformat}
maximumQueuesPerLevel=100
maxAppsPerUser=warning
{noformat}
In this case, "warning" means that the user will be warned that this particular 
setting is not supported in CS and won't be migrated. Another possible setting 
could be "error", which aborts the conversion immediately with an error message.

* We also need strict validation of certain things: the sum of capacities are 
100.0 (unless a capacity is defined in mem/vcore pair) and no two leaf queue is 
allowed with the same name.

* [~sunilg]'s idea is that only {{capacity-scheduler.xml}} should be generated 
and not {{yarn-site.xml}} (only the scheduler class should be changed in this 
file). Looking at the current settings and mappings, I believe this is not 
possible, because there are properties that should be placed in the 
{{yarn-site.xml}} - see the {{convertSiteProperties()}} method in the POC. Even 
if those properties can reside in {{capacity-scheduler.xml}} (someone must 
confirm this), their FS counterpart should be removed from the site config.

* Output of the tool is most likely going to be file/files, but having stdout 
as an option is preferred.

Not sure if I missed something, feel free to correct me if I'm wrong.


was (Author: pbacsko):
Had a discussion with [~sunilg], [~Prabhu Joseph], [~snemeth].

A couple of things to add:
* Development of this tool should happen in stages. Stage #1 converter will not 
support every FS feature/property, simply because it's missing in CS. Those 
which are missing should be added gradually (see YARN-9840 and YARN9841 for 
example). What we have in the POC is already a good starting point.

* Users should be able to define a "rule" file, which imposes certain limits on 
various things (eg. no more than 100 queue are allowed on the same level) and 
also defines what should happen if the tool encounters an unsupported feature. 
For example, CS does not support max running apps per user, so we can have the 
following settings:
{noformat}
maximumQueuesPerLevel=100
maxAppsPerUser=warning
{noformat}
In this case, "warning" means that the user will be warned that this particular 
setting is not supported in CS and won't be migrated. Another possible setting 
could be "error", which aborts the conversion immediately with an error message.

* We also need strict validation of certain things: the sum of capacities are 
100.0 (unless a capacity is defined in mem/vcore pair) and no two leaf queue is 
allowed with the same name.

* [~sunilg]'s idea is that only {{capacity-scheduler.xml}} should be generated 
and not {{yarn-site.xml}} (only the scheduler class should be changed in this 
file). Looking at the current settings and mappings, I believe this is not 
possible, because there are properties that should be placed in the 
{{yarn-site.xml}} - see the {{convertSiteProperties()}} method in the POC. Even 
if those properties can reside in {{capacity-scheduler.xml}} (someone must 
confirm this), their FS counterpart should be removed from the site config.

* Output of the tool is most likely going to be file/files, but having stdout 
as an option is preferred.

Not sure if I missed something, feel free to correct me if I'm wrong.

> Migration tool that help to generate CS configs based on FS
> -----------------------------------------------------------
>
>                 Key: YARN-9699
>                 URL: https://issues.apache.org/jira/browse/YARN-9699
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Wanqiang Ji
>            Assignee: Gergely Pollak
>            Priority: Major
>         Attachments: FS_to_CS_migration_POC.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to