[ https://issues.apache.org/jira/browse/YARN-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16574994#comment-16574994 ]
Sunil Govindan commented on YARN-8561: -------------------------------------- Thanks [~leftnoteasy] >> I'm not quite sure about this suggestion, it seems to me that we should add >> the getServiceResourceFromYarnResource method to service.Resource instead. I >> don't want to touch any service classes in this patch. Should we do it in a >> separate JIRA? Yes. This makes sense. >> I think we can push this to the future patch, one possible solution is to >>include a yaml file to describe job configs and user can reuse it instead of >>passing 10+ params to CLI. This could be a followup jira. I agree. Few more comments: # In {{Cli}} class, could we use Options for CLI parsing. This will help to do paring and add help much better. # We have a bunch of constants defined in {{CliConstants}}> going forward we will more here. I am not sure whether this is a good idea. Could we load this from a config file something like resource-types.xml where all such commands can be loaded. A new command can be added with a definition in this spec file without code change. # RemoteDirectoryManager holds information about dirs and fs. Currently its only in HDFS, or could we also support local file? # CliUtils#replacePatternsInLaunchCommand {code:java} 65 String newCli = specifiedCli; 66 for (Map.Entry<String, String> replace : replacePattern.entrySet()) { 67 newCli = newCli.replace(replace.getKey(), replace.getValue()); 68 }{code} I didnt understand this very cleanly. We want to replace the value in the specifiedCli to newCli, correct? Key will be the string which starts with %. 5. I think StringUtils#strip is better to trim chars like '[' and ']' in {{parseResourcesString}} 6. {{!resource.matches("^[^=]+=\\d+\\s?\\w*$"}} In parseResourcesString, its better to put this in const string 7. Yes, UnitsConversionUtil is more exhaustive and might have a bit diff meaning. But could we refactor the unit conversion code in parseResourcesString to a util, as i think it might help for some other apps too. 8. Resource profile support in cli? May help to avoid specify whole resources? 9. Submarine cli might need to support app priority, app timeout, queue mapping etc ? 10. Submarine kerberos support will be in subsequent jira, correct? 11. In cli, CliConstants.ENV helps to add envs to job. But we also need to specify ENV's per component level, correct? 12. May be checksum verification needed for FSBasedSubmarineStorageImpl 13. I think we need a translation from ServiceState to JobState. Because more and more states are added native service, and it will be tough to map to submarine. 14. In general i think this is a great effort in getting cli, job tracking, store etc in to one framework. Parameter validation and error handling is always tougher, but is there a way where we can cleanly show error comes from native service if the image doesnt exist or corrupted or no permission etc to be popped up in submarine level. > [Submarine] Add initial implementation: training job submission and job > history retrieve. > ----------------------------------------------------------------------------------------- > > Key: YARN-8561 > URL: https://issues.apache.org/jira/browse/YARN-8561 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Wangda Tan > Assignee: Wangda Tan > Priority: Major > Attachments: YARN-8561.001.patch, YARN-8561.002.patch, > YARN-8561.003.patch, YARN-8561.004.patch > > > Added following parts: > 1) New subcomponent of YARN, under applications/ project. > 2) Tensorflow training job submission, including training (single node and > distributed). > - Supported Docker container. > - Support GPU isolation. > - Support YARN registry DNS. > 3) Retrieve job history. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org