[ https://issues.apache.org/jira/browse/YARN-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508408#comment-16508408 ]
Eric Yang commented on YARN-8220: --------------------------------- [~leftnoteasy] Base on RunTensorflowJobUsingNativeServiceSpec.md the code can be changed to: {code} { "name": "single-node-tensorflow", "version": "1.0.0", "components": [ { "artifact" : { "id" : <docker-image-name>, "type" : "DOCKER" }, "name": "worker", "dependencies": [], "resource": { "cpus": 1, "memory": "4096", "additional" : { "yarn.io/gpu" : { "value" : 2 } } }, "launch_command": "--data-dir=hdfs://default/tmp/cifar-10-data,--job-dir=hdfs://default/tmp/cifar-10-jobdir,--num-gpus=1,--train-batch-size=16,--train-steps=40000", "number_of_containers": 1, "run_privileged_container": false, "configuration": { "env": { "HADOOP_HOME": "/hadoop-3.1.0", "HADOOP_HDFS_HOME": "", "HADOOP_YARN_HOME": "", "HADOOP_CONF_DIR": "/etc/hadoop/conf", "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE":"true" } } } ], "kerberos_principal" : { "principal_name" : "test-u...@example.com", "keytab" : "file:///etc/security/keytabs/test-user.headless.keytab" } } {code} JAVA_HOME, LD_LIBRARY_PATH, and CLASSPATH can be variables that are defined in /etc/profile.d or Dockerfile to avoid having to specify them externally. The same for {{cd /test/cifar10_estimator}} can be replaced with WORKDIR directive in Dockerfile. Dockerfile defines: {code} WORKDIR /test/models/tutorials/image/cifar10_estimator ENTRYPOINT ["/usr/bin/python", "cifar10_main.py"] {code} This would help with readability of the configurations. > Running Tensorflow on YARN with GPU and Docker - Examples > --------------------------------------------------------- > > Key: YARN-8220 > URL: https://issues.apache.org/jira/browse/YARN-8220 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-native-services > Reporter: Sunil Govindan > Assignee: Sunil Govindan > Priority: Critical > Attachments: YARN-8220.001.patch, YARN-8220.002.patch > > > Tensorflow could be run on YARN and could leverage YARN's distributed > features. > This spec fill will help to run Tensorflow on yarn with GPU/docker -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org