[jira] [Updated] (YARN-8799) [Submarine] Correct the default directory path in HDFS for "checkout_path"
[ https://issues.apache.org/jira/browse/YARN-8799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil Govindan updated YARN-8799: - Fix Version/s: (was: 3.2.0) > [Submarine] Correct the default directory path in HDFS for "checkout_path" > -- > > Key: YARN-8799 > URL: https://issues.apache.org/jira/browse/YARN-8799 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > > > {code:java} > yarn jar > $HADOOP_BASE_DIR/home/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar > job run \ > -verbose \ > -wait_job_finish \ > -keep_staging_dir \ > --env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-oracle \ > --env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.2.0-SNAPSHOT \ > --name tf-job-001 \ > --docker_image tangzhankun/tensorflow \ > --input_path hdfs://default/user/yarn/cifar-10-data \ > --worker_resources memory=4G,vcores=2 \ > --worker_launch_cmd "cd /cifar10_estimator && python cifar10_main.py > --data-dir=%input_path% --job-dir=%checkpoint_path% --num-gpus=0 > --train-steps=5"{code} > > Above script should work, but the job failed due to invalid path passed to > "--job-dir" per my testing. It should be a URI start with "hdfs://". > {code:java} > 2018-09-19 23:19:34,729 INFO yarnservice.YarnServiceJobSubmitter: Worker > command =[cd /cifar10_estimator && python cifar10_main.py > --data-dir=hdfs://default/user/yarn/cifar-10-data > --job-dir=submarine/jobs/tf-job-001/staging/checkpoint_path --num-gpus=0 > --train-steps=2]{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8799) [Submarine] Correct the default directory path in HDFS for "checkout_path"
[ https://issues.apache.org/jira/browse/YARN-8799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhankun Tang updated YARN-8799: --- Description: {code:java} yarn jar $HADOOP_BASE_DIR/home/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar job run \ -verbose \ -wait_job_finish \ -keep_staging_dir \ --env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-oracle \ --env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.2.0-SNAPSHOT \ --name tf-job-001 \ --docker_image tangzhankun/tensorflow \ --input_path hdfs://default/user/yarn/cifar-10-data \ --worker_resources memory=4G,vcores=2 \ --worker_launch_cmd "cd /cifar10_estimator && python cifar10_main.py --data-dir=%input_path% --job-dir=%checkpoint_path% --num-gpus=0 --train-steps=5"{code} Above script should work, but the job failed due to invalid path passed to "--job-dir" per my testing. It should be a URI start with "hdfs://". {code:java} 2018-09-19 23:19:34,729 INFO yarnservice.YarnServiceJobSubmitter: Worker command =[cd /cifar10_estimator && python cifar10_main.py --data-dir=hdfs://default/user/yarn/cifar-10-data --job-dir=submarine/jobs/tf-job-001/staging/checkpoint_path --num-gpus=0 --train-steps=2]{code} was: It might be more simple for user to use "--checkout_path" if we provide a default path in HDFS for checkout_path. It could be under current staging dir or other place that make sense. {code:java} yarn jar $HADOOP_BASE_DIR/home/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar job run \ -verbose \ -wait_job_finish \ -keep_staging_dir \ --env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-oracle \ --env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.2.0-SNAPSHOT \ --name tf-job-001 \ --docker_image tangzhankun/tensorflow \ --input_path hdfs://default/user/yarn/cifar-10-data \ --worker_resources memory=4G,vcores=2 \ --worker_launch_cmd "cd /cifar10_estimator && python cifar10_main.py --data-dir=%input_path% --job-dir=%checkpoint_path% --num-gpus=0 --train-steps=5"{code} Above script should works, but the job failed due to invalid path passed to "--job-dir" per my testing. It should be a URI start with "hdfs://". > [Submarine] Correct the default directory path in HDFS for "checkout_path" > -- > > Key: YARN-8799 > URL: https://issues.apache.org/jira/browse/YARN-8799 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Fix For: 3.2.0 > > > > {code:java} > yarn jar > $HADOOP_BASE_DIR/home/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar > job run \ > -verbose \ > -wait_job_finish \ > -keep_staging_dir \ > --env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-oracle \ > --env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.2.0-SNAPSHOT \ > --name tf-job-001 \ > --docker_image tangzhankun/tensorflow \ > --input_path hdfs://default/user/yarn/cifar-10-data \ > --worker_resources memory=4G,vcores=2 \ > --worker_launch_cmd "cd /cifar10_estimator && python cifar10_main.py > --data-dir=%input_path% --job-dir=%checkpoint_path% --num-gpus=0 > --train-steps=5"{code} > > Above script should work, but the job failed due to invalid path passed to > "--job-dir" per my testing. It should be a URI start with "hdfs://". > {code:java} > 2018-09-19 23:19:34,729 INFO yarnservice.YarnServiceJobSubmitter: Worker > command =[cd /cifar10_estimator && python cifar10_main.py > --data-dir=hdfs://default/user/yarn/cifar-10-data > --job-dir=submarine/jobs/tf-job-001/staging/checkpoint_path --num-gpus=0 > --train-steps=2]{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8799) [Submarine] Correct the default directory path in HDFS for "checkout_path"
[ https://issues.apache.org/jira/browse/YARN-8799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhankun Tang updated YARN-8799: --- Summary: [Submarine] Correct the default directory path in HDFS for "checkout_path" (was: [Submarine] A default directory path in HDFS for "checkout_path"?) > [Submarine] Correct the default directory path in HDFS for "checkout_path" > -- > > Key: YARN-8799 > URL: https://issues.apache.org/jira/browse/YARN-8799 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Fix For: 3.2.0 > > > It might be more simple for user to use "--checkout_path" if we provide a > default path in HDFS for checkout_path. It could be under current staging dir > or other place that make sense. > {code:java} > yarn jar > $HADOOP_BASE_DIR/home/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar > job run \ > -verbose \ > -wait_job_finish \ > -keep_staging_dir \ > --env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-oracle \ > --env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.2.0-SNAPSHOT \ > --name tf-job-001 \ > --docker_image tangzhankun/tensorflow \ > --input_path hdfs://default/user/yarn/cifar-10-data \ > --worker_resources memory=4G,vcores=2 \ > --worker_launch_cmd "cd /cifar10_estimator && python cifar10_main.py > --data-dir=%input_path% --job-dir=%checkpoint_path% --num-gpus=0 > --train-steps=5"{code} > Above script should works, but the job failed due to invalid path passed to > "--job-dir" per my testing. It should be a URI start with "hdfs://". -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org