[jira] [Commented] (YARN-9160) [Submarine] Document "PYTHONPATH" environment variable setting when using -localization options
[ https://issues.apache.org/jira/browse/YARN-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16735290#comment-16735290 ] Hudson commented on YARN-9160: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15716 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15716/]) YARN-9160. [Submarine] Document 'PYTHONPATH' environment variable (wangda: rev 2c02aa6ec259128934cc5468cf66104a624d88a7) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/src/site/markdown/QuickStart.md > [Submarine] Document "PYTHONPATH" environment variable setting when using > -localization options > --- > > Key: YARN-9160 > URL: https://issues.apache.org/jira/browse/YARN-9160 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Fix For: 3.3.0, 3.2.1 > > Attachments: YARN-9160-trunk.001.patch > > > An infra platform might want to provide the user a Zepplin notebook and > execute user's job with user's command input like "python entry_point.py > ...". This is better for the end user because he/she feels that the > "entry_point.py" seems in the local workbench. > This may translate to below submarine command in the platform when submitting > the job: > > {code:java} > ... job run > --localization entry_script.py:./ > --localization depedency_script1.py:./ > --localization depedency_script2.py:./ > --worker_launch_cmd "python entry_point.py .." > {code} > Or > > {code:java} > ... job run > --localization entry_script.py:./ > --localization depedency_scripts_dir:./ > --worker_launch_cmd "python entry_script.py .." > {code} > > When running with the above command, both will fail due to module import > error from the entry_point.py. This is because YARN only creates symbol links > in the container's work dir (the real scripts files are in different cache > folders) and python module import won't know that. > One possible solution is set localization with a directory containing all > scripts and change the worker_launch_cmd to "cd scripts_dir && python > entry_script.py". But this solution makes the user experience bad which feels > not in a local workbench. > And another solution is using "PYTHONPATH" environment variable. This > solution can keep the user experience good and won't need YARN localization > internal changes. > {code:java} > ... job run > # the entry point > --localization entry_script.py:/entry_script.py > # the dependency Python scripts of the entry point > --localization depedency_scripts_dir:/dependency_scripts_dir > # the PYTHONPATH env to make dependency available to entry script > --env PYTHONPATH="/dependency_scripts_dir" > --worker_launch_cmd "python /entry_script.py ..."{code} > And we should document this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9160) [Submarine] Document "PYTHONPATH" environment variable setting when using -localization options
[ https://issues.apache.org/jira/browse/YARN-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16735285#comment-16735285 ] Wangda Tan commented on YARN-9160: -- Committed to trunk, thanks [~tangzhankun]. > [Submarine] Document "PYTHONPATH" environment variable setting when using > -localization options > --- > > Key: YARN-9160 > URL: https://issues.apache.org/jira/browse/YARN-9160 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9160-trunk.001.patch > > > An infra platform might want to provide the user a Zepplin notebook and > execute user's job with user's command input like "python entry_point.py > ...". This is better for the end user because he/she feels that the > "entry_point.py" seems in the local workbench. > This may translate to below submarine command in the platform when submitting > the job: > > {code:java} > ... job run > --localization entry_script.py:./ > --localization depedency_script1.py:./ > --localization depedency_script2.py:./ > --worker_launch_cmd "python entry_point.py .." > {code} > Or > > {code:java} > ... job run > --localization entry_script.py:./ > --localization depedency_scripts_dir:./ > --worker_launch_cmd "python entry_script.py .." > {code} > > When running with the above command, both will fail due to module import > error from the entry_point.py. This is because YARN only creates symbol links > in the container's work dir (the real scripts files are in different cache > folders) and python module import won't know that. > One possible solution is set localization with a directory containing all > scripts and change the worker_launch_cmd to "cd scripts_dir && python > entry_script.py". But this solution makes the user experience bad which feels > not in a local workbench. > And another solution is using "PYTHONPATH" environment variable. This > solution can keep the user experience good and won't need YARN localization > internal changes. > {code:java} > ... job run > # the entry point > --localization entry_script.py:/entry_script.py > # the dependency Python scripts of the entry point > --localization depedency_scripts_dir:/dependency_scripts_dir > # the PYTHONPATH env to make dependency available to entry script > --env PYTHONPATH="/dependency_scripts_dir" > --worker_launch_cmd "python /entry_script.py ..."{code} > And we should document this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9160) [Submarine] Document "PYTHONPATH" environment variable setting when using -localization options
[ https://issues.apache.org/jira/browse/YARN-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16735246#comment-16735246 ] Wangda Tan commented on YARN-9160: -- Straightforward fix, +1. Thanks [~tangzhankun]. > [Submarine] Document "PYTHONPATH" environment variable setting when using > -localization options > --- > > Key: YARN-9160 > URL: https://issues.apache.org/jira/browse/YARN-9160 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-9160-trunk.001.patch > > > An infra platform might want to provide the user a Zepplin notebook and > execute user's job with user's command input like "python entry_point.py > ...". This is better for the end user because he/she feels that the > "entry_point.py" seems in the local workbench. > This may translate to below submarine command in the platform when submitting > the job: > > {code:java} > ... job run > --localization entry_script.py:./ > --localization depedency_script1.py:./ > --localization depedency_script2.py:./ > --worker_launch_cmd "python entry_point.py .." > {code} > Or > > {code:java} > ... job run > --localization entry_script.py:./ > --localization depedency_scripts_dir:./ > --worker_launch_cmd "python entry_script.py .." > {code} > > When running with the above command, both will fail due to module import > error from the entry_point.py. This is because YARN only creates symbol links > in the container's work dir (the real scripts files are in different cache > folders) and python module import won't know that. > One possible solution is set localization with a directory containing all > scripts and change the worker_launch_cmd to "cd scripts_dir && python > entry_script.py". But this solution makes the user experience bad which feels > not in a local workbench. > And another solution is using "PYTHONPATH" environment variable. This > solution can keep the user experience good and won't need YARN localization > internal changes. > {code:java} > ... job run > # the entry point > --localization entry_script.py:/entry_script.py > # the dependency Python scripts of the entry point > --localization depedency_scripts_dir:/dependency_scripts_dir > # the PYTHONPATH env to make dependency available to entry script > --env PYTHONPATH="/dependency_scripts_dir" > --worker_launch_cmd "python /entry_script.py ..."{code} > And we should document this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9160) [Submarine] Document "PYTHONPATH" environment variable setting when using -localization options
[ https://issues.apache.org/jira/browse/YARN-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16730229#comment-16730229 ] Hadoop QA commented on YARN-9160: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 30m 51s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 29s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 44m 12s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9160 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12953203/YARN-9160-trunk.001.patch | | Optional Tests | dupname asflicense mvnsite | | uname | Linux 88d152296104 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 128f340 | | maven | version: Apache Maven 3.3.9 | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/22953/artifact/out/whitespace-eol.txt | | Max. process+thread count | 447 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/22953/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > [Submarine] Document "PYTHONPATH" environment variable setting when using > -localization options > --- > > Key: YARN-9160 > URL: https://issues.apache.org/jira/browse/YARN-9160 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-9160-trunk.001.patch > > > An infra platform might want to provide the user a Zepplin notebook and > execute user's job with user's command input like "python entry_point.py > ...". This is better for the end user because he/she feels that the > "entry_point.py" seems in the local workbench. > This may translate to below submarine command in the platform when submitting > the job: > > {code:java} > ... job run > --localization entry_script.py:./ > --localization depedency_script1.py:./ > --localization depedency_script2.py:./ > --worker_launch_cmd "python entry_point.py .." > {code} > Or > > {code:java} > ... job run > --localization entry_script.py:./ > --localization depedency_scripts_dir:./ > --worker_launch_cmd "python entry_script.py .." > {code} > > When running with the above command, both will fail due to module import > error from the entry_point.py. This is because YARN only creates symbol links > in the container's work dir (the real scripts files are in different cache > folders) and python module import won't know that. > One possible solution is s