Hi Manuel,

Looks like you are using the virtualenv of spark. Virtualenv will create
python enviroment in executor.

>>> --conf 
>>> spark.pyspark.virtualenv.bin.path=/home/mansop/hail-test/python-2.7.2/bin/activate
\
And you are not making proper configuration, spark.pyspark.virtualenv.bin.path
should point to the virtualenv executable file which needs to be installed
on all the nodes of cluster. You can check the following link for more
details of how to use virtualenv in pyspark.

https://community.hortonworks.com/articles/104949/using-virtualenv-with-pyspark-1.html



Manuel Sopena Ballesteros <manuel...@garvan.org.au>于2018年1月16日周二 上午8:02写道:

> Apologies, I copied the wrong spark-submit output from running in a
> cluster. Please find below the right output for the question asked:
>
>
>
> -bash-4.1$ spark-submit --master yarn \
>
> >     --deploy-mode cluster \
>
> >     --driver-memory 4g \
>
> >     --executor-memory 2g \
>
> >     --executor-cores 4 \
>
> >     --queue default \
>
> >     --conf spark.pyspark.virtualenv.enabled=true \
>
> >     --conf spark.pyspark.virtualenv.type=native \
>
> >     --conf
> spark.pyspark.virtualenv.requirements=/home/mansop/requirements.txt \
>
> >     --conf
> spark.pyspark.virtualenv.bin.path=/home/mansop/hail-test/python-2.7.2/bin/activate
> \
>
> >     --jars $HAIL_HOME/build/libs/hail-all-spark.jar \
>
> >     --py-files $HAIL_HOME/build/distributions/hail-python.zip \
>
> >     test.py
>
>
>
> 18/01/16 10:42:49 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
>
> 18/01/16 10:42:50 WARN DomainSocketFactory: The short-circuit local reads
> feature cannot be used because libhadoop cannot be loaded.
>
> 18/01/16 10:42:50 INFO RMProxy: Connecting to ResourceManager at
> wp-hdp-ctrl03-mlx.mlx/10.0.1.206:8050
>
> 18/01/16 10:42:50 INFO Client: Requesting a new application from cluster
> with 4 NodeManagers
>
> 18/01/16 10:42:50 INFO Client: Verifying our application has not requested
> more than the maximum memory capability of the cluster (450560 MB per
> container)
>
> 18/01/16 10:42:50 INFO Client: Will allocate AM container, with 4505 MB
> memory including 409 MB overhead
>
> 18/01/16 10:42:50 INFO Client: Setting up container launch context for our
> AM
>
> 18/01/16 10:42:50 INFO Client: Setting up the launch environment for our
> AM container
>
> 18/01/16 10:42:50 INFO Client: Preparing resources for our AM container
>
> 18/01/16 10:42:51 INFO Client: Use hdfs cache file as spark.yarn.archive
> for HDP,
> hdfsCacheFile:hdfs://wp-hdp-ctrl01-mlx.mlx:8020/hdp/apps/2.6.3.0-235/spark2/spark2-hdp-yarn-archive.tar.gz
>
> 18/01/16 10:42:51 INFO Client: Source and destination file systems are the
> same. Not copying
> hdfs://wp-hdp-ctrl01-mlx.mlx:8020/hdp/apps/2.6.3.0-235/spark2/spark2-hdp-yarn-archive.tar.gz
>
> 18/01/16 10:42:51 INFO Client: Uploading resource
> file:/home/mansop/hail-test2/hail/build/libs/hail-all-spark.jar ->
> hdfs://wp-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1512016123441_0045/hail-all-spark.jar
>
> 18/01/16 10:42:51 INFO Client: Uploading resource
> file:/home/mansop/requirements.txt ->
> hdfs://wp-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1512016123441_0045/requirements.txt
>
> 18/01/16 10:42:51 INFO Client: Uploading resource
> file:/home/mansop/test.py ->
> hdfs://wp-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1512016123441_0045/test.py
>
> 18/01/16 10:42:51 INFO Client: Uploading resource
> file:/usr/hdp/2.6.3.0-235/spark2/python/lib/pyspark.zip ->
> hdfs://wp-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1512016123441_0045/pyspark.zip
>
> 18/01/16 10:42:51 INFO Client: Uploading resource
> file:/usr/hdp/2.6.3.0-235/spark2/python/lib/py4j-0.10.4-src.zip ->
> hdfs://wp-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1512016123441_0045/py4j-0.10.4-src.zip
>
> 18/01/16 10:42:51 INFO Client: Uploading resource
> file:/home/mansop/hail-test2/hail/build/distributions/hail-python.zip ->
> hdfs://wp-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1512016123441_0045/hail-python.zip
>
> 18/01/16 10:42:52 INFO Client: Uploading resource
> file:/tmp/spark-592e7e0f-6faa-4c3c-ab0f-7dd1cff21d17/__spark_conf__8493747840734310444.zip
> ->
> hdfs://wp-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1512016123441_0045/__spark_conf__.zip
>
> 18/01/16 10:42:52 INFO SecurityManager: Changing view acls to: mansop
>
> 18/01/16 10:42:52 INFO SecurityManager: Changing modify acls to: mansop
>
> 18/01/16 10:42:52 INFO SecurityManager: Changing view acls groups to:
>
> 18/01/16 10:42:52 INFO SecurityManager: Changing modify acls groups to:
>
> 18/01/16 10:42:52 INFO SecurityManager: SecurityManager: authentication
> disabled; ui acls disabled; users  with view permissions: Set(mansop);
> groups with view permissions: Set(); users  with modify permissions:
> Set(mansop); groups with modify permissions: Set()
>
> 18/01/16 10:42:52 INFO Client: Submitting application
> application_1512016123441_0045 to ResourceManager
>
> 18/01/16 10:42:52 INFO YarnClientImpl: Submitted application
> application_1512016123441_0045
>
> 18/01/16 10:42:53 INFO Client: Application report for
> application_1512016123441_0045 (state: ACCEPTED)
>
> 18/01/16 10:42:53 INFO Client:
>
>          client token: N/A
>
>          diagnostics: AM container is launched, waiting for AM container
> to Register with RM
>
>          ApplicationMaster host: N/A
>
>          ApplicationMaster RPC port: -1
>
>          queue: default
>
>          start time: 1516059772092
>
>          final status: UNDEFINED
>
>          tracking URL:
> http://wp-hdp-ctrl03-mlx.mlx:8088/proxy/application_1512016123441_0045/
>
>          user: mansop
>
> 18/01/16 10:42:54 INFO Client: Application report for
> application_1512016123441_0045 (state: ACCEPTED)
>
> 18/01/16 10:42:55 INFO Client: Application report for
> application_1512016123441_0045 (state: ACCEPTED)
>
> 18/01/16 10:42:56 INFO Client: Application report for
> application_1512016123441_0045 (state: ACCEPTED)
>
> 18/01/16 10:42:57 INFO Client: Application report for
> application_1512016123441_0045 (state: ACCEPTED)
>
> 18/01/16 10:42:58 INFO Client: Application report for
> application_1512016123441_0045 (state: ACCEPTED)
>
> 18/01/16 10:42:59 INFO Client: Application report for
> application_1512016123441_0045 (state: ACCEPTED)
>
> 18/01/16 10:43:00 INFO Client: Application report for
> application_1512016123441_0045 (state: ACCEPTED)
>
> 18/01/16 10:43:01 INFO Client: Application report for
> application_1512016123441_0045 (state: FAILED)
>
> 18/01/16 10:43:01 INFO Client:
>
>          client token: N/A
>
>          diagnostics: Application application_1512016123441_0045 failed 2
> times due to AM Container for appattempt_1512016123441_0045_000002 exited
> with  exitCode: 15
>
> For more detailed output, check the application tracking page:
> http://wp-hdp-ctrl03-mlx.mlx:8088/cluster/app/application_1512016123441_0045
> Then click on links to logs of each attempt.
>
> Diagnostics: Exception from container-launch.
>
> Container id: container_1512016123441_0045_02_000001
>
> Exit code: 15
>
>
>
> Container exited with a non-zero exit code 15. Error file: prelaunch.err.
>
> Last 4096 bytes of prelaunch.err :
>
> Last 4096 bytes of stderr :
>
> 52)
>
>         at org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>         at java.lang.reflect.Method.invoke(Method.java:498)
>
>         at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:646)
>
> Caused by: java.io.IOException: error=2, No such file or directory
>
>         at java.lang.UNIXProcess.forkAndExec(Native Method)
>
>         at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
>
>         at java.lang.ProcessImpl.start(ProcessImpl.java:134)
>
>         at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
>
>         ... 9 more
>
> 18/01/16 10:43:00 INFO ApplicationMaster: Final app status: FAILED,
> exitCode: 15, (reason: User class threw exception: java.io.IOException:
> Cannot run program
> "/d0/hadoop/yarn/local/usercache/mansop/appcache/application_1512016123441_0045/container_1512016123441_0045_02_000001/tmp/1516059780057-0/bin/python":
> error=2, No such file or directory)
>
> 18/01/16 10:43:00 ERROR ApplicationMaster: Uncaught exception:
>
> org.apache.spark.SparkException: Exception thrown in awaitResult:
>
>         at
> org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
>
>         at
> org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:423)
>
>         at
> org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:282)
>
>         at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:768)
>
>         at
> org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:67)
>
>         at
> org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:66)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>        at javax.security.auth.Subject.doAs(Subject.java:422)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
>
>         at
> org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
>
>         at
> org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:766)
>
>         at
> org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
>
> Caused by: java.io.IOException: Cannot run program
> "/d0/hadoop/yarn/local/usercache/mansop/appcache/application_1512016123441_0045/container_1512016123441_0045_02_000001/tmp/1516059780057-0/bin/python":
> error=2, No such file or directory
>
>         at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
>
>         at
> org.apache.spark.api.python.VirtualEnvFactory.execCommand(VirtualEnvFactory.scala:103)
>
>         at
> org.apache.spark.api.python.VirtualEnvFactory.setupVirtualEnv(VirtualEnvFactory.scala:91)
>
>         at
> org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:52)
>
>         at org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>         at java.lang.reflect.Method.invoke(Method.java:498)
>
>         at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:646)
>
> Caused by: java.io.IOException: error=2, No such file or directory
>
>         at java.lang.UNIXProcess.forkAndExec(Native Method)
>
>         at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
>
>         at java.lang.ProcessImpl.start(ProcessImpl.java:134)
>
>         at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
>
>         ... 9 more
>
> 18/01/16 10:43:00 INFO ApplicationMaster: Unregistering ApplicationMaster
> with FAILED (diag message: User class threw exception: java.io.IOException:
> Cannot run program
> "/d0/hadoop/yarn/local/usercache/mansop/appcache/application_1512016123441_0045/container_1512016123441_0045_02_000001/tmp/1516059780057-0/bin/python":
> error=2, No such file or directory)
>
> 18/01/16 10:43:00 INFO ApplicationMaster: Deleting staging directory
> hdfs://wp-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1512016123441_0045
>
> 18/01/16 10:43:00 INFO ShutdownHookManager: Shutdown hook called
>
>
>
> Failing this attempt. Failing the application.
>
>          ApplicationMaster host: N/A
>
>          ApplicationMaster RPC port: -1
>
>          queue: default
>
>          start time: 1516059772092
>
>          final status: FAILED
>
>          tracking URL:
> http://wp-hdp-ctrl03-mlx.mlx:8088/cluster/app/application_1512016123441_0045
>
>          user: mansop
>
> Exception in thread "main" org.apache.spark.SparkException: Application
> application_1512016123441_0045 finished with failed status
>
>         at org.apache.spark.deploy.yarn.Client.run(Client.scala:1187)
>
>         at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1233)
>
>         at org.apache.spark.deploy.yarn.Client.main(Client.scala)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>         at java.lang.reflect.Method.invoke(Method.java:498)
>
>         at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:782)
>
>         at
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
>
>         at
> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
>
>         at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
>
>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> 18/01/16 10:43:01 INFO ShutdownHookManager: Shutdown hook called
>
> 18/01/16 10:43:01 INFO ShutdownHookManager: Deleting directory
> /tmp/spark-592e7e0f-6faa-4c3c-ab0f-7dd1cff21d17
>
>
>
> QUESTION:
>
> Why spark/yarn can’t find this file
> */d0/hadoop/yarn/local/usercache/mansop/appcache/application_1512016123441_0045/container_1512016123441_0045_02_000001/tmp/1516059780057-0/bin/python*?
> Who copies it and from where? And what do I need to do in order to make my
> spark-submit job to run?
>
>
>
> Thank you
>
>
>
> Manuel
>
>
>
> *From:* Manuel Sopena Ballesteros
> *Sent:* Tuesday, January 16, 2018 10:53 AM
> *To:* user@spark.apache.org
> *Subject:* spark-submit can find python?
>
>
>
> Hi all,
>
>
>
> I am quite new to spark and need some help troubleshooting the execution
> of an application running on a spark cluster…
>
>
>
> My spark environment is deployed using Ambari (HDP), YARM is the resource
> scheduler and hadoop as file system.
>
>
>
> The application I am trying to run is a python script (test.py).
>
>
>
> The worker nodes have python 2.6 so I am asking spark to spin up a virtual
> environment based on python 2.7.
>
>
>
> I can successfully run this test app in a single node (see below):
>
>
>
> -bash-4.1$ spark-submit \
>
> > --conf spark.pyspark.virtualenv.type=native \
>
> > --conf
> spark.pyspark.virtualenv.requirements=/home/mansop/requirements.txt \
>
> > --conf
> spark.pyspark.virtualenv.bin.path=/home/mansop/hail-test/python-2.7.2/bin/activate
> \
>
> > --conf
> spark.pyspark.python=/home/mansop/hail-test/python-2.7.2/bin/python \
>
> > --jars $HAIL_HOME/build/libs/hail-all-spark.jar \
>
> > --py-files $HAIL_HOME/build/distributions/hail-python.zip \
>
> > test.py
>
> hail: info: SparkUI: http://192.168.10.201:4040
>
> Welcome to
>
>      __  __     <>__
>
>     / /_/ /__  __/ /
>
>    / __  / _ `/ / /
>
>   /_/ /_/\_,_/_/_/   version 0.1-0320a61
>
> [Stage 2:==================================================>     (91 + 4)
> / 100]Summary(samples=3, variants=308, call_rate=
>                                                                               
>                                            1.000000,
> contigs=['1'], multiallelics=0, snps=308, mnps=0, insertions=0,
> deletions=0, complex=0, star=0, max_alleles=2)
>
>
>
>
>
> However spark crashes while trying to run my test script (error below)
> throwing this error message
> */d0/hadoop/yarn/local/usercache/mansop/appcache/application_1512016123441_0032/container_1512016123441_0032_02_000001/tmp/1515989862748-0/bin/python*
>
>
>
> -bash-4.1$ spark-submit --master yarn \
>
> >     --deploy-mode cluster \
>
> >     --driver-memory 4g \
>
> >     --executor-memory 2g \
>
> >     --executor-cores 4 \
>
> >     --queue default \
>
> >     --conf spark.pyspark.virtualenv.type=native \
>
> >     --conf
> spark.pyspark.virtualenv.requirements=/home/mansop/requirements.txt \
>
> >     --conf
> spark.pyspark.virtualenv.bin.path=/home/mansop/hail-test/python-2.7.2/bin/activate
> \
>
> >     --jars $HAIL_HOME/build/libs/hail-all-spark.jar \
>
> >     --py-files $HAIL_HOME/build/distributions/hail-python.zip \
>
> >     test.py
>
> 18/01/16 09:55:17 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
>
> 18/01/16 09:55:18 WARN DomainSocketFactory: The short-circuit local reads
> feature cannot be used because libhadoop cannot be loaded.
>
> 18/01/16 09:55:18 INFO RMProxy: Connecting to ResourceManager at
> wp-hdp-ctrl03-mlx.mlx/10.0.1.206:8050
>
> 18/01/16 09:55:18 INFO Client: Requesting a new application from cluster
> with 4 NodeManagers
>
> 18/01/16 09:55:18 INFO Client: Verifying our application has not requested
> more than the maximum memory capability of the cluster (450560 MB per
> container)
>
> 18/01/16 09:55:18 INFO Client: Will allocate AM container, with 4505 MB
> memory including 409 MB overhead
>
> 18/01/16 09:55:18 INFO Client: Setting up container launch context for our
> AM
>
> 18/01/16 09:55:18 INFO Client: Setting up the launch environment for our
> AM container
>
> 18/01/16 09:55:18 INFO Client: Preparing resources for our AM container
>
> 18/01/16 09:55:19 INFO Client: Use hdfs cache file as spark.yarn.archive
> for HDP,
> hdfsCacheFile:hdfs://wp-hdp-ctrl01-mlx.mlx:8020/hdp/apps/2.6.3.0-235/spark2/spark2-hdp-yarn-archive.tar.gz
>
> 18/01/16 09:55:19 INFO Client: Source and destination file systems are the
> same. Not copying
> hdfs://wp-hdp-ctrl01-mlx.mlx:8020/hdp/apps/2.6.3.0-235/spark2/spark2-hdp-yarn-archive.tar.gz
>
> 18/01/16 09:55:19 INFO Client: Uploading resource
> file:/home/mansop/hail-test2/hail/build/libs/hail-all-spark.jar ->
> hdfs://wp-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1512016123441_0043/hail-all-spark.jar
>
> 18/01/16 09:55:20 INFO Client: Uploading resource
> file:/home/mansop/requirements.txt ->
> hdfs://wp-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1512016123441_0043/requirements.txt
>
> 18/01/16 09:55:20 INFO Client: Uploading resource
> file:/home/mansop/test.py ->
> hdfs://wp-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1512016123441_0043/test.py
>
> 18/01/16 09:55:20 INFO Client: Uploading resource
> file:/usr/hdp/2.6.3.0-235/spark2/python/lib/pyspark.zip ->
> hdfs://wp-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1512016123441_0043/pyspark.zip
>
> 18/01/16 09:55:20 INFO Client: Uploading resource
> file:/usr/hdp/2.6.3.0-235/spark2/python/lib/py4j-0.10.4-src.zip ->
> hdfs://wp-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1512016123441_0043/py4j-0.10.4-src.zip
>
> 18/01/16 09:55:20 INFO Client: Uploading resource
> file:/home/mansop/hail-test2/hail/build/distributions/hail-python.zip ->
> hdfs://wp-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1512016123441_0043/hail-python.zip
>
> 18/01/16 09:55:20 INFO Client: Uploading resource
> file:/tmp/spark-888af623-c81d-4ff1-ac8a-15f25112cc4a/__spark_conf__1173722187739681647.zip
> ->
> hdfs://wp-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1512016123441_0043/__spark_conf__.zip
>
> 18/01/16 09:55:20 INFO SecurityManager: Changing view acls to: mansop
>
> 18/01/16 09:55:20 INFO SecurityManager: Changing modify acls to: mansop
>
> 18/01/16 09:55:20 INFO SecurityManager: Changing view acls groups to:
>
> 18/01/16 09:55:20 INFO SecurityManager: Changing modify acls groups to:
>
> 18/01/16 09:55:20 INFO SecurityManager: SecurityManager: authentication
> disabled; ui acls disabled; users  with view permissions: Set(mansop);
> groups with view permissions: Set(); users  with modify permissions:
> Set(mansop); groups with modify permissions: Set()
>
> 18/01/16 09:55:20 INFO Client: Submitting application
> application_1512016123441_0043 to ResourceManager
>
> 18/01/16 09:55:20 INFO YarnClientImpl: Submitted application
> application_1512016123441_0043
>
> 18/01/16 09:55:21 INFO Client: Application report for
> application_1512016123441_0043 (state: ACCEPTED)
>
> 18/01/16 09:55:21 INFO Client:
>
>          client token: N/A
>
>          diagnostics: AM container is launched, waiting for AM container
> to Register with RM
>
>          ApplicationMaster host: N/A
>
>          ApplicationMaster RPC port: -1
>
>          queue: default
>
>          start time: 1516056920515
>
>          final status: UNDEFINED
>
>          tracking URL:
> http://wp-hdp-ctrl03-mlx.mlx:8088/proxy/application_1512016123441_0043/
>
>          user: mansop
>
> 18/01/16 09:55:22 INFO Client: Application report for
> application_1512016123441_0043 (state: ACCEPTED)
>
> 18/01/16 09:55:23 INFO Client: Application report for
> application_1512016123441_0043 (state: ACCEPTED)
>
> 18/01/16 09:55:24 INFO Client: Application report for
> application_1512016123441_0043 (state: ACCEPTED)
>
> 18/01/16 09:55:25 INFO Client: Application report for
> application_1512016123441_0043 (state: ACCEPTED)
>
> 18/01/16 09:55:26 INFO Client: Application report for
> application_1512016123441_0043 (state: ACCEPTED)
>
> 18/01/16 09:55:27 INFO Client: Application report for
> application_1512016123441_0043 (state: ACCEPTED)
>
> 18/01/16 09:55:28 INFO Client: Application report for
> application_1512016123441_0043 (state: ACCEPTED)
>
> 18/01/16 09:55:29 INFO Client: Application report for
> application_1512016123441_0043 (state: ACCEPTED)
>
> 18/01/16 09:55:30 INFO Client: Application report for
> application_1512016123441_0043 (state: FAILED)
>
> 18/01/16 09:55:30 INFO Client:
>
>          client token: N/A
>
>          diagnostics: Application application_1512016123441_0043 failed 2
> times due to AM Container for appattempt_1512016123441_0043_000002 exited
> with  exitCode: 1
>
> For more detailed output, check the application tracking page:
> http://wp-hdp-ctrl03-mlx.mlx:8088/cluster/app/application_1512016123441_0043
> Then click on links to logs of each attempt.
>
> Diagnostics: Exception from container-launch.
>
> Container id: container_1512016123441_0043_02_000001
>
> Exit code: 1
>
>
>
> Container exited with a non-zero exit code 1. Error file: prelaunch.err.
>
> Last 4096 bytes of prelaunch.err :
>
> Last 4096 bytes of stderr :
>
> SLF4J: Class path contains multiple SLF4J bindings.
>
> SLF4J: Found binding in
> [jar:file:/d1/hadoop/yarn/local/filecache/11/spark2-hdp-yarn-archive.tar.gz/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>
> SLF4J: Found binding in
> [jar:file:/usr/hdp/2.6.3.0-235/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
>
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>
> 18/01/16 09:55:27 INFO SignalUtils: Registered signal handler for TERM
>
> 18/01/16 09:55:27 INFO SignalUtils: Registered signal handler for HUP
>
> 18/01/16 09:55:27 INFO SignalUtils: Registered signal handler for INT
>
> 18/01/16 09:55:28 INFO ApplicationMaster: Preparing Local resources
>
> 18/01/16 09:55:28 INFO ApplicationMaster: ApplicationAttemptId:
> appattempt_1512016123441_0043_000002
>
> 18/01/16 09:55:28 INFO SecurityManager: Changing view acls to: yarn,mansop
>
> 18/01/16 09:55:28 INFO SecurityManager: Changing modify acls to:
> yarn,mansop
>
> 18/01/16 09:55:28 INFO SecurityManager: Changing view acls groups to:
>
> 18/01/16 09:55:28 INFO SecurityManager: Changing modify acls groups to:
>
> 18/01/16 09:55:28 INFO SecurityManager: SecurityManager: authentication
> disabled; ui acls disabled; users  with view permissions: Set(yarn,
> mansop); groups with view permissions: Set(); users  with modify
> permissions: Set(yarn, mansop); groups with modify permissions: Set()
>
> 18/01/16 09:55:28 INFO ApplicationMaster: Starting the user application in
> a separate Thread
>
> 18/01/16 09:55:28 INFO ApplicationMaster: Waiting for spark context
> initialization...
>
> 18/01/16 09:55:29 ERROR ApplicationMaster: User application exited with
> status 1
>
> 18/01/16 09:55:29 INFO ApplicationMaster: Final app status: FAILED,
> exitCode: 1, (reason: User application exited with status 1)
>
> 18/01/16 09:55:29 ERROR ApplicationMaster: Uncaught exception:
>
> org.apache.spark.SparkException: Exception thrown in awaitResult:
>
>         at
> org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
>
>         at
> org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:423)
>
>         at
> org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:282)
>
>         at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:768)
>
>         at
> org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:67)
>
>         at
> org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:66)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
>
>         at
> org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
>
>         at
> org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:766)
>
>         at
> org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
>
> Caused by: org.apache.spark.SparkUserAppException: User application exited
> with 1
>
>         at
> org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:105)
>
>         at org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>         at java.lang.reflect.Method.invoke(Method.java:498)
>
>         at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:646)
>
> 18/01/16 09:55:29 INFO ApplicationMaster: Unregistering ApplicationMaster
> with FAILED (diag message: User application exited with status 1)
>
> 18/01/16 09:55:29 INFO ApplicationMaster: Deleting staging directory
> hdfs://wp-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1512016123441_0043
>
> 18/01/16 09:55:29 INFO ShutdownHookManager: Shutdown hook called
>
>
>
> Failing this attempt. Failing the application.
>
>          ApplicationMaster host: N/A
>
>          ApplicationMaster RPC port: -1
>
>          queue: default
>
>          start time: 1516056920515
>
>          final status: FAILED
>
>          tracking URL:
> http://wp-hdp-ctrl03-mlx.mlx:8088/cluster/app/application_1512016123441_0043
>
>          user: mansop
>
> Exception in thread "main" org.apache.spark.SparkException: Application
> application_1512016123441_0043 finished with failed status
>
>         at org.apache.spark.deploy.yarn.Client.run(Client.scala:1187)
>
>         at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1233)
>
>         at org.apache.spark.deploy.yarn.Client.main(Client.scala)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>         at java.lang.reflect.Method.invoke(Method.java:498)
>
>         at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:782)
>
>         at
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
>
>         at
> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
>
>         at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
>
>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> 18/01/16 09:55:30 INFO ShutdownHookManager: Shutdown hook called
>
> 18/01/16 09:55:30 INFO ShutdownHookManager: Deleting directory
> /tmp/spark-888af623-c81d-4ff1-ac8a-15f25112cc4a
>
>
>
> QUESTION:
>
> Why spark/yarn can’t find this file
> */d0/hadoop/yarn/local/usercache/mansop/appcache/application_1512016123441_0032/container_1512016123441_0032_02_000001/tmp/1515989862748-0/bin/python*?
> Who copies it and from where? And what do I need to do in order to make my
> spark-submit job to run?
>
>
>
> Thank you very much
>
>
>
>
>
> *Manuel Sopena Ballesteros *| Big data Engineer
> *Garvan Institute of Medical Research *
> The Kinghorn Cancer Centre, 370 Victoria Street, Darlinghurst, NSW 2010
> <https://maps.google.com/?q=370+Victoria+Street,+Darlinghurst,+NSW+2010&entry=gmail&source=g>
> *T:* + 61 (0)2 9355 5760 <+61%202%209355%205760> | *F:* +61 (0)2 9295 8507
> <+61%202%209295%208507> | *E:* manuel...@garvan.org.au
>
>
> NOTICE
> Please consider the environment before printing this email. This message
> and any attachments are intended for the addressee named and may contain
> legally privileged/confidential/copyright information. If you are not the
> intended recipient, you should not read, use, disclose, copy or distribute
> this communication. If you have received this message in error please
> notify us at once by return email and then delete both messages. We accept
> no liability for the distribution of viruses or similar in electronic
> communications. This notice should not be removed.
>

Reply via email to