Hi Jeff,
sorry forgot to mention that
the same java code works fine if we replace the python pi.py file with the
jar version of pi example.



|------------------------------------->
|            Jeff Zhang               |
|            <zjf...@gmail.com>       |
|                                     |
|                                     |
|                                     |
|            2016/03/16 上午 11:05    |
|------------------------------------->
  
>--------------------------------------------------------------------------------------------------------------------|
  |                                                                             
                                       |
  |                                                                             
                                       |
  |                                                                             
                                     To|
  |        sychu...@tsmc.com                                                    
                                       |
  |                                                                             
                                     cc|
  |        user <user@spark.apache.org>                                         
                                       |
  |                                                                             
                                Subject|
  |        Re: Job failed while submitting python to yarn programatically       
                                       |
  |                                                                             
                                       |
  |                                                                             
                                       |
  |                                                                             
                                       |
  |                                                                             
                                       |
  |                                                                             
                                       |
  
>--------------------------------------------------------------------------------------------------------------------|




Could you try yarn-cluster mode ? Make sure your cluster nodes can reach
your client machine and no firewall.

On Wed, Mar 16, 2016 at 10:54 AM, <sychu...@tsmc.com> wrote:

  Hi all,

  We're trying to submit a python file, pi.py in this case, to yarn from
  java
  code but this kept failing(1.6.0).
  It seems the AM uses the arguments we passed to pi.py as the driver IP
  address.
  Could someone help me figuring out how to get the job done. Thanks in
  advance.

  The java code looks like below:

            String[] args = new String[]{
                  "--name",
                  "Test Submit Python To Yarn From Java",
                  "--primary-py-file",
                  SPARK_HOME + "/examples/src/main/python/pi.py",
                  "--num-executors",
                  "5",
                  "--driver-memory",
                  "512m",
                  "--executor-memory",
                  "512m",
                  "--executor-cores",
                  "1",
                  "--arg",
                  args[0]
              };

              Configuration config = new Configuration();
              SparkConf sparkConf = new SparkConf();
              ClientArguments clientArgs = new ClientArguments(args,
  sparkConf
  );
              Client client = new Client(clientArgs, config, sparkConf);
              client.run();


  The jar is submitted by spark-submit::
  ./bin/spark-submit --class SubmitPyYARNJobFromJava --master yarn-client
  TestSubmitPythonFromJava.jar 10


  The job submit to yarn just stay in ACCEPTED before it failed
  What I can't figure out is, yarn log shows AM couldn't reach the driver
  at
  10:0, which is my argument passed to pi.py

  SLF4J: Class path contains multiple SLF4J bindings.
  SLF4J: Found binding in
  
[jar:file:/data/1/yarn/local/usercache/root/filecache/2084/spark-assembly-1.6.0-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]


  SLF4J: Found binding in
  
[jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]


  SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
  explanation.
  SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
  16/03/15 17:54:44 INFO yarn.ApplicationMaster: Registered signal handlers
  for [TERM, HUP, INT]
  16/03/15 17:54:45 INFO yarn.ApplicationMaster: ApplicationAttemptId:
  appattempt_1458023046377_0499_000001
  16/03/15 17:54:45 INFO spark.SecurityManager: Changing view acls to:
  yarn,root
  16/03/15 17:54:45 INFO spark.SecurityManager: Changing modify acls to:
  yarn,root
  16/03/15 17:54:45 INFO spark.SecurityManager: SecurityManager:
  authentication disabled; ui acls disabled; users with view permissions:
  Set
  (yarn, root); users with modify permissions: Set(yarn, root)
  16/03/15 17:54:45 INFO yarn.ApplicationMaster: Waiting for Spark driver
  to
  be reachable.
  16/03/15 17:54:45 ERROR yarn.ApplicationMaster: Failed to connect to
  driver
  at 10:0, retrying ...
  16/03/15 17:54:46 ERROR yarn.ApplicationMaster: Failed to connect to
  driver
  at 10:0, retrying ...
  16/03/15 17:54:46 ERROR yarn.ApplicationMaster: Failed to connect to
  driver
  at 10:0, retrying ...
  .........
  16/03/15 17:56:25 ERROR yarn.ApplicationMaster: Failed to connect to
  driver
  at 10:0, retrying ...
  16/03/15 17:56:26 ERROR yarn.ApplicationMaster: Uncaught exception:
  org.apache.spark.SparkException: Failed to connect to driver!
                   at
  org.apache.spark.deploy.yarn.ApplicationMaster.waitForSparkDriver
  (ApplicationMaster.scala:484)
                   at
  org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher
  (ApplicationMaster.scala:345)
                   at org.apache.spark.deploy.yarn.ApplicationMaster.run
  (ApplicationMaster.scala:187)
                   at org.apache.spark.deploy.yarn.ApplicationMaster$
  $anonfun
  $main$1.apply$mcV$sp(ApplicationMaster.scala:653)
                   at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run
  (SparkHadoopUtil.scala:69)
                   at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run
  (SparkHadoopUtil.scala:68)
                   at java.security.AccessController.doPrivileged(Native
  Method)
                   at javax.security.auth.Subject.doAs(Subject.java:422)
                   at org.apache.hadoop.security.UserGroupInformation.doAs
  (UserGroupInformation.java:1628)
                   at
  org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser
  (SparkHadoopUtil.scala:68)
                   at org.apache.spark.deploy.yarn.ApplicationMaster$.main
  (ApplicationMaster.scala:651)
                   at org.apache.spark.deploy.yarn.ExecutorLauncher$.main
  (ApplicationMaster.scala:674)
                   at org.apache.spark.deploy.yarn.ExecutorLauncher.main
  (ApplicationMaster.scala)
  16/03/15 17:56:26 INFO yarn.ApplicationMaster: Final app status: FAILED,
  exitCode: 10, (reason: Uncaught exception:
  org.apache.spark.SparkException:
  Failed to connect to driver!)
  16/03/15 17:56:26 INFO util.ShutdownHookManager: Shutdown hook called

  Best regards,

  S.Y. Chung 鍾學毅
  F14MITD
  Taiwan Semiconductor Manufacturing Company, Ltd.
  Tel: 06-5056688 Ext: 734-6325

  ---------------------------------------------------------------------------

                                                           TSMC PROPERTY
   This email communication (and any attachments) is proprietary
  information
   for the sole use of its
   intended recipient. Any unauthorized review, use or distribution by
  anyone
   other than the intended
   recipient is strictly prohibited.  If you are not the intended
  recipient,
   please notify the sender by
   replying to this email, and then delete this email and any copies of it
   immediately. Thank you.

  ---------------------------------------------------------------------------






--
Best Regards

Jeff Zhang

 --------------------------------------------------------------------------- 
                                                         TSMC PROPERTY       
 This email communication (and any attachments) is proprietary information   
 for the sole use of its                                                     
 intended recipient. Any unauthorized review, use or distribution by anyone  
 other than the intended                                                     
 recipient is strictly prohibited.  If you are not the intended recipient,   
 please notify the sender by                                                 
 replying to this email, and then delete this email and any copies of it     
 immediately. Thank you.                                                     
 --------------------------------------------------------------------------- 

Reply via email to