Hi Jeff, Given this PR is merged, I'm trying to see if I can run yarn cluster mode from master build. I built Zeppelin master from this commit:
commit 3655c12b875884410224eca5d6155287d51916ac Author: Jongyoul Lee <jongy...@gmail.com> Date: Mon Apr 1 15:37:57 2019 +0900 [MINOR] Refactor CronJob class (#3335) While I can successfully run Spark interpreter yarn client mode, I'm having trouble making the yarn cluster mode working. Specifically, while the interpreter job was accepted in yarn, the job failed after 1-2 minutes because of this exception (see below). Do you have any idea why this is happening? DEBUG [2019-04-07 06:57:00,314] ({main} Logging.scala[logDebug]:58) - Created SSL options for fs: SSLOptions{enabled=false, keyStore=None, keyStorePassword=None, trustStore=None, trustStorePassword=None, protocol=None, enabledAlgorithms=Set()} INFO [2019-04-07 06:57:00,323] ({main} Logging.scala[logInfo]:54) - Starting the user application in a separate Thread INFO [2019-04-07 06:57:00,350] ({main} Logging.scala[logInfo]:54) - Waiting for spark context initialization... INFO [2019-04-07 06:57:00,403] ({Driver} RemoteInterpreterServer.java[<init>]:148) - Starting remote interpreter server on port 0, intpEventServerAddress: 172.17.0.1:45128 ERROR [2019-04-07 06:57:00,408] ({Driver} Logging.scala[logError]:91) - User class threw exception: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused) org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused) at org.apache.thrift.transport.TSocket.open(TSocket.java:226) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:154) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:139) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.main(RemoteInterpreterServer.java:285) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:635) Caused by: java.net.ConnectException: Connection refused (Connection refused) at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at org.apache.thrift.transport.TSocket.open(TSocket.java:221) ... 8 more Thanks, - Ethan On Wed, Feb 27, 2019 at 4:24 PM Jeff Zhang <zjf...@gmail.com> wrote: > Here's the PR > https://github.com/apache/zeppelin/pull/3308 > > Y. Ethan Guo <guoyi...@uber.com> 于2019年2月28日周四 上午2:50写道: > >> Hi All, >> >> I'm trying to use the new feature of yarn cluster mode to run Spark 2.4.0 >> jobs on Zeppelin 0.8.1. I've set the SPARK_HOME, SPARK_SUBMIT_OPTIONS, and >> HADOOP_CONF_DIR env variables in zeppelin-env.sh so that the Spark >> interpreter can be started in the cluster. I used `--jars` in >> SPARK_SUBMIT_OPTIONS to add local jars. However, when I tried to import a >> class from the jars in a Spark paragraph, the interpreter complained that >> it cannot find the package and class ("<console>:23: error: object ... is >> not a member of package ..."). Looks like the jars are not properly >> imported. >> >> I followed the instruction here >> <https://zeppelin.apache.org/docs/0.8.1/interpreter/spark.html#2-loading-spark-properties> >> to add the jars, but it seems that it's not working in the cluster mode. >> And this issue seems to be related to this bug: >> https://jira.apache.org/jira/browse/ZEPPELIN-3986. Is there any update >> on fixing it? What is the right way to add local jars in yarn cluster mode? >> Any help and update are much appreciated. >> >> >> Here's the SPARK_SUBMIT_OPTIONS I used (packages and jars paths omitted): >> >> export SPARK_SUBMIT_OPTIONS="--driver-memory 12G --packages ... --jars >> ... --repositories >> https://repository.cloudera.com/artifactory/public/,https://repository.cloudera.com/content/repositories/releases/,http://repo.spring.io/plugins-release/ >> " >> >> Thanks, >> - Ethan >> -- >> Best, >> - Ethan >> > > > -- > Best Regards > > Jeff Zhang >