It looks like the interpreter process can not connect to zeppelin server
process. I guess it is due to some network issue, can you check whether the
node in yarn cluster can connect to the zeppelin server host ?

Y. Ethan Guo <guoyi...@uber.com> 于2019年4月7日周日 下午3:31写道:

> Hi Jeff,
>
> Given this PR is merged, I'm trying to see if I can run yarn cluster mode
> from master build.  I built Zeppelin master from this commit:
>
> commit 3655c12b875884410224eca5d6155287d51916ac
> Author: Jongyoul Lee <jongy...@gmail.com>
> Date:   Mon Apr 1 15:37:57 2019 +0900
>     [MINOR] Refactor CronJob class (#3335)
>
> While I can successfully run Spark interpreter yarn client mode, I'm
> having trouble making the yarn cluster mode working.  Specifically, while
> the interpreter job was accepted in yarn, the job failed after 1-2 minutes
> because of this exception (see below).  Do you have any idea why this
> is happening?
>
> DEBUG [2019-04-07 06:57:00,314] ({main} Logging.scala[logDebug]:58) -
> Created SSL options for fs: SSLOptions{enabled=false, keyStore=None,
> keyStorePassword=None, trustStore=None, trustStorePassword=None,
> protocol=None, enabledAlgorithms=Set()}
>  INFO [2019-04-07 06:57:00,323] ({main} Logging.scala[logInfo]:54) -
> Starting the user application in a separate Thread
>  INFO [2019-04-07 06:57:00,350] ({main} Logging.scala[logInfo]:54) -
> Waiting for spark context initialization...
>  INFO [2019-04-07 06:57:00,403] ({Driver}
> RemoteInterpreterServer.java[<init>]:148) - Starting remote interpreter
> server on port 0, intpEventServerAddress: 172.17.0.1:45128
> ERROR [2019-04-07 06:57:00,408] ({Driver} Logging.scala[logError]:91) -
> User class threw exception:
> org.apache.thrift.transport.TTransportException: java.net.ConnectException:
> Connection refused (Connection refused)
> org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection refused (Connection refused)
> at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:154)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:139)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.main(RemoteInterpreterServer.java:285)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:635)
> Caused by: java.net.ConnectException: Connection refused (Connection
> refused)
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
> at
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
> at
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
> ... 8 more
>
> Thanks,
> - Ethan
>
> On Wed, Feb 27, 2019 at 4:24 PM Jeff Zhang <zjf...@gmail.com> wrote:
>
>> Here's the PR
>> https://github.com/apache/zeppelin/pull/3308
>>
>> Y. Ethan Guo <guoyi...@uber.com> 于2019年2月28日周四 上午2:50写道:
>>
>>> Hi All,
>>>
>>> I'm trying to use the new feature of yarn cluster mode to run Spark
>>> 2.4.0 jobs on Zeppelin 0.8.1. I've set the SPARK_HOME,
>>> SPARK_SUBMIT_OPTIONS, and HADOOP_CONF_DIR env variables in zeppelin-env.sh
>>> so that the Spark interpreter can be started in the cluster. I used
>>> `--jars` in SPARK_SUBMIT_OPTIONS to add local jars. However, when I tried
>>> to import a class from the jars in a Spark paragraph, the interpreter
>>> complained that it cannot find the package and class ("<console>:23: error:
>>> object ... is not a member of package ..."). Looks like the jars are not
>>> properly imported.
>>>
>>> I followed the instruction here
>>> <https://zeppelin.apache.org/docs/0.8.1/interpreter/spark.html#2-loading-spark-properties>
>>> to add the jars, but it seems that it's not working in the cluster mode.
>>> And this issue seems to be related to this bug:
>>> https://jira.apache.org/jira/browse/ZEPPELIN-3986.  Is there any update
>>> on fixing it? What is the right way to add local jars in yarn cluster mode?
>>> Any help and update are much appreciated.
>>>
>>>
>>> Here's the SPARK_SUBMIT_OPTIONS I used (packages and jars paths omitted):
>>>
>>> export SPARK_SUBMIT_OPTIONS="--driver-memory 12G --packages ... --jars
>>> ... --repositories
>>> https://repository.cloudera.com/artifactory/public/,https://repository.cloudera.com/content/repositories/releases/,http://repo.spring.io/plugins-release/
>>> "
>>>
>>> Thanks,
>>> - Ethan
>>> --
>>> Best,
>>> - Ethan
>>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>

-- 
Best Regards

Jeff Zhang

Reply via email to