It is supposed to be fixed in 0.9.0-SNAPSHOT as well, if you hit this issue
in master, then it should be a bug, please file a ticket and describe the
details. Thanks



Y. Ethan Guo <guoyi...@uber.com> 于2019年4月8日周一 下午4:42写道:

> I'm partially hitting this issue in 0.9.0-SNAPSHOT for Spark interpreter
> with other names.  Not sure if ZEPPELIN-3986 issue is completely resolved.
> I'm using multiple spark interpreters with different spark confs which
> share the same SPARK_SUBMIT_OPTIONS including a `--jars` option.  It seems
> that only one of them is working.  Anyway, shall we follow up on the ticket
> and see how to fix it?
>
> Thanks,
> - Ethan
>
> On Mon, Apr 8, 2019 at 1:34 AM Jeff Zhang <zjf...@gmail.com> wrote:
>
>> Hi Ethan,
>>
>> These behavior are not expected. Maybe you are hitting this issue which
>> is fixed in 0.8.2
>> https://jira.apache.org/jira/browse/ZEPPELIN-3986
>>
>>
>> Y. Ethan Guo <guoyi...@uber.com> 于2019年4月8日周一 下午4:26写道:
>>
>>> Hi Jeff, Dave,
>>>
>>> Thanks for the suggestion.  I was able to successfully run the Spark
>>> interpreter in yarn cluster mode on anther machine running Zeppelin.  The
>>> previous problem could probably be due to network issues.
>>>
>>> I have two observations:
>>> (1) I'm able to use "--jars" option in SPARK_SUBMIT_OPTIONS in the
>>> "spark" interpreter with yarn cluster mode configured.  I verify that the
>>> jars are pushed to the driver and executors by successfully running a job
>>> using some classes in the jars.  However, if I create a new "spark_abc"
>>> interpreter under the spark interpreter group, this new interpreter doesn't
>>> seem to pick up SPARK_SUBMIT_OPTIONS and the jars option, leading to errors
>>> of not being able to access packages/classes in the jars.
>>>
>>> (2) Once I restart the spark interpreters in the interpreter settings,
>>> the corresponding Spark jobs in yarn cluster first transition from
>>> "RUNNING" state to "ACCEPTED" state, and then end up in "FAILED" state.
>>>
>>> I'm wondering if the above behavior are expected and they are known to
>>> be the limitations of the current 0.9.0-SNAPSHOT version.
>>>
>>> Thanks,
>>> - Ethan
>>>
>>> On Sun, Apr 7, 2019 at 9:59 AM Dave Boyd <db...@incadencecorp.com>
>>> wrote:
>>>
>>>> From the connection refused message I wonder if it is an SSL error.  I
>>>> note none of the information for SSL (truststore, keystore, etc.)
>>>> I would think the YARN cluster requires some form of authentication.
>>>> On 4/7/19 9:27 AM, Jeff Zhang wrote:
>>>>
>>>> It looks like the interpreter process can not connect to zeppelin
>>>> server process. I guess it is due to some network issue, can you check
>>>> whether the node in yarn cluster can connect to the zeppelin server host ?
>>>>
>>>> Y. Ethan Guo <guoyi...@uber.com> 于2019年4月7日周日 下午3:31写道:
>>>>
>>>>> Hi Jeff,
>>>>>
>>>>> Given this PR is merged, I'm trying to see if I can run yarn cluster
>>>>> mode from master build.  I built Zeppelin master from this commit:
>>>>>
>>>>> commit 3655c12b875884410224eca5d6155287d51916ac
>>>>> Author: Jongyoul Lee <jongy...@gmail.com>
>>>>> Date:   Mon Apr 1 15:37:57 2019 +0900
>>>>>     [MINOR] Refactor CronJob class (#3335)
>>>>>
>>>>> While I can successfully run Spark interpreter yarn client mode, I'm
>>>>> having trouble making the yarn cluster mode working.  Specifically, while
>>>>> the interpreter job was accepted in yarn, the job failed after 1-2 minutes
>>>>> because of this exception (see below).  Do you have any idea why this
>>>>> is happening?
>>>>>
>>>>> DEBUG [2019-04-07 06:57:00,314] ({main} Logging.scala[logDebug]:58) -
>>>>> Created SSL options for fs: SSLOptions{enabled=false, keyStore=None,
>>>>> keyStorePassword=None, trustStore=None, trustStorePassword=None,
>>>>> protocol=None, enabledAlgorithms=Set()}
>>>>>  INFO [2019-04-07 06:57:00,323] ({main} Logging.scala[logInfo]:54) -
>>>>> Starting the user application in a separate Thread
>>>>>  INFO [2019-04-07 06:57:00,350] ({main} Logging.scala[logInfo]:54) -
>>>>> Waiting for spark context initialization...
>>>>>  INFO [2019-04-07 06:57:00,403] ({Driver}
>>>>> RemoteInterpreterServer.java[<init>]:148) - Starting remote interpreter
>>>>> server on port 0, intpEventServerAddress: 172.17.0.1:45128
>>>>> ERROR [2019-04-07 06:57:00,408] ({Driver} Logging.scala[logError]:91)
>>>>> - User class threw exception:
>>>>> org.apache.thrift.transport.TTransportException: 
>>>>> java.net.ConnectException:
>>>>> Connection refused (Connection refused)
>>>>> org.apache.thrift.transport.TTransportException:
>>>>> java.net.ConnectException: Connection refused (Connection refused)
>>>>> at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
>>>>> at
>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:154)
>>>>> at
>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:139)
>>>>> at
>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.main(RemoteInterpreterServer.java:285)
>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>> at
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>>> at
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>> at java.lang.reflect.Method.invoke(Method.java:498)
>>>>> at
>>>>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:635)
>>>>> Caused by: java.net.ConnectException: Connection refused (Connection
>>>>> refused)
>>>>> at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>>> at
>>>>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>>>>> at
>>>>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>>>>> at
>>>>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>>>>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>>>> at java.net.Socket.connect(Socket.java:589)
>>>>> at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
>>>>> ... 8 more
>>>>>
>>>>> Thanks,
>>>>> - Ethan
>>>>>
>>>>> On Wed, Feb 27, 2019 at 4:24 PM Jeff Zhang <zjf...@gmail.com> wrote:
>>>>>
>>>>>> Here's the PR
>>>>>> https://github.com/apache/zeppelin/pull/3308
>>>>>>
>>>>>> Y. Ethan Guo <guoyi...@uber.com> 于2019年2月28日周四 上午2:50写道:
>>>>>>
>>>>>>> Hi All,
>>>>>>>
>>>>>>> I'm trying to use the new feature of yarn cluster mode to run Spark
>>>>>>> 2.4.0 jobs on Zeppelin 0.8.1. I've set the SPARK_HOME,
>>>>>>> SPARK_SUBMIT_OPTIONS, and HADOOP_CONF_DIR env variables in 
>>>>>>> zeppelin-env.sh
>>>>>>> so that the Spark interpreter can be started in the cluster. I used
>>>>>>> `--jars` in SPARK_SUBMIT_OPTIONS to add local jars. However, when I 
>>>>>>> tried
>>>>>>> to import a class from the jars in a Spark paragraph, the interpreter
>>>>>>> complained that it cannot find the package and class ("<console>:23: 
>>>>>>> error:
>>>>>>> object ... is not a member of package ..."). Looks like the jars are not
>>>>>>> properly imported.
>>>>>>>
>>>>>>> I followed the instruction here
>>>>>>> <https://zeppelin.apache.org/docs/0.8.1/interpreter/spark.html#2-loading-spark-properties>
>>>>>>> to add the jars, but it seems that it's not working in the cluster mode.
>>>>>>> And this issue seems to be related to this bug:
>>>>>>> https://jira.apache.org/jira/browse/ZEPPELIN-3986.  Is there any
>>>>>>> update on fixing it? What is the right way to add local jars in yarn
>>>>>>> cluster mode? Any help and update are much appreciated.
>>>>>>>
>>>>>>>
>>>>>>> Here's the SPARK_SUBMIT_OPTIONS I used (packages and jars paths
>>>>>>> omitted):
>>>>>>>
>>>>>>> export SPARK_SUBMIT_OPTIONS="--driver-memory 12G --packages ...
>>>>>>> --jars ... --repositories
>>>>>>> https://repository.cloudera.com/artifactory/public/,https://repository.cloudera.com/content/repositories/releases/,http://repo.spring.io/plugins-release/
>>>>>>> "
>>>>>>>
>>>>>>> Thanks,
>>>>>>> - Ethan
>>>>>>> --
>>>>>>> Best,
>>>>>>> - Ethan
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards
>>>>>>
>>>>>> Jeff Zhang
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Best Regards
>>>>
>>>> Jeff Zhang
>>>>
>>>> --
>>>> ========= mailto:db...@incadencecorp.com <db...@incadencecorp.com> 
>>>> ============
>>>> David W. Boyd
>>>> VP,  Data Solutions
>>>> 10432 Balls Ford, Suite 240
>>>> Manassas, VA 20109
>>>> office:   +1-703-552-2862
>>>> cell:     +1-703-402-7908
>>>> ============== http://www.incadencecorp.com/ ============
>>>> ISO/IEC JTC1 WG9, editor ISO/IEC 20547 Big Data Reference Architecture
>>>> Chair ANSI/INCITS TC Big Data
>>>> Co-chair NIST Big Data Public Working Group Reference Architecture
>>>> First Robotic Mentor - FRC, FTC - www.iliterobotics.org
>>>> Board Member- USSTEM Foundation - www.usstem.org
>>>>
>>>> The information contained in this message may be privileged
>>>> and/or confidential and protected from disclosure.
>>>> If the reader of this message is not the intended recipient
>>>> or an employee or agent responsible for delivering this message
>>>> to the intended recipient, you are hereby notified that any
>>>> dissemination, distribution or copying of this communication
>>>> is strictly prohibited.  If you have received this communication
>>>> in error, please notify the sender immediately by replying to
>>>> this message and deleting the material from any computer.
>>>>
>>>>
>>>>
>>>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>

-- 
Best Regards

Jeff Zhang

Reply via email to