Hi Jhang, Not clear on that - I thought spark-submit was done when we run a paragraph, how does the .sh file come into play?
Thanks Ankit On Tue, Mar 13, 2018 at 5:43 PM, Jeff Zhang <zjf...@gmail.com> wrote: > > spark-submit is called in bin/interpreter.sh, I didn't try standalone > cluster mode. It is expected to run driver in separate host, but didn't > guaranteed zeppelin support this. > > Ankit Jain <ankitjain....@gmail.com>于2018年3月14日周三 上午8:34写道: > >> Hi Jhang, >> What is the expected behavior with standalone cluster mode? Should we see >> separate driver processes in the cluster(one per user) or multiple >> SparkSubmit processes? >> >> I was trying to dig in Zeppelin code & didn’t see where Zeppelin does the >> Spark-submit to the cluster? Can you please point to it? >> >> Thanks >> Ankit >> >> On Mar 13, 2018, at 5:25 PM, Jeff Zhang <zjf...@gmail.com> wrote: >> >> >> ZEPPELIN-2898 <https://issues.apache.org/jira/browse/ZEPPELIN-2898> is >> for yarn cluster model. And Zeppelin have integration test for yarn mode, >> so guaranteed it would work. But don't' have test for standalone, so not >> sure the behavior of standalone mode. >> >> >> Ruslan Dautkhanov <dautkha...@gmail.com>于2018年3月14日周三 上午8:06写道: >> >>> https://github.com/apache/zeppelin/pull/2577 pronounces yarn-cluster in >>> it's title so I assume it's only yarn-cluster. >>> Never used standalone-cluster myself. >>> >>> Which distro of Hadoop do you use? >>> Cloudera desupported standalone in CDH 5.5 and will remove in CDH 6. >>> https://www.cloudera.com/documentation/enterprise/ >>> release-notes/topics/rg_deprecated.html >>> >>> >>> >>> -- >>> Ruslan Dautkhanov >>> >>> On Tue, Mar 13, 2018 at 5:45 PM, Jhon Anderson Cardenas Diaz < >>> jhonderson2...@gmail.com> wrote: >>> >>>> Does this new feature work only for yarn-cluster ?. Or for spark >>>> standalone too ? >>>> >>> El mar., 13 de mar. de 2018 18:34, Ruslan Dautkhanov < >>>> dautkha...@gmail.com> escribió: >>>> >>> > Zeppelin version: 0.8.0 (merged at September 2017 version) >>>>> >>>>> https://issues.apache.org/jira/browse/ZEPPELIN-2898 was merged end of >>>>> September so not sure if you have that. >>>>> >>>>> Check out https://medium.com/@zjffdu/zeppelin-0-8-0-new- >>>>> features-ea53e8810235 how to set this up. >>>>> >>>>> >>>>> -- >>>>> Ruslan Dautkhanov >>>>> >>>>> On Tue, Mar 13, 2018 at 5:24 PM, Jhon Anderson Cardenas Diaz < >>>>> jhonderson2...@gmail.com> wrote: >>>>> >>>> Hi zeppelin users ! >>>>>> >>>>>> I am working with zeppelin pointing to a spark in standalone. I am >>>>>> trying to figure out a way to make zeppelin runs the spark driver outside >>>>>> of client process that submits the application. >>>>>> >>>>>> According with the documentation (http://spark.apache.org/docs/ >>>>>> 2.1.1/spark-standalone.html): >>>>>> >>>>>> *For standalone clusters, Spark currently supports two deploy modes. >>>>>> In client mode, the driver is launched in the same process as the client >>>>>> that submits the application. In cluster mode, however, the driver is >>>>>> launched from one of the Worker processes inside the cluster, and the >>>>>> client process exits as soon as it fulfills its responsibility of >>>>>> submitting the application without waiting for the application to >>>>>> finish.* >>>>>> >>>>>> The problem is that, even when I set the properties for >>>>>> spark-standalone cluster and deploy mode in cluster, the driver still run >>>>>> inside zeppelin machine (according with spark UI/executors page). These >>>>>> are >>>>>> properties that I am setting for the spark interpreter: >>>>>> >>>>>> master: spark://<master-name>:7077 >>>>>> spark.submit.deployMode: cluster >>>>>> spark.executor.memory: 16g >>>>>> >>>>>> Any ideas would be appreciated. >>>>>> >>>>>> Thank you >>>>>> >>>>>> Details: >>>>>> Spark version: 2.1.1 >>>>>> Zeppelin version: 0.8.0 (merged at September 2017 version) >>>>>> >>>>> -- Thanks & Regards, Ankit.