Hi Philipp, okay, I realized just now of my HUGE misunderstanding !

The "double-spark-submit" patter is just the standard spark-on-k8s way of 
running spark applications in cluster mode:
the 1st "spark-submit" in "cluster mode" is started from the client (in the 
zeppelin host, in our case), then the 2nd "spark-submit" in "client mode" is 
started by the "/opt/entrypoint.sh" script inside the standard spark docker 
image.

At this point I can make a more precise question:

I see that the interpreter.sh starts the RemoteInterpreterServer with, in 
particular the following paramters: CALLBACK_HOST / PORT
They refers to the Zeppelin host and RPC port

Moreover, when the interpreter starts, it runs a Thrift server on some random 
port.

So, I ask: which communications are supposed to happen, in order to correctly 
set-up my firewall/routing rules ?

-1 Must the Zeppelin server connect to the Interpreter Thrift server ?
-2 Must the Interpreter Thrift server connect to the Zeppelin server?
-3 Both ?

- Which ports must the Zeppelin server/ The thrift server  find open on the 
other server ?

Thank you everybody!

Fabrizio




On 2021/10/26 11:40:24, Philipp Dallig <philipp.dal...@gmail.com> wrote: 
> Hi Fabrizio,
> 
> At the moment I think zeppelin does not support running spark jobs in 
> cluster mode. But in fact K8s mode simulates cluster mode. Because the 
> Zeppelin interpreter is already started as a pod in K8s, as a manual 
> Spark submit execution would do in cluster mode.
> 
> Spark-submit is called only once during the start of the Zeppelin 
> interpreter. You will find the call in these lines: 
> https://github.com/apache/zeppelin/blob/2f55fe8ed277b28d71f858633f9c9d76fd18f0c3/bin/interpreter.sh#L303-L305
> 
> Best Regards
> Philipp
> 
> 
> Am 25.10.21 um 21:58 schrieb Fabrizio Fab:
> > Dear All, I am struggling since more than a week on the following problem.
> > My Zeppelin Server is running outside the k8s cluster (there is a reason 
> > for this) and I am able to run Spark zeppelin notes in Client mode but not 
> > in Cluster mode.
> >
> > I see that, at first, a pod for the interpreter (RemoteInterpreterServer) 
> > is created on the cluster by spark-submit from the Zeppelin host, with 
> > deployMode=cluster (and this happens without errors), then the interpreter 
> > itself runs another spark-submit  (this time from the Pod) with 
> > deployMode=client.
> >
> > Exactly, the following is the command line submitted by the interpreter 
> > from its pod
> >
> > /opt/spark/bin/spark-submit \
> > --conf spark.driver.bindAddress=<ip address of the interpreter pod> \
> > --deploy-mode client \
> > --properties-file /opt/spark/conf/spark.properties \
> > --class org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer \
> > spark-internal \
> > <ZEPPELIN_HOST> \
> > <ZEPPELIN_SERVER_RPC_PORT> \
> > <interpreter_name>-<user name>
> >
> > At this point, the interpreter Pod remains in "Running" state, while the 
> > Zeppelin note remains in "Pending" forever.
> >
> > The log of the Interpreter (level = DEBUG) at the end only says:
> >   INFO [2021-10-25 18:16:58,229] ({RemoteInterpreterServer-Thread} 
> > RemoteInterpreterServer.java[run]:194) Launching ThriftServer at <ip 
> > address of the interpreter pod>:<random port>
> >   INFO [2021-10-25 18:16:58,229] ({RegisterThread} 
> > RemoteInterpreterServer.java[run]:592) Start registration
> >   INFO [2021-10-25 18:16:58,332] ({RegisterThread} 
> > RemoteInterpreterServer.java[run]:606) Registering interpreter process
> >   INFO [2021-10-25 18:16:58,356] ({RegisterThread} 
> > RemoteInterpreterServer.java[run]:608) Registered interpreter process
> >   INFO [2021-10-25 18:16:58,356] ({RegisterThread} 
> > RemoteInterpreterServer.java[run]:629) Registration finished
> > (I replaced the true ip and port with a placeholder to make the log more 
> > clear for you)
> >
> > I am stuck at this point....
> > Anyone can help me ? Thank you in advance. Fabrizio
> >
> 

Reply via email to