[Spark RPC]: Yarn - Application Master / executors to Driver communication issue

Sunayan Saikia Fri, 14 Jul 2023 10:44:52 -0700

Hey Spark Community,

Our Jupyterhub/Jupyterlab (with spark client) runs behind two layers of
HAProxy and the Yarn cluster runs remotely. We want to use deploy mode
'client' so that we can capture the output of any spark sql query in
jupyterlab. I'm aware of other technologies like Livy and Spark Connect,
however, we want to do things without using any of these at the moment.


With Spark 3, we have seen that during spark session creation itself, the
*ApplicationMaster* attempts fail to talk back to the Driver with an
Exception of *'awaitResults - Too Large Frame: 5211803372140375592 -
Connection closed'.* This doesn't look like a correct exception because it
fails just during the creating the spark session without any query load.

We are using the configs *spark.driver.host*, *spark.driver.port,*
*spark.driver.blockManager.port* and *spark.driver.bindAddress. *We have
tested that, from outside, the host and the ports used with the above
configs, are already accessible.

I'm wondering if Spark supports this type of communication? Any suggestions
to debug this further?

-- 
Thanks,
Sunayan Saikia

[Spark RPC]: Yarn - Application Master / executors to Driver communication issue

Reply via email to