Hi.
I have been experiencing trouble while trying to connect to a Spark cluster
remotely. This Spark cluster is configured to run using YARN.
Can anyone guide me or provide any step-by-step instructions for connecting
remotely via spark-shell?
Here's the setup that I am using:
The Spark cluster is running with each node as a docker container hosted on
a VM. It is using YARN for scheduling resources for computations.
I have a dedicated docker container acting as a spark client, on which i
have the spark-shell installed(spark binary in standalone setup) and also
the Hadoop and Yarn config directories set so that spark-shell can
coordinate with the RM for resources.
With all of this set, i tried using the following command:

spark-shell --master yarn --deploy-mode client

This results in the spark-shell giving me a scala-based console, however,
when I check the Resource Manager UI on the cluster, there seems to be no
application/spark session running.
I have been expecting the driver to be running on the client machine and
the executors running in the cluster. But that doesn't seem to happen.

How can I achieve this?
Is whatever I am trying feasible, and if so, a good practice?

Thanks & Regards,
Rishikesh

Reply via email to