Hi. I have been experiencing trouble while trying to connect to a Spark cluster remotely. This Spark cluster is configured to run using YARN. Can anyone guide me or provide any step-by-step instructions for connecting remotely via spark-shell? Here's the setup that I am using: The Spark cluster is running with each node as a docker container hosted on a VM. It is using YARN for scheduling resources for computations. I have a dedicated docker container acting as a spark client, on which i have the spark-shell installed(spark binary in standalone setup) and also the Hadoop and Yarn config directories set so that spark-shell can coordinate with the RM for resources. With all of this set, i tried using the following command:
spark-shell --master yarn --deploy-mode client This results in the spark-shell giving me a scala-based console, however, when I check the Resource Manager UI on the cluster, there seems to be no application/spark session running. I have been expecting the driver to be running on the client machine and the executors running in the cluster. But that doesn't seem to happen. How can I achieve this? Is whatever I am trying feasible, and if so, a good practice? Thanks & Regards, Rishikesh