To put it simply, what are the configurations that need to be done on the
client machine so that it can run driver on itself and executors on
spark-yarn cluster nodes?

On Mon, Apr 22, 2019, 8:22 PM Rishikesh Gawade <rishikeshg1...@gmail.com>
wrote:

> Hi.
> I have been experiencing trouble while trying to connect to a Spark
> cluster remotely. This Spark cluster is configured to run using YARN.
> Can anyone guide me or provide any step-by-step instructions for
> connecting remotely via spark-shell?
> Here's the setup that I am using:
> The Spark cluster is running with each node as a docker container hosted
> on a VM. It is using YARN for scheduling resources for computations.
> I have a dedicated docker container acting as a spark client, on which i
> have the spark-shell installed(spark binary in standalone setup) and also
> the Hadoop and Yarn config directories set so that spark-shell can
> coordinate with the RM for resources.
> With all of this set, i tried using the following command:
>
> spark-shell --master yarn --deploy-mode client
>
> This results in the spark-shell giving me a scala-based console, however,
> when I check the Resource Manager UI on the cluster, there seems to be no
> application/spark session running.
> I have been expecting the driver to be running on the client machine and
> the executors running in the cluster. But that doesn't seem to happen.
>
> How can I achieve this?
> Is whatever I am trying feasible, and if so, a good practice?
>
> Thanks & Regards,
> Rishikesh
>

Reply via email to