Standalone mode already implies you are running on cluster (distributed)
mode. i.e. it's one of 4 available cluster manager options. The difference
is Standalone uses it's one resource manager rather than using YARN for
example.
If you are running docker on a single machine then you are limited to that
but if you run your docker on a cluster and deploy your Spark containers on
it then you will get your distribution and cluster mode.
And also If you are referring to scalability then you need to register
worker nodes when you need to scale.
You do it by registering a VM/container as a worker node as per doc using:

./sbin/start-worker.sh <master-spark-URL>

You can create a new docker container with your base image and run the
above command on the bootstrap and that would register a worker node
and scale your cluster when you want.

And if you kill them then you would scale down ( I think this is how
Databricks autoscaling works..). I am not sure k8s TBH, perhaps it's
handled this more gracefully


On Sat, Jul 24, 2021 at 3:38 PM Dinakar Chennubotla <
chennu.bigd...@gmail.com> wrote:

> Hi Khalid Mammadov,
>
> Thank you for your response,
> Yes, I did, I built standalone apache spark cluster on docker containers.
>
> But I am looking for distributed spark cluster,
> Where spark workers are scalable and spark "deployment mode  = cluster".
>
> Source url I used to built standalone apache spark cluster
> https://www.kdnuggets.com/2020/07/apache-spark-cluster-docker.html
>
> If you have documentation on distributed spark, which I am looking for,
> could you please send me.
>
>
> Thanks,
> Dinakar
>
> On Sat, 24 Jul, 2021, 19:32 Khalid Mammadov, <khalidmammad...@gmail.com>
> wrote:
>
>> Hi,
>>
>> Have you checked out docs?
>> https://spark.apache.org/docs/latest/spark-standalone.html
>>
>> Thanks,
>> Khalid
>>
>> On Sat, Jul 24, 2021 at 1:45 PM Dinakar Chennubotla <
>> chennu.bigd...@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> I am Dinakar, Hadoop admin,
>>> could someone help me here,
>>>
>>> 1. I have a DEV-POC task to do,
>>> 2. Need to Installing Distributed apache-spark cluster with Cluster mode
>>> on Docker containers.
>>> 3. with Scalable spark-worker containers.
>>> 4. we have a 9 node cluster with some other services or tools.
>>>
>>> Thanks,
>>> Dinakar
>>>
>>

Reply via email to