Interesting.

Spark supports the following cluster managers


   - Standalone: A cluster-manager, limited in features, shipped with Spark.
   - Apache Hadoop YARN is the most widely used resource manager not just
   for Spark but for other artefacts as well. On-premise YARN is used
   extensively. In Cloud it is also used widely in Infrastructure as a Service
   such as Google Dataproc.
   - Kubernetes (k8s): Spark runs natively on Kubernetes since version
   Spark 2.3.
   - Apache Mesos: An open source cluster-manager which was once popular
   but now in decline.

Now as I understand you are utilising both spark standalone and K8s. What
is perhaps missing is an architecture diagram for your setup. Do you have
or can you create such a diagram?

HTH

Mich Talebzadeh,
Solutions Architect/Engineering Lead
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Mon, 21 Aug 2023 at 08:31, Крюков Виталий Семенович
<vkryu...@fil-it.ru.invalid> wrote:

>
> Good afternoon.
> Perhaps you will be discouraged by what I will write below, but
> nevertheless, I ask for help in solving my problem. Perhaps the
> architecture of our solution will not seem correct to you.
> There are backend services that communicate with a service that implements
> spark-driver. When the service starts with driver, spark-submit occurs and
> the session lives until the service stops. The service works constantly.
> Faced problems when they began to deploy our solution in k8s. The services
> were located inside the k8s cluster and the Spark StandAlone cluster
> outside the k8s. When starting the service from spark-driver, spark-submit
> is executed, which confirms the presence of an application on the UI.
> But on workers, we get an error that the workman could not connect to a
> random port to the spark-driver. The ports themselves have learned to
> override and specify, but these ports must be accessible from outside the
> cluster. We found a solution in which we open NodePort on workers - it
> works. BUT this is not suitable for most customers due to internal
> regulations. How to resolve the issue through ingress was never found.
>
> Faced problems when they began to deploy our solution in k8s. The services
> were located inside the k8s cluster and the Spark StandAlone cluster
> outside the k8s. When starting the service from spark-driver, spark-submit
> is executed, which confirms the presence of an application on the UI. But
> on workers, we get an error that the workman could not connect to a random
> port to the spark-driver. The ports themselves have learned to override and
> specify, but these ports must be accessible from outside the cluster. We
> found a solution in which we open NodePort on workers - it works. BUT this
> is not suitable for most customers due to internal regulations. How to
> resolve the issue through ingress was never found.
>
> *with best regards**,*
>
> *Vitaly Kryukov*
> [image: 1c330227-6767-4cc2-bd95-69fd1fe6b3e7]
>
  • k8s+ YARN Spark Крюков Виталий Семенович
    • Re: k8s+ YARN Spark Mich Talebzadeh

Reply via email to