Re: Spark on Kubernetes

2024-04-30 Thread Mich Talebzadeh
g/wiki/Wernher_von_Braun>)". On Tue, 30 Apr 2024 at 04:29, Tarun raghav wrote: > Respected Sir/Madam, > I am Tarunraghav. I have a query regarding spark on kubernetes. > > We have an eks cluster, within which we have spark installed in the pods. > We set the executor memory as 1G

Spark on Kubernetes

2024-04-29 Thread Tarun raghav
Respected Sir/Madam, I am Tarunraghav. I have a query regarding spark on kubernetes. We have an eks cluster, within which we have spark installed in the pods. We set the executor memory as 1GB and set the executor instances as 2, I have also set dynamic allocation as true. So when I try to read

Re: [Spark on Kubernetes]: Seeking Guidance on Handling Persistent Executor Failures

2024-02-19 Thread Mich Talebzadeh
Thanks for your kind words Sri Well it is true that as yet spark on kubernetes is not on-par with spark on YARN in maturity and essentially spark on kubernetes is still work in progress.* So in the first place IMO one needs to think why executors are failing. What causes this behaviour

Re: [Spark on Kubernetes]: Seeking Guidance on Handling Persistent Executor Failures

2024-02-19 Thread Cheng Pan
> On Feb 19, 2024, at 23:59, Sri Potluri wrote: > > Hello Spark Community, > > I am currently leveraging Spark on Kubernetes, managed by the Spark Operator, > for running various Spark applications. While the system generally works > well, I've encountered a challenge

Re: [Spark on Kubernetes]: Seeking Guidance on Handling Persistent Executor Failures

2024-02-19 Thread Sri Potluri
eh-ph-d-5205b2/> >> >> >> https://en.everybodywiki.com/Mich_Talebzadeh >> >> >> >> *Disclaimer:* The information provided is correct to the best of my >> knowledge but of course cannot be guaranteed . It is essential to note >> that, as w

Re: [Spark on Kubernetes]: Seeking Guidance on Handling Persistent Executor Failures

2024-02-19 Thread Mich Talebzadeh
s worth one-thousand > expert opinions (Werner <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von > Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". > > > On Mon, 19 Feb 2024 at 18:34, Sri Potluri wrote: > >> Hello Spark Community, >&

Re: [Spark on Kubernetes]: Seeking Guidance on Handling Persistent Executor Failures

2024-02-19 Thread Mich Talebzadeh
;Von Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". On Mon, 19 Feb 2024 at 18:34, Sri Potluri wrote: > Hello Spark Community, > > I am currently leveraging Spark on Kubernetes, managed by the Spark > Operator, for running various Spark applications. While

[Spark on Kubernetes]: Seeking Guidance on Handling Persistent Executor Failures

2024-02-19 Thread Sri Potluri
Hello Spark Community, I am currently leveraging Spark on Kubernetes, managed by the Spark Operator, for running various Spark applications. While the system generally works well, I've encountered a challenge related to how Spark applications handle executor failures, specifically in scenarios

Re: Regarding Spark on Kubernetes(EKS)

2024-02-19 Thread Jagannath Majhi
Yes I have gone through it. So let's give me the setup. More context - My jar file is in java language On Mon, Feb 19, 2024, 8:53 PM Mich Talebzadeh wrote: > Sure but first it would be beneficial to understand the way Spark works on > Kubernetes and the concept.s > > Have a look at

Re: Regarding Spark on Kubernetes(EKS)

2024-02-19 Thread Jagannath Majhi
I am not using any private docker image. Only I am running the jar file in EMR using spark-submit command so now I want to run this jar file in eks so can you please tell me how can I set-up for this ?? On Mon, Feb 19, 2024, 8:06 PM Jagannath Majhi < jagannath.ma...@cloud.cbnits.com> wrote: >

Re: Regarding Spark on Kubernetes(EKS)

2024-02-19 Thread Mich Talebzadeh
Sure but first it would be beneficial to understand the way Spark works on Kubernetes and the concept.s Have a look at this article of mine Spark on Kubernetes, A Practitioner’s Guide <https://www.linkedin.com/pulse/spark-kubernetes-practitioners-guide-mich-talebzadeh-ph-d-%3Ftrackin

Re: Regarding Spark on Kubernetes(EKS)

2024-02-19 Thread Mich Talebzadeh
OK you have a jar file that you want to work with when running using Spark on k8s as the execution engine (EKS) as opposed to YARN on EMR as the execution engine? Mich Talebzadeh, Dad | Technologist | Solutions Architect | Engineer London United Kingdom view my Linkedin profile

Re: Regarding Spark on Kubernetes(EKS)

2024-02-19 Thread Mich Talebzadeh
Where is your docker file? In ECR container registry. If you are going to use EKS, then it need to be accessible to all nodes of cluster When you build your docker image, put your jar under the $SPARK_HOME directory. Then add a line to your docker build file as below Here I am accessing Google

Re: Regarding Spark on Kubernetes(EKS)

2024-02-19 Thread Richard Smith
I run my Spark jobs in GCP with Google Dataproc using GCS buckets. I've not used AWS, but its EMR product offers similar functionality to Dataproc. The title of your post implies your Spark cluster runs on EKS. You might be better off using EMR, see links below: EMR

Regarding Spark on Kubernetes(EKS)

2024-02-19 Thread Jagannath Majhi
Dear Spark Community, I hope this email finds you well. I am reaching out to seek assistance and guidance regarding a task I'm currently working on involving Apache Spark. I have developed a JAR file that contains some Spark applications and functionality, and I need to run this JAR file within

Elasticity and scalability for Spark in Kubernetes

2023-10-30 Thread Mich Talebzadeh
I was thinking in line of elasticity and autoscaling for Spark in the context of Kubernetes. My experience with Kubernetes and Spark on the so called autopilot has not been that great.This is mainly from my experience that in autopilot you let the choice of nodes be decided by the vendor's

Re: Seeking Guidance on Spark on Kubernetes Secrets Configuration

2023-10-01 Thread Jon Rodríguez Aranguren
this message finds you all in good health and spirits. >> >> I'm reaching out to the collective expertise of this esteemed community >> with a query regarding Spark on Kubernetes. As a newcomer, I have always >> admired the depth and breadth of knowledge shared within thi

Re: Seeking Guidance on Spark on Kubernetes Secrets Configuration

2023-10-01 Thread Jörn Franke
e of this esteemed community with a query regarding Spark on Kubernetes. As a newcomer, I have always admired the depth and breadth of knowledge shared within this forum, and it is my hope that some of you might have insights on a specific challenge I'm facing.I am currently trying to configure multiple

Re: Seeking Guidance on Spark on Kubernetes Secrets Configuration

2023-10-01 Thread Jörn Franke
eb Jon Rodríguez Aranguren <jon.r.arangu...@gmail.com>:Dear Spark Community Members,I trust this message finds you all in good health and spirits.I'm reaching out to the collective expertise of this esteemed community with a query regarding Spark on Kubernetes. As a newcomer, I have alwa

Re: Seeking Guidance on Spark on Kubernetes Secrets Configuration

2023-10-01 Thread Mich Talebzadeh
gt;> jon.r.arangu...@gmail.com>: >> >>  >> Dear Spark Community Members, >> >> I trust this message finds you all in good health and spirits. >> >> I'm reaching out to the collective expertise of this esteemed community >> with a query re

Re: Seeking Guidance on Spark on Kubernetes Secrets Configuration

2023-09-30 Thread Jayabindu Singh
t; Dear Spark Community Members, > > I trust this message finds you all in good health and spirits. > > I'm reaching out to the collective expertise of this esteemed community > with a query regarding Spark on Kubernetes. As a newcomer, I have always > admired the depth and b

Re: Seeking Guidance on Spark on Kubernetes Secrets Configuration

2023-09-30 Thread Jörn Franke
uez Aranguren > : > >  > Dear Spark Community Members, > > I trust this message finds you all in good health and spirits. > > I'm reaching out to the collective expertise of this esteemed community with > a query regarding Spark on Kubernetes. As a newcomer, I ha

Seeking Guidance on Spark on Kubernetes Secrets Configuration

2023-09-29 Thread Jon Rodríguez Aranguren
Dear Spark Community Members, I trust this message finds you all in good health and spirits. I'm reaching out to the collective expertise of this esteemed community with a query regarding Spark on Kubernetes. As a newcomer, I have always admired the depth and breadth of knowledge shared within

SPIP: Adding work load identity to Spark on Kubernetes documents (supersedes Secret Management)

2023-02-20 Thread Mich Talebzadeh
Hi, I would like to propose that the current Secret Management <https://spark.apache.org/docs/latest/running-on-kubernetes.html#secret-management> in Spark Kubernetes documentation to include the more secure credentials Workload identity) for Spark to access Kubernetes services. Both

Re: Running Spark on Kubernetes (GKE) - failing on spark-submit

2023-02-16 Thread Mich Talebzadeh
;> It may help to check this article of mine >> >> >> Spark on Kubernetes, A Practitioner’s Guide >> <https://www.linkedin.com/pulse/spark-kubernetes-practitioners-guide-mich-talebzadeh-ph-d-/?trackingId=FDQORri0TBeJl02p3D%2B2JA%3D%3D> >> >> >> HTH &g

Re: Running Spark on Kubernetes (GKE) - failing on spark-submit

2023-02-15 Thread karan alang
thnks, Mich .. let me check this On Wed, Feb 15, 2023 at 1:42 AM Mich Talebzadeh wrote: > > It may help to check this article of mine > > > Spark on Kubernetes, A Practitioner’s Guide > <https://www.linkedin.com/pulse/spark-kubernetes-practitioners-guide-mich-talebza

Re: Running Spark on Kubernetes (GKE) - failing on spark-submit

2023-02-15 Thread Mich Talebzadeh
It may help to check this article of mine Spark on Kubernetes, A Practitioner’s Guide <https://www.linkedin.com/pulse/spark-kubernetes-practitioners-guide-mich-talebzadeh-ph-d-/?trackingId=FDQORri0TBeJl02p3D%2B2JA%3D%3D> HTH view my Linkedin profile <https://www.linkedin.co

Re: Running Spark on Kubernetes (GKE) - failing on spark-submit

2023-02-15 Thread Mich Talebzadeh
Your submit command spark-submit --master k8s://https://34.74.22.140:7077 --deploy-mode cluster --name pyspark-example --conf spark.kubernetes.container.image=pyspark-example:0.1 --conf spark.kubernetes.file.upload.path=/myexample src/StructuredStream-on-gke.py pay attention to what it says

Re: Running Spark on Kubernetes (GKE) - failing on spark-submit

2023-02-14 Thread karan alang
Hi Ye, This is the error i get when i don't set the spark.kubernetes.file.upload.path Any ideas on how to fix this ? ``` Exception in thread "main" org.apache.spark.SparkException: Please specify spark.kubernetes.file.upload.path property. at

Re: Running Spark on Kubernetes (GKE) - failing on spark-submit

2023-02-14 Thread Ye Xianjin
The configuration of ‘…file.upload.path’ is wrong. it means a distributed fs path to store your archives/resource/jars temporarily, then distributed by spark to drivers/executors. For your cases, you don’t need to set this configuration.Sent from my iPhoneOn Feb 14, 2023, at 5:43 AM, karan alang

Re: Running Spark on Kubernetes (GKE) - failing on spark-submit

2023-02-14 Thread Khalid Mammadov
I am not k8s expert but I think you got permission issue. Try 777 as an example to see if it works. On Mon, 13 Feb 2023, 21:42 karan alang, wrote: > Hello All, > > I'm trying to run a simple application on GKE (Kubernetes), and it is > failing: > Note : I have spark(bitnami spark chart)

Running Spark on Kubernetes (GKE) - failing on spark-submit

2023-02-13 Thread karan alang
Hello All, I'm trying to run a simple application on GKE (Kubernetes), and it is failing: Note : I have spark(bitnami spark chart) installed on GKE using helm install Here is what is done : 1. created a docker image using Dockerfile Dockerfile : ``` FROM python:3.7-slim RUN apt-get update &&

Re: spark on kubernetes

2022-10-16 Thread Qian Sun
Glad to hear it! On Sun, Oct 16, 2022 at 2:37 PM Mohammad Abdollahzade Arani < mamadazar...@gmail.com> wrote: > Hi Qian, > Thanks for the reply and I'm So sorry for the late reply. > I found the answer. My mistake was token conversion. I had to decode > base64 the service accounts token and

Re: spark on kubernetes

2022-10-15 Thread Qian Sun
Hi Mohammad Did you try this command? ./bin/spark-submit \ --master k8s://https://vm13:6443 \ --class com.example.WordCounter \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=default \ --conf

spark on kubernetes

2022-10-15 Thread Mohammad Abdollahzade Arani
I have a k8s cluster and a spark cluster. my question is is as bellow: https://stackoverflow.com/questions/74053948/how-to-resolve-pods-is-forbidden-user-systemanonymous-cannot-watch-resourc I have searched and I found lot's of other similar questions on stackoverflow without an answer like

trouble using spark in kubernetes

2022-05-03 Thread Andreas Klos
; ) spark = SparkSession.builder.config(conf=spark_conf).getOrCreate() sc = spark.sparkContext t = sc.parallelize(range(10)) r = t.sumApprox(3) print('Approximate sum: %s' % r) Unfortunately, It does not work... with kubectl describe po podname-exec-1 I get the following error message: Error:

Re: One click to run Spark on Kubernetes

2022-02-23 Thread bo yang
gt;>>> However, that depends on setting up access permission, use of service >>>>> accounts, pulling the correct dockerfiles for the driver and the >>>>> executors. >>>>> Those details add to the complexity. >>>>> >>>&

Re: One click to run Spark on Kubernetes

2022-02-23 Thread Bjørn Jørgensen
gt; >>>> Thanks >>>> >>>> >>>> >>>> view my Linkedin profile >>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>> >>>> >>>> https://en.everybodywiki.com/Mich_Talebz

Re: One click to run Spark on Kubernetes

2022-02-23 Thread bo yang
; >>> >>> https://en.everybodywiki.com/Mich_Talebzadeh >>> >>> >>> >>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>> any loss, damage or destruction of data or any other property which may >>> arise

Re: One click to run Spark on Kubernetes

2022-02-23 Thread Sarath Annareddy
d to the complexity. >>>> >>>> Thanks >>>> >>>> >>>>view my Linkedin profile >>>> >>>> >>>> >>>> https://en.everybodywiki.com/Mich_Talebzadeh >>>> >>>> >>>>

Re: One click to run Spark on Kubernetes

2022-02-23 Thread bo yang
erybodywiki.com/Mich_Talebzadeh >> >> >> >> *Disclaimer:* Use it at your own risk. Any and all responsibility for >> any loss, damage or destruction of data or any other property which may >> arise from relying on this email's technical content is explicitly >>

Re: One click to run Spark on Kubernetes

2022-02-23 Thread Sarath Annareddy
ny other property which may arise >> from relying on this email's technical content is explicitly disclaimed. The >> author will in no case be liable for any monetary damages arising from such >> loss, damage or destruction. >> >> >> >>> On Wed, 23

Re: One click to run Spark on Kubernetes

2022-02-23 Thread Bitfox
ny other property which may >> arise from relying on this email's technical content is explicitly >> disclaimed. The author will in no case be liable for any monetary damages >> arising from such loss, damage or destruction. >> >> >> >> >> On Wed, 23 Feb 2022 at

Re: One click to run Spark on Kubernetes

2022-02-23 Thread bo yang
liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Wed, 23 Feb 2022 at 04:06, bo yang wrote: > >> Hi Spark Community, >> >> We built an open source tool to deploy and run Spark on Kubernetes with a >> one click

Re: One click to run Spark on Kubernetes

2022-02-22 Thread Mich Talebzadeh
hnical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Wed, 23 Feb 2022 at 04:06, bo yang wrote: > Hi Spark Community, > > We built an open source tool to deploy and run Spark on Kubernetes wit

Re: One click to run Spark on Kubernetes

2022-02-22 Thread bo yang
t;>> >>> Thanks >>> >>> On Wed, Feb 23, 2022 at 12:21 PM bo yang wrote: >>> >>>> It is not a standalone spark cluster. In some details, it deploys a >>>> Spark Operator ( >>>> https://github.com/GoogleCloudPlatform/spar

Re: One click to run Spark on Kubernetes

2022-02-22 Thread amihay gonen
12:21 PM bo yang wrote: >> >>> It is not a standalone spark cluster. In some details, it deploys a >>> Spark Operator ( >>> https://github.com/GoogleCloudPlatform/spark-on-k8s-operator) and an >>> extra REST Service. When people submit Spark application t

Re: One click to run Spark on Kubernetes

2022-02-22 Thread bo yang
rm/spark-on-k8s-operator) >> and an extra REST Service. When people submit Spark application to that >> REST Service, the REST Service will create a CRD inside the >> Kubernetes cluster. Then Spark Operator will pick up the CRD and launch the >> Spark application. The one cl

Re: One click to run Spark on Kubernetes

2022-02-22 Thread Prasad Paravatha
an extra REST Service. When people submit Spark application to that > REST Service, the REST Service will create a CRD inside the > Kubernetes cluster. Then Spark Operator will pick up the CRD and launch the > Spark application. The one click tool intends to hide these details, so > peo

Re: One click to run Spark on Kubernetes

2022-02-22 Thread Bitfox
ome details, it deploys a Spark > Operator (https://github.com/GoogleCloudPlatform/spark-on-k8s-operator) > and an extra REST Service. When people submit Spark application to that > REST Service, the REST Service will create a CRD inside the > Kubernetes cluster. Then Spark Operat

Re: One click to run Spark on Kubernetes

2022-02-22 Thread bo yang
ion of spark? or just the standalone node? > > Thanks > > On Wed, Feb 23, 2022 at 12:06 PM bo yang wrote: > >> Hi Spark Community, >> >> We built an open source tool to deploy and run Spark on Kubernetes with a >> one click command. For example, on AWS, it co

Re: One click to run Spark on Kubernetes

2022-02-22 Thread Bitfox
Can it be a cluster installation of spark? or just the standalone node? Thanks On Wed, Feb 23, 2022 at 12:06 PM bo yang wrote: > Hi Spark Community, > > We built an open source tool to deploy and run Spark on Kubernetes with a > one click command. For example, on AWS, it could a

One click to run Spark on Kubernetes

2022-02-22 Thread bo yang
Hi Spark Community, We built an open source tool to deploy and run Spark on Kubernetes with a one click command. For example, on AWS, it could automatically create an EKS cluster, node group, NGINX ingress, and Spark Operator. Then you will be able to use curl or a CLI tool to submit Spark

Shuffle in Spark with Kubernetes

2021-10-27 Thread Mich Talebzadeh
As I understand Spark releases > 3 currently do not support external shuffle. Is there any timelines when this could be available? For now we have two parameters for Dynamic Resource Allocation. These are --conf spark.dynamicAllocation.enabled=true \ --conf

Re: Spark on Kubernetes scheduler variety

2021-07-08 Thread Mich Talebzadeh
her such an approach and the so-called democratization >>>> of Spark on whatever platform is really should be of great focus. >>>> >>>> Having worked on Google Dataproc <https://cloud.google.com/dataproc> (A >>>> fully >>>> managed and h

Re: Spark on Kubernetes scheduler variety

2021-07-08 Thread Holden Karau
tch scheduling on Kubernetes could be rewarding. However, if I >>> may say I doubt whether such an approach and the so-called democratization >>> of Spark on whatever platform is really should be of great focus. >>> >>> Having worked on Google Dataproc <https:

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Mich Talebzadeh
Hi Holden, Thank you for your points. I guess coming from a corporate world I had an oversight on how an open source project like Spark does leverage resources and interest :). As @KlausMa kindly volunteered it would be good to hear scheduling ideas on Spark on Kubernetes and of course as I am

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Holden Karau
>> of Spark on whatever platform is really should be of great focus. >> >> Having worked on Google Dataproc <https://cloud.google.com/dataproc> (A fully >> managed and highly scalable service for running Apache Spark, Hadoop and >> more recently other artefacts) fo

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Lalwani, Jayesh
You can always chain aggregations by chaining multiple Structured Streaming jobs. It’s not a showstopper. Getting Spark on Kubernetes is important for organizations that want to pursue a multi-cloud strategy From: Mich Talebzadeh Date: Wednesday, June 23, 2021 at 11:27 AM To: "user @

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread John Zhuge
t;https://cloud.google.com/dataproc> (A fully >> managed and highly scalable service for running Apache Spark, Hadoop and >> more recently other artefacts) for that past two years, and Spark on >> Kubernetes on-premise, I have come to the conclusion that Spark is not a >> bea

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Mich Talebzadeh
ully >> managed and highly scalable service for running Apache Spark, Hadoop and >> more recently other artefacts) for that past two years, and Spark on >> Kubernetes on-premise, I have come to the conclusion that Spark is not a >> beast that that one can fully commoditiz

Re: Spark on Kubernetes scheduler variety

2021-06-23 Thread Klaus Ma
gt; Having worked on Google Dataproc <https://cloud.google.com/dataproc> (A fully > managed and highly scalable service for running Apache Spark, Hadoop and > more recently other artefacts) for that past two years, and Spark on > Kubernetes on-premise, I have come to the conclusion

Re: Spark on Kubernetes scheduler variety

2021-06-23 Thread Mich Talebzadeh
democratization of Spark on whatever platform is really should be of great focus. Having worked on Google Dataproc <https://cloud.google.com/dataproc> (A fully managed and highly scalable service for running Apache Spark, Hadoop and more recently other artefacts) for that past two years, and

Re: Question on spark on Kubernetes

2021-05-20 Thread Gourav Sengupta
at 9:50 PM Mithalee Mohapatra < mithaleemohapa...@gmail.com> wrote: > Hi, > I am currently trying to run spark submit in Kubernetes. I have set up the > IAM roles for serviceaccount and generated the ARN. I am trying to use the > "spark.hadoop.fs.s3a.

Question on spark on Kubernetes

2021-05-20 Thread Mithalee Mohapatra
Hi, I am currently trying to run spark submit in Kubernetes. I have set up the IAM roles for serviceaccount and generated the ARN. I am trying to use the "spark.hadoop.fs.s3a.fast.upload=true --conf spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.WebIdentityTokenCredentialsPro

Re: [Spark in Kubernetes] Question about running in client mode

2021-04-27 Thread Shiqi Sun
n read here: > https://spark.apache.org/docs/latest/submitting-applications.html > > Best Regards, > Attila > > On Tue, Apr 27, 2021 at 12:03 AM Shiqi Sun wrote: > >> Hi Spark User group, >> >> I have a couple of quick questions about running Spark i

Re: [Spark in Kubernetes] Question about running in client mode

2021-04-26 Thread Attila Zsolt Piros
stions about running Spark in Kubernetes > between different deploy modes. > > As specified in > https://spark.apache.org/docs/latest/running-on-kubernetes.html#client-mode, > since Spark 2.4, client mode support is available when running in > Kubernetes, and it says "when y

[Spark in Kubernetes] Question about running in client mode

2021-04-26 Thread Shiqi Sun
Hi Spark User group, I have a couple of quick questions about running Spark in Kubernetes between different deploy modes. As specified in https://spark.apache.org/docs/latest/running-on-kubernetes.html#client-mode, since Spark 2.4, client mode support is available when running in Kubernetes

Re: Dynamic Allocation Backlog Property in Spark on Kubernetes

2021-04-10 Thread ranju goel
chedulerbacklogtimeout (say 15 mins) and speeds up the >> job. >> >> >> [image: image.png] >> >> Best Regards >> >> >> >> >> >> *From:* Attila Zsolt Piros >> *Sent:* Friday, April 9, 2021 11:11 AM >> *To:* Ranju

Re: Dynamic Allocation Backlog Property in Spark on Kubernetes

2021-04-10 Thread Attila Zsolt Piros
rds > > > > > > *From:* Attila Zsolt Piros > *Sent:* Friday, April 9, 2021 11:11 AM > *To:* Ranju Jain > *Cc:* user@spark.apache.org > *Subject:* Re: Dynamic Allocation Backlog Property in Spark on Kubernetes > > > > You should not set "spark.dynamicAllocatio

Re: Dynamic Allocation Backlog Property in Spark on Kubernetes

2021-04-10 Thread ranju goel
Piros *Sent:* Friday, April 9, 2021 11:11 AM *To:* Ranju Jain *Cc:* user@spark.apache.org *Subject:* Re: Dynamic Allocation Backlog Property in Spark on Kubernetes You should not set "spark.dynamicAllocation.schedulerBacklogTimeout" so high and the purpose of this config is very diff

Re: Dynamic Allocation Backlog Property in Spark on Kubernetes

2021-04-08 Thread Attila Zsolt Piros
> > Regards > > Ranju > > > > > > *From:* Attila Zsolt Piros > *Sent:* Friday, April 9, 2021 12:13 AM > *To:* Ranju Jain > *Cc:* user@spark.apache.org > *Subject:* Re: Dynamic Allocation Backlog Property in Spark on Kubernetes > > > > Hi! > &

RE: Dynamic Allocation Backlog Property in Spark on Kubernetes

2021-04-08 Thread Ranju Jain
To: Ranju Jain Cc: user@spark.apache.org Subject: Re: Dynamic Allocation Backlog Property in Spark on Kubernetes Hi! For dynamic allocation you do not need to run the Spark jobs in parallel. Dynamic allocation simply means Spark scales up by requesting more executors when there are pending tasks

Re: Dynamic Allocation Backlog Property in Spark on Kubernetes

2021-04-08 Thread Attila Zsolt Piros
ynchronize different Spark jobs but it is about tasks. Best regards, Attila On Tue, Apr 6, 2021 at 1:59 PM Ranju Jain wrote: > Hi All, > > > > I have set dynamic allocation enabled while running spark on Kubernetes . > But new executors are requested if pending tasks are backlogge

Dynamic Allocation Backlog Property in Spark on Kubernetes

2021-04-06 Thread Ranju Jain
Hi All, I have set dynamic allocation enabled while running spark on Kubernetes . But new executors are requested if pending tasks are backlogged for more than configured duration in property "spark.dynamicAllocation.schedulerBacklogTimeout". My Use Case is: There are number of par

RE: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS

2021-03-11 Thread Ranju Jain
Ok! Thanks for all guidance :-) Regards Ranju From: Mich Talebzadeh Sent: Thursday, March 11, 2021 11:07 PM To: Ranju Jain Cc: user@spark.apache.org Subject: Re: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS I don't have any specific reference. However, you can do a Google search. best

Re: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS

2021-03-11 Thread Mich Talebzadeh
gt; > > Do you have any reference or links where I can check out the Shared > Volumes ? > > > > Regards > > Ranju > > > > *From:* Mich Talebzadeh > *Sent:* Thursday, March 11, 2021 5:38 PM > *Cc:* user@spark.apache.org > *Subject:* Re: Spark on Kuber

RE: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS

2021-03-11 Thread Ranju Jain
: user@spark.apache.org Subject: Re: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS Well your mileage varies so to speak. The only way to find out is setting an NFS mount and testing it. The performance will depend on the mounted file system and the amount of cache it has. File cache

Re: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS

2021-03-11 Thread Mich Talebzadeh
ch 11, 2021 5:22 PM > *To:* Ranju Jain > *Cc:* user@spark.apache.org > *Subject:* Re: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS > > > > Ok this is on Google Cloud correct? > > > > > > > > > LinkedIn > *http

RE: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS

2021-03-11 Thread Ranju Jain
the other sides [drawback]. Regards Ranju From: Mich Talebzadeh Sent: Thursday, March 11, 2021 5:22 PM To: Ranju Jain Cc: user@spark.apache.org Subject: Re: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS Ok this is on Google Cloud correct? LinkedIn https://www.linkedin.com/profile

Re: Spark on Kubernetes | 3.0.1 | Shared Volume or NFS

2021-03-11 Thread Mich Talebzadeh
Ok this is on Google Cloud correct? LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * *Disclaimer:* Use it at your own risk. Any and all responsibility for any

Spark on Kubernetes | 3.0.1 | Shared Volume or NFS

2021-03-11 Thread Ranju Jain
Hi, I need to write all Executors pods data on some common location which can be accessed and retrieved by driver pod. I was first planning to go with NFS, but I think Shared Volume is equally good. Please suggest Is there any major drawback in using Shared Volume instead of NFS when many pods

Re: vm.swappiness value for Spark on Kubernetes

2021-02-16 Thread Sean Owen
gt; kubernetes worker nodes the value of this parameter is '60'. > > Now my question is if it is OK to keep such a high value of > 'vm.swappiness'=60 in kubernetes environment for Spark workloads. > > Will such high value of this kernel parameter have performance impact on >

vm.swappiness value for Spark on Kubernetes

2021-02-16 Thread Jahar Tyagi
are moving to kubernetes and on kubernetes worker nodes the value of this parameter is '60'. Now my question is if it is OK to keep such a high value of 'vm.swappiness'=60 in kubernetes environment for Spark workloads. Will such high value of this kernel parameter have performance impact on Spark PODs

[Spark on Kubernetes] Spark Application dependency management Question.

2021-02-03 Thread xgong
Hey Team: Currently, we were upgrading the spark version from 2.4 to 3.0. But we found that the applications, which work in spark 2.4, keep failing with Spark 3.0. We are running Spark on Kubernetes with cluster mode. In spark-submit, we have "--jars local:///apps-dep/spark-extra

RE: Spark on Kubernetes : unable to write files to HDFS

2020-12-16 Thread Loic DESCOTTE
Everything is working fine now  Thanks again Loïc De : German Schiavon Envoyé : mercredi 16 décembre 2020 19:23 À : Loic DESCOTTE Cc : user@spark.apache.org Objet : Re: Spark on Kubernetes : unable to write files to HDFS We all been there! no reason

Re: Spark on Kubernetes : unable to write files to HDFS

2020-12-16 Thread German Schiavon
embre 2020 18:01 > *À :* Loic DESCOTTE > *Cc :* user@spark.apache.org > *Objet :* Re: Spark on Kubernetes : unable to write files to HDFS > > Hi, > > seems that you have a typo no? > > Exception in thread "main" java.io.IOException: No FileSystem for scheme: >

RE: Spark on Kubernetes : unable to write files to HDFS

2020-12-16 Thread Loic DESCOTTE
Oh thank you you're right!! I feel shameful ?? De : German Schiavon Envoyé : mercredi 16 décembre 2020 18:01 À : Loic DESCOTTE Cc : user@spark.apache.org Objet : Re: Spark on Kubernetes : unable to write files to HDFS Hi, seems that you have a typo

Re: Spark on Kubernetes : unable to write files to HDFS

2020-12-16 Thread German Schiavon
.appName("Hello Spark 7") > .config("fs.hdfs.impl", classOf[org.apache.hadoop.hdfs. > DistributedFileSystem].getName) > .getOrCreate() > > > But still the same error... > > -- > *De :* Sean Owen

RE: Spark on Kubernetes : unable to write files to HDFS

2020-12-16 Thread Loic DESCOTTE
impl", classOf[org.apache.hadoop.hdfs.DistributedFileSystem].getName) .getOrCreate() But still the same error... De : Sean Owen Envoyé : mercredi 16 décembre 2020 14:27 À : Loic DESCOTTE Objet : Re: Spark on Kubernetes : unable to write files to HDFS

Spark on Kubernetes : unable to write files to HDFS

2020-12-16 Thread Loic DESCOTTE
Hello, I am using Spark On Kubernetes and I have the following error when I try to write data on HDFS : "no filesystem for scheme hdfs" More details : I am submitting my application with Spark submit like this : spark-submit --master k8s://https://myK8SMaster:6443 \ --deploy-mo

Spark on Kubernetes

2020-11-13 Thread Arti Pande
Hi, Is it recommended to use Spark on K8S in production? Spark operator for Kubernetes seems to be in beta state. https://github.com/GoogleCloudPlatform/spark-on-k8s-operator#:~:text=The%20Kubernetes%20Operator%20for%20Apache%20Spark%20aims%20to%20make%20specifying,surfacing%20status%20of

Re: Hive on Spark in Kubernetes.

2020-10-07 Thread Yuri Oleynikov (‫יורי אולייניקוב‬‎)
Thank you very much! Отправлено с iPhone > 7 окт. 2020 г., в 17:38, mykidong написал(а): > > Hi all, > > I have recently written a blog about hive on spark in kubernetes > environment: > - https://itnext.io/hive-on-spark-in-kubernetes-115c8e9fa5c1 > > In this bl

Hive on Spark in Kubernetes.

2020-10-07 Thread mykidong
Hi all, I have recently written a blog about hive on spark in kubernetes environment: - https://itnext.io/hive-on-spark-in-kubernetes-115c8e9fa5c1 In this blog, you can find how to run hive on kubernetes using spark thrift server compatible with hive server2. Cheers, - Kidong. -- Sent from

Re: [Spark 3.0 Kubernetes] Does Spark 3.0 support production deployment

2020-07-12 Thread Prashant Sharma
> *Cc:* Sean Owen ; Ramani, Sai (DI SW CAS MP AFC ARC) < > sai.ram...@siemens.com>; Varshney, Vaibhav (DI SW CAS MP AFC ARC) < > vaibhav.varsh...@siemens.com> > *Subject:* Re: [Spark 3.0 Kubernetes] Does Spark 3.0 support production > deployment > > > > Hi, >

RE: [Spark 3.0 Kubernetes] Does Spark 3.0 support production deployment

2020-07-10 Thread Varshney, Vaibhav
: [Spark 3.0 Kubernetes] Does Spark 3.0 support production deployment Hi, Whether it is a blocker or not, is upto you to decide. But, spark k8s cluster supports dynamic allocation, through a different mechanism, that is, without using an external shuffle service. https://issues.apache.org/jira

Re: [Spark 3.0 Kubernetes] Does Spark 3.0 support production deployment

2020-07-09 Thread Prashant Sharma
ache.org; Ramani, Sai (DI SW CAS MP AFC ARC) < > sai.ram...@siemens.com> > Subject: Re: [Spark 3.0 Kubernetes] Does Spark 3.0 support production > deployment > > I haven't used the K8S scheduler personally, but, just based on that > comment I wouldn't worry too much. It's been ar

RE: [Spark 3.0 Kubernetes] Does Spark 3.0 support production deployment

2020-07-09 Thread Varshney, Vaibhav
Vaibhav V -Original Message- From: Sean Owen Sent: Thursday, July 9, 2020 3:20 PM To: Varshney, Vaibhav (DI SW CAS MP AFC ARC) Cc: user@spark.apache.org; Ramani, Sai (DI SW CAS MP AFC ARC) Subject: Re: [Spark 3.0 Kubernetes] Does Spark 3.0 support production deployment I haven't used the K8S

Re: [Spark 3.0 Kubernetes] Does Spark 3.0 support production deployment

2020-07-09 Thread Sean Owen
that could be added to it and more work going on, and maybe people closer to that work can comment. But yeah you shouldn't be afraid to try it. On Thu, Jul 9, 2020 at 3:18 PM Varshney, Vaibhav wrote: > > Hi Spark Experts, > > > > We are trying to deploy spark on Kub

[Spark 3.0 Kubernetes] Does Spark 3.0 support production deployment

2020-07-09 Thread Varshney, Vaibhav
Hi Spark Experts, We are trying to deploy spark on Kubernetes. As per doc http://spark.apache.org/docs/latest/running-on-kubernetes.html, it looks like K8s deployment is experimental. "The Kubernetes scheduler is currently experimental ". Spark 3.0 does not support production deploy

  1   2   >