RE: Apache Spark Operator for Kubernetes?

2022-10-28 Thread Jim Halfpenny
Hi Clayton,
I’m not aware of an official Apache operator, but I can recommend taking a look 
a the one we’re created at Stackable.

https://github.com/stackabletech/spark-k8s-operator

It’s actively maintained and we’d be happy to receive feedback if you have 
feature requests.

Kind regards,
Jim


On 2022/10/14 15:28:55 Clayton Wohl wrote:
> My company has been exploring the Google Spark Operator for running Spark
> jobs on a Kubernetes cluster, but we've found lots of limitations and
> problems, and the product seems weakly supported.
> 
> Is there any official Apache option, or plans for such an option, to run
> Spark jobs on Kubernetes? Is there perhaps an official Apache Spark
> Operator in the works?
> 
> We currently run jobs on both Databricks and on Amazon EMR, but it would be
> nice to have a good option for running Spark directly on our Kubernetes
> clusters.
> 
> thanks :)
> 
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Apache Spark Operator for Kubernetes?

2022-10-14 Thread Artemis User
If you have the hardware resources, it isn't difficult to set up Spark 
in a kubernetes cluster.  The online doc describes everything you would 
need (https://spark.apache.org/docs/latest/running-on-kubernetes.html).


You're right, both AWS EMR and Google's environment aren't flexible and 
not cheap.  At one time, we ended up spending over $1,800 per month on 
EMR.  If you have the hardware resources and know how configure and 
optimize your own OS and networks, going with an in-house solution will 
always be the best.


On 10/14/22 11:28 AM, Clayton Wohl wrote:
My company has been exploring the Google Spark Operator for running 
Spark jobs on a Kubernetes cluster, but we've found lots of 
limitations and problems, and the product seems weakly supported.


Is there any official Apache option, or plans for such an option, to 
run Spark jobs on Kubernetes? Is there perhaps an official Apache 
Spark Operator in the works?


We currently run jobs on both Databricks and on Amazon EMR, but it 
would be nice to have a good option for running Spark directly on our 
Kubernetes clusters.


thanks :)



-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org