+1 for creating an official Kubernetes operator for Apache Spark On Fri, Nov 10, 2023 at 12:38 AM huaxin gao <huaxin.ga...@gmail.com> wrote:
> +1 > > On Thu, Nov 9, 2023 at 3:14 PM DB Tsai <dbt...@dbtsai.com> wrote: > >> +1 >> >> To be completely transparent, I am employed in the same department as >> Zhou at Apple. >> >> I support this proposal, provided that we witness community adoption >> following the release of the Flink Kubernetes operator, streamlining Flink >> deployment on Kubernetes. >> >> A well-maintained official Spark Kubernetes operator is essential for our >> Spark community as well. >> >> DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1 >> >> On Nov 9, 2023, at 12:05 PM, Zhou Jiang <zhou.c.ji...@gmail.com> wrote: >> >> Hi Spark community, >> I'm reaching out to initiate a conversation about the possibility of >> developing a Java-based Kubernetes operator for Apache Spark. Following the >> operator pattern ( >> https://kubernetes.io/docs/concepts/extend-kubernetes/operator/), Spark >> users may manage applications and related components seamlessly using >> native tools like kubectl. The primary goal is to simplify the Spark user >> experience on Kubernetes, minimizing the learning curve and operational >> complexities and therefore enable users to focus on the Spark application >> development. >> Although there are several open-source Spark on Kubernetes operators >> available, none of them are officially integrated into the Apache Spark >> project. As a result, these operators may lack active support and >> development for new features. Within this proposal, our aim is to introduce >> a Java-based Spark operator as an integral component of the Apache Spark >> project. This solution has been employed internally at Apple for multiple >> years, operating millions of executors in real production environments. The >> use of Java in this solution is intended to accommodate a wider user and >> contributor audience, especially those who are familiar with Scala. >> Ideally, this operator should have its dedicated repository, similar to >> Spark Connect Golang or Spark Docker, allowing it to maintain a loose >> connection with the Spark release cycle. This model is also followed by the >> Apache Flink Kubernetes operator. >> We believe that this project holds the potential to evolve into a >> thriving community project over the long run. A comparison can be drawn >> with the Flink Kubernetes Operator: Apple has open-sourced internal Flink >> Kubernetes operator, making it a part of the Apache Flink project ( >> https://github.com/apache/flink-kubernetes-operator). This move has >> gained wide industry adoption and contributions from the community. In a >> mere year, the Flink operator has garnered more than 600 stars and has >> attracted contributions from over 80 contributors. This showcases the level >> of community interest and collaborative momentum that can be achieved in >> similar scenarios. >> More details can be found at SPIP doc : Spark Kubernetes Operator >> https://docs.google.com/document/d/1f5mm9VpSKeWC72Y9IiKN2jbBn32rHxjWKUfLRaGEcLE >> >> Thanks, >> -- >> *Zhou JIANG* >> >> >>