Re: Spark on Kubernetes scheduler variety

2021-07-08 Thread Mich Talebzadeh
Splendid. Please invite me to the next meeting mich.talebza...@gmail.com Timezone London, UK *GMT+1* Thanks, view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or

Re: Spark on Kubernetes scheduler variety

2021-07-08 Thread Holden Karau
Hi Y'all, We had an initial meeting which went well, got some more context around Volcano and its near-term roadmap. Talked about the impact around scheduler deadlocking and some ways that we could potentially improve integration from the Spark side and Volcano sides respectively. I'm going to

Re: Spark on Kubernetes scheduler variety

2021-07-01 Thread Mich Talebzadeh
Hi, A rather simple question. As Kubernetes is a special work requiring some effort in setting it up properly, do we have a dev/test bed to conduct development work? What I am trying to get at is if there is official support for Volcano stuff that a vendor can provide free cluster usage in

Re: Spark on Kubernetes scheduler variety

2021-06-30 Thread Mich Talebzadeh
Hi Klaus, Thanks https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/1289 view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or

Re: Spark on Kubernetes scheduler variety

2021-06-30 Thread Klaus Ma
Hi Mich, Would you help to open an issue at spark-on-k8s-operator repo? We're going to submit a PR to update the install steps :) -- Klaus On Wed, Jun 30, 2021 at 12:24 AM Mich Talebzadeh wrote: > Hi Yikun > > In reference > > >

Re: Spark on Kubernetes scheduler variety

2021-06-30 Thread Mich Talebzadeh
Hi Michel, Thanks for the link. I am familiar with G-Research as I met them in my presentation in London back in October 2019. The amanda project sems to create super-scheduling on top of Kubernetes clusters and I quote: "Armada is an application to achieve high throughput of run-to-completion

Re: Spark on Kubernetes scheduler variety

2021-06-29 Thread Mich Talebzadeh
Hi Yikun In reference https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/docs/volcano-integration.md Trying to install Volcano I am getting this error helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator Error: looks like

Re: Spark on Kubernetes scheduler variety

2021-06-29 Thread Mich Talebzadeh
Cool, thanks! view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical

Re: Spark on Kubernetes scheduler variety

2021-06-29 Thread Yikun Jiang
> Is this the correct link for integrating Volcano with Spark? Yes, it is Kubernetes operator style of integrating Volcano. And if you want to just use spark submit style to submit a native support job, you can see [2] as ref. [1]

Re: Spark on Kubernetes scheduler variety

2021-06-28 Thread Mich Talebzadeh
Hi Yikun, Is this the correct link for integrating Volcano with Spark? spark-on-k8s-operator/volcano-integration.md at master · GoogleCloudPlatform/spark-on-k8s-operator · GitHub Thanks Mich

Re: Spark on Kubernetes scheduler variety

2021-06-25 Thread Yikun Jiang
Oops, sorry for the error link, it should be: We will also prepare to propose an initial design and POC[3] on a shared branch (based on spark master branch) where we can collaborate on it, so I created the spark-volcano[1] org in github to make it happen. [3]

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread John Zhuge
Thanks Yikun! On Thu, Jun 24, 2021 at 8:54 PM Yikun Jiang wrote: > Hi, folks. > > As @Klaus mentioned, We have some work on Spark on k8s with volcano native > support. Also, there were also some production deployment validation from > our partners in China, like JingDong, XiaoHongShu, VIPshop.

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Yikun Jiang
Hi, folks. As @Klaus mentioned, We have some work on Spark on k8s with volcano native support. Also, there were also some production deployment validation from our partners in China, like JingDong, XiaoHongShu, VIPshop. We will also prepare to propose an initial design and POC[3] on a shared

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Mich Talebzadeh
Hi Holden, Thank you for your points. I guess coming from a corporate world I had an oversight on how an open source project like Spark does leverage resources and interest :). As @KlausMa kindly volunteered it would be good to hear scheduling ideas on Spark on Kubernetes and of course as I am

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Holden Karau
Hi Mich, I certainly think making Spark on Kubernetes run well is going to be a challenge. However I think, and I could be wrong about this as well, that in terms of cluster managers Kubernetes is likely to be our future. Talking with people I don't hear about new standalone, YARN or mesos

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Holden Karau
That's awesome, I'm just starting to get context around Volcano but maybe we can schedule an initial meeting for all of us interested in pursuing this to get on the same page. On Wed, Jun 23, 2021 at 6:54 PM Klaus Ma wrote: > Hi team, > > I'm kube-batch/Volcano founder, and I'm excited to hear

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread John Zhuge
Thanks Klaus! I am interested in more details. On Wed, Jun 23, 2021 at 6:54 PM Klaus Ma wrote: > Hi team, > > I'm kube-batch/Volcano founder, and I'm excited to hear that the spark > community also has such requirements :) > > Volcano provides several features for batch workload, e.g.

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Mich Talebzadeh
Thanks Klaus. That will be great. It will also be intuitive if you elaborate the need for this feature in line with the limitation of the current batch workload. Regards, Mich view my Linkedin profile *Disclaimer:* Use it at

Re: Spark on Kubernetes scheduler variety

2021-06-23 Thread Klaus Ma
Hi team, I'm kube-batch/Volcano founder, and I'm excited to hear that the spark community also has such requirements :) Volcano provides several features for batch workload, e.g. fair-share, queue, reservation, preemption/reclaim and so on. It has been used in several product environments with

Re: Spark on Kubernetes scheduler variety

2021-06-23 Thread Mich Talebzadeh
Please allow me to be diverse and express a different point of view on this roadmap. I believe from a technical point of view spending time and effort plus talent on batch scheduling on Kubernetes could be rewarding. However, if I may say I doubt whether such an approach and the so-called

Re: Spark on Kubernetes scheduler variety

2021-06-18 Thread Holden Karau
I think these approaches are good, but there are limitations (eg dynamic scaling) without us making changes inside of the Spark Kube scheduler. Certainly whichever scheduler extensions we add support for we should collaborate with the people developing those extensions insofar as they are

Re: Spark on Kubernetes scheduler variety

2021-06-18 Thread Mich Talebzadeh
Hi, Regarding your point and I quote ".. I know that one of the Spark on Kube operators supports volcano/kube-batch so I was thinking that might be a place I would start exploring..." There seems to be ongoing work on say Volcano as part of Cloud Native Computing Foundation

Spark on Kubernetes scheduler variety

2021-06-17 Thread Holden Karau
Hi Folks, I'm continuing my adventures to make Spark on containers party and I was wondering if folks have experience with the different batch scheduler options that they prefer? I was thinking so that we can better support dynamic allocation it might make sense for us to support using different