Re: Volcano in spark distro
> In any way, I'd like to say that the root cause of the difference is those scheduler designs instead of Apache Spark itself. For example, Apache YuniKorn doesn't force us to add a new dependency at all while Volcano did. This makes sense! > In these day, I prefer and invest more Apache YuniKorn and, for Apache Spark 4, we are going to add and focus more on K8s native features like `Scheduling Gate` additionally. This is very useful to know! Are there any tickets/SPIPs for this yet? Thanks for sharing these bits and kind regards Santosh On Wed, Aug 23, 2023 at 12:10 AM Dongjoon Hyun wrote: > Of course, we can make Apache Spark distribution bigger and bigger, but > I'm a little neutral about Volcano. > > In any way, I'd like to say that the root cause of the difference is those > scheduler designs instead of Apache Spark itself. For example, Apache > YuniKorn doesn't force us to add a new dependency at all while Volcano did. > > > It would useful to support volcano in spark distro itself just like > yunikorn. > In these day, I prefer and invest more Apache YuniKorn and, for Apache > Spark 4, we are going to add and focus more on K8s native features like > `Scheduling Gate` additionally. > > Dongjoon. >
Re: Volcano in spark distro
Of course, we can make Apache Spark distribution bigger and bigger, but I'm a little neutral about Volcano. In any way, I'd like to say that the root cause of the difference is those scheduler designs instead of Apache Spark itself. For example, Apache YuniKorn doesn't force us to add a new dependency at all while Volcano did. > It would useful to support volcano in spark distro itself just like yunikorn. In these day, I prefer and invest more Apache YuniKorn and, for Apache Spark 4, we are going to add and focus more on K8s native features like `Scheduling Gate` additionally. Dongjoon.
Re: Volcano in spark distro
@Santosh We tried to add this in v3.3.0. [1] The main reason for not adding it at that time was: 1. Volcano multi-arch not supported before v1.7.0. (already upgraded to 1.7.0 since Spark 3.4.0) 2. Spark on K8s + Volcano is experimental. (We have removed the experimental [2]) Consider spark volcano integrations already stable to run on spark community (since spark 3.4.0) [3] and volcano community (since spark 3.3.0) [4] for a long time. I think it's stable enough. So I believe we have the capability to enable the volcano module in Apache Spark now (master / maybe Apache Spark 4.0?). [1] https://github.com/apache/spark/pull/35922 [2] https://github.com/apache/spark/pull/40152 [3] https://github.com/apache/spark/blob/master/.github/workflows/build_and_test.yml#L1090 [4] https://github.com/volcano-sh/volcano/blob/master/.github/workflows/e2e_spark.yaml#L12 Regards, Yikun On Tue, Aug 22, 2023 at 8:14 PM Santosh Pingale wrote: > Hey all > > It would useful to support volcano in spark distro itself just like > yunikorn. So I am wondering what is the reason behind this decision of not > packaging it already. > Running Spark on Kubernetes - Spark 3.4.1 Documentation > <https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-volcano-as-customized-scheduler-for-spark-on-kubernetes> > spark.apache.org > <https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-volcano-as-customized-scheduler-for-spark-on-kubernetes> > [image: apple-touch-icon.png] > <https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-volcano-as-customized-scheduler-for-spark-on-kubernetes> > <https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-volcano-as-customized-scheduler-for-spark-on-kubernetes> > > Can we package it to make it easily available and hence usable? > > Kind regards > Santosh >
Re: Volcano in spark distro
Hi Santosh, We had a Google team discussion about k8s back in February and it was mentioned then. My personal experience with Volcano was not that impressive. Do you have some stats to prove that it is worth adding as an addition. Anyone else is welcome to comment. HTH Mich Talebzadeh, Distinguished Technologist, Solutions Architect & Engineer London United Kingdom view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Tue, 22 Aug 2023 at 13:14, Santosh Pingale wrote: > Hey all > > It would useful to support volcano in spark distro itself just like > yunikorn. So I am wondering what is the reason behind this decision of not > packaging it already. > Running Spark on Kubernetes - Spark 3.4.1 Documentation > <https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-volcano-as-customized-scheduler-for-spark-on-kubernetes> > spark.apache.org > <https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-volcano-as-customized-scheduler-for-spark-on-kubernetes> > [image: apple-touch-icon.png] > <https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-volcano-as-customized-scheduler-for-spark-on-kubernetes> > <https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-volcano-as-customized-scheduler-for-spark-on-kubernetes> > > Can we package it to make it easily available and hence usable? > > Kind regards > Santosh >
Volcano in spark distro
Hey all It would useful to support volcano in spark distro itself just like yunikorn. So I am wondering what is the reason behind this decision of not packaging it already. https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-volcano-as-customized-scheduler-for-spark-on-kubernetes Can we package it to make it easily available and hence usable? Kind regards Santosh signature.asc Description: Message signed with OpenPGP