Hi, folks! Wishing you all the best in 2022.

I'd like to share the current status on "Support Customized K8S Scheduler
in Spark".

https://docs.google.com/document/d/1xgQGRpaHQX6-QH_J9YV2C2Dh6RpXefUpLM7KGkzL6Fg/edit#heading=h.1quyr1r2kr5n

Framework/Common support

- Volcano and Yunikorn team join the discussion and complete the initial
doc on framework/common part.

- SPARK-37145 <https://issues.apache.org/jira/browse/SPARK-37145> (under
reviewing): We proposed to extend the customized scheduler by just using a
custom feature step, it will meet the requirement of customized scheduler
after it gets merged. After this, the user can enable featurestep and
scheduler like:

spark-submit \

    --conf spark.kubernete.scheduler.name volcano \

    --conf spark.kubernetes.driver.pod.featureSteps
org.apache.spark.deploy.k8s.features.scheduler.VolcanoFeatureStep

--conf spark.kubernete.job.queue xxx

(such as above, the VolcanoFeatureStep will help to set the the spark
scheduler queue according user specified conf)

- SPARK-37331 <https://issues.apache.org/jira/browse/SPARK-37331>: Added
the ability to create kubernetes resources before driver pod creation.

- SPARK-36059 <https://issues.apache.org/jira/browse/SPARK-36059>: Add the
ability to specify a scheduler in driver/executor

After above all, the framework/common support would be ready for most of
customized schedulers

Volcano part:

- SPARK-37258 <https://issues.apache.org/jira/browse/SPARK-37258>: Upgrade
kubernetes-client to 5.11.1 to add volcano scheduler API support.

- SPARK-36061 <https://issues.apache.org/jira/browse/SPARK-36061>: Add a
VolcanoFeatureStep to help users to create a PodGroup with user specified
minimum resources required, there is also a WIP commit to show the preview
of this
<https://github.com/Yikun/spark/pull/45/commits/81bf6f98edb5c00ebd0662dc172bc73f980b6a34>
.

Yunikorn part:

- @WeiweiYang is completing the doc of the Yunikorn part and implementing
the Yunikorn part.

Regards,
Yikun


Weiwei Yang <w...@apache.org> 于2021年12月2日周四 02:00写道:

> Thank you Yikun for the info, and thanks for inviting me to a meeting to
> discuss this.
> I appreciate your effort to put these together, and I agree that the
> purpose is to make Spark easy/flexible enough to support other K8s
> schedulers (not just for Volcano).
> As discussed, could you please help to abstract out the things in common
> and allow Spark to plug different implementations? I'd be happy to work
> with you guys on this issue.
>
>
> On Tue, Nov 30, 2021 at 6:49 PM Yikun Jiang <yikunk...@gmail.com> wrote:
>
>> @Weiwei @Chenya
>>
>> > Thanks for bringing this up. This is quite interesting, we definitely
>> should participate more in the discussions.
>>
>> Thanks for your reply and welcome to join the discussion, I think the
>> input from Yunikorn is very critical.
>>
>> > The main thing here is, the Spark community should make Spark pluggable
>> in order to support other schedulers, not just for Volcano. It looks like
>> this proposal is pushing really hard for adopting PodGroup, which isn't
>> part of K8s yet, that to me is problematic.
>>
>> Definitely yes, we are on the same page.
>>
>> I think we have the same goal: propose a general and reasonable mechanism
>> to make spark on k8s with a custom scheduler more usable.
>>
>> But for the PodGroup, just allow me to do a brief introduction:
>> - The PodGroup definition has been approved by Kubernetes officially in
>> KEP-583. [1]
>> - It can be regarded as a general concept/standard in Kubernetes rather
>> than a specific concept in Volcano, there are also others to implement it,
>> such as [2][3].
>> - Kubernetes recommends using CRD to do more extension to implement what
>> they want. [4]
>> - Volcano as extension provides an interface to maintain the life cycle
>> PodGroup CRD and use volcano-scheduler to complete the scheduling.
>>
>> [1]
>> https://github.com/kubernetes/enhancements/tree/master/keps/sig-scheduling/583-coscheduling
>> [2]
>> https://github.com/kubernetes-sigs/scheduler-plugins/tree/master/pkg/coscheduling#podgroup
>> [3] https://github.com/kubernetes-sigs/kube-batch
>> [4]
>> https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/
>>
>> Regards,
>> Yikun
>>
>>
>> Weiwei Yang <w...@apache.org> 于2021年12月1日周三 上午5:57写道:
>>
>>> Hi Chenya
>>>
>>> Thanks for bringing this up. This is quite interesting, we definitely
>>> should participate more in the discussions.
>>> The main thing here is, the Spark community should make Spark pluggable
>>> in order to support other schedulers, not just for Volcano. It looks like
>>> this proposal is pushing really hard for adopting PodGroup, which isn't
>>> part of K8s yet, that to me is problematic.
>>>
>>> On Tue, Nov 30, 2021 at 9:21 AM Prasad Paravatha <
>>> prasad.parava...@gmail.com> wrote:
>>>
>>>> This is a great feature/idea.
>>>> I'd love to get involved in some form (testing and/or documentation).
>>>> This could be my 1st contribution to Spark!
>>>>
>>>> On Tue, Nov 30, 2021 at 10:46 PM John Zhuge <jzh...@apache.org> wrote:
>>>>
>>>>> +1 Kudos to Yikun and the community for starting the discussion!
>>>>>
>>>>> On Tue, Nov 30, 2021 at 8:47 AM Chenya Zhang <
>>>>> chenyazhangche...@gmail.com> wrote:
>>>>>
>>>>>> Thanks folks for bringing up the topic of natively integrating
>>>>>> Volcano and other alternative schedulers into Spark!
>>>>>>
>>>>>> +Weiwei, Wilfred, Chaoran. We would love to contribute to the
>>>>>> discussion as well.
>>>>>>
>>>>>> From our side, we have been using and improving on one alternative
>>>>>> resource scheduler, Apache YuniKorn (https://yunikorn.apache.org/),
>>>>>> for Spark on Kubernetes in production at Apple with solid results in the
>>>>>> past year. It is capable of supporting Gang scheduling (similar to
>>>>>> PodGroups), multi-tenant resource queues (similar to YARN), FIFO, and 
>>>>>> other
>>>>>> handy features like bin packing to enable efficient autoscaling, etc.
>>>>>>
>>>>>> Natively integrating with Spark would provide more flexibility for
>>>>>> users and reduce the extra cost and potential inconsistency of 
>>>>>> maintaining
>>>>>> different layers of resource strategies. One interesting topic we hope to
>>>>>> discuss more about is dynamic allocation, which would benefit from native
>>>>>> coordination between Spark and resource schedulers in K8s &
>>>>>> cloud environment for an optimal resource efficiency.
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 30, 2021 at 8:10 AM Holden Karau <hol...@pigscanfly.ca>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks for putting this together, I’m really excited for us to add
>>>>>>> better batch scheduling integrations.
>>>>>>>
>>>>>>> On Tue, Nov 30, 2021 at 12:46 AM Yikun Jiang <yikunk...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hey everyone,
>>>>>>>>
>>>>>>>> I'd like to start a discussion on "Support Volcano/Alternative
>>>>>>>> Schedulers Proposal".
>>>>>>>>
>>>>>>>> This SPIP is proposed to make spark k8s schedulers provide more
>>>>>>>> YARN like features (such as queues and minimum resources before 
>>>>>>>> scheduling
>>>>>>>> jobs) that many folks want on Kubernetes.
>>>>>>>>
>>>>>>>> The goal of this SPIP is to improve current spark k8s scheduler
>>>>>>>> implementations, add the ability of batch scheduling and support 
>>>>>>>> volcano as
>>>>>>>> one of implementations.
>>>>>>>>
>>>>>>>> Design doc:
>>>>>>>> https://docs.google.com/document/d/1xgQGRpaHQX6-QH_J9YV2C2Dh6RpXefUpLM7KGkzL6Fg
>>>>>>>> JIRA: https://issues.apache.org/jira/browse/SPARK-36057
>>>>>>>> Part of PRs:
>>>>>>>> Ability to create resources
>>>>>>>> https://github.com/apache/spark/pull/34599
>>>>>>>> Add PodGroupFeatureStep: https://github.com/apache/spark/pull/34456
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Yikun
>>>>>>>>
>>>>>>> --
>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> John Zhuge
>>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Prasad Paravatha
>>>>
>>>

Reply via email to