> And maybe we also could ping Yikun Jiang who has done similar things in
Spark.
Thanks for @wangyang ping. Yes, I was involved in Spark's customized
scheduler support work and as the main completer.
For customized scheduler support, I can share scheduler's requirement in
here:
1. Help scheduler to *specify* the scheduler name
2. Help scheduler to create the* scheduler related label/annotation/CRD*,
such as
- Yunikorn needs labels/annotations
<https://yunikorn.apache.org/docs/user_guide/labels_and_annotations_in_yunikorn/>
(maybe task group CRD in future or not)
- Volcano needs annotations and CRD <https://volcano.sh/en/docs/podgroup/>
- Kube-batch needs annotations/CRD
<https://github.com/kubernetes-sigs/kube-batch/tree/master/config/crds>
- Kueue needs annotation support
<https://github.com/kubernetes-sigs/kueue/blob/888cedb6e62c315e008916086308a893cd21dd66/config/samples/sample-job.yaml#L6>
and
cluster level CRD
3. Help the scheduler to create the scheduler meta/CRD at the* right time*,
such as if users want to avoid pod max pending, we need to create the
scheduler required CRD before pod creation.
For complex requirements, Spark uses featurestep to support (looks flink
decorators are very similar to it)
For simple requirements, they can just use configuration or Pod Template.
[1]
https://spark.apache.org/docs/latest/running-on-kubernetes.html#customized-kubernetes-schedulers-for-spark-on-kubernetes
>From the FLIP, I can see the above requirements are covered.
BTW, I think Flink decorators' existing and new added interface have
already covered all requirements of Kubernetes, so I personally think the
K8s related scheduler requirement can also be well covered by it.
Regards,
Yikun
On Thu, Jul 14, 2022 at 5:11 PM Yang Wang wrote:
> I think we could go over the customized scheduler plugin mechanism again
> with YuniKorn to make sure that it is common enough.
> But the implementation could be deferred.
>
> And maybe we also could ping Yikun Jiang who has done similar things in
> Spark.
>
> For the e2e tests, I admit that they could be improved. But I am not sure
> whether we really need the java implementation instead.
> This is out of the scope of this FLIP and let's keep the discussion
> under FLINK-20392.
>
>
> Best,
> Yang
>
> Martijn Visser 于2022年7月14日周四 15:28写道:
>
> > Hi Bo,
> >
> > Thanks for the info! I think I see that you've already updated the FLIP
> to
> > reflect how customized schedulers are beneficial for both batch and
> > streaming jobs.
> >
> > The reason why I'm not too happy that we would only create a reference
> > implementation for Volcano is that we don't know if the generic support
> for
> > customized scheduler plugins will also work for others. We think it will,
> > but since there would be no other implementation available, we are not
> > sure. My concern is that when someone tries to add support for another
> > scheduler, we notice that we actually made a mistake or should improve
> the
> > generic support.
> >
> > Best regards,
> >
> > Martijn
> >
> >
> >
> > Op do 14 jul. 2022 om 05:30 schreef bo zhaobo <
> bzhaojyathousa...@gmail.com
> > >:
> >
> > > Hi Martijn,
> > >
> > > Thank you for your comments. I will answer the questions one by one.
> > >
> > > ""
> > > * Regarding the motivation, it mentions that the development trend is
> > that
> > > Flink supports both batch and stream processing. I think the vision and
> > > trend is that we have unified batch- and stream processing. What I'm
> > > missing is the vision on what's the impact for customized Kubernetes
> > > schedulers on stream processing. Could there be some elaboration on
> that?
> > > ""
> > >
> > > >>
> > >
> > > We very much agree with you and the dev trend that Flink supports both
> > > batch and stream processing. Actually, using the K8S customized
> scheduler
> > > is beneficial for streaming scenarios too, such as avoiding resource
> > > deadlock and other problems, for example, the remaining resources in
> the
> > > K8S cluster are only enough for one job running, but we submitted two.
> At
> > > this time, both jobs will be prevented and hang from requesting
> resources
> > > at the same time when using the default K8S scheduler, but in this
> case,
> > > the customized scheduler Volcano won’t schedule overcommit pods if the
> > idle
> > > can not fit all following pods setup. So the benefits mentioned in FLIP
> > are
> > > not only