Re: [Discuss] FLIP-362: Support minimum resource limitation

2023-10-04 Thread xiangyu feng
scillation check or the minimum
> > >> > restriction can be considered as disabled.
> > >> > - The minimum resource is a cluster-level configuration rather than
> a
> > >> > job-level configuration. If a user has an application with two batch
> > >> > jobs preceding the streaming job, they may also require this
> > >> > configuration to accelerate the execution of batch jobs.
> > >> >
> > >> > WDYT?
> > >> >
> > >> > Best,
> > >> > Yangze Guo
> > >> >
> > >> > On Thu, Sep 21, 2023 at 4:49 AM Jing Ge  >
> > >> > wrote:
> > >> > >
> > >> > > Hi Xiangyu,
> > >> > >
> > >> > > Thanks for driving it! There is one thing I am not really sure if
> I
> > >> > > understand you correctly.
> > >> > >
> > >> > > According to the FLIP: "The minimum resource limitation will be
> > >> > implemented
> > >> > > in the DefaultResourceAllocationStrategy of
> FineGrainedSlotManager.
> > >> > >
> > >> > > Each time when SlotManager needs to reconcile the cluster
> resources
> > or
> > >> > > fulfill job resource requirements, the
> > >> DefaultResourceAllocationStrategy
> > >> > > will check if the minimum resource requirement has been fulfilled.
> > If
> > >> it
> > >> > is
> > >> > > not, DefaultResourceAllocationStrategy will request new
> > >> > PendingTaskManagers
> > >> > > and FineGrainedSlotManager will allocate new worker resources
> > >> > accordingly."
> > >> > >
> > >> > > "To avoid this oscillation, we need to check the worker number
> > derived
> > >> > from
> > >> > > minimum and maximum resource configuration is consistent before
> > >> starting
> > >> > > SlotManager."
> > >> > >
> > >> > > Will the minimum resource configuration also take effect for
> > streaming
> > >> > jobs
> > >> > > in application mode? Since it is not recommended to
> > >> > > configure slotmanager.number-of-slots.max for streaming jobs, does
> > it
> > >> > make
> > >> > > sense to disable it for common streaming jobs? At least disable
> the
> > >> check
> > >> > > for avoiding the oscillation?
> > >> > >
> > >> > > Best regards,
> > >> > > Jing
> > >> > >
> > >> > >
> > >> > > On Tue, Sep 19, 2023 at 4:58 PM Chen Zhanghao <
> > >> zhanghao.c...@outlook.com
> > >> > >
> > >> > > wrote:
> > >> > >
> > >> > > > Thanks for driving this, Xiangyu. We use Session clusters for
> > quick
> > >> SQL
> > >> > > > debugging internally, and found cold-start job submission slow
> due
> > >> to
> > >> > lack
> > >> > > > of the exact minimum resource reservation feature proposed here.
> > >> This
> > >> > > > should improve the experience a lot for running short lived-jobs
> > in
> > >> > session
> > >> > > > clusters.
> > >> > > >
> > >> > > > Best,
> > >> > > > Zhanghao Chen
> > >> > > > 
> > >> > > > 发件人: Yangze Guo 
> > >> > > > 发送时间: 2023年9月19日 13:10
> > >> > > > 收件人: xiangyu feng 
> > >> > > > 抄送: dev@flink.apache.org 
> > >> > > > 主题: Re: [Discuss] FLIP-362: Support minimum resource limitation
> > >> > > >
> > >> > > > Thanks for driving this @Xiangyu. This is a feature that many
> > users
> > >> > > > have requested for a long time. +1 for the overall proposal.
> > >> > > >
> > >> > > > Best,
> > >> > > > Yangze Guo
> > >> > > >
> > >> > > > On Tue, Sep 19, 2023 at 11:48 AM xiangyu feng <
> > xiangyu...@gmail.com
> > >> >
> > >> > > > wrote:
> > >> > > > >
> > >> > > > > Hi Devs,
> > >> > > > >
> > >> > > > > I'm opening this thread to discuss FLIP-362: Support minimum
> > >> resource
> > >> > > > limitation. The design doc can be found at:
> > >> > > > > FLIP-362: Support minimum resource limitation
> > >> > > > >
> > >> > > > > Currently, the Flink cluster only requests Task Managers (TMs)
> > >> when
> > >> > > > there is a resource requirement, and idle TMs are released
> after a
> > >> > certain
> > >> > > > period of time. However, in certain scenarios, such as running
> > short
> > >> > > > lived-jobs in session cluster and scheduling batch jobs stage by
> > >> > stage, we
> > >> > > > need to improve the efficiency of job execution by maintaining a
> > >> > certain
> > >> > > > number of available workers in the cluster all the time.
> > >> > > > >
> > >> > > > > After discussed with Yangze, we introduced this new feature.
> The
> > >> new
> > >> > > > added public options and proposed changes are described in this
> > >> FLIP.
> > >> > > > >
> > >> > > > > Looking forward to your feedback, thanks.
> > >> > > > >
> > >> > > > > Best regards,
> > >> > > > > Xiangyu
> > >> > > > >
> > >> > > >
> > >> >
> > >>
> > >
> >
>


Re: [Discuss] FLIP-362: Support minimum resource limitation

2023-10-04 Thread David Morávek
tManager.
> >> > >
> >> > > Each time when SlotManager needs to reconcile the cluster resources
> or
> >> > > fulfill job resource requirements, the
> >> DefaultResourceAllocationStrategy
> >> > > will check if the minimum resource requirement has been fulfilled.
> If
> >> it
> >> > is
> >> > > not, DefaultResourceAllocationStrategy will request new
> >> > PendingTaskManagers
> >> > > and FineGrainedSlotManager will allocate new worker resources
> >> > accordingly."
> >> > >
> >> > > "To avoid this oscillation, we need to check the worker number
> derived
> >> > from
> >> > > minimum and maximum resource configuration is consistent before
> >> starting
> >> > > SlotManager."
> >> > >
> >> > > Will the minimum resource configuration also take effect for
> streaming
> >> > jobs
> >> > > in application mode? Since it is not recommended to
> >> > > configure slotmanager.number-of-slots.max for streaming jobs, does
> it
> >> > make
> >> > > sense to disable it for common streaming jobs? At least disable the
> >> check
> >> > > for avoiding the oscillation?
> >> > >
> >> > > Best regards,
> >> > > Jing
> >> > >
> >> > >
> >> > > On Tue, Sep 19, 2023 at 4:58 PM Chen Zhanghao <
> >> zhanghao.c...@outlook.com
> >> > >
> >> > > wrote:
> >> > >
> >> > > > Thanks for driving this, Xiangyu. We use Session clusters for
> quick
> >> SQL
> >> > > > debugging internally, and found cold-start job submission slow due
> >> to
> >> > lack
> >> > > > of the exact minimum resource reservation feature proposed here.
> >> This
> >> > > > should improve the experience a lot for running short lived-jobs
> in
> >> > session
> >> > > > clusters.
> >> > > >
> >> > > > Best,
> >> > > > Zhanghao Chen
> >> > > > 
> >> > > > 发件人: Yangze Guo 
> >> > > > 发送时间: 2023年9月19日 13:10
> >> > > > 收件人: xiangyu feng 
> >> > > > 抄送: dev@flink.apache.org 
> >> > > > 主题: Re: [Discuss] FLIP-362: Support minimum resource limitation
> >> > > >
> >> > > > Thanks for driving this @Xiangyu. This is a feature that many
> users
> >> > > > have requested for a long time. +1 for the overall proposal.
> >> > > >
> >> > > > Best,
> >> > > > Yangze Guo
> >> > > >
> >> > > > On Tue, Sep 19, 2023 at 11:48 AM xiangyu feng <
> xiangyu...@gmail.com
> >> >
> >> > > > wrote:
> >> > > > >
> >> > > > > Hi Devs,
> >> > > > >
> >> > > > > I'm opening this thread to discuss FLIP-362: Support minimum
> >> resource
> >> > > > limitation. The design doc can be found at:
> >> > > > > FLIP-362: Support minimum resource limitation
> >> > > > >
> >> > > > > Currently, the Flink cluster only requests Task Managers (TMs)
> >> when
> >> > > > there is a resource requirement, and idle TMs are released after a
> >> > certain
> >> > > > period of time. However, in certain scenarios, such as running
> short
> >> > > > lived-jobs in session cluster and scheduling batch jobs stage by
> >> > stage, we
> >> > > > need to improve the efficiency of job execution by maintaining a
> >> > certain
> >> > > > number of available workers in the cluster all the time.
> >> > > > >
> >> > > > > After discussed with Yangze, we introduced this new feature. The
> >> new
> >> > > > added public options and proposed changes are described in this
> >> FLIP.
> >> > > > >
> >> > > > > Looking forward to your feedback, thanks.
> >> > > > >
> >> > > > > Best regards,
> >> > > > > Xiangyu
> >> > > > >
> >> > > >
> >> >
> >>
> >
>


Re: [Discuss] FLIP-362: Support minimum resource limitation

2023-10-03 Thread xiangyu feng
Hi David,

Thx for your feedback.

First of all, for keeping some spare resources around, do you mean
'Redundant TaskManagers'[1]? If not, what is the difference between the
spare resources and redundant taskmanagers?

Secondly, IMHO the difference between min-reserved resource and spare
resources is that we could configure a rather large min-reserved resource
for user cases submitting lots of short-lived jobs concurrently, but we
don't want to configure a large spare resource since this might double the
total resource usage and lead to resource waste.

Looking forward to hearing from you.

Regards,
Xiangyu

[1] https://issues.apache.org/jira/browse/FLINK-18625

David Morávek  于2023年10月3日周二 05:00写道:

> H Xiangyui,
>
> The sentiment of the FLIP makes sense, but I keep wondering whether this
> is the best way to think about the problem. I assume that "interactive
> session cluster" users always want to keep some spare resources around (up
> to a configured threshold) to reduce cold start instead of statically
> configuring the minimum.
>
> It's just a tiny change from the original proposal, but it could make all
> the difference (eliminate overprovisioning, maintain latencies with a
> growing # of jobs, ..)
>
> WDYT?
>
> Best,
> D.
>
> On Mon, Sep 25, 2023 at 5:11 PM Jing Ge 
> wrote:
>
>> Hi Yangze,
>>
>> Thanks for the clarification. The example of two batch jobs team up with
>> one streaming job is interesting.
>>
>> Best regards,
>> Jing
>>
>> On Wed, Sep 20, 2023 at 7:19 PM Yangze Guo  wrote:
>>
>> > Thanks for the comments, Jing.
>> >
>> > > Will the minimum resource configuration also take effect for streaming
>> > jobs in application mode?
>> > > Since it is not recommended to configure
>> slotmanager.number-of-slots.max
>> > for streaming jobs, does it make sense to disable it for common
>> streaming
>> > jobs? At least disable the check for avoiding the oscillation?
>> >
>> > Yes. The minimum resource configuration will only disabled in
>> > standalone cluster atm. I agree it make sense to disable it for a pure
>> > streaming job, however:
>> > - By default, the minimum resource is configured to 0. If users do not
>> > proactively set it, either the oscillation check or the minimum
>> > restriction can be considered as disabled.
>> > - The minimum resource is a cluster-level configuration rather than a
>> > job-level configuration. If a user has an application with two batch
>> > jobs preceding the streaming job, they may also require this
>> > configuration to accelerate the execution of batch jobs.
>> >
>> > WDYT?
>> >
>> > Best,
>> > Yangze Guo
>> >
>> > On Thu, Sep 21, 2023 at 4:49 AM Jing Ge 
>> > wrote:
>> > >
>> > > Hi Xiangyu,
>> > >
>> > > Thanks for driving it! There is one thing I am not really sure if I
>> > > understand you correctly.
>> > >
>> > > According to the FLIP: "The minimum resource limitation will be
>> > implemented
>> > > in the DefaultResourceAllocationStrategy of FineGrainedSlotManager.
>> > >
>> > > Each time when SlotManager needs to reconcile the cluster resources or
>> > > fulfill job resource requirements, the
>> DefaultResourceAllocationStrategy
>> > > will check if the minimum resource requirement has been fulfilled. If
>> it
>> > is
>> > > not, DefaultResourceAllocationStrategy will request new
>> > PendingTaskManagers
>> > > and FineGrainedSlotManager will allocate new worker resources
>> > accordingly."
>> > >
>> > > "To avoid this oscillation, we need to check the worker number derived
>> > from
>> > > minimum and maximum resource configuration is consistent before
>> starting
>> > > SlotManager."
>> > >
>> > > Will the minimum resource configuration also take effect for streaming
>> > jobs
>> > > in application mode? Since it is not recommended to
>> > > configure slotmanager.number-of-slots.max for streaming jobs, does it
>> > make
>> > > sense to disable it for common streaming jobs? At least disable the
>> check
>> > > for avoiding the oscillation?
>> > >
>> > > Best regards,
>> > > Jing
>> > >
>> > >
>> > > On Tue, Sep 19, 2023 at 4:58 PM Chen Zhanghao <
>> zhanghao.c...@outlook.com
>> > >

Re: [Discuss] FLIP-362: Support minimum resource limitation

2023-10-02 Thread David Morávek
H Xiangyui,

The sentiment of the FLIP makes sense, but I keep wondering whether this is
the best way to think about the problem. I assume that "interactive session
cluster" users always want to keep some spare resources around (up to a
configured threshold) to reduce cold start instead of statically
configuring the minimum.

It's just a tiny change from the original proposal, but it could make all
the difference (eliminate overprovisioning, maintain latencies with a
growing # of jobs, ..)

WDYT?

Best,
D.

On Mon, Sep 25, 2023 at 5:11 PM Jing Ge  wrote:

> Hi Yangze,
>
> Thanks for the clarification. The example of two batch jobs team up with
> one streaming job is interesting.
>
> Best regards,
> Jing
>
> On Wed, Sep 20, 2023 at 7:19 PM Yangze Guo  wrote:
>
> > Thanks for the comments, Jing.
> >
> > > Will the minimum resource configuration also take effect for streaming
> > jobs in application mode?
> > > Since it is not recommended to configure
> slotmanager.number-of-slots.max
> > for streaming jobs, does it make sense to disable it for common streaming
> > jobs? At least disable the check for avoiding the oscillation?
> >
> > Yes. The minimum resource configuration will only disabled in
> > standalone cluster atm. I agree it make sense to disable it for a pure
> > streaming job, however:
> > - By default, the minimum resource is configured to 0. If users do not
> > proactively set it, either the oscillation check or the minimum
> > restriction can be considered as disabled.
> > - The minimum resource is a cluster-level configuration rather than a
> > job-level configuration. If a user has an application with two batch
> > jobs preceding the streaming job, they may also require this
> > configuration to accelerate the execution of batch jobs.
> >
> > WDYT?
> >
> > Best,
> > Yangze Guo
> >
> > On Thu, Sep 21, 2023 at 4:49 AM Jing Ge 
> > wrote:
> > >
> > > Hi Xiangyu,
> > >
> > > Thanks for driving it! There is one thing I am not really sure if I
> > > understand you correctly.
> > >
> > > According to the FLIP: "The minimum resource limitation will be
> > implemented
> > > in the DefaultResourceAllocationStrategy of FineGrainedSlotManager.
> > >
> > > Each time when SlotManager needs to reconcile the cluster resources or
> > > fulfill job resource requirements, the
> DefaultResourceAllocationStrategy
> > > will check if the minimum resource requirement has been fulfilled. If
> it
> > is
> > > not, DefaultResourceAllocationStrategy will request new
> > PendingTaskManagers
> > > and FineGrainedSlotManager will allocate new worker resources
> > accordingly."
> > >
> > > "To avoid this oscillation, we need to check the worker number derived
> > from
> > > minimum and maximum resource configuration is consistent before
> starting
> > > SlotManager."
> > >
> > > Will the minimum resource configuration also take effect for streaming
> > jobs
> > > in application mode? Since it is not recommended to
> > > configure slotmanager.number-of-slots.max for streaming jobs, does it
> > make
> > > sense to disable it for common streaming jobs? At least disable the
> check
> > > for avoiding the oscillation?
> > >
> > > Best regards,
> > > Jing
> > >
> > >
> > > On Tue, Sep 19, 2023 at 4:58 PM Chen Zhanghao <
> zhanghao.c...@outlook.com
> > >
> > > wrote:
> > >
> > > > Thanks for driving this, Xiangyu. We use Session clusters for quick
> SQL
> > > > debugging internally, and found cold-start job submission slow due to
> > lack
> > > > of the exact minimum resource reservation feature proposed here. This
> > > > should improve the experience a lot for running short lived-jobs in
> > session
> > > > clusters.
> > > >
> > > > Best,
> > > > Zhanghao Chen
> > > > 
> > > > 发件人: Yangze Guo 
> > > > 发送时间: 2023年9月19日 13:10
> > > > 收件人: xiangyu feng 
> > > > 抄送: dev@flink.apache.org 
> > > > 主题: Re: [Discuss] FLIP-362: Support minimum resource limitation
> > > >
> > > > Thanks for driving this @Xiangyu. This is a feature that many users
> > > > have requested for a long time. +1 for the overall proposal.
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > 

Re: [Discuss] FLIP-362: Support minimum resource limitation

2023-09-25 Thread Jing Ge
Hi Yangze,

Thanks for the clarification. The example of two batch jobs team up with
one streaming job is interesting.

Best regards,
Jing

On Wed, Sep 20, 2023 at 7:19 PM Yangze Guo  wrote:

> Thanks for the comments, Jing.
>
> > Will the minimum resource configuration also take effect for streaming
> jobs in application mode?
> > Since it is not recommended to configure slotmanager.number-of-slots.max
> for streaming jobs, does it make sense to disable it for common streaming
> jobs? At least disable the check for avoiding the oscillation?
>
> Yes. The minimum resource configuration will only disabled in
> standalone cluster atm. I agree it make sense to disable it for a pure
> streaming job, however:
> - By default, the minimum resource is configured to 0. If users do not
> proactively set it, either the oscillation check or the minimum
> restriction can be considered as disabled.
> - The minimum resource is a cluster-level configuration rather than a
> job-level configuration. If a user has an application with two batch
> jobs preceding the streaming job, they may also require this
> configuration to accelerate the execution of batch jobs.
>
> WDYT?
>
> Best,
> Yangze Guo
>
> On Thu, Sep 21, 2023 at 4:49 AM Jing Ge 
> wrote:
> >
> > Hi Xiangyu,
> >
> > Thanks for driving it! There is one thing I am not really sure if I
> > understand you correctly.
> >
> > According to the FLIP: "The minimum resource limitation will be
> implemented
> > in the DefaultResourceAllocationStrategy of FineGrainedSlotManager.
> >
> > Each time when SlotManager needs to reconcile the cluster resources or
> > fulfill job resource requirements, the DefaultResourceAllocationStrategy
> > will check if the minimum resource requirement has been fulfilled. If it
> is
> > not, DefaultResourceAllocationStrategy will request new
> PendingTaskManagers
> > and FineGrainedSlotManager will allocate new worker resources
> accordingly."
> >
> > "To avoid this oscillation, we need to check the worker number derived
> from
> > minimum and maximum resource configuration is consistent before starting
> > SlotManager."
> >
> > Will the minimum resource configuration also take effect for streaming
> jobs
> > in application mode? Since it is not recommended to
> > configure slotmanager.number-of-slots.max for streaming jobs, does it
> make
> > sense to disable it for common streaming jobs? At least disable the check
> > for avoiding the oscillation?
> >
> > Best regards,
> > Jing
> >
> >
> > On Tue, Sep 19, 2023 at 4:58 PM Chen Zhanghao  >
> > wrote:
> >
> > > Thanks for driving this, Xiangyu. We use Session clusters for quick SQL
> > > debugging internally, and found cold-start job submission slow due to
> lack
> > > of the exact minimum resource reservation feature proposed here. This
> > > should improve the experience a lot for running short lived-jobs in
> session
> > > clusters.
> > >
> > > Best,
> > > Zhanghao Chen
> > > 
> > > 发件人: Yangze Guo 
> > > 发送时间: 2023年9月19日 13:10
> > > 收件人: xiangyu feng 
> > > 抄送: dev@flink.apache.org 
> > > 主题: Re: [Discuss] FLIP-362: Support minimum resource limitation
> > >
> > > Thanks for driving this @Xiangyu. This is a feature that many users
> > > have requested for a long time. +1 for the overall proposal.
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Tue, Sep 19, 2023 at 11:48 AM xiangyu feng 
> > > wrote:
> > > >
> > > > Hi Devs,
> > > >
> > > > I'm opening this thread to discuss FLIP-362: Support minimum resource
> > > limitation. The design doc can be found at:
> > > > FLIP-362: Support minimum resource limitation
> > > >
> > > > Currently, the Flink cluster only requests Task Managers (TMs) when
> > > there is a resource requirement, and idle TMs are released after a
> certain
> > > period of time. However, in certain scenarios, such as running short
> > > lived-jobs in session cluster and scheduling batch jobs stage by
> stage, we
> > > need to improve the efficiency of job execution by maintaining a
> certain
> > > number of available workers in the cluster all the time.
> > > >
> > > > After discussed with Yangze, we introduced this new feature. The new
> > > added public options and proposed changes are described in this FLIP.
> > > >
> > > > Looking forward to your feedback, thanks.
> > > >
> > > > Best regards,
> > > > Xiangyu
> > > >
> > >
>


Re: [Discuss] FLIP-362: Support minimum resource limitation

2023-09-22 Thread xiangyu feng
Hi all,

Thank you for the comments.

If there is no further comment, we will open the voting thread in 3 days.

Regards,
Xiangyu

Yangze Guo  于2023年9月21日周四 11:19写道:

> Thanks for the reply, Shammon.
>
> As the example described in my last response, an application could
> contain multiple jobs, both batch and streaming. I don't lean to
> disable it in Application mode in case users want to leverage it to
> accelerate the preceding batch jobs in their application.
>
> Best,
> Yangze Guo
>
> On Thu, Sep 21, 2023 at 11:15 AM Shammon FY  wrote:
> >
> > Hi,
> >
> > I agree that `minimum resource limitation` will bring values for flink
> > session clusters, but for `Application Mode`, is it useful for streaming
> > and batch jobs? Is it necessary for us to not support the application
> mode,
> > rather than relying on the default value 0?
> >
> > Best,
> > Shammon FY
> >
> > On Thu, Sep 21, 2023 at 10:18 AM Yangze Guo  wrote:
> >
> > > Thanks for the comments, Jing.
> > >
> > > > Will the minimum resource configuration also take effect for
> streaming
> > > jobs in application mode?
> > > > Since it is not recommended to configure
> slotmanager.number-of-slots.max
> > > for streaming jobs, does it make sense to disable it for common
> streaming
> > > jobs? At least disable the check for avoiding the oscillation?
> > >
> > > Yes. The minimum resource configuration will only disabled in
> > > standalone cluster atm. I agree it make sense to disable it for a pure
> > > streaming job, however:
> > > - By default, the minimum resource is configured to 0. If users do not
> > > proactively set it, either the oscillation check or the minimum
> > > restriction can be considered as disabled.
> > > - The minimum resource is a cluster-level configuration rather than a
> > > job-level configuration. If a user has an application with two batch
> > > jobs preceding the streaming job, they may also require this
> > > configuration to accelerate the execution of batch jobs.
> > >
> > > WDYT?
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Thu, Sep 21, 2023 at 4:49 AM Jing Ge 
> > > wrote:
> > > >
> > > > Hi Xiangyu,
> > > >
> > > > Thanks for driving it! There is one thing I am not really sure if I
> > > > understand you correctly.
> > > >
> > > > According to the FLIP: "The minimum resource limitation will be
> > > implemented
> > > > in the DefaultResourceAllocationStrategy of FineGrainedSlotManager.
> > > >
> > > > Each time when SlotManager needs to reconcile the cluster resources
> or
> > > > fulfill job resource requirements, the
> DefaultResourceAllocationStrategy
> > > > will check if the minimum resource requirement has been fulfilled.
> If it
> > > is
> > > > not, DefaultResourceAllocationStrategy will request new
> > > PendingTaskManagers
> > > > and FineGrainedSlotManager will allocate new worker resources
> > > accordingly."
> > > >
> > > > "To avoid this oscillation, we need to check the worker number
> derived
> > > from
> > > > minimum and maximum resource configuration is consistent before
> starting
> > > > SlotManager."
> > > >
> > > > Will the minimum resource configuration also take effect for
> streaming
> > > jobs
> > > > in application mode? Since it is not recommended to
> > > > configure slotmanager.number-of-slots.max for streaming jobs, does it
> > > make
> > > > sense to disable it for common streaming jobs? At least disable the
> check
> > > > for avoiding the oscillation?
> > > >
> > > > Best regards,
> > > > Jing
> > > >
> > > >
> > > > On Tue, Sep 19, 2023 at 4:58 PM Chen Zhanghao <
> zhanghao.c...@outlook.com
> > > >
> > > > wrote:
> > > >
> > > > > Thanks for driving this, Xiangyu. We use Session clusters for
> quick SQL
> > > > > debugging internally, and found cold-start job submission slow due
> to
> > > lack
> > > > > of the exact minimum resource reservation feature proposed here.
> This
> > > > > should improve the experience a lot for running short lived-jobs in
> > > session
> > > > > clusters.
> > > > >
&

Re: [Discuss] FLIP-362: Support minimum resource limitation

2023-09-20 Thread Yangze Guo
Thanks for the reply, Shammon.

As the example described in my last response, an application could
contain multiple jobs, both batch and streaming. I don't lean to
disable it in Application mode in case users want to leverage it to
accelerate the preceding batch jobs in their application.

Best,
Yangze Guo

On Thu, Sep 21, 2023 at 11:15 AM Shammon FY  wrote:
>
> Hi,
>
> I agree that `minimum resource limitation` will bring values for flink
> session clusters, but for `Application Mode`, is it useful for streaming
> and batch jobs? Is it necessary for us to not support the application mode,
> rather than relying on the default value 0?
>
> Best,
> Shammon FY
>
> On Thu, Sep 21, 2023 at 10:18 AM Yangze Guo  wrote:
>
> > Thanks for the comments, Jing.
> >
> > > Will the minimum resource configuration also take effect for streaming
> > jobs in application mode?
> > > Since it is not recommended to configure slotmanager.number-of-slots.max
> > for streaming jobs, does it make sense to disable it for common streaming
> > jobs? At least disable the check for avoiding the oscillation?
> >
> > Yes. The minimum resource configuration will only disabled in
> > standalone cluster atm. I agree it make sense to disable it for a pure
> > streaming job, however:
> > - By default, the minimum resource is configured to 0. If users do not
> > proactively set it, either the oscillation check or the minimum
> > restriction can be considered as disabled.
> > - The minimum resource is a cluster-level configuration rather than a
> > job-level configuration. If a user has an application with two batch
> > jobs preceding the streaming job, they may also require this
> > configuration to accelerate the execution of batch jobs.
> >
> > WDYT?
> >
> > Best,
> > Yangze Guo
> >
> > On Thu, Sep 21, 2023 at 4:49 AM Jing Ge 
> > wrote:
> > >
> > > Hi Xiangyu,
> > >
> > > Thanks for driving it! There is one thing I am not really sure if I
> > > understand you correctly.
> > >
> > > According to the FLIP: "The minimum resource limitation will be
> > implemented
> > > in the DefaultResourceAllocationStrategy of FineGrainedSlotManager.
> > >
> > > Each time when SlotManager needs to reconcile the cluster resources or
> > > fulfill job resource requirements, the DefaultResourceAllocationStrategy
> > > will check if the minimum resource requirement has been fulfilled. If it
> > is
> > > not, DefaultResourceAllocationStrategy will request new
> > PendingTaskManagers
> > > and FineGrainedSlotManager will allocate new worker resources
> > accordingly."
> > >
> > > "To avoid this oscillation, we need to check the worker number derived
> > from
> > > minimum and maximum resource configuration is consistent before starting
> > > SlotManager."
> > >
> > > Will the minimum resource configuration also take effect for streaming
> > jobs
> > > in application mode? Since it is not recommended to
> > > configure slotmanager.number-of-slots.max for streaming jobs, does it
> > make
> > > sense to disable it for common streaming jobs? At least disable the check
> > > for avoiding the oscillation?
> > >
> > > Best regards,
> > > Jing
> > >
> > >
> > > On Tue, Sep 19, 2023 at 4:58 PM Chen Zhanghao  > >
> > > wrote:
> > >
> > > > Thanks for driving this, Xiangyu. We use Session clusters for quick SQL
> > > > debugging internally, and found cold-start job submission slow due to
> > lack
> > > > of the exact minimum resource reservation feature proposed here. This
> > > > should improve the experience a lot for running short lived-jobs in
> > session
> > > > clusters.
> > > >
> > > > Best,
> > > > Zhanghao Chen
> > > > 
> > > > 发件人: Yangze Guo 
> > > > 发送时间: 2023年9月19日 13:10
> > > > 收件人: xiangyu feng 
> > > > 抄送: dev@flink.apache.org 
> > > > 主题: Re: [Discuss] FLIP-362: Support minimum resource limitation
> > > >
> > > > Thanks for driving this @Xiangyu. This is a feature that many users
> > > > have requested for a long time. +1 for the overall proposal.
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > > > On Tue, Sep 19, 2023 at 11:48 AM xiangyu feng 
> > > > wrote:
> > > > >

Re: [Discuss] FLIP-362: Support minimum resource limitation

2023-09-20 Thread Shammon FY
Hi,

I agree that `minimum resource limitation` will bring values for flink
session clusters, but for `Application Mode`, is it useful for streaming
and batch jobs? Is it necessary for us to not support the application mode,
rather than relying on the default value 0?

Best,
Shammon FY

On Thu, Sep 21, 2023 at 10:18 AM Yangze Guo  wrote:

> Thanks for the comments, Jing.
>
> > Will the minimum resource configuration also take effect for streaming
> jobs in application mode?
> > Since it is not recommended to configure slotmanager.number-of-slots.max
> for streaming jobs, does it make sense to disable it for common streaming
> jobs? At least disable the check for avoiding the oscillation?
>
> Yes. The minimum resource configuration will only disabled in
> standalone cluster atm. I agree it make sense to disable it for a pure
> streaming job, however:
> - By default, the minimum resource is configured to 0. If users do not
> proactively set it, either the oscillation check or the minimum
> restriction can be considered as disabled.
> - The minimum resource is a cluster-level configuration rather than a
> job-level configuration. If a user has an application with two batch
> jobs preceding the streaming job, they may also require this
> configuration to accelerate the execution of batch jobs.
>
> WDYT?
>
> Best,
> Yangze Guo
>
> On Thu, Sep 21, 2023 at 4:49 AM Jing Ge 
> wrote:
> >
> > Hi Xiangyu,
> >
> > Thanks for driving it! There is one thing I am not really sure if I
> > understand you correctly.
> >
> > According to the FLIP: "The minimum resource limitation will be
> implemented
> > in the DefaultResourceAllocationStrategy of FineGrainedSlotManager.
> >
> > Each time when SlotManager needs to reconcile the cluster resources or
> > fulfill job resource requirements, the DefaultResourceAllocationStrategy
> > will check if the minimum resource requirement has been fulfilled. If it
> is
> > not, DefaultResourceAllocationStrategy will request new
> PendingTaskManagers
> > and FineGrainedSlotManager will allocate new worker resources
> accordingly."
> >
> > "To avoid this oscillation, we need to check the worker number derived
> from
> > minimum and maximum resource configuration is consistent before starting
> > SlotManager."
> >
> > Will the minimum resource configuration also take effect for streaming
> jobs
> > in application mode? Since it is not recommended to
> > configure slotmanager.number-of-slots.max for streaming jobs, does it
> make
> > sense to disable it for common streaming jobs? At least disable the check
> > for avoiding the oscillation?
> >
> > Best regards,
> > Jing
> >
> >
> > On Tue, Sep 19, 2023 at 4:58 PM Chen Zhanghao  >
> > wrote:
> >
> > > Thanks for driving this, Xiangyu. We use Session clusters for quick SQL
> > > debugging internally, and found cold-start job submission slow due to
> lack
> > > of the exact minimum resource reservation feature proposed here. This
> > > should improve the experience a lot for running short lived-jobs in
> session
> > > clusters.
> > >
> > > Best,
> > > Zhanghao Chen
> > > 
> > > 发件人: Yangze Guo 
> > > 发送时间: 2023年9月19日 13:10
> > > 收件人: xiangyu feng 
> > > 抄送: dev@flink.apache.org 
> > > 主题: Re: [Discuss] FLIP-362: Support minimum resource limitation
> > >
> > > Thanks for driving this @Xiangyu. This is a feature that many users
> > > have requested for a long time. +1 for the overall proposal.
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Tue, Sep 19, 2023 at 11:48 AM xiangyu feng 
> > > wrote:
> > > >
> > > > Hi Devs,
> > > >
> > > > I'm opening this thread to discuss FLIP-362: Support minimum resource
> > > limitation. The design doc can be found at:
> > > > FLIP-362: Support minimum resource limitation
> > > >
> > > > Currently, the Flink cluster only requests Task Managers (TMs) when
> > > there is a resource requirement, and idle TMs are released after a
> certain
> > > period of time. However, in certain scenarios, such as running short
> > > lived-jobs in session cluster and scheduling batch jobs stage by
> stage, we
> > > need to improve the efficiency of job execution by maintaining a
> certain
> > > number of available workers in the cluster all the time.
> > > >
> > > > After discussed with Yangze, we introduced this new feature. The new
> > > added public options and proposed changes are described in this FLIP.
> > > >
> > > > Looking forward to your feedback, thanks.
> > > >
> > > > Best regards,
> > > > Xiangyu
> > > >
> > >
>


Re: [Discuss] FLIP-362: Support minimum resource limitation

2023-09-20 Thread xiangyu feng
Hi Jing,

Thanks for pointing this out. As described by Yangze, the min resource
option will be set to 0 by default and the oscillation check will be
disabled at then.
In most cases, common streaming jobs won't be affected by this new added
option.

I've updated the FLIP to explain this.

Thx,
Xiangyu

Yangze Guo  于2023年9月21日周四 10:18写道:

> Thanks for the comments, Jing.
>
> > Will the minimum resource configuration also take effect for streaming
> jobs in application mode?
> > Since it is not recommended to configure slotmanager.number-of-slots.max
> for streaming jobs, does it make sense to disable it for common streaming
> jobs? At least disable the check for avoiding the oscillation?
>
> Yes. The minimum resource configuration will only disabled in
> standalone cluster atm. I agree it make sense to disable it for a pure
> streaming job, however:
> - By default, the minimum resource is configured to 0. If users do not
> proactively set it, either the oscillation check or the minimum
> restriction can be considered as disabled.
> - The minimum resource is a cluster-level configuration rather than a
> job-level configuration. If a user has an application with two batch
> jobs preceding the streaming job, they may also require this
> configuration to accelerate the execution of batch jobs.
>
> WDYT?
>
> Best,
> Yangze Guo
>
> On Thu, Sep 21, 2023 at 4:49 AM Jing Ge 
> wrote:
> >
> > Hi Xiangyu,
> >
> > Thanks for driving it! There is one thing I am not really sure if I
> > understand you correctly.
> >
> > According to the FLIP: "The minimum resource limitation will be
> implemented
> > in the DefaultResourceAllocationStrategy of FineGrainedSlotManager.
> >
> > Each time when SlotManager needs to reconcile the cluster resources or
> > fulfill job resource requirements, the DefaultResourceAllocationStrategy
> > will check if the minimum resource requirement has been fulfilled. If it
> is
> > not, DefaultResourceAllocationStrategy will request new
> PendingTaskManagers
> > and FineGrainedSlotManager will allocate new worker resources
> accordingly."
> >
> > "To avoid this oscillation, we need to check the worker number derived
> from
> > minimum and maximum resource configuration is consistent before starting
> > SlotManager."
> >
> > Will the minimum resource configuration also take effect for streaming
> jobs
> > in application mode? Since it is not recommended to
> > configure slotmanager.number-of-slots.max for streaming jobs, does it
> make
> > sense to disable it for common streaming jobs? At least disable the check
> > for avoiding the oscillation?
> >
> > Best regards,
> > Jing
> >
> >
> > On Tue, Sep 19, 2023 at 4:58 PM Chen Zhanghao  >
> > wrote:
> >
> > > Thanks for driving this, Xiangyu. We use Session clusters for quick SQL
> > > debugging internally, and found cold-start job submission slow due to
> lack
> > > of the exact minimum resource reservation feature proposed here. This
> > > should improve the experience a lot for running short lived-jobs in
> session
> > > clusters.
> > >
> > > Best,
> > > Zhanghao Chen
> > > 
> > > 发件人: Yangze Guo 
> > > 发送时间: 2023年9月19日 13:10
> > > 收件人: xiangyu feng 
> > > 抄送: dev@flink.apache.org 
> > > 主题: Re: [Discuss] FLIP-362: Support minimum resource limitation
> > >
> > > Thanks for driving this @Xiangyu. This is a feature that many users
> > > have requested for a long time. +1 for the overall proposal.
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Tue, Sep 19, 2023 at 11:48 AM xiangyu feng 
> > > wrote:
> > > >
> > > > Hi Devs,
> > > >
> > > > I'm opening this thread to discuss FLIP-362: Support minimum resource
> > > limitation. The design doc can be found at:
> > > > FLIP-362: Support minimum resource limitation
> > > >
> > > > Currently, the Flink cluster only requests Task Managers (TMs) when
> > > there is a resource requirement, and idle TMs are released after a
> certain
> > > period of time. However, in certain scenarios, such as running short
> > > lived-jobs in session cluster and scheduling batch jobs stage by
> stage, we
> > > need to improve the efficiency of job execution by maintaining a
> certain
> > > number of available workers in the cluster all the time.
> > > >
> > > > After discussed with Yangze, we introduced this new feature. The new
> > > added public options and proposed changes are described in this FLIP.
> > > >
> > > > Looking forward to your feedback, thanks.
> > > >
> > > > Best regards,
> > > > Xiangyu
> > > >
> > >
>


Re: [Discuss] FLIP-362: Support minimum resource limitation

2023-09-20 Thread Yangze Guo
Thanks for the comments, Jing.

> Will the minimum resource configuration also take effect for streaming jobs 
> in application mode?
> Since it is not recommended to configure slotmanager.number-of-slots.max for 
> streaming jobs, does it make sense to disable it for common streaming jobs? 
> At least disable the check for avoiding the oscillation?

Yes. The minimum resource configuration will only disabled in
standalone cluster atm. I agree it make sense to disable it for a pure
streaming job, however:
- By default, the minimum resource is configured to 0. If users do not
proactively set it, either the oscillation check or the minimum
restriction can be considered as disabled.
- The minimum resource is a cluster-level configuration rather than a
job-level configuration. If a user has an application with two batch
jobs preceding the streaming job, they may also require this
configuration to accelerate the execution of batch jobs.

WDYT?

Best,
Yangze Guo

On Thu, Sep 21, 2023 at 4:49 AM Jing Ge  wrote:
>
> Hi Xiangyu,
>
> Thanks for driving it! There is one thing I am not really sure if I
> understand you correctly.
>
> According to the FLIP: "The minimum resource limitation will be implemented
> in the DefaultResourceAllocationStrategy of FineGrainedSlotManager.
>
> Each time when SlotManager needs to reconcile the cluster resources or
> fulfill job resource requirements, the DefaultResourceAllocationStrategy
> will check if the minimum resource requirement has been fulfilled. If it is
> not, DefaultResourceAllocationStrategy will request new PendingTaskManagers
> and FineGrainedSlotManager will allocate new worker resources accordingly."
>
> "To avoid this oscillation, we need to check the worker number derived from
> minimum and maximum resource configuration is consistent before starting
> SlotManager."
>
> Will the minimum resource configuration also take effect for streaming jobs
> in application mode? Since it is not recommended to
> configure slotmanager.number-of-slots.max for streaming jobs, does it make
> sense to disable it for common streaming jobs? At least disable the check
> for avoiding the oscillation?
>
> Best regards,
> Jing
>
>
> On Tue, Sep 19, 2023 at 4:58 PM Chen Zhanghao 
> wrote:
>
> > Thanks for driving this, Xiangyu. We use Session clusters for quick SQL
> > debugging internally, and found cold-start job submission slow due to lack
> > of the exact minimum resource reservation feature proposed here. This
> > should improve the experience a lot for running short lived-jobs in session
> > clusters.
> >
> > Best,
> > Zhanghao Chen
> > ____
> > 发件人: Yangze Guo 
> > 发送时间: 2023年9月19日 13:10
> > 收件人: xiangyu feng 
> > 抄送: dev@flink.apache.org 
> > 主题: Re: [Discuss] FLIP-362: Support minimum resource limitation
> >
> > Thanks for driving this @Xiangyu. This is a feature that many users
> > have requested for a long time. +1 for the overall proposal.
> >
> > Best,
> > Yangze Guo
> >
> > On Tue, Sep 19, 2023 at 11:48 AM xiangyu feng 
> > wrote:
> > >
> > > Hi Devs,
> > >
> > > I'm opening this thread to discuss FLIP-362: Support minimum resource
> > limitation. The design doc can be found at:
> > > FLIP-362: Support minimum resource limitation
> > >
> > > Currently, the Flink cluster only requests Task Managers (TMs) when
> > there is a resource requirement, and idle TMs are released after a certain
> > period of time. However, in certain scenarios, such as running short
> > lived-jobs in session cluster and scheduling batch jobs stage by stage, we
> > need to improve the efficiency of job execution by maintaining a certain
> > number of available workers in the cluster all the time.
> > >
> > > After discussed with Yangze, we introduced this new feature. The new
> > added public options and proposed changes are described in this FLIP.
> > >
> > > Looking forward to your feedback, thanks.
> > >
> > > Best regards,
> > > Xiangyu
> > >
> >


Re: [Discuss] FLIP-362: Support minimum resource limitation

2023-09-20 Thread Jing Ge
Hi Xiangyu,

Thanks for driving it! There is one thing I am not really sure if I
understand you correctly.

According to the FLIP: "The minimum resource limitation will be implemented
in the DefaultResourceAllocationStrategy of FineGrainedSlotManager.

Each time when SlotManager needs to reconcile the cluster resources or
fulfill job resource requirements, the DefaultResourceAllocationStrategy
will check if the minimum resource requirement has been fulfilled. If it is
not, DefaultResourceAllocationStrategy will request new PendingTaskManagers
and FineGrainedSlotManager will allocate new worker resources accordingly."

"To avoid this oscillation, we need to check the worker number derived from
minimum and maximum resource configuration is consistent before starting
SlotManager."

Will the minimum resource configuration also take effect for streaming jobs
in application mode? Since it is not recommended to
configure slotmanager.number-of-slots.max for streaming jobs, does it make
sense to disable it for common streaming jobs? At least disable the check
for avoiding the oscillation?

Best regards,
Jing


On Tue, Sep 19, 2023 at 4:58 PM Chen Zhanghao 
wrote:

> Thanks for driving this, Xiangyu. We use Session clusters for quick SQL
> debugging internally, and found cold-start job submission slow due to lack
> of the exact minimum resource reservation feature proposed here. This
> should improve the experience a lot for running short lived-jobs in session
> clusters.
>
> Best,
> Zhanghao Chen
> 
> 发件人: Yangze Guo 
> 发送时间: 2023年9月19日 13:10
> 收件人: xiangyu feng 
> 抄送: dev@flink.apache.org 
> 主题: Re: [Discuss] FLIP-362: Support minimum resource limitation
>
> Thanks for driving this @Xiangyu. This is a feature that many users
> have requested for a long time. +1 for the overall proposal.
>
> Best,
> Yangze Guo
>
> On Tue, Sep 19, 2023 at 11:48 AM xiangyu feng 
> wrote:
> >
> > Hi Devs,
> >
> > I'm opening this thread to discuss FLIP-362: Support minimum resource
> limitation. The design doc can be found at:
> > FLIP-362: Support minimum resource limitation
> >
> > Currently, the Flink cluster only requests Task Managers (TMs) when
> there is a resource requirement, and idle TMs are released after a certain
> period of time. However, in certain scenarios, such as running short
> lived-jobs in session cluster and scheduling batch jobs stage by stage, we
> need to improve the efficiency of job execution by maintaining a certain
> number of available workers in the cluster all the time.
> >
> > After discussed with Yangze, we introduced this new feature. The new
> added public options and proposed changes are described in this FLIP.
> >
> > Looking forward to your feedback, thanks.
> >
> > Best regards,
> > Xiangyu
> >
>


Re: [Discuss] FLIP-362: Support minimum resource limitation

2023-09-18 Thread Yangze Guo
Thanks for driving this @Xiangyu. This is a feature that many users
have requested for a long time. +1 for the overall proposal.

Best,
Yangze Guo

On Tue, Sep 19, 2023 at 11:48 AM xiangyu feng  wrote:
>
> Hi Devs,
>
> I'm opening this thread to discuss FLIP-362: Support minimum resource 
> limitation. The design doc can be found at:
> FLIP-362: Support minimum resource limitation
>
> Currently, the Flink cluster only requests Task Managers (TMs) when there is 
> a resource requirement, and idle TMs are released after a certain period of 
> time. However, in certain scenarios, such as running short lived-jobs in 
> session cluster and scheduling batch jobs stage by stage, we need to improve 
> the efficiency of job execution by maintaining a certain number of available 
> workers in the cluster all the time.
>
> After discussed with Yangze, we introduced this new feature. The new added 
> public options and proposed changes are described in this FLIP.
>
> Looking forward to your feedback, thanks.
>
> Best regards,
> Xiangyu
>