Hi Dong!

Let me try to answer the questions :)

1 : busyTimeMsPerSecond is not specific for CPU, it measures the time spent
in the main record processing loop for an operator if I
understand correctly. This includes IO operations too.

2: We should add this to the FLIP I agree. It would be a Duration config
with the expected catch up time after rescaling (let's say 5 minutes). It
could be computed based on the current data rate and the calculated max
processing rate after the rescale.

3: In the current proposal we don't have per operator configs. Target
utilization would apply to all operators uniformly.

4: It should be configurable, yes.

5,6: The names haven't been finalized but I think these are minor details.
We could add concrete names to the FLIP :)

Cheers,
Gyula


On Sun, Nov 6, 2022 at 5:19 PM Dong Lin <lindon...@gmail.com> wrote:

> Hi Max,
>
> Thank you for the proposal. The proposal tackles a very important issue
> for Flink users and the design looks promising overall!
>
> I have some questions to better understand the proposed public interfaces
> and the algorithm.
>
> 1) The proposal seems to assume that the operator's busyTimeMsPerSecond
> could reach 1 sec. I believe this is mostly true for cpu-bound operators.
> Could you confirm that this can also be true for io-bound operators such as
> sinks? For example, suppose a Kafka Sink subtask has reached I/O bottleneck
> when flushing data out to the Kafka clusters, will busyTimeMsPerSecond
> reach 1 sec?
>
> 2) It is said that "users can configure a maximum time to fully process
> the backlog". The configuration section does not seem to provide this
> config. Could you specify this? And any chance this proposal can provide
> the formula for calculating the new processing rate?
>
> 3) How are users expected to specify the per-operator configs (e.g. target
> utilization)? For example, should users specify it programmatically in a
> DataStream/Table/SQL API?
>
> 4) How often will the Flink Kubernetes operator query metrics from
> JobManager? Is this configurable?
>
> 5) Could you specify the config name and default value for the proposed
> configs?
>
> 6) Could you add the name/mbean/type for the proposed metrics?
>
>
> Cheers,
> Dong
>
>
>

Reply via email to