[
https://issues.apache.org/jira/browse/FLINK-36640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matyas Orhidi updated FLINK-36640:
----------------------------------
Description:
FLINK-32589 introduced a feature to carry over parallelism overrides between
application upgrades. This serves pretty well on the happy path preventing
users from clearing the actual parallelism settings on regular updates. However
in certain scenarios users may actually want to reset the parallelism to an
initial or desired state to force the application out of an unwanted scaling
state.
Such unwanted state can be for example too large parallelisms that causes
throttling, or similar. The current workaround needs to upgrade steps by
setting {{job.autoscaler.enabled: false}} and then {{job.autoscaler.enabled:
true}} again. We could combine the two steps into an {{{}autoscalerReset
nonce{}}}. We could also reintroduce the config with manual vertex parallelism
overrides to be able to reset to a desired scaling state, similar to what we do
when starting a job from specific Kafka offsets.
was:
FLINK-32589 introduced a feature to carry over parallelism overrides between
application upgrades. This serves pretty well on the happy path preventing
users from clearing the actual parallelism settings on regular updates. However
in certain scenarios users may actually want to reset the parallelism to an
initial or desired state to force the application out of an unwanted scaling
state.
Such unwanted state can be for example having too large parallelisms that
causes throttling, or similar. The current workaround needs to upgrade steps by
setting `job.autoscaler.enabled: false` and then `job.autoscaler.enabled: true`
again. We could combine it with a `autoscalerReset` nonce. We could also
reintroduce the config with manual vertex parallelism settings to be able to
reset to a configurable state, similar to what we do when starting a job from
specific Kafka offsets.
> Provide an easy way to reset autoscaling
> ----------------------------------------
>
> Key: FLINK-36640
> URL: https://issues.apache.org/jira/browse/FLINK-36640
> Project: Flink
> Issue Type: Improvement
> Components: Kubernetes Operator
> Reporter: Matyas Orhidi
> Priority: Major
>
> FLINK-32589 introduced a feature to carry over parallelism overrides between
> application upgrades. This serves pretty well on the happy path preventing
> users from clearing the actual parallelism settings on regular updates.
> However in certain scenarios users may actually want to reset the parallelism
> to an initial or desired state to force the application out of an unwanted
> scaling state.
> Such unwanted state can be for example too large parallelisms that causes
> throttling, or similar. The current workaround needs to upgrade steps by
> setting {{job.autoscaler.enabled: false}} and then {{job.autoscaler.enabled:
> true}} again. We could combine the two steps into an {{{}autoscalerReset
> nonce{}}}. We could also reintroduce the config with manual vertex
> parallelism overrides to be able to reset to a desired scaling state, similar
> to what we do when starting a job from specific Kafka offsets.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)