Bowen Li created FLINK-39990:
--------------------------------
Summary: Expose DynamicKafkaSource.activeSplitCount metric per
source reader
Key: FLINK-39990
URL: https://issues.apache.org/jira/browse/FLINK-39990
Project: Flink
Issue Type: New Feature
Components: Connectors / Kafka
Affects Versions: 2.3.0
Reporter: Bowen Li
Fix For: 2.4.0
{{DynamicKafkaSource should expose a per-reader/subtask metric named
DynamicKafkaSource.activeSplitCount.
The metric will be used by the Flink Kubernetes Operator autoscaler to detect
when cluster/partition removal leaves empty subtasks or reduces the total active
partition count below job parallelism.}}
{{Metric semantics:}}
* {{Gauge name: DynamicKafkaSource.activeSplitCount }}
* {{Scope: per DynamicKafkaSource reader/subtask}}
* {{Value: number of currently assigned active splits}}
* {{Report 0 when no active splits are assigned}}
* {{Retained removed partitions must not count toward this metric; they should
retain offset/state only}}
{{Autoscaler assumptions:}}
- sum(activeSplitCount) < parallelism indicates the job should scale down
- any subtask with activeSplitCount == 0 indicates the job should rebalance or
restart
Scope:
This ticket only adds the source metric. Autoscaler behavior remains owned by
the Flink Kubernetes Operator.
Acceptance Criteria:
- DynamicKafkaSource exposes activeSplitCount for every reader/subtask
- Metric reports 0 before assignment and after all active splits are removed
- Metric updates after split assignment, removal, and reactivation
- Retained removed partitions are excluded from the count
- Tests cover assignment, removal, empty reader, retained-state exclusion, and
reactivation
- Metrics documentation is updated
--
This message was sent by Atlassian Jira
(v8.20.10#820010)