Xintong Song created FLINK-19068: ------------------------------------ Summary: Filter verbose pod events for KubernetesResourceManagerDriver Key: FLINK-19068 URL: https://issues.apache.org/jira/browse/FLINK-19068 Project: Flink Issue Type: Improvement Components: Deployment / Kubernetes Reporter: Xintong Song
A status of a Kubernetes pod consists of many detailed fields. Currently, Flink receives pod {{MODIFIED}} events from theĀ {{KubernetesPodsWatcher}} on every single change to these fields, many of which Flink does not care. The verbose events will not affect the functionality of Flink, but will pollute the logs with repeated messages, because Flink only looks into the fields it interested in and those fields are identical. E.g., when a task manager is stopped due to idle timeout, Flink receives 3 events: * MODIFIED: container terminated * MODIFIED: {{deletionGracePeriodSeconds}} changes from 30 to 0, which is a Kubernetes internal status change after containers are gracefully terminated * DELETED: Flink removes metadata of the terminated pod Among the 3 messages, Flink is only interested in the 1st MODIFIED message, but will try to process all of them because the container status is terminated. I propose to Filter the verbose events in {{KubernetesResourceManagerDriver.PodCallbackHandlerImpl}}, to only process the status changes interested by Flink. This probably requires recording the status of all living pods, to compare with the incoming events for detecting status changes. -- This message was sent by Atlassian Jira (v8.3.4#803005)