[ https://issues.apache.org/jira/browse/FLINK-27944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zhu Zhu updated FLINK-27944: ---------------------------- Description: When a task has union inputs, some IO metrics(numBytesIn* and numBuffersIn*) of the different inputs may collide and failed to be registered. The problem can be reproduced with a simple job like: {code:java} DataStream<String> source1 = env.fromElements("abc"); DataStream<String> source2 = env.fromElements("123"); source1.union(source2).print();{code} Logs of collisions: {code:java} 2022-06-08 00:59:01,629 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBytesInLocal'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, Shuffle, Netty, Input] 2022-06-08 00:59:01,629 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBytesInLocalPerSecond'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, Shuffle, Netty, Input] 2022-06-08 00:59:01,629 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBytesInLocal'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0] 2022-06-08 00:59:01,629 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBytesInLocalPerSecond'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0] 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBytesInRemote'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, Shuffle, Netty, Input] 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBytesInRemotePerSecond'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, Shuffle, Netty, Input] 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBytesInRemote'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0] 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBytesInRemotePerSecond'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0] 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBuffersInLocal'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, Shuffle, Netty, Input] 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBuffersInLocalPerSecond'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, Shuffle, Netty, Input] 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBuffersInLocal'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0] 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBuffersInLocalPerSecond'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0] 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBuffersInRemote'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, Shuffle, Netty, Input] 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBuffersInRemotePerSecond'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, Shuffle, Netty, Input] 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBuffersInRemote'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0] 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBuffersInRemotePerSecond'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0] {code} was: When a task has union inputs, some IO metrics(numBytesIn* and numBuffersIn*) of the different inputs may collide and failed to be registered. The problem can be reproduced with a simple job like: {code:java} DataStream<String> source1 = env.fromElements("abc"); DataStream<String> source2 = env.fromElements("123"); source1.union(source2).print();{code} Logs of collisions: {code:java} 2022-06-08 00:59:01,629 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBytesInLocal'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, Shuffle, Netty, Input]2022-06-08 00:59:01,629 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBytesInLocalPerSecond'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, Shuffle, Netty, Input]2022-06-08 00:59:01,629 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBytesInLocal'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0]2022-06-08 00:59:01,629 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBytesInLocalPerSecond'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0]2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBytesInRemote'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, Shuffle, Netty, Input]2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBytesInRemotePerSecond'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, Shuffle, Netty, Input]2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBytesInRemote'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0]2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBytesInRemotePerSecond'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0]2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBuffersInLocal'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, Shuffle, Netty, Input]2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBuffersInLocalPerSecond'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, Shuffle, Netty, Input]2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBuffersInLocal'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0]2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBuffersInLocalPerSecond'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0]2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBuffersInRemote'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, Shuffle, Netty, Input]2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBuffersInRemotePerSecond'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, Shuffle, Netty, Input]2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBuffersInRemote'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0]2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'numBuffersInRemotePerSecond'. Metric will not be reported.[, taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0] {code} > IO metric collision happens when a task has union inputs > -------------------------------------------------------- > > Key: FLINK-27944 > URL: https://issues.apache.org/jira/browse/FLINK-27944 > Project: Flink > Issue Type: Bug > Components: Runtime / Metrics > Affects Versions: 1.15.0 > Reporter: Zhu Zhu > Priority: Critical > Fix For: 1.16.0 > > > When a task has union inputs, some IO metrics(numBytesIn* and numBuffersIn*) > of the different inputs may collide and failed to be registered. > > The problem can be reproduced with a simple job like: > {code:java} > DataStream<String> source1 = env.fromElements("abc"); > DataStream<String> source2 = env.fromElements("123"); > source1.union(source2).print();{code} > > Logs of collisions: > {code:java} > 2022-06-08 00:59:01,629 WARN org.apache.flink.metrics.MetricGroup > [] - Name collision: Group already contains a Metric with the > name 'numBytesInLocal'. Metric will not be reported.[, taskmanager, > fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, > Shuffle, Netty, Input] > 2022-06-08 00:59:01,629 WARN org.apache.flink.metrics.MetricGroup > [] - Name collision: Group already contains a Metric with the > name 'numBytesInLocalPerSecond'. Metric will not be reported.[, taskmanager, > fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, > Shuffle, Netty, Input] > 2022-06-08 00:59:01,629 WARN org.apache.flink.metrics.MetricGroup > [] - Name collision: Group already contains a Metric with the > name 'numBytesInLocal'. Metric will not be reported.[, taskmanager, > fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0] > 2022-06-08 00:59:01,629 WARN org.apache.flink.metrics.MetricGroup > [] - Name collision: Group already contains a Metric with the > name 'numBytesInLocalPerSecond'. Metric will not be reported.[, taskmanager, > fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0] > 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup > [] - Name collision: Group already contains a Metric with the > name 'numBytesInRemote'. Metric will not be reported.[, taskmanager, > fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, > Shuffle, Netty, Input] > 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup > [] - Name collision: Group already contains a Metric with the > name 'numBytesInRemotePerSecond'. Metric will not be reported.[, taskmanager, > fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, > Shuffle, Netty, Input] > 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup > [] - Name collision: Group already contains a Metric with the > name 'numBytesInRemote'. Metric will not be reported.[, taskmanager, > fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0] > 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup > [] - Name collision: Group already contains a Metric with the > name 'numBytesInRemotePerSecond'. Metric will not be reported.[, taskmanager, > fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0] > 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup > [] - Name collision: Group already contains a Metric with the > name 'numBuffersInLocal'. Metric will not be reported.[, taskmanager, > fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, > Shuffle, Netty, Input] > 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup > [] - Name collision: Group already contains a Metric with the > name 'numBuffersInLocalPerSecond'. Metric will not be reported.[, > taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to > Std. Out, 0, Shuffle, Netty, Input] > 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup > [] - Name collision: Group already contains a Metric with the > name 'numBuffersInLocal'. Metric will not be reported.[, taskmanager, > fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0] > 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup > [] - Name collision: Group already contains a Metric with the > name 'numBuffersInLocalPerSecond'. Metric will not be reported.[, > taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to > Std. Out, 0] > 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup > [] - Name collision: Group already contains a Metric with the > name 'numBuffersInRemote'. Metric will not be reported.[, taskmanager, > fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0, > Shuffle, Netty, Input] > 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup > [] - Name collision: Group already contains a Metric with the > name 'numBuffersInRemotePerSecond'. Metric will not be reported.[, > taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to > Std. Out, 0, Shuffle, Netty, Input] > 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup > [] - Name collision: Group already contains a Metric with the > name 'numBuffersInRemote'. Metric will not be reported.[, taskmanager, > fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to Std. Out, 0] > 2022-06-08 00:59:01,630 WARN org.apache.flink.metrics.MetricGroup > [] - Name collision: Group already contains a Metric with the > name 'numBuffersInRemotePerSecond'. Metric will not be reported.[, > taskmanager, fa9f270e-e904-4f69-8227-8d6e26e1be62, WordCount, Sink: Print to > Std. Out, 0] > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)