Zhilong Hong created FLINK-22863: ------------------------------------ Summary: ArrayIndexOutOfBoundsException may happen when building rescale edges Key: FLINK-22863 URL: https://issues.apache.org/jira/browse/FLINK-22863 Project: Flink Issue Type: Bug Components: Runtime / Coordination Affects Versions: 1.13.1, 1.13.0 Reporter: Zhilong Hong Fix For: 1.14.0, 1.13.2 Attachments: image-2021-06-03-15-06-09-301.png
For EdgeManagerBuildUtil introduced in FLINK-21326, we find that during the construction of rescale edges, it may throw ArrayIndexOutOfBoundsException like this: !image-2021-06-03-15-06-09-301.png|width=938,height=200! It is mainly caused by the precision of {{double}} in Java. In EdgeManagerBuildUtil#connectPointwise, when upstream parallelism < downstream parallelism, we calculate the indices of downstream vertices that connect to each upstream partition like this: {code:java} int start = (int) (Math.ceil(partitionNum * factor)); int end = (int) (Math.ceil((partitionNum + 1) * factor)); {code} The index range is [{{start}}, {{end}}). In some cases the value of {{end}} may exceed the downstream parallelism and throw the ArrayIndexOutOfBoundsException. Let's take an example. The upstream parallelism is 7. The downstream parallelism is 29. For the last upstream partition (which {{partitionNum}} is 6), {{(partitionNum + 1) * factor}} is 29.00002, which is slightly larger than 29. This is caused by the precision of {{double}}. Then {{end}} = {{Math.ceil(29.00002)}}, which is 30. ArrayIndexOutOfBoundsException is thrown here. To solve this issue, we need to add an extra check for the boundary condition like this: {code:java} int end = Math.min(targetCount, (int) (Math.ceil((partitionNum + 1) * factor))); {code} This affects release-1.13.0 and release-1.13.1. -- This message was sent by Atlassian Jira (v8.3.4#803005)