Gen Luo created FLINK-24965: ------------------------------- Summary: Improper usage of Map.Entry after Entry Iterator.remove in TaskLocaStateStoreImpl#pruneCheckpoints Key: FLINK-24965 URL: https://issues.apache.org/jira/browse/FLINK-24965 Project: Flink Issue Type: Bug Components: Runtime / State Backends Reporter: Gen Luo
In TaskLocaStateStoreImpl#pruneCheckpoints, a list is created to store snapshots which should be removed, and entries to remove are directly add into the list. The code is like this. {code:java} Iterator<Map.Entry<Long, TaskStateSnapshot>> entryIterator = storedTaskStateByCheckpointID.entrySet().iterator(); while (entryIterator.hasNext()) { Map.Entry<Long, TaskStateSnapshot> snapshotEntry = entryIterator.next(); long entryCheckpointId = snapshotEntry.getKey(); if (pruningChecker.test(entryCheckpointId)) { toRemove.add(snapshotEntry); entryIterator.remove(); } else if (breakOnceCheckerFalse) { break; } } {code} However, according to the javadoc of Map.Entry, {code:java} the behavior of a map entry is undefined if the backing map has been modified after the entry was returned by the iterator, except through the setValue operation on the map entry. {code} entries should not be reserved for further usage after iterator.remove is called. In this case, where the map is a TreeMap, if the first entry is skipped and removal happens from the second element in `storedTaskStateByCheckpointID`, entries return by entryIterator.next will be the same object and the list will be filled with it. A possible fix is to use `new AbstractMap.SimpleEntry<>(snapshotEntry.getKey(), snapshotEntry.getValue())` instead of snapshotEntry itself to add into the list. The issue is a minor one since all usage of this method seems safe so far. -- This message was sent by Atlassian Jira (v8.20.1#820001)