Jiayi Liao created FLINK-21413:
----------------------------------

             Summary: TtlMapState and TtlListState cannot be clean completely 
with Filesystem StateBackend
                 Key: FLINK-21413
                 URL: https://issues.apache.org/jira/browse/FLINK-21413
             Project: Flink
          Issue Type: Bug
          Components: Runtime / State Backends
    Affects Versions: 1.9.0
            Reporter: Jiayi Liao


Take the #TtlMapState as an example,

 
{code:java}
public Map<UK, TtlValue<UV>> getUnexpiredOrNull(@Nonnull Map<UK, TtlValue<UV>> 
ttlValue) {
        Map<UK, TtlValue<UV>> unexpired = new HashMap<>();
        TypeSerializer<TtlValue<UV>> valueSerializer =
                ((MapSerializer<UK, TtlValue<UV>>) 
original.getValueSerializer()).getValueSerializer();
        for (Map.Entry<UK, TtlValue<UV>> e : ttlValue.entrySet()) {
                if (!expired(e.getValue())) {
                        // we have to do the defensive copy to update the value
                        unexpired.put(e.getKey(), 
valueSerializer.copy(e.getValue()));
                }
        }
        return ttlValue.size() == unexpired.size() ? ttlValue : unexpired;
}
{code}
 

The returned value will never be null and the #StateEntry will exists forever, 
which leads to memory leak if the key's range of the stream is very large. 
Below we can see that 20+ millison uncleared TtlStateMap could take up several 
GB memory.

 

!image-2021-02-19-11-13-38-691.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to