[ https://issues.apache.org/jira/browse/FLINK-9601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16514841#comment-16514841 ]
ASF GitHub Bot commented on FLINK-9601: --------------------------------------- Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/6174 Thanks for the contribution. I think the general idea is good, but I think the implementation can be improved. How about changing `CopyOnWriteStateTable::snapshotTableArrays()` in such ways that the returned snapshot array is already created with a size of `max(whateverTheCurrentCodeDoes, size())`. This prevents that we ever have to create another array, in particular in this corner case where we are dealing with huge arrays. Then the remaining code should automagically work without further changes. Adding some nice comment might help that our optimized snapshotting algorithm assumes that the array can hold the flattened entries. What do you think? > Snapshot of CopyOnWriteStateTable will failed when the amount of record is > more than MAXIMUM_CAPACITY > ----------------------------------------------------------------------------------------------------- > > Key: FLINK-9601 > URL: https://issues.apache.org/jira/browse/FLINK-9601 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing > Affects Versions: 1.6.0 > Reporter: Sihua Zhou > Assignee: Sihua Zhou > Priority: Major > Fix For: 1.6.0 > > > In short, the problem is that we reuse the `snaphotData` as the output array > when partitioning the input data, but the `snapshotData` is max length is `1 > << 30`. So when the records in `CopyOnWriteStateTable` is more than `1 << 30` > (e.g. 1 <<30 + 1), then the check > `Preconditions.checkState(partitioningDestination.length >= > numberOfElements);` could be failed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)