yunfengzhou-hub commented on code in PR #97: URL: https://github.com/apache/flink-ml/pull/97#discussion_r888753223
########## flink-ml-iteration/src/main/java/org/apache/flink/iteration/datacache/nonkeyed/DataCacheSnapshot.java: ########## @@ -90,18 +90,18 @@ public void writeTo(OutputStream checkpointOutputStream) throws IOException { } dos.writeBoolean(fileSystem.isDistributedFS()); + for (Segment segment : segments) { + persistSegmentToDisk(segment); + } if (fileSystem.isDistributedFS()) { // We only need to record the segments itself serializeSegments(segments, dos); } else { // We have to copy the whole streams. - int totalRecords = segments.stream().mapToInt(Segment::getCount).sum(); - long totalSize = segments.stream().mapToLong(Segment::getSize).sum(); - checkState(totalRecords >= 0, "overflowed: " + totalRecords); - dos.writeInt(totalRecords); - dos.writeLong(totalSize); - + dos.writeInt(segments.size()); for (Segment segment : segments) { + dos.writeInt(segment.getCount()); Review Comment: Because the max size of a segment is limited. For example, limited by the max allowed file size of the underlying filesystem. If we merge all segments into one during snapshot, errors due to such limits might be invoked. Treating each segment separately could help avoid such errors while not adding much overhead to the snapshot process. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org