yunfengzhou-hub commented on code in PR #97:
URL: https://github.com/apache/flink-ml/pull/97#discussion_r884537389


##########
flink-ml-iteration/src/main/java/org/apache/flink/iteration/datacache/nonkeyed/Segment.java:
##########
@@ -18,38 +18,49 @@
 
 package org.apache.flink.iteration.datacache.nonkeyed;
 
+import org.apache.flink.api.common.typeutils.TypeSerializer;
 import org.apache.flink.core.fs.Path;
 
+import java.io.IOException;
 import java.io.Serializable;
+import java.util.List;
 import java.util.Objects;
 
-/** A segment represents a single file for the cache. */
+/** A segment contains the information about a cache unit. */
 public class Segment implements Serializable {
 
-    private final Path path;
+    /** The pre-allocated path on disk to persist the records. */
+    Path path;
 
-    /** The count of the records in the file. */
-    private final int count;
+    /** The number of records in the file. */
+    int count;
 
-    /** The total length of file. */
-    private final long size;
+    /** The size of the records in file. */
+    long fsSize;
 
-    public Segment(Path path, int count, long size) {
+    /** The size of the records in memory. */
+    transient long inMemorySize;
+
+    /** The cached records in memory. */
+    transient List<Object> cache;
+
+    /** The serializer for the records. */
+    transient TypeSerializer<Object> serializer;
+
+    Segment() {}
+
+    Segment(Path path, int count, long fsSize) {
         this.path = path;
         this.count = count;
-        this.size = size;
-    }
-
-    public Path getPath() {
-        return path;
+        this.fsSize = fsSize;
     }
 
-    public int getCount() {
-        return count;
+    boolean isOnDisk() throws IOException {

Review Comment:
   We have the need to express a segment that is both cached in memory and 
persisted on disk. If we separate the class into MemorySegment and FsSegment, 
I'm not sure how to deal with such situation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to