voonhous commented on code in PR #19002:
URL: https://github.com/apache/hudi/pull/19002#discussion_r3415163077


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java:
##########
@@ -564,14 +564,15 @@ public List<WriteStatus> close() {
         writer = null;
       }
 
-      // update final size, once for all log files
-      // TODO we can actually deduce file size purely from AppendResult (based 
on offset and size
-      //      of the appended block)
+      // Set the final on-disk size of each log file. Appends within an append 
handle are contiguous,
+      // so a log file's length equals its start offset plus the total bytes 
appended to it. That is
+      // exactly what fs.getFileStatus().getLength() returns, and both values 
are already captured by
+      // the AppendResult stats (logOffset and the accumulated 
fileSizeInBytes). Deriving the size this
+      // way avoids a getPathInfo/HEAD per log file, which is a remote round 
trip per file group on
+      // object stores.
       for (WriteStatus status : statuses) {
-        long logFileSize = storage.getPathInfo(
-            new StoragePath(config.getBasePath(), status.getStat().getPath()))
-            .getLength();
-        status.getStat().setFileSizeInBytes(logFileSize);
+        HoodieDeltaWriteStat stat = (HoodieDeltaWriteStat) status.getStat();
+        stat.setFileSizeInBytes(stat.getLogOffset() + 
stat.getFileSizeInBytes());

Review Comment:
   Done, extracted `appendedBytes` so the before-state is explicit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to