hudi-agent commented on code in PR #17785:
URL: https://github.com/apache/hudi/pull/17785#discussion_r3346422725


##########
hudi-hadoop-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFormatWriter.java:
##########
@@ -207,23 +206,29 @@ private int getLogBlockLength(int contentLength, int 
headerLength, int footerLen
 
   private void rolloverIfNeeded() throws IOException {
     // Roll over if the size is past the threshold
-    if (getCurrentSize() > sizeThreshold) {
-      log.info("CurrentSize {} has reached threshold {}. Rolling over to the 
next version", getCurrentSize(), sizeThreshold);
+    if (getCurrentSize() > getSizeThreshold()) {
+      log.info("CurrentSize {} has reached threshold {}. Rolling over to the 
next version", getCurrentSize(), getSizeThreshold());
       rollOver();
     }
   }
 
   private void rollOver() throws IOException {
     closeStream();
-    this.logFile = logFile.rollOver(rolloverLogWriteToken);
+    this.logFile = getLogFile().rollOver(getLogWriteToken());
     this.closed = false;
   }
 
   private void createNewFile() throws IOException {
-    fileCreationHook.preFileCreation(this.logFile);
-    this.output = new FSDataOutputStream(
-        storage.create(this.logFile.getPath(), false, bufferSize, replication, 
WriterBuilder.DEFAULT_SIZE_THRESHOLD),
-        new FileSystem.Statistics(storage.getScheme())
+    getFileCreationCallback().preFileCreation(this.getLogFile());
+    this.outputStream = new FSDataOutputStream(
+        getStorage().create(
+            this.getLogFile().getPath(),
+            false,
+            getBufferSize(),
+            getReplication(),

Review Comment:
   🤖 Heads-up: this changes the HDFS block size argument from the previous 
fixed `WriterBuilder.DEFAULT_SIZE_THRESHOLD` (512 MB) to the user-configurable 
`getSizeThreshold()`. Since `HoodieHadoopStorage.create()` passes this through 
to `FileSystem.create(..., blockSize, ...)`, a user (or test) setting a small 
`withSizeThreshold(...)` would now create the underlying file with that tiny 
block size, which HDFS rejects when it falls below 
`dfs.namenode.fs-limits.min-block-size` (default 1 MB). Could you keep using 
the 512 MB constant here as the block size, since that's a different concept 
from the log rollover threshold? @yihua
   
   <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFormat.java:
##########
@@ -60,31 +60,140 @@ public interface HoodieLogFormat {
 
   String DEFAULT_WRITE_TOKEN = "0-0-0";
 
-  String DEFAULT_LOG_FORMAT_WRITER = 
"org.apache.hudi.common.table.log.HoodieLogFormatWriter";
-
   /**
-   * Writer interface to allow appending block to this file format.
+   * Abstract base class for appending blocks to the Hoodie log format.
+   * Subclasses provide specific implementations for writing to different 
storage layers.
    */
-  interface Writer extends Closeable {
+  @Getter
+  @Slf4j
+  abstract class Writer implements Closeable {
+
+    // Default max log file size 512 MB
+    public static final long DEFAULT_SIZE_THRESHOLD = 512 * 1024 * 1024L;
+
+    // Buffer size
+    protected Integer bufferSize;
+    // FileSystem
+    protected HoodieStorage storage;
+    // Size threshold for the log file. Useful when used with a rolling log 
appender
+    protected Long sizeThreshold;
+    // Log File extension. Could be .avro.delta or .avro.commits etc
+    protected String fileExtension;
+    // File Id
+    protected String logFileId;
+    // File Commit Time stamp
+    protected String instantTime;
+    // version number for this log file. If not specified, then the current 
version will be
+    // computed by inspecting the file system
+    protected Integer logVersion;
+    // file len of this log file

Review Comment:
   🤖 nit: the comment just above (`// file len of this log file`) is a leftover 
from when this field was called `fileLen` — could you update it to match the 
new `fileSize` name? On the same line, the `= 0L` inline initializer is 
redundant since the constructor always assigns `this.fileSize = fileSize != 
null ? fileSize : 0L;`; all other fields in this class have no inline default, 
so this one stands out and could mislead a reader into thinking the default is 
set here rather than in the constructor.
   
   <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/RollbackHelperV1.java:
##########
@@ -311,9 +312,9 @@ List<Pair<String, HoodieRollbackStat>> 
maybeDeleteAndCollectStats(HoodieEngineCo
           // Let's emit markers for rollback as well. markers are emitted 
under rollback instant time.
           WriteMarkers writeMarkers = 
WriteMarkersFactory.get(config.getMarkersType(), table, instantTime);
 
-          HoodieLogFormat.WriterBuilder writerBuilder = 
HoodieLogFormat.newWriterBuilder()
-              
.onParentPath(FSUtils.constructAbsolutePath(metaClient.getBasePath(), 
partitionPath))
-              .withFileId(fileId)
+          HoodieLogFormatWriter.HoodieLogFormatWriterBuilder writerBuilder = 
HoodieLogFormatWriter.builder()

Review Comment:
   🤖 nit: `HoodieLogFormatWriter.HoodieLogFormatWriterBuilder` is a 
Lombok-generated inner-class name leaking into production code — callers are 
now coupled to that implementation detail. The old code used 
`HoodieLogFormat.WriterBuilder` as a stable, publicly-declared type. Could we 
keep a named builder type in `HoodieLogFormat` (even a simple alias or thin 
interface) so call sites like this one don't need to reference the Lombok 
internals?
   
   <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to