voonhous commented on code in PR #17785:
URL: https://github.com/apache/hudi/pull/17785#discussion_r3346814589
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/RollbackHelperV1.java:
##########
@@ -311,9 +312,9 @@ List<Pair<String, HoodieRollbackStat>>
maybeDeleteAndCollectStats(HoodieEngineCo
// Let's emit markers for rollback as well. markers are emitted
under rollback instant time.
WriteMarkers writeMarkers =
WriteMarkersFactory.get(config.getMarkersType(), table, instantTime);
- HoodieLogFormat.WriterBuilder writerBuilder =
HoodieLogFormat.newWriterBuilder()
-
.onParentPath(FSUtils.constructAbsolutePath(metaClient.getBasePath(),
partitionPath))
- .withFileId(fileId)
+ HoodieLogFormatWriter.HoodieLogFormatWriterBuilder writerBuilder =
HoodieLogFormatWriter.builder()
Review Comment:
I follow the concern about the Lombok-generated name, but I'd prefer to keep
it as-is. The builder belongs to `HoodieLogFormatWriter`, which lives in
`hudi-hadoop-common` and references hadoop types (`FSDataOutputStream`). The
old `HoodieLogFormat.WriterBuilder` only lived in `hudi-common` by loading the
writer reflectively via `ReflectionUtils`, the brittleness this PR removes
(#11207). Re-adding a named builder type in `HoodieLogFormat` would mean either
bringing back that reflection or re-coupling `hudi-common` to hadoop. The
generated type is referenced in exactly one internal call site
(`RollbackHelperV1`), where we hold the builder to conditionally set the
pre-computed log version before `build()`, so the leak is contained to a single
internal caller.
##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFormat.java:
##########
@@ -60,31 +60,140 @@ public interface HoodieLogFormat {
String DEFAULT_WRITE_TOKEN = "0-0-0";
- String DEFAULT_LOG_FORMAT_WRITER =
"org.apache.hudi.common.table.log.HoodieLogFormatWriter";
-
/**
- * Writer interface to allow appending block to this file format.
+ * Abstract base class for appending blocks to the Hoodie log format.
+ * Subclasses provide specific implementations for writing to different
storage layers.
*/
- interface Writer extends Closeable {
+ @Getter
+ @Slf4j
+ abstract class Writer implements Closeable {
+
+ // Default max log file size 512 MB
+ public static final long DEFAULT_SIZE_THRESHOLD = 512 * 1024 * 1024L;
+
+ // Buffer size
+ protected Integer bufferSize;
+ // FileSystem
+ protected HoodieStorage storage;
+ // Size threshold for the log file. Useful when used with a rolling log
appender
+ protected Long sizeThreshold;
+ // Log File extension. Could be .avro.delta or .avro.commits etc
+ protected String fileExtension;
+ // File Id
+ protected String logFileId;
+ // File Commit Time stamp
+ protected String instantTime;
+ // version number for this log file. If not specified, then the current
version will be
+ // computed by inspecting the file system
+ protected Integer logVersion;
+ // file len of this log file
Review Comment:
Done, updated the comment to "file size" and removed the redundant `= 0L`
inline initializer; the default is set in the constructor.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]