[GitHub] [hudi] yihua commented on a diff in pull request #6670: [HUDI-4842] Support compression strategy based on delte file length

2022-09-16 Thread GitBox


yihua commented on code in PR #6670:
URL: https://github.com/apache/hudi/pull/6670#discussion_r973534451


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieCompactionConfig.java:
##
@@ -106,6 +106,12 @@ public class HoodieCompactionConfig extends HoodieConfig {
   .withDocumentation("Only if the log file size is greater than the 
threshold in bytes,"
   + " the file group will be compacted.");
 
+  public static final ConfigProperty 
COMPACTION_LOG_FILE_LENGTH_THRESHOLD = ConfigProperty
+  .key("hoodie.compaction.logfile.length.threshold")
+  .defaultValue(0L)

Review Comment:
   Got it.  That makes sense.  Then do we even need this config?  The new 
compaction strategy prioritizes the compaction of file groups with more log 
files, and should include all file groups for compaction nevertheless.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] yihua commented on a diff in pull request #6670: [HUDI-4842] Support compression strategy based on delte file length

2022-09-15 Thread GitBox


yihua commented on code in PR #6670:
URL: https://github.com/apache/hudi/pull/6670#discussion_r972540738


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieCompactionConfig.java:
##
@@ -106,6 +106,12 @@ public class HoodieCompactionConfig extends HoodieConfig {
   .withDocumentation("Only if the log file size is greater than the 
threshold in bytes,"
   + " the file group will be compacted.");
 
+  public static final ConfigProperty 
COMPACTION_LOG_FILE_LENGTH_THRESHOLD = ConfigProperty
+  .key("hoodie.compaction.logfile.length.threshold")
+  .defaultValue(0L)

Review Comment:
   Should this be set to a reasonable value like `5` for example?  Otherwise, 
it falls back to the behavior where all file groups are compacted.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] yihua commented on a diff in pull request #6670: [HUDI-4842] Support compression strategy based on delte file length

2022-09-15 Thread GitBox


yihua commented on code in PR #6670:
URL: https://github.com/apache/hudi/pull/6670#discussion_r972532363


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieCompactionConfig.java:
##
@@ -241,41 +255,61 @@ public class HoodieCompactionConfig extends HoodieConfig {
*/
   @Deprecated
   public static final String COMPACTION_STRATEGY_PROP = 
COMPACTION_STRATEGY.key();
-  /** @deprecated Use {@link #COMPACTION_STRATEGY} and its methods instead */
+  /**
+   * @deprecated Use {@link #COMPACTION_STRATEGY} and its methods instead
+   */
   @Deprecated
   public static final String DEFAULT_COMPACTION_STRATEGY = 
COMPACTION_STRATEGY.defaultValue();
-  /** @deprecated Use {@link #COMPACTION_LAZY_BLOCK_READ_ENABLE} and its 
methods instead */
+  /**
+   * @deprecated Use {@link #COMPACTION_LAZY_BLOCK_READ_ENABLE} and its 
methods instead
+   */
   @Deprecated
   public static final String COMPACTION_LAZY_BLOCK_READ_ENABLED_PROP = 
COMPACTION_LAZY_BLOCK_READ_ENABLE.key();
-  /** @deprecated Use {@link #COMPACTION_LAZY_BLOCK_READ_ENABLE} and its 
methods instead */
+  /**
+   * @deprecated Use {@link #COMPACTION_LAZY_BLOCK_READ_ENABLE} and its 
methods instead
+   */
   @Deprecated
   public static final String DEFAULT_COMPACTION_LAZY_BLOCK_READ_ENABLED = 
COMPACTION_REVERSE_LOG_READ_ENABLE.defaultValue();
-  /** @deprecated Use {@link #COMPACTION_REVERSE_LOG_READ_ENABLE} and its 
methods instead */
+  /**
+   * @deprecated Use {@link #COMPACTION_REVERSE_LOG_READ_ENABLE} and its 
methods instead
+   */
   @Deprecated
   public static final String COMPACTION_REVERSE_LOG_READ_ENABLED_PROP = 
COMPACTION_REVERSE_LOG_READ_ENABLE.key();
-  /** @deprecated Use {@link #COMPACTION_REVERSE_LOG_READ_ENABLE} and its 
methods instead */
+  /**
+   * @deprecated Use {@link #COMPACTION_REVERSE_LOG_READ_ENABLE} and its 
methods instead
+   */
   @Deprecated
   public static final String DEFAULT_COMPACTION_REVERSE_LOG_READ_ENABLED = 
COMPACTION_REVERSE_LOG_READ_ENABLE.defaultValue();
+  /**
+   * @deprecated Use {@link #TARGET_PARTITIONS_PER_DAYBASED_COMPACTION} and 
its methods instead
+   */
+  @Deprecated
+  public static final String TARGET_PARTITIONS_PER_DAYBASED_COMPACTION_PROP = 
TARGET_PARTITIONS_PER_DAYBASED_COMPACTION.key();
+  /**
+   * @deprecated Use {@link #TARGET_PARTITIONS_PER_DAYBASED_COMPACTION} and 
its methods instead
+   */
+  @Deprecated
+  public static final String DEFAULT_TARGET_PARTITIONS_PER_DAYBASED_COMPACTION 
= TARGET_PARTITIONS_PER_DAYBASED_COMPACTION.defaultValue();
   /**
* @deprecated Use {@link #INLINE_COMPACT} and its methods instead
*/
   @Deprecated
   private static final String DEFAULT_INLINE_COMPACT = 
INLINE_COMPACT.defaultValue();
-  /** @deprecated Use {@link #INLINE_COMPACT_NUM_DELTA_COMMITS} and its 
methods instead */
+  /**
+   * @deprecated Use {@link #INLINE_COMPACT_NUM_DELTA_COMMITS} and its methods 
instead
+   */
   @Deprecated
   private static final String DEFAULT_INLINE_COMPACT_NUM_DELTA_COMMITS = 
INLINE_COMPACT_NUM_DELTA_COMMITS.defaultValue();
-  /** @deprecated Use {@link #INLINE_COMPACT_TIME_DELTA_SECONDS} and its 
methods instead */
+  /**
+   * @deprecated Use {@link #INLINE_COMPACT_TIME_DELTA_SECONDS} and its 
methods instead
+   */
   @Deprecated
   private static final String DEFAULT_INLINE_COMPACT_TIME_DELTA_SECONDS = 
INLINE_COMPACT_TIME_DELTA_SECONDS.defaultValue();
-  /** @deprecated Use {@link #INLINE_COMPACT_TRIGGER_STRATEGY} and its methods 
instead */
+  /**

Review Comment:
   Could you avoid code style changes to reduce review overhead?  It's OK to 
have a separate PR to reformat the code.



##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieCompactionConfig.java:
##
@@ -381,6 +415,11 @@ public Builder 
withLogFileSizeThresholdBasedCompaction(long logFileSizeThreshold
   return this;
 }
 
+public Builder withLogFileLengthThresholdBasedCompaction(int 
logFileLengthThreshold) {
+  compactionConfig.setValue(COMPACTION_LOG_FILE_LENGTH_THRESHOLD, 
String.valueOf(logFileLengthThreshold));
+  return this;

Review Comment:
   Same here for method name and variable renaming.



##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieCompactionConfig.java:
##
@@ -106,6 +106,12 @@ public class HoodieCompactionConfig extends HoodieConfig {
   .withDocumentation("Only if the log file size is greater than the 
threshold in bytes,"
   + " the file group will be compacted.");
 
+  public static final ConfigProperty 
COMPACTION_LOG_FILE_LENGTH_THRESHOLD = ConfigProperty
+  .key("hoodie.compaction.logfile.length.threshold")

Review Comment:
   `hoodie.compaction.logfile.length.threshold`
   ->
   `hoodie.compaction.logfile.num.threshold`
   `length` can be confused with the size of the log file.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub