pvary commented on code in PR #15566:
URL: https://github.com/apache/iceberg/pull/15566#discussion_r2906797077


##########
flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergSink.java:
##########
@@ -682,6 +698,14 @@ IcebergSink build() {
       FlinkMaintenanceConfig flinkMaintenanceConfig =
           new FlinkMaintenanceConfig(table, writeOptions, readableConfig);
 
+      // If the "compactMode" option is configured,
+      // add it as a separate maintenance task
+      if (flinkWriteConf.compactMode()) {
+        RewriteDataFilesConfig rewriteDataFilesConfig =
+            flinkMaintenanceConfig.createRewriteDataFilesConfig();
+        
maintenanceTasks.add(RewriteDataFiles.builder().config(rewriteDataFilesConfig));
+      }

Review Comment:
   What is the typical use-case?
   The most wide config settings are like this:
   ```
     public String orcCompressionStrategy() {
       return confParser
           .stringConf()
           .option(FlinkWriteOptions.COMPRESSION_STRATEGY.key())
           .flinkConfig(FlinkWriteOptions.COMPRESSION_STRATEGY)
           .tableProperty(TableProperties.ORC_COMPRESSION_STRATEGY)
           .defaultValue(TableProperties.ORC_COMPRESSION_STRATEGY_DEFAULT)
           .parse();
     }
   ```
   
   Sometimes we allow the user to override the settings by manually adding 
options:
   ```
       @Override
       public Builder overwrite(boolean newOverwrite) {
         writeOptions.put(FlinkWriteOptions.OVERWRITE_MODE.key(), 
Boolean.toString(newOverwrite));
         return this;
       }
   ```
   
   We might even validate against setting them twice with a different java 
method:
   ```
       public Builder<T> watermarkColumn(String columnName) {
         Preconditions.checkArgument(
             splitAssignerFactory == null,
             "Watermark column and SplitAssigner should not be set in the same 
source");
         readOptions.put(FlinkReadOptions.WATERMARK_COLUMN, columnName);
         return this;
       }
   ```
   
   To support the property based solutions we likely to have:
   - ExpireSnapshotsConfig
   - DeleteOrphanFilesConfig
   
   This means that setting `ExpireSnapshotsConfig` and also adding a new 
ExpireSnapshots tasks should only effect each other. If we create a generic 
`maintenance` method on `IcebergSink`, then we will have hard time matching 
those.
   
   I think we need to think a bit more about this.
   
   CC: @Guosmilesmile 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to