[ 
https://issues.apache.org/jira/browse/HUDI-7646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839541#comment-17839541
 ] 

Geser Dugarov edited comment on HUDI-7646 at 4/22/24 8:11 AM:
--------------------------------------------------------------

The main question is what options are preferable, with ".inline" or with 
".async" naming. The current distribution is the following.

Using ".inline":
* hoodie.compact.inline
* hoodie.compact.schedule.inline
* hoodie.log.compaction.inline
* hoodie.clustering.inline
* hoodie.clustering.schedule.inline
* hoodie.partition.ttl.inline

Using ".async":
* hoodie.clean.async.enabled
* clean.async.enabled
* compaction.async.enabled
* hoodie.kafka.compaction.async.enable
* hoodie.clustering.async.enabled
* clustering.async.enabled
* hoodie.archive.async
* hoodie.embed.timeline.server.async
* hoodie.metadata.index.async
* hoodie.datasource.compaction.async.enable

Looks like it's preferable to move toward ".async" option.

And from user point of view, it's more obvious what ".async" means in comparing 
with ".inline", which needs to clarify the Hudi write process for a user.


was (Author: JIRAUSER301110):
The main question is using ".inline" vs ".async". The current distribution is 
the following.

Using ".inline":
* hoodie.compact.inline
* hoodie.compact.schedule.inline
* hoodie.log.compaction.inline
* hoodie.clustering.inline
* hoodie.clustering.schedule.inline
* hoodie.partition.ttl.inline

Using ".async":
* hoodie.clean.async.enabled
* clean.async.enabled
* compaction.async.enabled
* hoodie.kafka.compaction.async.enable
* hoodie.clustering.async.enabled
* clustering.async.enabled
* hoodie.archive.async
* hoodie.embed.timeline.server.async
* hoodie.metadata.index.async
* hoodie.datasource.compaction.async.enable

Looks like it's preferable to move toward ".async" option.

And from user point of view, it's more obvious what ".async" means in comparing 
with ".inline", which needs to clarify the Hudi write process for a user.

> Consistent naming in Compaction service
> ---------------------------------------
>
>                 Key: HUDI-7646
>                 URL: https://issues.apache.org/jira/browse/HUDI-7646
>             Project: Apache Hudi
>          Issue Type: Improvement
>            Reporter: Geser Dugarov
>            Priority: Minor
>
> The set of configuration parameters for Compaction service is confusing.
> In HoodieCompationConfig:
> * hoodie.compact.inline
> * hoodie.compact.schedule.inline
> * hoodie.log.compaction.enable
> * hoodie.log.compaction.inline
> * hoodie.compact.inline.max.delta.commits
> * hoodie.compact.inline.max.delta.seconds
> * hoodie.compact.inline.trigger.strategy
> * hoodie.parquet.small.file.limit
> * hoodie.record.size.estimation.threshold
> * hoodie.compaction.target.io
> * hoodie.compaction.logfile.size.threshold
> * hoodie.compaction.logfile.num.threshold
> * hoodie.compaction.strategy
> * hoodie.compaction.daybased.target.partitions
> * hoodie.copyonwrite.insert.split.size
> * hoodie.copyonwrite.insert.auto.split
> * hoodie.copyonwrite.record.size.estimate
> * hoodie.log.compaction.blocks.threshold
> In FlinkOptions:
> * compaction.async.enabled
> * compaction.schedule.enabled
> * compaction.delta_commits
> * compaction.delta_seconds
> * compaction.trigger.strategy
> * compaction.target_io
> * compaction.max_memory
> * compaction.tasks
> * compaction.timeout.seconds
> Need to refactor naming with saving backward compatibility.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to