[ https://issues.apache.org/jira/browse/HUDI-2873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477539#comment-17477539 ]
shibei commented on HUDI-2873: ------------------------------ [~xiaotaotao] Two things need to be clarified: 1) Compaction is closer to the semantics of optimize in the data world. At present, compaction doesn't have the ability to sort data when merging log into base file, but the example given above means the optimize command should trigger data sorting, which implies the optimize command should be implemented based on clustering, correct? 2) Is there any possibility to optimize clustering operation instead of introduce a new write operation? > Support optimize data layout by sql and make the build more fast > ---------------------------------------------------------------- > > Key: HUDI-2873 > URL: https://issues.apache.org/jira/browse/HUDI-2873 > Project: Apache Hudi > Issue Type: Task > Components: Performance, spark > Reporter: tao meng > Priority: Critical > Labels: sev:high > Fix For: 0.11.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001)