[jira] [Commented] (HUDI-2873) Support optimize data layout by sql and make the build more fast

shibei (Jira) Mon, 17 Jan 2022 19:24:06 -0800


    [ 
https://issues.apache.org/jira/browse/HUDI-2873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477539#comment-17477539
 ]


shibei commented on HUDI-2873:
------------------------------

[~xiaotaotao] Two things need to be clarified:
1) Compaction is closer to the semantics of optimize in the data world. At 
present, compaction doesn't  have the ability to sort data when merging log 
into base file, but the example given above means the optimize command should 
trigger data sorting, which implies the optimize command should be implemented 
based on clustering, correct?

2) Is there any possibility to optimize clustering operation instead of 
introduce a new write operation?

 

> Support optimize data layout by sql and make the build more fast
> ----------------------------------------------------------------
>
>                 Key: HUDI-2873
>                 URL: https://issues.apache.org/jira/browse/HUDI-2873
>             Project: Apache Hudi
>          Issue Type: Task
>          Components: Performance, spark
>            Reporter: tao meng
>            Priority: Critical
>              Labels: sev:high
>             Fix For: 0.11.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (HUDI-2873) Support optimize data layout by sql and make the build more fast

Reply via email to