[ https://issues.apache.org/jira/browse/HUDI-2414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413952#comment-17413952 ]
Jian Feng commented on HUDI-2414: --------------------------------- [~liujinhui] yeah MOR is better than COW , but it still need compact data to old partitions , base the test I did MOR's commit duration is 50% of COW with default configuration > enable Hot and cold data separate when ingest data > -------------------------------------------------- > > Key: HUDI-2414 > URL: https://issues.apache.org/jira/browse/HUDI-2414 > Project: Apache Hudi > Issue Type: Improvement > Components: Writer Core > Reporter: Jian Feng > Assignee: Jian Feng > Priority: Major > > when using Hudi to ingest e-commercial company's item data,there are massive > update data into old partitions,if one record need update, then the whole > file it belongs need rewrite, that result in every commit nearly rewrite the > whole table. > I'm thinking if Hudi can provide a hot and cold data separate tool, work with > specific column(such as create time and update time) to distinguish hot data > and cold data, then rebuild table to separate them into different file > groups, after recreate table, the performance will be much better -- This message was sent by Atlassian Jira (v8.3.4#803005)