[ https://issues.apache.org/jira/browse/HUDI-5012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
sivabalan narayanan updated HUDI-5012: -------------------------------------- Status: In Progress (was: Open) > Fix clean planning for very large partitions > -------------------------------------------- > > Key: HUDI-5012 > URL: https://issues.apache.org/jira/browse/HUDI-5012 > Project: Apache Hudi > Issue Type: Improvement > Components: cleaning > Reporter: sivabalan narayanan > Assignee: sivabalan narayanan > Priority: Critical > Labels: pull-request-available > Fix For: 0.12.1 > > > Within clean planning phase, we do a map() for every partition and then > trigger planning for each partition within that. > > For very large number of partitions, and if cleaner shuffle parallelism is > small, this results in more sequential planning. We can enhance this with > mapPartitions call and optimize it > -- This message was sent by Atlassian Jira (v8.20.10#820010)