[ https://issues.apache.org/jira/browse/HUDI-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexey Kudinkin reassigned HUDI-5261: ------------------------------------- Assignee: Jonathan Vexler > Use proper parallelism for engine context APIs > ---------------------------------------------- > > Key: HUDI-5261 > URL: https://issues.apache.org/jira/browse/HUDI-5261 > Project: Apache Hudi > Issue Type: Improvement > Components: performance > Reporter: Raymond Xu > Assignee: Jonathan Vexler > Priority: Critical > Fix For: 0.12.2 > > > do a global search of these APIs > - org.apache.hudi.common.engine.HoodieEngineContext#flatMap > - org.apache.hudi.common.engine.HoodieEngineContext#map > and similar ones take in parallelism. > A lot of occurrences are using number of items as parallelism, which affect > performance. Parallelism should be based on num cores available in the > cluster and set by user via parallelism configs. -- This message was sent by Atlassian Jira (v8.20.10#820010)