[ https://issues.apache.org/jira/browse/HUDI-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexey Kudinkin closed HUDI-5363. --------------------------------- Resolution: Fixed > Remove default parallelism values for all ops > --------------------------------------------- > > Key: HUDI-5363 > URL: https://issues.apache.org/jira/browse/HUDI-5363 > Project: Apache Hudi > Issue Type: Bug > Components: writer-core > Affects Versions: 0.12.1 > Reporter: Alexey Kudinkin > Assignee: Alexey Kudinkin > Priority: Blocker > Labels: pull-request-available > Fix For: 0.13.0 > > > Currently, we always override the parallelism of the incoming datasets: > # If user specified shuffle parallelism explicitly, we'd use it to override > the original one > # If user did NOT specify shuffle parallelism, we'd use default value of 200 > Second case is problematic: we're blindly overriding "natural" parallelism of > the data (determined based on the source of the data) and replace it with > static unrelated value. > Instead, we should only be overriding the parallelism in following cases: > # User provided an overriding value explicitly -- This message was sent by Atlassian Jira (v8.20.10#820010)