Rajesh Balamohan created HIVE-27050: ---------------------------------------
Summary: Iceberg: MOR: Restrict reducer extrapolation to contain number of small files being created Key: HIVE-27050 URL: https://issues.apache.org/jira/browse/HIVE-27050 Project: Hive Issue Type: Improvement Components: Iceberg integration Reporter: Rajesh Balamohan Scenario: # Create a simple table in iceberg (MOR mode). e.g store_sales_delete_1 # Insert some data into it. # Run an update statement as follows ## "update store_sales_delete_1 set ss_sold_time_sk=699060 where ss_sold_time_sk=69906" Hive estimates the number of reducers as "1". But due to "hive.tez.max.partition.factor" which defaults to "2.0", it will double the number of reducers. To put in perspective, it will create very small positional delete files spreading across different reducers. This will cause problems during reading, as all files should be opened for reading. # When iceberg MOR tables are involved in update/delete/merges, disable "hive.tez.max.partition.factor"; or set it to "1.0" irrespective of the user setting; # Have explicit logs for easier debugging; User shouldn't be confused on why the setting is not taking into effect. -- This message was sent by Atlassian Jira (v8.20.10#820010)