okumin commented on code in PR #4477:
URL: https://github.com/apache/hive/pull/4477#discussion_r1265275497
##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergSerDe.java:
##########
@@ -148,6 +148,14 @@ public void initialize(@Nullable Configuration
configuration, Properties serDePr
// TODO: remove once we have both Fanout and ClusteredWriter available:
HIVE-25948
HiveConf.setIntVar(configuration,
HiveConf.ConfVars.HIVEOPTSORTDYNAMICPARTITIONTHRESHOLD, 1);
HiveConf.setVar(configuration, HiveConf.ConfVars.DYNAMICPARTITIONINGMODE,
"nonstrict");
+
+ Context.Operation operation =
HiveCustomStorageHandlerUtils.getWriteOperation(configuration,
+ serDeProperties.getProperty(Catalogs.NAME));
+
+ if (operation != null) {
+ HiveConf.setFloatVar(configuration,
HiveConf.ConfVars.TEZ_MAX_PARTITION_FACTOR, 1f);
Review Comment:
Some random thoughts. I would say these are minor.
- Is it best to disable over-provisioning only for DELETE reducers using a
Hive hook or something?
- Over-provisioning might work for INSERT or UPDATE
- It could work also for DELETE if rows are filtered
- Maybe, the optimization should be applied to only the last reducer in
the case like Map -> Reduce -> Reduce?
- `hive.tez.auto.reducer.parallelism.min.threshold=0.0` can be an option?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]