okumin commented on code in PR #4404:
URL: https://github.com/apache/hive/pull/4404#discussion_r1226183179


##########
ql/src/java/org/apache/hadoop/hive/ql/plan/TezEdgeProperty.java:
##########
@@ -51,19 +53,22 @@ public TezEdgeProperty(HiveConf hiveConf, EdgeType edgeType,
   }
 
   public TezEdgeProperty(HiveConf hiveConf, EdgeType edgeType, boolean 
isAutoReduce,
-      boolean isSlowStart, int minReducer, int maxReducer, long 
bytesPerReducer) {
+      boolean isSlowStart, int minReducer, int maxReducer, long 
bytesPerReducer,
+      float minSrcFraction, float maxSrcFraction) {
     this(hiveConf, edgeType, -1);
-    setAutoReduce(hiveConf, isAutoReduce, minReducer, maxReducer, 
bytesPerReducer);
+    setAutoReduce(hiveConf, isAutoReduce, minReducer, maxReducer, 
bytesPerReducer, minSrcFraction, maxSrcFraction);
     this.isSlowStart = isSlowStart;
   }
 
   public void setAutoReduce(HiveConf hiveConf, boolean isAutoReduce, int 
minReducer,
-      int maxReducer, long bytesPerReducer) {
+      int maxReducer, long bytesPerReducer, float minSrcFraction, float 
maxSrcFraction) {
     this.hiveConf = hiveConf;
     this.minReducer = minReducer;
     this.maxReducer = maxReducer;
     this.isAutoReduce = isAutoReduce;
     this.inputSizePerReducer = bytesPerReducer;
+    this.minSrcFraction = minSrcFraction;
+    this.maxSrcFraction = maxSrcFraction;

Review Comment:
   @abstractdog Thanks for putting your opinion. Actually, all the other 3 
params are already covered indirectly. The newly added 2 are only ones we can't 
control at all.
   
https://github.com/apache/hive/blob/b55f9c513b25e8e61150f6bf0afd1a882780d098/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java#L1657-L1669
   
   - `tez.shuffle-vertex-manager.enable.auto-parallel`: This is configured when 
`hive.tez.auto.reducer.parallelism` is enabled and the reduce vertex can be 
parallelized
   - `tez.shuffle-vertex-manager.desired-task-input-size`: This is inherited 
from `hive.exec.reducers.bytes.per.reducer`
   - `tez.shuffle-vertex-manager.min-task-parallelism`: This will be 
`{estimated # of reducers} * hive.tez.min.partition.factor`
   
   If we support them, it means `tez.shuffle-vertex-manager.*` overwrites the 3 
params. Is it preferable?
   Another option is to introduce new params like 
`hive.tez.auto.reducer.parallelism.min-src-fraction` assuming 
`tez.shuffle-vertex-manager.*` is SPI-like params. It might be consistent with 
other params.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to