okumin commented on code in PR #4404:
URL: https://github.com/apache/hive/pull/4404#discussion_r1226183179
##########
ql/src/java/org/apache/hadoop/hive/ql/plan/TezEdgeProperty.java:
##########
@@ -51,19 +53,22 @@ public TezEdgeProperty(HiveConf hiveConf, EdgeType edgeType,
}
public TezEdgeProperty(HiveConf hiveConf, EdgeType edgeType, boolean
isAutoReduce,
- boolean isSlowStart, int minReducer, int maxReducer, long
bytesPerReducer) {
+ boolean isSlowStart, int minReducer, int maxReducer, long
bytesPerReducer,
+ float minSrcFraction, float maxSrcFraction) {
this(hiveConf, edgeType, -1);
- setAutoReduce(hiveConf, isAutoReduce, minReducer, maxReducer,
bytesPerReducer);
+ setAutoReduce(hiveConf, isAutoReduce, minReducer, maxReducer,
bytesPerReducer, minSrcFraction, maxSrcFraction);
this.isSlowStart = isSlowStart;
}
public void setAutoReduce(HiveConf hiveConf, boolean isAutoReduce, int
minReducer,
- int maxReducer, long bytesPerReducer) {
+ int maxReducer, long bytesPerReducer, float minSrcFraction, float
maxSrcFraction) {
this.hiveConf = hiveConf;
this.minReducer = minReducer;
this.maxReducer = maxReducer;
this.isAutoReduce = isAutoReduce;
this.inputSizePerReducer = bytesPerReducer;
+ this.minSrcFraction = minSrcFraction;
+ this.maxSrcFraction = maxSrcFraction;
Review Comment:
@abstractdog Thanks for putting your opinion. Actually, all the other 3
params are already covered indirectly. The newly added 2 are only ones we can't
control at all.
https://github.com/apache/hive/blob/b55f9c513b25e8e61150f6bf0afd1a882780d098/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java#L1657-L1669
- `tez.shuffle-vertex-manager.enable.auto-parallel`: This is configured when
`hive.tez.auto.reducer.parallelism` is enabled and the reduce vertex can be
parallelized
- `tez.shuffle-vertex-manager.desired-task-input-size`: This is inherited
from `hive.exec.reducers.bytes.per.reducer`
- `tez.shuffle-vertex-manager.min-task-parallelism`: This will be
`{estimated # of reducers} * hive.tez.min.partition.factor`
If we support them, it means `tez.shuffle-vertex-manager.*` overwrites the 3
params. Is it preferable?
Another option is to introduce new params like
`hive.tez.auto.reducer.parallelism.min-src-fraction` assuming
`tez.shuffle-vertex-manager.*` is SPI-like params. It might be consistent with
other params.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]