----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71783/#review219661 -----------------------------------------------------------
ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyFilter.java Lines 76 (patched) <https://reviews.apache.org/r/71783/#comment307871> This will include the hashcode of the object... Is this intended? ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyFilter.java Lines 87 (patched) <https://reviews.apache.org/r/71783/#comment307870> * Ratio between the forwarded rows and the total incoming rows. * The higher the number is, the less is the efficiency of the filter. * 1 means all rows should be forwarded. ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyOperator.java Lines 141 (patched) <https://reviews.apache.org/r/71783/#comment307873> _runTimeNumRows % conf.getCheckEfficiencyNumRows() == 0_ Could you leave a short comment here explaining when we are doing the efficiency check? ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyOperator.java Lines 148 (patched) <https://reviews.apache.org/r/71783/#comment307875> Is it worth to have this if/else branch? ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorTopNKeyOperator.java Lines 181 (patched) <https://reviews.apache.org/r/71783/#comment307876> nit. Maybe method should not be static so we do not have pass the Logger instance around. ql/src/java/org/apache/hadoop/hive/ql/plan/TopNKeyDesc.java Lines 96 (patched) <https://reviews.apache.org/r/71783/#comment307872> Probably we do not need these at USER level (doubting we even need them at DEFAULT level). - Jesús Camacho Rodríguez On Feb. 25, 2020, 3:12 p.m., Attila Magyar wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71783/ > ----------------------------------------------------------- > > (Updated Feb. 25, 2020, 3:12 p.m.) > > > Review request for hive, Gopal V, Jesús Camacho Rodríguez, Krisztian Kasa, > and Panos Garefalakis. > > > Bugs: HIVE-22925 > https://issues.apache.org/jira/browse/HIVE-22925 > > > Repository: hive-git > > > Description > ------- > > In certain cases the TopNKey filter might work in an inefficient way and adds > extra CPU overhead. For example if the rows are coming in an descending order > but the filter wants the top N smallest elements the filter will forward > everything. > > Inefficient should be detected in runtime so that the filter can be disabled > of the ration between forwarder_rows/total_rows is too high. > > > Diffs > ----- > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e419dc5eb3b > ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyFilter.java 38d2e08b760 > ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyOperator.java dd66dfcd72e > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorTopNKeyOperator.java > 7feadd3137d > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/topnkey/TopNKeyProcessor.java > 3869ffa2b83 > ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 31735c9ea3d > ql/src/java/org/apache/hadoop/hive/ql/plan/TopNKeyDesc.java 19910a341e0 > ql/src/test/org/apache/hadoop/hive/ql/exec/TestTopNKeyFilter.java > 95cd45978a8 > ql/src/test/results/clientpositive/llap/topnkey.q.out 345d481f18c > > > Diff: https://reviews.apache.org/r/71783/diff/2/ > > > Testing > ------- > > on dwx > > > Thanks, > > Attila Magyar > >
