-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71783/#review219661
-----------------------------------------------------------




ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyFilter.java
Lines 76 (patched)
<https://reviews.apache.org/r/71783/#comment307871>

    This will include the hashcode of the object... Is this intended?



ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyFilter.java
Lines 87 (patched)
<https://reviews.apache.org/r/71783/#comment307870>

    * Ratio between the forwarded rows and the total incoming rows.
       * The higher the number is, the less is the efficiency of the filter.
       * 1 means all rows should be forwarded.



ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyOperator.java
Lines 141 (patched)
<https://reviews.apache.org/r/71783/#comment307873>

    _runTimeNumRows % conf.getCheckEfficiencyNumRows() == 0_
    
    Could you leave a short comment here explaining when we are doing the 
efficiency check?



ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyOperator.java
Lines 148 (patched)
<https://reviews.apache.org/r/71783/#comment307875>

    Is it worth to have this if/else branch?



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorTopNKeyOperator.java
Lines 181 (patched)
<https://reviews.apache.org/r/71783/#comment307876>

    nit. Maybe method should not be static so we do not have pass the Logger 
instance around.



ql/src/java/org/apache/hadoop/hive/ql/plan/TopNKeyDesc.java
Lines 96 (patched)
<https://reviews.apache.org/r/71783/#comment307872>

    Probably we do not need these at USER level (doubting we even need them at 
DEFAULT level).


- Jesús Camacho Rodríguez


On Feb. 25, 2020, 3:12 p.m., Attila Magyar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71783/
> -----------------------------------------------------------
> 
> (Updated Feb. 25, 2020, 3:12 p.m.)
> 
> 
> Review request for hive, Gopal V, Jesús Camacho Rodríguez, Krisztian Kasa, 
> and Panos Garefalakis.
> 
> 
> Bugs: HIVE-22925
>     https://issues.apache.org/jira/browse/HIVE-22925
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> In certain cases the TopNKey filter might work in an inefficient way and adds 
> extra CPU overhead. For example if the rows are coming in an descending order 
> but the filter wants the top N smallest elements the filter will forward 
> everything.
> 
> Inefficient should be detected in runtime so that the filter can be disabled 
> of the ration between forwarder_rows/total_rows is too high.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e419dc5eb3b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyFilter.java 38d2e08b760 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyOperator.java dd66dfcd72e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorTopNKeyOperator.java 
> 7feadd3137d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/topnkey/TopNKeyProcessor.java 
> 3869ffa2b83 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 31735c9ea3d 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/TopNKeyDesc.java 19910a341e0 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestTopNKeyFilter.java 
> 95cd45978a8 
>   ql/src/test/results/clientpositive/llap/topnkey.q.out 345d481f18c 
> 
> 
> Diff: https://reviews.apache.org/r/71783/diff/2/
> 
> 
> Testing
> -------
> 
> on dwx
> 
> 
> Thanks,
> 
> Attila Magyar
> 
>

Reply via email to