-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39836/#review105972
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java (line 192)
<https://reviews.apache.org/r/39836/#comment164672>

    This issue is mostly a cosemtic one. 
    
    Because whoever uses the stats (e.g., auto-reducer parallelism) looks at 
the leaves of tree, not at TS. Every operator after TS, anyway uses 
getDSFromCS() function to compute its own DS which uses numrows from its parent 
and col stats to compute DS. Parent's DS is not used. So, the value of DS in TS 
is irrelevant for planning. And because of this disconnect in explain you can 
see a DS getting increased after FIL is applied on TS. See, patch attached on 
HIVE-12181 This patch aims to fix that by having uniform logic for DS 
estimation so that explain output doesnt look stupid. Planning logic will not 
be affected by this. 
    
    Further, estimate is made using existing function getDSFromCS() which all 
other operators use and no change is made in that w.r.t incomplete/missing 
stats.


- Ashutosh Chauhan


On Oct. 31, 2015, 10:11 p.m., Ashutosh Chauhan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39836/
> -----------------------------------------------------------
> 
> (Updated Oct. 31, 2015, 10:11 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Bugs: HIVE-12309
>     https://issues.apache.org/jira/browse/HIVE-12309
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> TableScan should use column stats when available for better data size estimate
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java e1f8ebc 
>   ql/src/test/results/clientpositive/annotate_stats_deep_filters.q.out 
> fc4f294 
>   ql/src/test/results/clientpositive/annotate_stats_filter.q.out 054b573 
>   ql/src/test/results/clientpositive/annotate_stats_groupby.q.out 1b9ec68 
>   ql/src/test/results/clientpositive/annotate_stats_groupby2.q.out be3fa1d 
>   ql/src/test/results/clientpositive/annotate_stats_join.q.out bc44cc3 
>   ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out c864c04 
>   ql/src/test/results/clientpositive/annotate_stats_limit.q.out 7300ea0 
>   ql/src/test/results/clientpositive/annotate_stats_part.q.out cf523cb 
>   ql/src/test/results/clientpositive/annotate_stats_select.q.out 877037d 
>   ql/src/test/results/clientpositive/annotate_stats_table.q.out ebc6c5b 
>   ql/src/test/results/clientpositive/annotate_stats_union.q.out e09dde3 
>   ql/src/test/results/clientpositive/cbo_rp_auto_join0.q.out d1bc6d4 
>   ql/src/test/results/clientpositive/cbo_rp_auto_join1.q.out 3b053fe 
>   ql/src/test/results/clientpositive/cbo_rp_join0.q.out a8bcc90 
>   ql/src/test/results/clientpositive/extrapolate_part_stats_full.q.out 
> f87a539 
>   ql/src/test/results/clientpositive/extrapolate_part_stats_partial.q.out 
> 5903cd1 
>   ql/src/test/results/clientpositive/extrapolate_part_stats_partial_ndv.q.out 
> 2ea1e6e 
>   ql/src/test/results/clientpositive/llap/llapdecider.q.out 676a0e4 
>   ql/src/test/results/clientpositive/spark/annotate_stats_join.q.out 8955a61 
>   ql/src/test/results/clientpositive/stats_ppr_all.q.out 7627f7a 
>   ql/src/test/results/clientpositive/tez/explainuser_1.q.out ec434f0 
>   ql/src/test/results/clientpositive/tez/llapdecider.q.out 676a0e4 
> 
> Diff: https://reviews.apache.org/r/39836/diff/
> 
> 
> Testing
> -------
> 
> Existing tests
> 
> 
> Thanks,
> 
> Ashutosh Chauhan
> 
>

Reply via email to