----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/39836/#review105972 -----------------------------------------------------------
ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java (line 192) <https://reviews.apache.org/r/39836/#comment164672> This issue is mostly a cosemtic one. Because whoever uses the stats (e.g., auto-reducer parallelism) looks at the leaves of tree, not at TS. Every operator after TS, anyway uses getDSFromCS() function to compute its own DS which uses numrows from its parent and col stats to compute DS. Parent's DS is not used. So, the value of DS in TS is irrelevant for planning. And because of this disconnect in explain you can see a DS getting increased after FIL is applied on TS. See, patch attached on HIVE-12181 This patch aims to fix that by having uniform logic for DS estimation so that explain output doesnt look stupid. Planning logic will not be affected by this. Further, estimate is made using existing function getDSFromCS() which all other operators use and no change is made in that w.r.t incomplete/missing stats. - Ashutosh Chauhan On Oct. 31, 2015, 10:11 p.m., Ashutosh Chauhan wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/39836/ > ----------------------------------------------------------- > > (Updated Oct. 31, 2015, 10:11 p.m.) > > > Review request for hive and Prasanth_J. > > > Bugs: HIVE-12309 > https://issues.apache.org/jira/browse/HIVE-12309 > > > Repository: hive-git > > > Description > ------- > > TableScan should use column stats when available for better data size estimate > > > Diffs > ----- > > ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java e1f8ebc > ql/src/test/results/clientpositive/annotate_stats_deep_filters.q.out > fc4f294 > ql/src/test/results/clientpositive/annotate_stats_filter.q.out 054b573 > ql/src/test/results/clientpositive/annotate_stats_groupby.q.out 1b9ec68 > ql/src/test/results/clientpositive/annotate_stats_groupby2.q.out be3fa1d > ql/src/test/results/clientpositive/annotate_stats_join.q.out bc44cc3 > ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out c864c04 > ql/src/test/results/clientpositive/annotate_stats_limit.q.out 7300ea0 > ql/src/test/results/clientpositive/annotate_stats_part.q.out cf523cb > ql/src/test/results/clientpositive/annotate_stats_select.q.out 877037d > ql/src/test/results/clientpositive/annotate_stats_table.q.out ebc6c5b > ql/src/test/results/clientpositive/annotate_stats_union.q.out e09dde3 > ql/src/test/results/clientpositive/cbo_rp_auto_join0.q.out d1bc6d4 > ql/src/test/results/clientpositive/cbo_rp_auto_join1.q.out 3b053fe > ql/src/test/results/clientpositive/cbo_rp_join0.q.out a8bcc90 > ql/src/test/results/clientpositive/extrapolate_part_stats_full.q.out > f87a539 > ql/src/test/results/clientpositive/extrapolate_part_stats_partial.q.out > 5903cd1 > ql/src/test/results/clientpositive/extrapolate_part_stats_partial_ndv.q.out > 2ea1e6e > ql/src/test/results/clientpositive/llap/llapdecider.q.out 676a0e4 > ql/src/test/results/clientpositive/spark/annotate_stats_join.q.out 8955a61 > ql/src/test/results/clientpositive/stats_ppr_all.q.out 7627f7a > ql/src/test/results/clientpositive/tez/explainuser_1.q.out ec434f0 > ql/src/test/results/clientpositive/tez/llapdecider.q.out 676a0e4 > > Diff: https://reviews.apache.org/r/39836/diff/ > > > Testing > ------- > > Existing tests > > > Thanks, > > Ashutosh Chauhan > >
