HIVE-14367 : Estimated size for constant nulls is 0 (Ashutosh Chauhan via Jesus Camacho Rodriguez)
Signed-off-by: Ashutosh Chauhan <hashut...@apache.org> Project: http://git-wip-us.apache.org/repos/asf/hive/repo Commit: http://git-wip-us.apache.org/repos/asf/hive/commit/47dbc005 Tree: http://git-wip-us.apache.org/repos/asf/hive/tree/47dbc005 Diff: http://git-wip-us.apache.org/repos/asf/hive/diff/47dbc005 Branch: refs/heads/branch-2.2 Commit: 47dbc005cdf9761852ef717fea8d89ca54459a43 Parents: da395b5 Author: Ashutosh Chauhan <hashut...@apache.org> Authored: Wed Jul 27 19:21:11 2016 -0700 Committer: Owen O'Malley <omal...@apache.org> Committed: Tue Mar 28 15:27:50 2017 -0700 ---------------------------------------------------------------------- .../clientpositive/udaf_example_avg.q.out | 8 +- .../clientpositive/udaf_example_max.q.out | 8 +- .../clientpositive/udaf_example_max_n.q.out | 8 +- .../clientpositive/udaf_example_min.q.out | 8 +- .../clientpositive/udaf_example_min_n.q.out | 8 +- .../stats/annotation/StatsRulesProcFactory.java | 2 +- .../apache/hadoop/hive/ql/stats/StatsUtils.java | 49 +-- .../hive/ql/udf/generic/GenericUDAFAverage.java | 15 +- .../hive/ql/udf/generic/GenericUDAFMax.java | 11 +- .../hive/ql/udf/generic/GenericUDAFMin.java | 10 + .../queries/clientpositive/vector_coalesce.q | 1 + .../clientnegative/udf_assert_true.q.out | 24 +- .../clientpositive/acid_table_stats.q.out | 8 +- .../annotate_stats_deep_filters.q.out | 6 +- .../clientpositive/annotate_stats_filter.q.out | 112 +++--- .../clientpositive/annotate_stats_groupby.q.out | 12 +- .../clientpositive/annotate_stats_join.q.out | 16 +- .../annotate_stats_join_pkfk.q.out | 74 ++-- .../clientpositive/annotate_stats_limit.q.out | 20 +- .../clientpositive/annotate_stats_select.q.out | 4 +- .../clientpositive/annotate_stats_union.q.out | 20 +- .../clientpositive/autoColumnStats_4.q.out | 8 +- .../clientpositive/autoColumnStats_7.q.out | 8 +- .../clientpositive/autoColumnStats_9.q.out | 10 +- .../cbo_rp_annotate_stats_groupby.q.out | 12 +- .../clientpositive/cbo_rp_auto_join0.q.out | 24 +- .../cbo_rp_groupby3_noskew_multi_distinct.q.out | 6 +- .../results/clientpositive/cbo_rp_join0.q.out | 50 +-- .../cbo_rp_udaf_percentile_approx_23.q.out | 16 +- .../clientpositive/constantfolding.q.out | 4 +- .../clientpositive/create_genericudaf.q.out | 8 +- .../clientpositive/decimal_precision.q.out | 8 +- .../results/clientpositive/decimal_stats.q.out | 12 +- .../results/clientpositive/decimal_udf.q.out | 8 +- .../clientpositive/fetch_aggregation.q.out | 4 +- .../test/results/clientpositive/fold_case.q.out | 6 +- .../test/results/clientpositive/groupby3.q.out | 10 +- .../results/clientpositive/groupby3_map.q.out | 6 +- .../groupby3_map_multi_distinct.q.out | 6 +- .../clientpositive/groupby3_map_skew.q.out | 10 +- .../clientpositive/groupby3_noskew.q.out | 6 +- .../groupby3_noskew_multi_distinct.q.out | 6 +- .../clientpositive/interval_arithmetic.q.out | 8 +- .../clientpositive/literal_decimal.q.out | 4 +- .../llap/dynamic_partition_pruning.q.out | 215 +++++++----- .../vectorized_dynamic_partition_pruning.q.out | 150 ++++---- .../results/clientpositive/metadataonly1.q.out | 34 +- .../clientpositive/num_op_type_conv.q.out | 4 +- .../results/clientpositive/perf/query13.q.out | 4 +- .../results/clientpositive/perf/query28.q.out | 18 +- .../reduceSinkDeDuplication_pRS_key_empty.q.out | 14 +- .../clientpositive/remove_exprs_stats.q.out | 46 +-- .../spark/annotate_stats_join.q.out | 16 +- .../results/clientpositive/spark/groupby3.q.out | 10 +- .../clientpositive/spark/groupby3_map.q.out | 6 +- .../spark/groupby3_map_multi_distinct.q.out | 6 +- .../spark/groupby3_map_skew.q.out | 10 +- .../clientpositive/spark/groupby3_noskew.q.out | 6 +- .../spark/groupby3_noskew_multi_distinct.q.out | 6 +- .../clientpositive/spark/subquery_in.q.out | 16 +- .../spark/union_remove_6_subq.q.out | 8 +- .../clientpositive/spark/vectorization_0.q.out | 46 +-- .../spark/vectorization_pushdown.q.out | 8 +- .../spark/vectorization_short_regress.q.out | 40 +-- .../spark/vectorized_mapjoin.q.out | 8 +- .../spark/vectorized_shufflejoin.q.out | 12 +- .../spark/vectorized_timestamp_funcs.q.out | 10 +- .../results/clientpositive/subquery_in.q.out | 12 +- .../results/clientpositive/subquery_notin.q.out | 38 +- .../tez/dynamic_partition_pruning.q.out | 150 ++++---- .../clientpositive/tez/explainuser_1.q.out | 346 +++++++++---------- .../clientpositive/tez/explainuser_3.q.out | 4 +- .../results/clientpositive/tez/groupby3.q.out | 10 +- .../clientpositive/tez/metadataonly1.q.out | 40 +-- .../clientpositive/tez/subquery_in.q.out | 12 +- .../clientpositive/tez/vector_aggregate_9.q.out | 8 +- .../tez/vector_aggregate_without_gby.q.out | 4 +- .../clientpositive/tez/vector_coalesce.q.out | 70 ++-- .../tez/vector_decimal_precision.q.out | 8 +- .../clientpositive/tez/vector_decimal_udf.q.out | 8 +- .../tez/vector_interval_arithmetic.q.out | 16 +- .../tez/vector_null_projection.q.out | 26 +- .../clientpositive/tez/vectorization_0.q.out | 46 +-- .../tez/vectorization_pushdown.q.out | 8 +- .../tez/vectorization_short_regress.q.out | 40 +-- .../tez/vectorized_distinct_gby.q.out | 8 +- .../vectorized_dynamic_partition_pruning.q.out | 150 ++++---- .../clientpositive/tez/vectorized_mapjoin.q.out | 8 +- .../tez/vectorized_shufflejoin.q.out | 12 +- .../tez/vectorized_timestamp_funcs.q.out | 10 +- .../clientpositive/udaf_number_format.q.out | 4 +- .../udaf_percentile_approx_23.q.out | 16 +- ql/src/test/results/clientpositive/udf3.q.out | 4 +- ql/src/test/results/clientpositive/udf4.q.out | 4 +- ql/src/test/results/clientpositive/udf7.q.out | 2 +- ql/src/test/results/clientpositive/udf8.q.out | 8 +- .../test/results/clientpositive/udf_case.q.out | 2 +- .../results/clientpositive/udf_coalesce.q.out | 2 +- .../test/results/clientpositive/udf_elt.q.out | 2 +- .../results/clientpositive/udf_greatest.q.out | 2 +- ql/src/test/results/clientpositive/udf_if.q.out | 2 +- .../test/results/clientpositive/udf_instr.q.out | 2 +- .../test/results/clientpositive/udf_least.q.out | 2 +- .../results/clientpositive/udf_locate.q.out | 2 +- .../test/results/clientpositive/udf_trunc.q.out | 4 +- .../test/results/clientpositive/udf_when.q.out | 2 +- .../results/clientpositive/udtf_stack.q.out | 6 +- .../clientpositive/union_remove_6_subq.q.out | 8 +- .../clientpositive/vector_aggregate_9.q.out | 8 +- .../vector_aggregate_without_gby.q.out | 8 +- .../clientpositive/vector_coalesce.q.out | 80 ++--- .../vector_decimal_precision.q.out | 8 +- .../clientpositive/vector_decimal_udf.q.out | 8 +- .../results/clientpositive/vector_elt.q.out | 4 +- .../vector_interval_arithmetic.q.out | 16 +- .../clientpositive/vector_null_projection.q.out | 30 +- .../results/clientpositive/vector_nvl.q.out | 4 +- .../results/clientpositive/vector_udf1.q.out | 24 +- .../clientpositive/vectorization_0.q.out | 46 +-- .../clientpositive/vectorization_pushdown.q.out | 8 +- .../vectorization_short_regress.q.out | 40 +-- .../vectorized_distinct_gby.q.out | 4 +- .../clientpositive/vectorized_mapjoin.q.out | 8 +- .../clientpositive/vectorized_shufflejoin.q.out | 12 +- .../vectorized_timestamp_funcs.q.out | 10 +- 125 files changed, 1430 insertions(+), 1353 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/hive/blob/47dbc005/contrib/src/test/results/clientpositive/udaf_example_avg.q.out ---------------------------------------------------------------------- diff --git a/contrib/src/test/results/clientpositive/udaf_example_avg.q.out b/contrib/src/test/results/clientpositive/udaf_example_avg.q.out index 61926d4..4e6cb99 100644 --- a/contrib/src/test/results/clientpositive/udaf_example_avg.q.out +++ b/contrib/src/test/results/clientpositive/udaf_example_avg.q.out @@ -33,20 +33,20 @@ STAGE PLANS: aggregations: example_avg(_col0), example_avg(_col1) mode: hash outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 128 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 128 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: struct<mcount:bigint,msum:double>), _col1 (type: struct<mcount:bigint,msum:double>) Reduce Operator Tree: Group By Operator aggregations: example_avg(VALUE._col0), example_avg(VALUE._col1) mode: mergepartial outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 128 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 128 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/47dbc005/contrib/src/test/results/clientpositive/udaf_example_max.q.out ---------------------------------------------------------------------- diff --git a/contrib/src/test/results/clientpositive/udaf_example_max.q.out b/contrib/src/test/results/clientpositive/udaf_example_max.q.out index 932d8df..fc5a896 100644 --- a/contrib/src/test/results/clientpositive/udaf_example_max.q.out +++ b/contrib/src/test/results/clientpositive/udaf_example_max.q.out @@ -38,20 +38,20 @@ STAGE PLANS: aggregations: example_max(_col0), example_max(_col1) mode: hash outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 168 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 368 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: - Statistics: Num rows: 1 Data size: 168 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 368 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: string), _col1 (type: string) Reduce Operator Tree: Group By Operator aggregations: example_max(VALUE._col0), example_max(VALUE._col1) mode: mergepartial outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 168 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 368 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 168 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 368 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/47dbc005/contrib/src/test/results/clientpositive/udaf_example_max_n.q.out ---------------------------------------------------------------------- diff --git a/contrib/src/test/results/clientpositive/udaf_example_max_n.q.out b/contrib/src/test/results/clientpositive/udaf_example_max_n.q.out index 16ae212..47d8f52 100644 --- a/contrib/src/test/results/clientpositive/udaf_example_max_n.q.out +++ b/contrib/src/test/results/clientpositive/udaf_example_max_n.q.out @@ -33,20 +33,20 @@ STAGE PLANS: aggregations: example_max_n(_col0, 10), example_max_n(_col2, 10) mode: hash outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 424 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 424 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: struct<a:array<double>,n:int>), _col1 (type: struct<a:array<double>,n:int>) Reduce Operator Tree: Group By Operator aggregations: example_max_n(VALUE._col0), example_max_n(VALUE._col1) mode: mergepartial outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 424 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 424 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/47dbc005/contrib/src/test/results/clientpositive/udaf_example_min.q.out ---------------------------------------------------------------------- diff --git a/contrib/src/test/results/clientpositive/udaf_example_min.q.out b/contrib/src/test/results/clientpositive/udaf_example_min.q.out index b0ffe53..feb2add 100644 --- a/contrib/src/test/results/clientpositive/udaf_example_min.q.out +++ b/contrib/src/test/results/clientpositive/udaf_example_min.q.out @@ -38,20 +38,20 @@ STAGE PLANS: aggregations: example_min(_col0), example_min(_col1) mode: hash outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 168 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 368 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: - Statistics: Num rows: 1 Data size: 168 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 368 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: string), _col1 (type: string) Reduce Operator Tree: Group By Operator aggregations: example_min(VALUE._col0), example_min(VALUE._col1) mode: mergepartial outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 168 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 368 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 168 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 368 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/47dbc005/contrib/src/test/results/clientpositive/udaf_example_min_n.q.out ---------------------------------------------------------------------- diff --git a/contrib/src/test/results/clientpositive/udaf_example_min_n.q.out b/contrib/src/test/results/clientpositive/udaf_example_min_n.q.out index 7e7dd84..16c4684 100644 --- a/contrib/src/test/results/clientpositive/udaf_example_min_n.q.out +++ b/contrib/src/test/results/clientpositive/udaf_example_min_n.q.out @@ -33,20 +33,20 @@ STAGE PLANS: aggregations: example_min_n(_col0, 10), example_min_n(_col2, 10) mode: hash outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 424 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 424 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: struct<a:array<double>,n:int>), _col1 (type: struct<a:array<double>,n:int>) Reduce Operator Tree: Group By Operator aggregations: example_min_n(VALUE._col0), example_min_n(VALUE._col1) mode: mergepartial outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 424 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 424 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/47dbc005/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java ---------------------------------------------------------------------- diff --git a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java index 33778cd..d9f70a7 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java @@ -1236,7 +1236,7 @@ public class StatsRulesProcFactory { ColStatistics cs = new ColStatistics(colName, colType); cs.setCountDistint(stats.getNumRows()); cs.setNumNulls(0); - cs.setAvgColLen(StatsUtils.getAvgColLenOfFixedLengthTypes(colType)); + cs.setAvgColLen(StatsUtils.getAvgColLenOf(conf, ci.getObjectInspector(), colType)); aggColStats.add(cs); } } http://git-wip-us.apache.org/repos/asf/hive/blob/47dbc005/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java ---------------------------------------------------------------------- diff --git a/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java b/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java index 0a2e4a5..0f1e3a3 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java @@ -432,7 +432,7 @@ public class StatsUtils { long numPartitions = getNDVPartitionColumn(partList.getPartitions(), ci.getInternalName()); partCS.setCountDistint(numPartitions); - partCS.setAvgColLen(StatsUtils.getAvgColLenOfVariableLengthTypes(conf, + partCS.setAvgColLen(StatsUtils.getAvgColLenOf(conf, ci.getObjectInspector(), partCS.getColumnType())); partCS.setRange(getRangePartitionColumn(partList.getPartitions(), ci.getInternalName(), ci.getType().getTypeName(), conf.getVar(ConfVars.DEFAULTPARTITIONNAME))); @@ -551,7 +551,7 @@ public class StatsUtils { || colTypeLowerCase.startsWith(serdeConstants.MAP_TYPE_NAME) || colTypeLowerCase.startsWith(serdeConstants.STRUCT_TYPE_NAME) || colTypeLowerCase.startsWith(serdeConstants.UNION_TYPE_NAME)) { - avgRowSize += getAvgColLenOfVariableLengthTypes(conf, oi, colTypeLowerCase); + avgRowSize += getAvgColLenOf(conf, oi, colTypeLowerCase); } else { avgRowSize += getAvgColLenOfFixedLengthTypes(colTypeLowerCase); } @@ -841,7 +841,7 @@ public class StatsUtils { * - column type * @return raw data size */ - public static long getAvgColLenOfVariableLengthTypes(HiveConf conf, ObjectInspector oi, + public static long getAvgColLenOf(HiveConf conf, ObjectInspector oi, String colType) { long configVarLen = HiveConf.getIntVar(conf, HiveConf.ConfVars.HIVE_STATS_MAX_VARIABLE_LENGTH); @@ -908,7 +908,7 @@ public class StatsUtils { return getSizeOfComplexTypes(conf, oi); } - return 0; + throw new IllegalArgumentException("Size requested for unknown type: " + colType + " OI: " + oi.getTypeName()); } /** @@ -931,10 +931,10 @@ public class StatsUtils { if (colTypeLowerCase.equals(serdeConstants.STRING_TYPE_NAME) || colTypeLowerCase.startsWith(serdeConstants.VARCHAR_TYPE_NAME) || colTypeLowerCase.startsWith(serdeConstants.CHAR_TYPE_NAME)) { - int avgColLen = (int) getAvgColLenOfVariableLengthTypes(conf, oi, colTypeLowerCase); + int avgColLen = (int) getAvgColLenOf(conf, oi, colTypeLowerCase); result += JavaDataModel.get().lengthForStringOfLength(avgColLen); } else if (colTypeLowerCase.equals(serdeConstants.BINARY_TYPE_NAME)) { - int avgColLen = (int) getAvgColLenOfVariableLengthTypes(conf, oi, colTypeLowerCase); + int avgColLen = (int) getAvgColLenOf(conf, oi, colTypeLowerCase); result += JavaDataModel.get().lengthForByteArrayOfSize(avgColLen); } else { result += getAvgColLenOfFixedLengthTypes(colTypeLowerCase); @@ -1025,11 +1025,13 @@ public class StatsUtils { if (colTypeLowerCase.equals(serdeConstants.TINYINT_TYPE_NAME) || colTypeLowerCase.equals(serdeConstants.SMALLINT_TYPE_NAME) || colTypeLowerCase.equals(serdeConstants.INT_TYPE_NAME) + || colTypeLowerCase.equals(serdeConstants.VOID_TYPE_NAME) || colTypeLowerCase.equals(serdeConstants.BOOLEAN_TYPE_NAME) || colTypeLowerCase.equals(serdeConstants.FLOAT_TYPE_NAME)) { return JavaDataModel.get().primitive1(); } else if (colTypeLowerCase.equals(serdeConstants.DOUBLE_TYPE_NAME) || colTypeLowerCase.equals(serdeConstants.BIGINT_TYPE_NAME) + || colTypeLowerCase.equals(serdeConstants.INTERVAL_YEAR_MONTH_TYPE_NAME) || colTypeLowerCase.equals("long")) { return JavaDataModel.get().primitive2(); } else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME)) { @@ -1038,8 +1040,10 @@ public class StatsUtils { return JavaDataModel.get().lengthOfDate(); } else if (colTypeLowerCase.startsWith(serdeConstants.DECIMAL_TYPE_NAME)) { return JavaDataModel.get().lengthOfDecimal(); + } else if (colTypeLowerCase.equals(serdeConstants.INTERVAL_DAY_TIME_TYPE_NAME)) { + return JavaDataModel.JAVA32_META; } else { - return 0; + throw new IllegalArgumentException("Size requested for unknown type: " + colType); } } @@ -1261,7 +1265,7 @@ public class StatsUtils { double avgColSize = 0; long countDistincts = 0; long numNulls = 0; - ObjectInspector oi = null; + ObjectInspector oi = end.getWritableObjectInspector(); long numRows = parentStats.getNumRows(); if (end instanceof ExprNodeColumnDesc) { @@ -1284,7 +1288,6 @@ public class StatsUtils { // virtual columns colType = encd.getTypeInfo().getTypeName(); countDistincts = numRows; - oi = encd.getWritableObjectInspector(); } else { // clone the column stats and return @@ -1303,16 +1306,13 @@ public class StatsUtils { // constant projection ExprNodeConstantDesc encd = (ExprNodeConstantDesc) end; - // null projection + colName = encd.getName(); + colType = encd.getTypeString(); if (encd.getValue() == null) { - colName = encd.getName(); - colType = serdeConstants.VOID_TYPE_NAME; + // null projection numNulls = numRows; } else { - colName = encd.getName(); - colType = encd.getTypeString(); countDistincts = 1; - oi = encd.getWritableObjectInspector(); } } else if (end instanceof ExprNodeGenericFuncDesc) { ExprNodeGenericFuncDesc engfd = (ExprNodeGenericFuncDesc) end; @@ -1353,7 +1353,6 @@ public class StatsUtils { // fallback to default countDistincts = getNDVFor(engfd, numRows, parentStats); - oi = engfd.getWritableObjectInspector(); } else if (end instanceof ExprNodeColumnListDesc) { // column list @@ -1361,7 +1360,6 @@ public class StatsUtils { colName = Joiner.on(",").join(encd.getCols()); colType = serdeConstants.LIST_TYPE_NAME; countDistincts = numRows; - oi = encd.getWritableObjectInspector(); } else if (end instanceof ExprNodeFieldDesc) { // field within complex type @@ -1369,25 +1367,12 @@ public class StatsUtils { colName = enfd.getFieldName(); colType = enfd.getTypeString(); countDistincts = numRows; - oi = enfd.getWritableObjectInspector(); } else { throw new IllegalArgumentException("not supported expr type " + end.getClass()); } colType = colType.toLowerCase(); - if (colType.equals(serdeConstants.STRING_TYPE_NAME) - || colType.equals(serdeConstants.BINARY_TYPE_NAME) - || colType.startsWith(serdeConstants.VARCHAR_TYPE_NAME) - || colType.startsWith(serdeConstants.CHAR_TYPE_NAME) - || colType.startsWith(serdeConstants.LIST_TYPE_NAME) - || colType.startsWith(serdeConstants.MAP_TYPE_NAME) - || colType.startsWith(serdeConstants.STRUCT_TYPE_NAME) - || colType.startsWith(serdeConstants.UNION_TYPE_NAME)) { - avgColSize = getAvgColLenOfVariableLengthTypes(conf, oi, colType); - } else { - avgColSize = getAvgColLenOfFixedLengthTypes(colType); - } - + avgColSize = getAvgColLenOf(conf, oi, colType); ColStatistics colStats = new ColStatistics(colName, colType); colStats.setAvgColLen(avgColSize); colStats.setCountDistint(countDistincts); @@ -1544,7 +1529,7 @@ public class StatsUtils { for (ColStatistics cs : colStats) { if (cs != null) { String colTypeLowerCase = cs.getColumnType().toLowerCase(); - long nonNullCount = numRows - cs.getNumNulls(); + long nonNullCount = cs.getNumNulls() > 0 ? numRows - cs.getNumNulls() + 1 : numRows; double sizeOf = 0; if (colTypeLowerCase.equals(serdeConstants.TINYINT_TYPE_NAME) || colTypeLowerCase.equals(serdeConstants.SMALLINT_TYPE_NAME) http://git-wip-us.apache.org/repos/asf/hive/blob/47dbc005/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java ---------------------------------------------------------------------- diff --git a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java index 3c1ce26..1b65df3 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java @@ -18,6 +18,7 @@ package org.apache.hadoop.hive.ql.udf.generic; import java.util.ArrayList; +import java.util.HashSet; import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -27,7 +28,9 @@ import org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException; import org.apache.hadoop.hive.ql.metadata.HiveException; import org.apache.hadoop.hive.ql.parse.SemanticException; import org.apache.hadoop.hive.ql.plan.ptf.WindowFrameDef; +import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.AbstractAggregationBuffer; import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.AggregationBuffer; +import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.AggregationType; import org.apache.hadoop.hive.ql.util.JavaDataModel; import org.apache.hadoop.hive.serde2.io.DoubleWritable; import org.apache.hadoop.hive.serde2.io.HiveDecimalWritable; @@ -332,10 +335,16 @@ public class GenericUDAFAverage extends AbstractGenericUDAFResolver { } } - private static class AverageAggregationBuffer<TYPE> implements AggregationBuffer { - private Object previousValue; - private long count; + @AggregationType(estimable = true) + private static class AverageAggregationBuffer<TYPE> extends AbstractAggregationBuffer { + private Object previousValue; + private long count; private TYPE sum; + + @Override + public int estimate() { + return 2*JavaDataModel.PRIMITIVES2; + } }; @SuppressWarnings("unchecked") http://git-wip-us.apache.org/repos/asf/hive/blob/47dbc005/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMax.java ---------------------------------------------------------------------- diff --git a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMax.java b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMax.java index 43b23fa..763bfd5 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMax.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMax.java @@ -31,6 +31,7 @@ import org.apache.hadoop.hive.ql.plan.ptf.BoundaryDef; import org.apache.hadoop.hive.ql.plan.ptf.WindowFrameDef; import org.apache.hadoop.hive.ql.udf.UDFType; import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.AggregationBuffer; +import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.AggregationType; import org.apache.hadoop.hive.ql.util.JavaDataModel; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils; @@ -79,8 +80,13 @@ public class GenericUDAFMax extends AbstractGenericUDAFResolver { } /** class for storing the current max value */ + @AggregationType(estimable = true) static class MaxAgg extends AbstractAggregationBuffer { Object o; + @Override + public int estimate() { + return JavaDataModel.PRIMITIVES2; + } } @Override @@ -138,7 +144,7 @@ public class GenericUDAFMax extends AbstractGenericUDAFResolver { /* * Based on the Paper by Daniel Lemire: Streaming Max-Min filter using no more * than 3 comparisons per elem. - * + * * 1. His algorithm works on fixed size windows up to the current row. For row * 'i' and window 'w' it computes the min/max for window (i-w, i). 2. The core * idea is to keep a queue of (max, idx) tuples. A tuple in the queue @@ -150,7 +156,7 @@ public class GenericUDAFMax extends AbstractGenericUDAFResolver { * element at the front of the queue has reached its max range of influence; * i.e. frontTuple.idx + w > i. If yes we can remove it from the queue. - on * the ith step o/p the front of the queue as the max for the ith entry. - * + * * Here we modify the algorithm: 1. to handle window's that are of the form * (i-p, i+f), where p is numPreceding,f = numFollowing - we start outputing * rows only after receiving f rows. - the formula for 'influence range' of an @@ -192,6 +198,7 @@ public class GenericUDAFMax extends AbstractGenericUDAFResolver { + (3 * JavaDataModel.PRIMITIVES1); } + @Override protected void reset() { maxChain.clear(); super.reset(); http://git-wip-us.apache.org/repos/asf/hive/blob/47dbc005/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMin.java ---------------------------------------------------------------------- diff --git a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMin.java b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMin.java index 70e0db1..132bad6 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMin.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMin.java @@ -26,7 +26,9 @@ import org.apache.hadoop.hive.ql.parse.SemanticException; import org.apache.hadoop.hive.ql.plan.ptf.BoundaryDef; import org.apache.hadoop.hive.ql.plan.ptf.WindowFrameDef; import org.apache.hadoop.hive.ql.udf.UDFType; +import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.AggregationType; import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMax.MaxStreamingFixedWindow; +import org.apache.hadoop.hive.ql.util.JavaDataModel; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.FullMapEqualComparer; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils; @@ -76,8 +78,13 @@ public class GenericUDAFMin extends AbstractGenericUDAFResolver { } /** class for storing the current max value */ + @AggregationType(estimable = true) static class MinAgg extends AbstractAggregationBuffer { Object o; + @Override + public int estimate() { + return JavaDataModel.PRIMITIVES2; + } } @Override @@ -139,14 +146,17 @@ public class GenericUDAFMin extends AbstractGenericUDAFResolver { super(wrappedEval, wFrmDef); } + @Override protected ObjectInspector inputOI() { return ((GenericUDAFMinEvaluator) wrappedEval).inputOI; } + @Override protected ObjectInspector outputOI() { return ((GenericUDAFMinEvaluator) wrappedEval).outputOI; } + @Override protected boolean removeLast(Object in, Object last) { return isLess(in, last); } http://git-wip-us.apache.org/repos/asf/hive/blob/47dbc005/ql/src/test/queries/clientpositive/vector_coalesce.q ---------------------------------------------------------------------- diff --git a/ql/src/test/queries/clientpositive/vector_coalesce.q b/ql/src/test/queries/clientpositive/vector_coalesce.q index b1a7766..cfba7be 100644 --- a/ql/src/test/queries/clientpositive/vector_coalesce.q +++ b/ql/src/test/queries/clientpositive/vector_coalesce.q @@ -1,3 +1,4 @@ +set hive.stats.fetch.column.stats=true; set hive.explain.user=false; SET hive.vectorized.execution.enabled=true; http://git-wip-us.apache.org/repos/asf/hive/blob/47dbc005/ql/src/test/results/clientnegative/udf_assert_true.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientnegative/udf_assert_true.q.out b/ql/src/test/results/clientnegative/udf_assert_true.q.out index baa9074..7fc50d6 100644 --- a/ql/src/test/results/clientnegative/udf_assert_true.q.out +++ b/ql/src/test/results/clientnegative/udf_assert_true.q.out @@ -28,13 +28,13 @@ STAGE PLANS: Select Operator expressions: assert_true((_col5 > 0)) (type: void) outputColumnNames: _col0 - Statistics: Num rows: 1000 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 4000 Basic stats: COMPLETE Column stats: COMPLETE Limit Number of rows: 2 - Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -52,13 +52,13 @@ STAGE PLANS: Select Operator expressions: assert_true((_col5 > 0)) (type: void) outputColumnNames: _col0 - Statistics: Num rows: 1000 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 4000 Basic stats: COMPLETE Column stats: COMPLETE Limit Number of rows: 2 - Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -105,13 +105,13 @@ STAGE PLANS: Select Operator expressions: assert_true((_col5 < 2)) (type: void) outputColumnNames: _col0 - Statistics: Num rows: 1000 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 4000 Basic stats: COMPLETE Column stats: COMPLETE Limit Number of rows: 2 - Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -129,13 +129,13 @@ STAGE PLANS: Select Operator expressions: assert_true((_col5 < 2)) (type: void) outputColumnNames: _col0 - Statistics: Num rows: 1000 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 4000 Basic stats: COMPLETE Column stats: COMPLETE Limit Number of rows: 2 - Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE + Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/47dbc005/ql/src/test/results/clientpositive/acid_table_stats.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/acid_table_stats.q.out b/ql/src/test/results/clientpositive/acid_table_stats.q.out index 9a3a255..0177ad1 100644 --- a/ql/src/test/results/clientpositive/acid_table_stats.q.out +++ b/ql/src/test/results/clientpositive/acid_table_stats.q.out @@ -537,20 +537,20 @@ STAGE PLANS: aggregations: max(key) mode: hash outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 84 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: - Statistics: Num rows: 1 Data size: 84 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: string) Reduce Operator Tree: Group By Operator aggregations: max(VALUE._col0) mode: mergepartial outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 84 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 84 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/47dbc005/ql/src/test/results/clientpositive/annotate_stats_deep_filters.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/annotate_stats_deep_filters.q.out b/ql/src/test/results/clientpositive/annotate_stats_deep_filters.q.out index b7a87fd..32644dc 100644 --- a/ql/src/test/results/clientpositive/annotate_stats_deep_filters.q.out +++ b/ql/src/test/results/clientpositive/annotate_stats_deep_filters.q.out @@ -118,12 +118,12 @@ STAGE PLANS: Map Operator Tree: TableScan alias: over1k - Statistics: Num rows: 2098 Data size: 16736 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 2098 Data size: 16744 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (((t = 1) and (si = 2)) or ((t = 2) and (si = 3)) or ((t = 3) and (si = 4)) or ((t = 4) and (si = 5)) or ((t = 5) and (si = 6)) or ((t = 6) and (si = 7)) or ((t = 7) and (si = 8)) or ((t = 9) and (si = 10)) or ((t = 10) and (si = 11)) or ((t = 11) and (si = 12)) or ((t = 12) and (si = 13)) or ((t = 13) and (si = 14)) or ((t = 14) and (si = 15)) or ((t = 15) and (si = 16)) or ((t = 16) and (si = 17)) or ((t = 17) and (si = 18)) or ((t = 27) and (si = 28)) or ((t = 37) and (si = 38)) or ((t = 47) and (si = 48)) or ((t = 52) and (si = 53))) (type: boolean) - Statistics: Num rows: 300 Data size: 2392 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 300 Data size: 2400 Basic stats: COMPLETE Column stats: COMPLETE Select Operator - Statistics: Num rows: 300 Data size: 2392 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 300 Data size: 2400 Basic stats: COMPLETE Column stats: COMPLETE Group By Operator aggregations: count() mode: hash http://git-wip-us.apache.org/repos/asf/hive/blob/47dbc005/ql/src/test/results/clientpositive/annotate_stats_filter.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/annotate_stats_filter.q.out b/ql/src/test/results/clientpositive/annotate_stats_filter.q.out index 8e8dcc1..66dd92b 100644 --- a/ql/src/test/results/clientpositive/annotate_stats_filter.q.out +++ b/ql/src/test/results/clientpositive/annotate_stats_filter.q.out @@ -131,7 +131,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (state = 'OH') (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -167,17 +167,17 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (state <> 'OH') (type: boolean) - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -203,17 +203,17 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (state <> 'OH') (type: boolean) - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -239,17 +239,17 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: zip is null (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), null (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 1 Data size: 94 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 94 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -275,17 +275,17 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: zip is null (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), null (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 1 Data size: 94 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 94 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -311,17 +311,17 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: zip is not null (type: boolean) - Statistics: Num rows: 7 Data size: 702 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 7 Data size: 714 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 7 Data size: 702 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 7 Data size: 714 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 7 Data size: 702 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 7 Data size: 714 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -347,17 +347,17 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: zip is not null (type: boolean) - Statistics: Num rows: 7 Data size: 702 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 7 Data size: 714 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 7 Data size: 702 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 7 Data size: 714 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 7 Data size: 702 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 7 Data size: 714 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -383,11 +383,11 @@ STAGE PLANS: Processor Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE ListSink PREHOOK: query: explain select * from loc_orc where !true @@ -404,7 +404,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: false (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -440,11 +440,11 @@ STAGE PLANS: Processor Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE ListSink PREHOOK: query: explain select * from loc_orc where 'foo' @@ -461,17 +461,17 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: 'foo' (type: string) - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -497,11 +497,11 @@ STAGE PLANS: Processor Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE ListSink PREHOOK: query: explain select * from loc_orc where false = true @@ -518,7 +518,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: false (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -554,7 +554,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: false (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -590,7 +590,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: false (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -626,7 +626,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((state = 'OH') or (state = 'CA')) (type: boolean) Statistics: Num rows: 2 Data size: 204 Basic stats: COMPLETE Column stats: COMPLETE @@ -662,7 +662,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: false (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -698,7 +698,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((year = 2001) and (state = 'OH') and (state = 'FL')) (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -734,7 +734,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (((year = 2001) and year is null) or (state = 'CA')) (type: boolean) Statistics: Num rows: 2 Data size: 204 Basic stats: COMPLETE Column stats: COMPLETE @@ -770,7 +770,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (((year = 2001) or year is null) and (state = 'CA')) (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -806,17 +806,17 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (locid < 30) (type: boolean) - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -842,7 +842,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (locid > 30) (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -878,17 +878,17 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (locid <= 30) (type: boolean) - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -914,7 +914,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (locid >= 30) (type: boolean) Statistics: Num rows: 1 Data size: 102 Basic stats: COMPLETE Column stats: COMPLETE @@ -950,7 +950,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (locid < 3) (type: boolean) Statistics: Num rows: 2 Data size: 204 Basic stats: COMPLETE Column stats: COMPLETE @@ -986,7 +986,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (locid > 3) (type: boolean) Statistics: Num rows: 2 Data size: 204 Basic stats: COMPLETE Column stats: COMPLETE @@ -1022,7 +1022,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (locid <= 3) (type: boolean) Statistics: Num rows: 2 Data size: 204 Basic stats: COMPLETE Column stats: COMPLETE @@ -1058,7 +1058,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (locid >= 3) (type: boolean) Statistics: Num rows: 2 Data size: 204 Basic stats: COMPLETE Column stats: COMPLETE http://git-wip-us.apache.org/repos/asf/hive/blob/47dbc005/ql/src/test/results/clientpositive/annotate_stats_groupby.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/annotate_stats_groupby.q.out b/ql/src/test/results/clientpositive/annotate_stats_groupby.q.out index 24658f1..a8e4854 100644 --- a/ql/src/test/results/clientpositive/annotate_stats_groupby.q.out +++ b/ql/src/test/results/clientpositive/annotate_stats_groupby.q.out @@ -196,21 +196,21 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 28 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: year (type: int) outputColumnNames: year - Statistics: Num rows: 8 Data size: 28 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE Group By Operator keys: year (type: int) mode: hash outputColumnNames: _col0 - Statistics: Num rows: 8 Data size: 28 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 8 Data size: 28 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE Reduce Operator Tree: Group By Operator keys: KEY._col0 (type: int) @@ -644,11 +644,11 @@ STAGE PLANS: Map Operator Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 28 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: year (type: int) outputColumnNames: year - Statistics: Num rows: 8 Data size: 28 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE Group By Operator keys: year (type: int) mode: hash http://git-wip-us.apache.org/repos/asf/hive/blob/47dbc005/ql/src/test/results/clientpositive/annotate_stats_join.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/annotate_stats_join.q.out b/ql/src/test/results/clientpositive/annotate_stats_join.q.out index eb8c6c5..5d4fe6c 100644 --- a/ql/src/test/results/clientpositive/annotate_stats_join.q.out +++ b/ql/src/test/results/clientpositive/annotate_stats_join.q.out @@ -514,19 +514,19 @@ STAGE PLANS: value expressions: _col1 (type: string) TableScan alias: l - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: locid is not null (type: boolean) - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col1 (type: int) sort order: + Map-reduce partition columns: _col1 (type: int) - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col0 (type: string), _col2 (type: bigint), _col3 (type: int) Reduce Operator Tree: Join Operator @@ -598,19 +598,19 @@ STAGE PLANS: Statistics: Num rows: 6 Data size: 570 Basic stats: COMPLETE Column stats: COMPLETE TableScan alias: l - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (state is not null and locid is not null) (type: boolean) - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: int) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: string), _col1 (type: int) sort order: ++ Map-reduce partition columns: _col0 (type: string), _col1 (type: int) - Statistics: Num rows: 8 Data size: 804 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 816 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col2 (type: bigint), _col3 (type: int) Reduce Operator Tree: Join Operator http://git-wip-us.apache.org/repos/asf/hive/blob/47dbc005/ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out b/ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out index 5d3d809..71da136 100644 --- a/ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out +++ b/ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out @@ -379,19 +379,19 @@ STAGE PLANS: Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE TableScan alias: ss - Statistics: Num rows: 1000 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 3860 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ss_store_sk is not null (type: boolean) - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_store_sk (type: int) outputColumnNames: _col0 - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Reduce Operator Tree: Join Operator condition map: @@ -509,19 +509,19 @@ STAGE PLANS: Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE TableScan alias: ss - Statistics: Num rows: 1000 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 3860 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (ss_store_sk > 0) (type: boolean) - Statistics: Num rows: 1000 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 3860 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_store_sk (type: int) outputColumnNames: _col0 - Statistics: Num rows: 1000 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 3860 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 1000 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 3860 Basic stats: COMPLETE Column stats: COMPLETE Reduce Operator Tree: Join Operator condition map: @@ -574,19 +574,19 @@ STAGE PLANS: Statistics: Num rows: 4 Data size: 16 Basic stats: COMPLETE Column stats: PARTIAL TableScan alias: ss - Statistics: Num rows: 1000 Data size: 7668 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 7676 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((ss_quantity > 10) and ss_store_sk is not null) (type: boolean) - Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 321 Data size: 2468 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_store_sk (type: int) outputColumnNames: _col0 - Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 321 Data size: 2468 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 321 Data size: 2468 Basic stats: COMPLETE Column stats: COMPLETE Reduce Operator Tree: Join Operator condition map: @@ -639,19 +639,19 @@ STAGE PLANS: Statistics: Num rows: 12 Data size: 96 Basic stats: COMPLETE Column stats: COMPLETE TableScan alias: ss - Statistics: Num rows: 1000 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 3860 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ss_store_sk is not null (type: boolean) - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_store_sk (type: int) outputColumnNames: _col0 - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Reduce Operator Tree: Join Operator condition map: @@ -704,19 +704,19 @@ STAGE PLANS: Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE TableScan alias: ss - Statistics: Num rows: 1000 Data size: 7668 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 7676 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((ss_quantity > 10) and ss_store_sk is not null) (type: boolean) - Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 321 Data size: 2468 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_store_sk (type: int) outputColumnNames: _col0 - Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 321 Data size: 2468 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 321 Data size: 2468 Basic stats: COMPLETE Column stats: COMPLETE Reduce Operator Tree: Join Operator condition map: @@ -754,19 +754,19 @@ STAGE PLANS: Map Operator Tree: TableScan alias: ss - Statistics: Num rows: 1000 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 3860 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ss_store_sk is not null (type: boolean) - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_store_sk (type: int) outputColumnNames: _col0 - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE TableScan alias: s Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE @@ -840,7 +840,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: ss - Statistics: Num rows: 1000 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 3860 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (ss_store_sk > 1000) (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: COMPLETE @@ -926,19 +926,19 @@ STAGE PLANS: Map Operator Tree: TableScan alias: ss - Statistics: Num rows: 1000 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 3860 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ss_store_sk is not null (type: boolean) - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_store_sk (type: int) outputColumnNames: _col0 - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 964 Data size: 3720 Basic stats: COMPLETE Column stats: COMPLETE TableScan alias: s Statistics: Num rows: 12 Data size: 96 Basic stats: COMPLETE Column stats: COMPLETE @@ -1012,19 +1012,19 @@ STAGE PLANS: Map Operator Tree: TableScan alias: ss - Statistics: Num rows: 1000 Data size: 7668 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 7676 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((ss_quantity > 10) and ss_store_sk is not null) (type: boolean) - Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 321 Data size: 2468 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_store_sk (type: int) outputColumnNames: _col0 - Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 321 Data size: 2468 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 321 Data size: 2468 Basic stats: COMPLETE Column stats: COMPLETE TableScan alias: s Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE @@ -1099,19 +1099,19 @@ STAGE PLANS: Map Operator Tree: TableScan alias: ss - Statistics: Num rows: 1000 Data size: 7656 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1000 Data size: 7664 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (ss_store_sk is not null and ss_addr_sk is not null) (type: boolean) - Statistics: Num rows: 916 Data size: 7012 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 916 Data size: 7020 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_addr_sk (type: int), ss_store_sk (type: int) outputColumnNames: _col0, _col1 - Statistics: Num rows: 916 Data size: 7012 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 916 Data size: 7020 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col1 (type: int) sort order: + Map-reduce partition columns: _col1 (type: int) - Statistics: Num rows: 916 Data size: 7012 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 916 Data size: 7020 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col0 (type: int) TableScan alias: s