HIVE-18108: in case basic stats are missing; rowcount estimation depends on the selected columns size (Zoltan Haindrich, reviewed by Ashutosh Chauhan)
Signed-off-by: Zoltan Haindrich <k...@rxd.hu> Project: http://git-wip-us.apache.org/repos/asf/hive/repo Commit: http://git-wip-us.apache.org/repos/asf/hive/commit/d0fa7d55 Tree: http://git-wip-us.apache.org/repos/asf/hive/tree/d0fa7d55 Diff: http://git-wip-us.apache.org/repos/asf/hive/diff/d0fa7d55 Branch: refs/heads/master Commit: d0fa7d5515bde6819fcdb54648d678b2448b7731 Parents: 8e32172 Author: Zoltan Haindrich <k...@rxd.hu> Authored: Fri Jan 5 11:17:24 2018 +0100 Committer: Zoltan Haindrich <k...@rxd.hu> Committed: Fri Jan 5 11:17:24 2018 +0100 ---------------------------------------------------------------------- .../apache/hadoop/hive/ql/stats/StatsUtils.java | 48 +- .../bucket_mapjoin_mismatch1.q.out | 36 +- .../clientpositive/acid_table_stats.q.out | 20 +- .../analyze_table_null_partition.q.out | 4 +- .../clientpositive/annotate_stats_part.q.out | 10 +- .../clientpositive/annotate_stats_table.q.out | 4 +- .../clientpositive/autoColumnStats_2.q.out | 4 +- .../clientpositive/autoColumnStats_5.q.out | 18 +- .../clientpositive/autoColumnStats_5a.q.out | 18 +- .../clientpositive/auto_sortmerge_join_1.q.out | 42 +- .../clientpositive/auto_sortmerge_join_11.q.out | 60 +-- .../clientpositive/auto_sortmerge_join_12.q.out | 326 +++++++------- .../clientpositive/auto_sortmerge_join_2.q.out | 36 +- .../clientpositive/auto_sortmerge_join_3.q.out | 42 +- .../clientpositive/auto_sortmerge_join_4.q.out | 42 +- .../clientpositive/auto_sortmerge_join_7.q.out | 42 +- .../avro_schema_evolution_native.q.out | 10 +- .../clientpositive/beeline/smb_mapjoin_10.q.out | 4 +- .../clientpositive/bucket_map_join_spark1.q.out | 40 +- .../clientpositive/bucket_map_join_spark2.q.out | 40 +- .../clientpositive/bucket_map_join_spark3.q.out | 40 +- .../clientpositive/bucketcontext_1.q.out | 14 +- .../clientpositive/bucketcontext_2.q.out | 14 +- .../clientpositive/bucketcontext_3.q.out | 14 +- .../clientpositive/bucketcontext_4.q.out | 14 +- .../clientpositive/bucketcontext_6.q.out | 10 +- .../clientpositive/bucketcontext_7.q.out | 14 +- .../clientpositive/bucketcontext_8.q.out | 14 +- .../clientpositive/bucketmapjoin10.q.out | 10 +- .../clientpositive/bucketmapjoin11.q.out | 20 +- .../clientpositive/bucketmapjoin12.q.out | 20 +- .../results/clientpositive/bucketmapjoin5.q.out | 24 +- .../results/clientpositive/bucketmapjoin8.q.out | 20 +- .../results/clientpositive/bucketmapjoin9.q.out | 20 +- .../clientpositive/bucketmapjoin_negative.q.out | 12 +- .../bucketmapjoin_negative2.q.out | 12 +- .../clientpositive/columnstats_partlvl.q.out | 56 +-- .../clientpositive/columnstats_partlvl_dp.q.out | 28 +- .../test/results/clientpositive/combine2.q.out | 12 +- .../clientpositive/groupby_sort_11.q.out | 8 +- .../results/clientpositive/groupby_sort_6.q.out | 16 +- .../infer_bucket_sort_dyn_part.q.out | 26 +- .../infer_bucket_sort_num_buckets.q.out | 22 +- .../insert1_overwrite_partitions.q.out | 38 +- .../insert2_overwrite_partitions.q.out | 48 +- .../results/clientpositive/insert_into2.q.out | 4 +- .../list_bucket_query_oneskew_1.q.out | 18 +- .../list_bucket_query_oneskew_2.q.out | 42 +- .../list_bucket_query_oneskew_3.q.out | 6 +- .../llap/auto_sortmerge_join_1.q.out | 42 +- .../llap/auto_sortmerge_join_11.q.out | 56 +-- .../llap/auto_sortmerge_join_12.q.out | 102 ++--- .../llap/auto_sortmerge_join_2.q.out | 28 +- .../llap/auto_sortmerge_join_3.q.out | 42 +- .../llap/auto_sortmerge_join_4.q.out | 42 +- .../llap/auto_sortmerge_join_7.q.out | 42 +- .../llap/auto_sortmerge_join_8.q.out | 42 +- .../clientpositive/llap/bucketmapjoin1.q.out | 32 +- .../clientpositive/llap/bucketmapjoin2.q.out | 72 +-- .../clientpositive/llap/bucketmapjoin3.q.out | 48 +- .../clientpositive/llap/bucketmapjoin7.q.out | 26 +- .../llap/column_table_stats.q.out | 12 +- .../llap/dynamic_partition_pruning.q.out | 132 +++--- .../llap/dynamic_partition_pruning_2.q.out | 6 +- .../llap/dynamic_semijoin_reduction.q.out | 38 +- .../llap/dynamic_semijoin_user_level.q.out | 4 +- .../llap/dynpart_sort_opt_vectorization.q.out | 8 +- .../clientpositive/llap/explainuser_1.q.out | 20 +- .../clientpositive/llap/insert_into2.q.out | 4 +- .../llap/join_reordering_no_stats.q.out | 20 +- .../clientpositive/llap/llap_partitioned.q.out | 2 +- .../llap/partition_multilevels.q.out | 8 +- .../results/clientpositive/llap/stats11.q.out | 32 +- .../llap/vector_partitioned_date_time.q.out | 16 +- .../vectorized_dynamic_partition_pruning.q.out | 126 +++--- .../merge_dynamic_partition.q.out | 54 +-- .../merge_dynamic_partition2.q.out | 18 +- .../merge_dynamic_partition3.q.out | 18 +- .../results/clientpositive/nullgroup3.q.out | 8 +- .../results/clientpositive/nullgroup5.q.out | 12 +- .../clientpositive/partition_boolexpr.q.out | 12 +- ql/src/test/results/clientpositive/pcs.q.out | 6 +- .../test/results/clientpositive/regex_col.q.out | 12 +- .../test/results/clientpositive/row__id.q.out | 18 +- .../results/clientpositive/smb_mapjoin_10.q.out | 4 +- .../spark/auto_sortmerge_join_1.q.out | 24 +- .../spark/auto_sortmerge_join_12.q.out | 196 ++++---- .../spark/auto_sortmerge_join_2.q.out | 16 +- .../spark/auto_sortmerge_join_3.q.out | 24 +- .../spark/auto_sortmerge_join_4.q.out | 24 +- .../spark/auto_sortmerge_join_7.q.out | 24 +- .../spark/auto_sortmerge_join_8.q.out | 24 +- .../spark/bucket_map_join_spark1.q.out | 36 +- .../spark/bucket_map_join_spark2.q.out | 36 +- .../spark/bucket_map_join_spark3.q.out | 36 +- .../clientpositive/spark/bucketmapjoin1.q.out | 28 +- .../clientpositive/spark/bucketmapjoin10.q.out | 10 +- .../clientpositive/spark/bucketmapjoin11.q.out | 20 +- .../clientpositive/spark/bucketmapjoin12.q.out | 20 +- .../clientpositive/spark/bucketmapjoin2.q.out | 66 +-- .../clientpositive/spark/bucketmapjoin3.q.out | 44 +- .../clientpositive/spark/bucketmapjoin5.q.out | 20 +- .../clientpositive/spark/bucketmapjoin7.q.out | 20 +- .../clientpositive/spark/bucketmapjoin8.q.out | 20 +- .../clientpositive/spark/bucketmapjoin9.q.out | 20 +- .../spark/bucketmapjoin_negative.q.out | 10 +- .../spark/bucketmapjoin_negative2.q.out | 10 +- .../clientpositive/spark/insert_into2.q.out | 4 +- .../clientpositive/spark/smb_mapjoin_10.q.out | 10 +- .../spark/spark_dynamic_partition_pruning.q.out | 450 +++++++++---------- .../spark_dynamic_partition_pruning_2.q.out | 20 +- .../spark_dynamic_partition_pruning_4.q.out | 20 +- .../spark/spark_explainuser_1.q.out | 20 +- ...k_vectorized_dynamic_partition_pruning.q.out | 444 +++++++++--------- .../results/clientpositive/spark/stats10.q.out | 2 +- .../results/clientpositive/spark/stats12.q.out | 2 +- .../results/clientpositive/spark/stats13.q.out | 2 +- .../results/clientpositive/spark/stats2.q.out | 2 +- .../results/clientpositive/spark/stats7.q.out | 2 +- .../results/clientpositive/spark/stats8.q.out | 10 +- .../results/clientpositive/spark/stats9.q.out | 2 +- .../clientpositive/spark/stats_noscan_2.q.out | 4 +- .../clientpositive/spark/union_remove_1.q.out | 16 +- .../clientpositive/spark/union_remove_10.q.out | 24 +- .../clientpositive/spark/union_remove_11.q.out | 24 +- .../clientpositive/spark/union_remove_12.q.out | 18 +- .../clientpositive/spark/union_remove_13.q.out | 24 +- .../clientpositive/spark/union_remove_14.q.out | 18 +- .../clientpositive/spark/union_remove_15.q.out | 20 +- .../clientpositive/spark/union_remove_16.q.out | 20 +- .../clientpositive/spark/union_remove_17.q.out | 16 +- .../clientpositive/spark/union_remove_19.q.out | 52 +-- .../clientpositive/spark/union_remove_2.q.out | 24 +- .../clientpositive/spark/union_remove_20.q.out | 20 +- .../clientpositive/spark/union_remove_21.q.out | 16 +- .../clientpositive/spark/union_remove_22.q.out | 40 +- .../clientpositive/spark/union_remove_23.q.out | 38 +- .../clientpositive/spark/union_remove_24.q.out | 20 +- .../clientpositive/spark/union_remove_25.q.out | 16 +- .../clientpositive/spark/union_remove_3.q.out | 24 +- .../clientpositive/spark/union_remove_4.q.out | 16 +- .../clientpositive/spark/union_remove_5.q.out | 24 +- .../clientpositive/spark/union_remove_6.q.out | 20 +- .../spark/union_remove_6_subq.q.out | 20 +- .../clientpositive/spark/union_remove_7.q.out | 16 +- .../clientpositive/spark/union_remove_8.q.out | 24 +- .../clientpositive/spark/union_remove_9.q.out | 28 +- .../clientpositive/spark/union_view.q.out | 40 +- .../vectorization_parquet_projection.q.out | 4 +- .../test/results/clientpositive/stats10.q.out | 2 +- .../test/results/clientpositive/stats12.q.out | 2 +- .../test/results/clientpositive/stats13.q.out | 2 +- ql/src/test/results/clientpositive/stats2.q.out | 2 +- ql/src/test/results/clientpositive/stats7.q.out | 2 +- ql/src/test/results/clientpositive/stats8.q.out | 10 +- ql/src/test/results/clientpositive/stats9.q.out | 2 +- .../results/clientpositive/stats_noscan_2.q.out | 4 +- .../results/clientpositive/stats_ppr_all.q.out | 2 +- .../clientpositive/tez/explainanalyze_5.q.out | 2 +- .../results/clientpositive/union_remove_1.q.out | 24 +- .../clientpositive/union_remove_10.q.out | 24 +- .../clientpositive/union_remove_11.q.out | 30 +- .../clientpositive/union_remove_12.q.out | 18 +- .../clientpositive/union_remove_13.q.out | 24 +- .../clientpositive/union_remove_14.q.out | 18 +- .../clientpositive/union_remove_15.q.out | 28 +- .../clientpositive/union_remove_16.q.out | 28 +- .../clientpositive/union_remove_17.q.out | 20 +- .../clientpositive/union_remove_19.q.out | 76 ++-- .../results/clientpositive/union_remove_2.q.out | 24 +- .../clientpositive/union_remove_20.q.out | 28 +- .../clientpositive/union_remove_21.q.out | 24 +- .../clientpositive/union_remove_22.q.out | 56 +-- .../clientpositive/union_remove_23.q.out | 38 +- .../clientpositive/union_remove_24.q.out | 28 +- .../clientpositive/union_remove_25.q.out | 24 +- .../results/clientpositive/union_remove_3.q.out | 30 +- .../results/clientpositive/union_remove_4.q.out | 24 +- .../results/clientpositive/union_remove_5.q.out | 24 +- .../results/clientpositive/union_remove_6.q.out | 32 +- .../clientpositive/union_remove_6_subq.q.out | 32 +- .../results/clientpositive/union_remove_7.q.out | 24 +- .../results/clientpositive/union_remove_8.q.out | 24 +- .../results/clientpositive/union_remove_9.q.out | 32 +- .../results/clientpositive/union_view.q.out | 64 +-- .../clientpositive/vector_gather_stats.q.out | 2 +- .../vectorization_parquet_projection.q.out | 4 +- 187 files changed, 2898 insertions(+), 2886 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/hive/blob/d0fa7d55/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java ---------------------------------------------------------------------- diff --git a/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java b/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java index 05c9380..e265863 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java @@ -189,11 +189,6 @@ public class StatsUtils { */ public static long getNumRows(HiveConf conf, List<ColumnInfo> schema, Table table, PrunedPartitionList partitionList, AtomicInteger noColsMissingStats) { - //for non-partitioned table - List<String> neededColumns = new ArrayList<>(); - for(ColumnInfo ci:schema) { - neededColumns.add(ci.getInternalName()); - } boolean shouldEstimateStats = HiveConf.getBoolVar(conf, ConfVars.HIVE_STATS_ESTIMATE_STATS); @@ -213,7 +208,7 @@ public class StatsUtils { } // go ahead with the estimation long ds = getDataSize(conf, table); - return getNumRows(conf, schema, neededColumns, table, ds); + return getNumRows(conf, schema, table, ds); } else { // partitioned table long nr = 0; @@ -242,9 +237,12 @@ public class StatsUtils { ds = getSumIgnoreNegatives(dataSizes); + float deserFactor = HiveConf.getFloatVar(conf, HiveConf.ConfVars.HIVE_STATS_DESERIALIZATION_FACTOR); + if (ds <= 0) { dataSizes = getBasicStatForPartitions( table, partitionList.getNotDeniedPartns(), StatsSetupConst.TOTAL_SIZE); + dataSizes = safeMult(dataSizes, deserFactor); ds = getSumIgnoreNegatives(dataSizes); } @@ -252,13 +250,11 @@ public class StatsUtils { // sizes if (ds <= 0 && shouldEstimateStats) { dataSizes = getFileSizeForPartitions(conf, partitionList.getNotDeniedPartns()); + dataSizes = safeMult(dataSizes, deserFactor); + ds = getSumIgnoreNegatives(dataSizes); } - ds = getSumIgnoreNegatives(dataSizes); - float deserFactor = - HiveConf.getFloatVar(conf, HiveConf.ConfVars.HIVE_STATS_DESERIALIZATION_FACTOR); - ds = (long) (ds * deserFactor); - int avgRowSize = estimateRowSizeFromSchema(conf, schema, neededColumns); + int avgRowSize = estimateRowSizeFromSchema(conf, schema); if (avgRowSize > 0) { setUnknownRcDsToAverage(rowCounts, dataSizes, avgRowSize); nr = getSumIgnoreNegatives(rowCounts); @@ -296,14 +292,13 @@ public class StatsUtils { } } - private static long getNumRows(HiveConf conf, List<ColumnInfo> schema, List<String> neededColumns, - Table table, long ds) { + private static long getNumRows(HiveConf conf, List<ColumnInfo> schema, Table table, long ds) { long nr = getNumRows(table); // number of rows -1 means that statistics from metastore is not reliable // and 0 means statistics gathering is disabled // estimate only if num rows is -1 since 0 could be actual number of rows if (nr < 0) { - int avgRowSize = estimateRowSizeFromSchema(conf, schema, neededColumns); + int avgRowSize = estimateRowSizeFromSchema(conf, schema); if (avgRowSize > 0) { if (LOG.isDebugEnabled()) { LOG.debug("Estimated average row size: " + avgRowSize); @@ -341,7 +336,7 @@ public class StatsUtils { //getDataSize tries to estimate stats if it doesn't exist using file size // we would like to avoid file system calls if it too expensive long ds = shouldEstimateStats? getDataSize(conf, table): getRawDataSize(table); - long nr = getNumRows(conf, schema, neededColumns, table, ds); + long nr = getNumRows(conf, schema, table, ds); List<ColStatistics> colStats = Lists.newArrayList(); if (fetchColStats) { colStats = getTableColumnStats(table, schema, neededColumns, colStatsCache); @@ -377,6 +372,7 @@ public class StatsUtils { ds = getSumIgnoreNegatives(dataSizes); if (ds <= 0) { dataSizes = getBasicStatForPartitions(table, partList.getNotDeniedPartns(), StatsSetupConst.TOTAL_SIZE); + dataSizes = safeMult(dataSizes, deserFactor); ds = getSumIgnoreNegatives(dataSizes); } @@ -384,11 +380,11 @@ public class StatsUtils { // sizes if (ds <= 0 && shouldEstimateStats) { dataSizes = getFileSizeForPartitions(conf, partList.getNotDeniedPartns()); + dataSizes = safeMult(dataSizes, deserFactor); + ds = getSumIgnoreNegatives(dataSizes); } - ds = getSumIgnoreNegatives(dataSizes); - ds = (long) (ds * deserFactor); - int avgRowSize = estimateRowSizeFromSchema(conf, schema, neededColumns); + int avgRowSize = estimateRowSizeFromSchema(conf, schema); if (avgRowSize > 0) { setUnknownRcDsToAverage(rowCounts, dataSizes, avgRowSize); nr = getSumIgnoreNegatives(rowCounts); @@ -768,6 +764,14 @@ public class StatsUtils { } } + public static int estimateRowSizeFromSchema(HiveConf conf, List<ColumnInfo> schema) { + List<String> neededColumns = new ArrayList<>(); + for (ColumnInfo ci : schema) { + neededColumns.add(ci.getInternalName()); + } + return estimateRowSizeFromSchema(conf, schema, neededColumns); + } + public static int estimateRowSizeFromSchema(HiveConf conf, List<ColumnInfo> schema, List<String> neededColumns) { int avgRowSize = 0; @@ -1937,6 +1941,14 @@ public class StatsUtils { } } + public static List<Long> safeMult(List<Long> l, float b) { + List<Long> ret = new ArrayList<>(); + for (Long a : l) { + ret.add(safeMult(a, b)); + } + return ret; + } + public static boolean hasDiscreteRange(ColStatistics colStat) { if (colStat.getRange() != null) { TypeInfo colType = TypeInfoUtils.getTypeInfoFromTypeString(colStat.getColumnType()); http://git-wip-us.apache.org/repos/asf/hive/blob/d0fa7d55/ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out b/ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out index 79f1f93..53bbeae 100644 --- a/ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out +++ b/ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out @@ -94,35 +94,35 @@ STAGE PLANS: Map Operator Tree: TableScan alias: a - Statistics: Num rows: 40 Data size: 4200 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 108 Data size: 42000 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) - Statistics: Num rows: 40 Data size: 4200 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 108 Data size: 42000 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: int), value (type: string) outputColumnNames: _col0, _col1 - Statistics: Num rows: 40 Data size: 4200 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 108 Data size: 42000 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 40 Data size: 4200 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 108 Data size: 42000 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: string) TableScan alias: b - Statistics: Num rows: 29 Data size: 3062 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 78 Data size: 30620 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) - Statistics: Num rows: 29 Data size: 3062 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 78 Data size: 30620 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: int), value (type: string) outputColumnNames: _col0, _col1 - Statistics: Num rows: 29 Data size: 3062 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 78 Data size: 30620 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 29 Data size: 3062 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 78 Data size: 30620 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: string) Reduce Operator Tree: Join Operator @@ -132,14 +132,14 @@ STAGE PLANS: 0 _col0 (type: int) 1 _col0 (type: int) outputColumnNames: _col0, _col1, _col4 - Statistics: Num rows: 44 Data size: 4620 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 118 Data size: 46200 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: string), _col4 (type: string) outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 44 Data size: 4620 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 118 Data size: 46200 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 44 Data size: 4620 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 118 Data size: 46200 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -177,10 +177,10 @@ STAGE PLANS: b TableScan alias: b - Statistics: Num rows: 29 Data size: 3062 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 102 Data size: 30620 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) - Statistics: Num rows: 29 Data size: 3062 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 102 Data size: 30620 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 key (type: int) @@ -191,10 +191,10 @@ STAGE PLANS: Map Operator Tree: TableScan alias: a - Statistics: Num rows: 40 Data size: 4200 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 140 Data size: 42000 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) - Statistics: Num rows: 40 Data size: 4200 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 140 Data size: 42000 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 @@ -202,14 +202,14 @@ STAGE PLANS: 0 key (type: int) 1 key (type: int) outputColumnNames: _col0, _col1, _col7 - Statistics: Num rows: 44 Data size: 4620 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 154 Data size: 46200 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: string), _col7 (type: string) outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 44 Data size: 4620 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 154 Data size: 46200 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 44 Data size: 4620 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 154 Data size: 46200 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/d0fa7d55/ql/src/test/results/clientpositive/acid_table_stats.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/acid_table_stats.q.out b/ql/src/test/results/clientpositive/acid_table_stats.q.out index 05a03d2..74d4c44 100644 --- a/ql/src/test/results/clientpositive/acid_table_stats.q.out +++ b/ql/src/test/results/clientpositive/acid_table_stats.q.out @@ -133,27 +133,27 @@ STAGE PLANS: Map Operator Tree: TableScan alias: acid - Statistics: Num rows: 1 Data size: 39500 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 81 Data size: 39500 Basic stats: COMPLETE Column stats: NONE Select Operator - Statistics: Num rows: 1 Data size: 39500 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 81 Data size: 39500 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 8 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: - Statistics: Num rows: 1 Data size: 8 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: bigint) Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 8 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 8 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -299,9 +299,9 @@ STAGE PLANS: Map Operator Tree: TableScan alias: acid - Statistics: Num rows: 1000 Data size: 2080000 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1000 Data size: 208000 Basic stats: COMPLETE Column stats: NONE Select Operator - Statistics: Num rows: 1000 Data size: 2080000 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1000 Data size: 208000 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count() mode: hash @@ -460,9 +460,9 @@ STAGE PLANS: Map Operator Tree: TableScan alias: acid - Statistics: Num rows: 2000 Data size: 4160000 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2000 Data size: 416000 Basic stats: COMPLETE Column stats: NONE Select Operator - Statistics: Num rows: 2000 Data size: 4160000 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2000 Data size: 416000 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count() mode: hash http://git-wip-us.apache.org/repos/asf/hive/blob/d0fa7d55/ql/src/test/results/clientpositive/analyze_table_null_partition.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/analyze_table_null_partition.q.out b/ql/src/test/results/clientpositive/analyze_table_null_partition.q.out index d48df75..e7151b6 100644 --- a/ql/src/test/results/clientpositive/analyze_table_null_partition.q.out +++ b/ql/src/test/results/clientpositive/analyze_table_null_partition.q.out @@ -278,12 +278,12 @@ STAGE PLANS: Processor Tree: TableScan alias: test2 - Statistics: Num rows: 5 Data size: 111 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 5 Data size: 299 Basic stats: COMPLETE Column stats: NONE GatherStats: false Select Operator expressions: name (type: string), age (type: int) outputColumnNames: _col0, _col1 - Statistics: Num rows: 5 Data size: 111 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 5 Data size: 299 Basic stats: COMPLETE Column stats: NONE ListSink PREHOOK: query: DROP TABLE test1 http://git-wip-us.apache.org/repos/asf/hive/blob/d0fa7d55/ql/src/test/results/clientpositive/annotate_stats_part.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/annotate_stats_part.q.out b/ql/src/test/results/clientpositive/annotate_stats_part.q.out index cba89a6..399ddb6 100644 --- a/ql/src/test/results/clientpositive/annotate_stats_part.q.out +++ b/ql/src/test/results/clientpositive/annotate_stats_part.q.out @@ -90,11 +90,11 @@ STAGE PLANS: Processor Tree: TableScan alias: loc_orc - Statistics: Num rows: 6 Data size: 3060 Basic stats: COMPLETE Column stats: PARTIAL + Statistics: Num rows: 18 Data size: 14640 Basic stats: COMPLETE Column stats: PARTIAL Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), year (type: string) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 6 Data size: 2280 Basic stats: COMPLETE Column stats: PARTIAL + Statistics: Num rows: 18 Data size: 6840 Basic stats: COMPLETE Column stats: PARTIAL ListSink PREHOOK: query: analyze table loc_orc partition(year='2001') compute statistics @@ -121,11 +121,11 @@ STAGE PLANS: Processor Tree: TableScan alias: loc_orc - Statistics: Num rows: 3 Data size: 936 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 8 Data size: 5048 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: state (type: string), locid (type: int), zip (type: bigint), '__HIVE_DEFAULT_PARTITION__' (type: string) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 3 Data size: 936 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 8 Data size: 5048 Basic stats: COMPLETE Column stats: NONE ListSink PREHOOK: query: explain select * from loc_orc @@ -339,7 +339,7 @@ STAGE PLANS: Processor Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 9212 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 2246 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: year (type: string) outputColumnNames: _col0 http://git-wip-us.apache.org/repos/asf/hive/blob/d0fa7d55/ql/src/test/results/clientpositive/annotate_stats_table.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/annotate_stats_table.q.out b/ql/src/test/results/clientpositive/annotate_stats_table.q.out index 83d241c..433a06b 100644 --- a/ql/src/test/results/clientpositive/annotate_stats_table.q.out +++ b/ql/src/test/results/clientpositive/annotate_stats_table.q.out @@ -81,11 +81,11 @@ STAGE PLANS: Processor Tree: TableScan alias: emp_orc - Statistics: Num rows: 37 Data size: 6956 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 13 Data size: 2444 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: lastname (type: string), deptid (type: int) outputColumnNames: _col0, _col1 - Statistics: Num rows: 37 Data size: 6956 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 13 Data size: 2444 Basic stats: COMPLETE Column stats: NONE ListSink PREHOOK: query: analyze table emp_orc compute statistics http://git-wip-us.apache.org/repos/asf/hive/blob/d0fa7d55/ql/src/test/results/clientpositive/autoColumnStats_2.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/autoColumnStats_2.q.out b/ql/src/test/results/clientpositive/autoColumnStats_2.q.out index b209ff0..4f63aad 100644 --- a/ql/src/test/results/clientpositive/autoColumnStats_2.q.out +++ b/ql/src/test/results/clientpositive/autoColumnStats_2.q.out @@ -783,11 +783,11 @@ STAGE PLANS: Processor Tree: TableScan alias: alter5 - Statistics: Num rows: 19 Data size: 1653 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 49 Data size: 4263 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: col1 (type: string), 'a' (type: string) outputColumnNames: _col0, _col1 - Statistics: Num rows: 19 Data size: 3268 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 49 Data size: 8428 Basic stats: COMPLETE Column stats: COMPLETE ListSink PREHOOK: query: drop table src_stat_part http://git-wip-us.apache.org/repos/asf/hive/blob/d0fa7d55/ql/src/test/results/clientpositive/autoColumnStats_5.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/autoColumnStats_5.q.out b/ql/src/test/results/clientpositive/autoColumnStats_5.q.out index 2655bfd..db5dd86 100644 --- a/ql/src/test/results/clientpositive/autoColumnStats_5.q.out +++ b/ql/src/test/results/clientpositive/autoColumnStats_5.q.out @@ -27,14 +27,14 @@ STAGE PLANS: Map Operator Tree: TableScan alias: values__tmp__table__1 - Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 440 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: UDFToInteger(tmp_values_col1) (type: int), tmp_values_col2 (type: string) outputColumnNames: _col0, _col1 - Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 440 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 440 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat @@ -43,18 +43,18 @@ STAGE PLANS: Select Operator expressions: _col0 (type: int), _col1 (type: string), UDFToInteger('1') (type: int) outputColumnNames: a, b, part - Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 440 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: compute_stats(a, 'hll'), compute_stats(b, 'hll') keys: part (type: int) mode: hash outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 440 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 440 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,bitvector:binary>), _col2 (type: struct<columntype:string,maxlength:bigint,sumlength:bigint,count:bigint,countnulls:bigint,bitvector:binary>) Reduce Operator Tree: Group By Operator @@ -62,14 +62,14 @@ STAGE PLANS: keys: KEY._col0 (type: int) mode: mergepartial outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 1 Data size: 220 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 440 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col1 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col2 (type: struct<columntype:string,maxlength:bigint,avglength:double,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col0 (type: int) outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 1 Data size: 220 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 440 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 220 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 440 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/d0fa7d55/ql/src/test/results/clientpositive/autoColumnStats_5a.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/autoColumnStats_5a.q.out b/ql/src/test/results/clientpositive/autoColumnStats_5a.q.out index d173c98..40548d0 100644 --- a/ql/src/test/results/clientpositive/autoColumnStats_5a.q.out +++ b/ql/src/test/results/clientpositive/autoColumnStats_5a.q.out @@ -791,14 +791,14 @@ STAGE PLANS: Map Operator Tree: TableScan alias: values__tmp__table__5 - Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 440 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: UDFToInteger(tmp_values_col1) (type: int), tmp_values_col2 (type: string) outputColumnNames: _col0, _col1 - Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 440 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 440 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat @@ -807,18 +807,18 @@ STAGE PLANS: Select Operator expressions: _col0 (type: int), _col1 (type: string), UDFToInteger('1') (type: int) outputColumnNames: a, b, part - Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 440 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: compute_stats(a, 'hll'), compute_stats(b, 'hll') keys: part (type: int) mode: hash outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 440 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 440 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,bitvector:binary>), _col2 (type: struct<columntype:string,maxlength:bigint,sumlength:bigint,count:bigint,countnulls:bigint,bitvector:binary>) Reduce Operator Tree: Group By Operator @@ -826,14 +826,14 @@ STAGE PLANS: keys: KEY._col0 (type: int) mode: mergepartial outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 1 Data size: 220 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 440 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col1 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col2 (type: struct<columntype:string,maxlength:bigint,avglength:double,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col0 (type: int) outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 1 Data size: 220 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 440 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 220 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 440 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/d0fa7d55/ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out b/ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out index d847937..097822b 100644 --- a/ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out +++ b/ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out @@ -113,16 +113,16 @@ STAGE PLANS: Map Operator Tree: TableScan alias: b - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE Sorted Merge Bucket Map Join Operator condition map: Inner Join 0 to 1 @@ -310,16 +310,16 @@ STAGE PLANS: Map Operator Tree: TableScan alias: a - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE Sorted Merge Bucket Map Join Operator condition map: Inner Join 0 to 1 @@ -568,16 +568,16 @@ STAGE PLANS: $hdt$_1:b TableScan alias: b - Statistics: Num rows: 1 Data size: 114 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 1140 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 1 Data size: 114 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 1140 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 114 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 1140 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col0 (type: string) @@ -589,16 +589,16 @@ STAGE PLANS: Map Operator Tree: TableScan alias: a - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 @@ -905,16 +905,16 @@ STAGE PLANS: $hdt$_0:a TableScan alias: a - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col0 (type: string) @@ -926,16 +926,16 @@ STAGE PLANS: Map Operator Tree: TableScan alias: b - Statistics: Num rows: 1 Data size: 114 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 1140 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 1 Data size: 114 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 1140 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 114 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 1140 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 @@ -1139,16 +1139,16 @@ STAGE PLANS: Map Operator Tree: TableScan alias: a - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE Sorted Merge Bucket Map Join Operator condition map: Inner Join 0 to 1 http://git-wip-us.apache.org/repos/asf/hive/blob/d0fa7d55/ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out b/ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out index 25bac39..2bdf3b9 100644 --- a/ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out +++ b/ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out @@ -222,16 +222,16 @@ STAGE PLANS: $hdt$_1:b TableScan alias: b - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col0 (type: string) @@ -243,16 +243,16 @@ STAGE PLANS: Map Operator Tree: TableScan alias: a - Statistics: Num rows: 1 Data size: 114 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 1140 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 1 Data size: 114 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 1140 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 114 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 1140 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 @@ -260,7 +260,7 @@ STAGE PLANS: 0 _col0 (type: string) 1 _col0 (type: string) Position of Big Table: 0 - Statistics: Num rows: 127 Data size: 12786 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 264 Data size: 127864 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count() mode: hash @@ -567,16 +567,16 @@ STAGE PLANS: $hdt$_0:a TableScan alias: a - Statistics: Num rows: 1 Data size: 114 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 1140 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 1 Data size: 114 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 1140 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 114 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 1140 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col0 (type: string) @@ -588,16 +588,16 @@ STAGE PLANS: Map Operator Tree: TableScan alias: b - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 @@ -605,7 +605,7 @@ STAGE PLANS: 0 _col0 (type: string) 1 _col0 (type: string) Position of Big Table: 1 - Statistics: Num rows: 127 Data size: 12786 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 264 Data size: 127864 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count() mode: hash @@ -790,42 +790,42 @@ STAGE PLANS: Map Operator Tree: TableScan alias: a - Statistics: Num rows: 1 Data size: 114 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 1140 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 1 Data size: 114 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 1140 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 114 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 1140 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) null sort order: a sort order: + Map-reduce partition columns: _col0 (type: string) - Statistics: Num rows: 1 Data size: 114 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 1140 Basic stats: COMPLETE Column stats: NONE tag: 0 auto parallelism: false TableScan alias: b - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) null sort order: a sort order: + Map-reduce partition columns: _col0 (type: string) - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE tag: 1 auto parallelism: false Path -> Alias: @@ -990,7 +990,7 @@ STAGE PLANS: keys: 0 _col0 (type: string) 1 _col0 (type: string) - Statistics: Num rows: 127 Data size: 12786 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 264 Data size: 127864 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count() mode: hash @@ -1052,16 +1052,16 @@ STAGE PLANS: Map Operator Tree: TableScan alias: b - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE Sorted Merge Bucket Map Join Operator condition map: Inner Join 0 to 1 @@ -1249,12 +1249,12 @@ STAGE PLANS: Map Operator Tree: TableScan alias: b - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE Sorted Merge Bucket Map Join Operator condition map: Inner Join 0 to 1 @@ -1442,12 +1442,12 @@ STAGE PLANS: Map Operator Tree: TableScan alias: c - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 116 Data size: 11624 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 240 Data size: 116240 Basic stats: COMPLETE Column stats: NONE Sorted Merge Bucket Map Join Operator condition map: Inner Join 0 to 1