HIVE-18149: Stats: rownum estimation from datasize underestimates in most cases (Zoltan Haindrich, reviewed by Ashutosh Chauhan)
Signed-off-by: Zoltan Haindrich <k...@rxd.hu> Project: http://git-wip-us.apache.org/repos/asf/hive/repo Commit: http://git-wip-us.apache.org/repos/asf/hive/commit/e26b9325 Tree: http://git-wip-us.apache.org/repos/asf/hive/tree/e26b9325 Diff: http://git-wip-us.apache.org/repos/asf/hive/diff/e26b9325 Branch: refs/heads/master Commit: e26b932536e57ba11813b6bc96f9b9707538963a Parents: b7ac74a Author: Zoltan Haindrich <k...@rxd.hu> Authored: Wed Dec 20 11:35:27 2017 +0100 Committer: Zoltan Haindrich <k...@rxd.hu> Committed: Wed Dec 20 11:35:27 2017 +0100 ---------------------------------------------------------------------- .../org/apache/hadoop/hive/conf/HiveConf.java | 2 +- .../udf_example_arraymapstruct.q.out | 6 +- .../test/results/clientpositive/explain.q.out | 4 +- .../insert_into_dynamic_partitions.q.out | 10 +- .../clientpositive/insert_into_table.q.out | 8 +- .../insert_overwrite_dynamic_partitions.q.out | 10 +- .../clientpositive/insert_overwrite_table.q.out | 8 +- .../runtime_skewjoin_mapjoin_spark.q | 1 + .../spark_dynamic_partition_pruning_3.q | 3 +- .../clientpositive/acid_table_stats.q.out | 12 +- .../clientpositive/annotate_stats_part.q.out | 2 +- .../clientpositive/annotate_stats_table.q.out | 6 +- .../clientpositive/autoColumnStats_5.q.out | 54 +-- .../clientpositive/autoColumnStats_5a.q.out | 54 +-- .../clientpositive/auto_join_stats.q.out | 6 +- .../clientpositive/auto_join_stats2.q.out | 22 +- .../clientpositive/auto_sortmerge_join_12.q.out | 20 +- .../clientpositive/auto_sortmerge_join_5.q.out | 42 +- .../beeline/select_dummy_source.q.out | 14 +- .../clientpositive/beeline/smb_mapjoin_1.q.out | 20 +- .../clientpositive/beeline/smb_mapjoin_2.q.out | 20 +- .../clientpositive/beeline/smb_mapjoin_3.q.out | 20 +- .../clientpositive/binarysortable_1.q.out | Bin 4332 -> 4339 bytes .../clientpositive/bucket_map_join_1.q.out | 10 +- .../clientpositive/bucket_map_join_2.q.out | 10 +- .../clientpositive/bucketcontext_5.q.out | 14 +- .../clientpositive/bucketcontext_6.q.out | 4 +- .../results/clientpositive/bucketmapjoin5.q.out | 8 +- .../clientpositive/bucketmapjoin_negative.q.out | 4 +- .../bucketmapjoin_negative2.q.out | 4 +- .../bucketmapjoin_negative3.q.out | 126 +++--- .../clientpositive/case_sensitivity.q.out | 10 +- .../results/clientpositive/cbo_rp_join1.q.out | 64 +-- .../cbo_rp_udaf_percentile_approx_23.q.out | 8 +- .../columnarserde_create_shortcut.q.out | 10 +- .../clientpositive/columnstats_tbllvl.q.out | 16 +- .../test/results/clientpositive/combine2.q.out | 12 +- .../clientpositive/compute_stats_date.q.out | 4 +- .../test/results/clientpositive/concat_op.q.out | 4 +- .../clientpositive/correlationoptimizer5.q.out | 156 +++---- .../clientpositive/decimal_precision.q.out | 4 +- .../clientpositive/decimal_precision2.q.out | 14 +- .../results/clientpositive/decimal_udf.q.out | 178 ++++---- .../results/clientpositive/decimal_udf2.q.out | 16 +- .../display_colstats_tbllvl.q.out | 8 +- .../clientpositive/distinct_windowing.q.out | 60 +-- .../distinct_windowing_no_cbo.q.out | 104 ++--- .../clientpositive/drop_table_with_index.q.out | 12 +- .../clientpositive/filter_cond_pushdown2.q.out | 42 +- .../clientpositive/gen_udf_example_add10.q.out | 12 +- .../test/results/clientpositive/groupby10.q.out | 104 ++--- .../results/clientpositive/groupby_cube1.q.out | 130 +++--- .../clientpositive/groupby_grouping_id3.q.out | 32 +- .../clientpositive/groupby_grouping_sets1.q.out | 94 ++-- .../clientpositive/groupby_grouping_sets2.q.out | 54 +-- .../clientpositive/groupby_grouping_sets3.q.out | 46 +- .../clientpositive/groupby_grouping_sets4.q.out | 116 ++--- .../clientpositive/groupby_grouping_sets5.q.out | 64 +-- .../clientpositive/groupby_grouping_sets6.q.out | 24 +- .../groupby_grouping_sets_grouping.q.out | 222 ++++----- .../groupby_grouping_sets_limit.q.out | 92 ++-- .../clientpositive/groupby_rollup1.q.out | 102 ++--- .../clientpositive/groupby_sort_11.q.out | 8 +- .../results/clientpositive/index_serde.q.out | 8 +- .../clientpositive/index_skewtable.q.out | 12 +- .../infer_bucket_sort_map_operators.q.out | 8 +- .../clientpositive/infer_const_type.q.out | 32 +- .../test/results/clientpositive/input17.q.out | 14 +- .../test/results/clientpositive/input21.q.out | 10 +- .../test/results/clientpositive/input22.q.out | 12 +- .../results/clientpositive/input3_limit.q.out | 18 +- ql/src/test/results/clientpositive/input4.q.out | 2 +- ql/src/test/results/clientpositive/input5.q.out | 14 +- .../clientpositive/input_columnarserde.q.out | 10 +- .../clientpositive/input_dynamicserde.q.out | 6 +- .../clientpositive/input_lazyserde.q.out | 10 +- .../clientpositive/input_lazyserde2.q.out | 10 +- .../clientpositive/input_testxpath.q.out | 8 +- .../clientpositive/input_testxpath2.q.out | 10 +- .../clientpositive/input_testxpath3.q.out | 6 +- .../clientpositive/input_testxpath4.q.out | 16 +- .../results/clientpositive/insert_into1.q.out | 10 +- .../results/clientpositive/insert_into2.q.out | 4 +- .../results/clientpositive/join_hive_626.q.out | 34 +- .../results/clientpositive/join_reorder.q.out | 92 ++-- .../results/clientpositive/join_reorder2.q.out | 68 +-- .../results/clientpositive/join_reorder3.q.out | 68 +-- .../results/clientpositive/join_reorder4.q.out | 72 +-- .../test/results/clientpositive/join_star.q.out | 140 +++--- .../results/clientpositive/join_thrift.q.out | 18 +- .../llap/auto_sortmerge_join_12.q.out | 8 +- .../llap/dynamic_partition_pruning.q.out | 132 +++--- .../llap/dynamic_partition_pruning_2.q.out | 6 +- .../llap/dynamic_semijoin_reduction.q.out | 38 +- .../llap/dynamic_semijoin_user_level.q.out | 4 +- .../llap/dynpart_sort_opt_vectorization.q.out | 8 +- .../clientpositive/llap/explainuser_1.q.out | 8 +- .../clientpositive/llap/insert_into1.q.out | 4 +- .../clientpositive/llap/insert_into2.q.out | 4 +- .../clientpositive/llap/llap_partitioned.q.out | 2 +- .../llap/partition_multilevels.q.out | 8 +- .../llap/vector_complex_all.q.out | 4 +- .../llap/vector_partitioned_date_time.q.out | 16 +- .../vectorized_dynamic_partition_pruning.q.out | 126 +++--- .../clientpositive/mapjoin_subquery2.q.out | 26 +- .../results/clientpositive/nullformatCTAS.q.out | 6 +- .../results/clientpositive/nullgroup3.q.out | 8 +- .../results/clientpositive/nullscript.q.out | 8 +- .../results/clientpositive/orc_merge5.q.out | 20 +- .../results/clientpositive/orc_merge6.q.out | 40 +- .../clientpositive/orc_merge_incompat1.q.out | 10 +- .../clientpositive/orc_merge_incompat2.q.out | 18 +- .../clientpositive/parallel_orderby.q.out | 10 +- .../clientpositive/partition_boolexpr.q.out | 12 +- ql/src/test/results/clientpositive/pcs.q.out | 6 +- .../results/clientpositive/ptf_matchpath.q.out | 42 +- .../results/clientpositive/quotedid_skew.q.out | 44 +- .../test/results/clientpositive/regex_col.q.out | 12 +- .../test/results/clientpositive/row__id.q.out | 18 +- .../clientpositive/select_dummy_source.q.out | 14 +- .../test/results/clientpositive/skewjoin.q.out | 72 +-- .../clientpositive/skewjoin_mapjoin1.q.out | 136 +++--- .../clientpositive/skewjoin_mapjoin11.q.out | 36 +- .../clientpositive/skewjoin_mapjoin2.q.out | 80 ++-- .../clientpositive/skewjoin_mapjoin3.q.out | 36 +- .../clientpositive/skewjoin_mapjoin4.q.out | 48 +- .../clientpositive/skewjoin_mapjoin5.q.out | 50 +-- .../clientpositive/skewjoin_mapjoin6.q.out | 8 +- .../clientpositive/skewjoin_mapjoin7.q.out | 80 ++-- .../clientpositive/skewjoin_mapjoin8.q.out | 24 +- .../clientpositive/skewjoin_mapjoin9.q.out | 46 +- .../skewjoin_union_remove_1.q.out | 160 +++---- .../skewjoin_union_remove_2.q.out | 56 +-- .../results/clientpositive/skewjoinopt1.q.out | 168 +++---- .../results/clientpositive/skewjoinopt10.q.out | 12 +- .../results/clientpositive/skewjoinopt11.q.out | 96 ++-- .../results/clientpositive/skewjoinopt12.q.out | 44 +- .../results/clientpositive/skewjoinopt13.q.out | 32 +- .../results/clientpositive/skewjoinopt14.q.out | 56 +-- .../results/clientpositive/skewjoinopt16.q.out | 44 +- .../results/clientpositive/skewjoinopt17.q.out | 88 ++-- .../results/clientpositive/skewjoinopt18.q.out | 8 +- .../results/clientpositive/skewjoinopt19.q.out | 44 +- .../results/clientpositive/skewjoinopt2.q.out | 192 ++++---- .../results/clientpositive/skewjoinopt20.q.out | 44 +- .../results/clientpositive/skewjoinopt21.q.out | 44 +- .../results/clientpositive/skewjoinopt3.q.out | 88 ++-- .../results/clientpositive/skewjoinopt4.q.out | 88 ++-- .../results/clientpositive/skewjoinopt5.q.out | 44 +- .../results/clientpositive/skewjoinopt6.q.out | 44 +- .../results/clientpositive/skewjoinopt7.q.out | 60 +-- .../results/clientpositive/skewjoinopt8.q.out | 60 +-- .../results/clientpositive/skewjoinopt9.q.out | 56 +-- .../results/clientpositive/smb_mapjoin_1.q.out | 20 +- .../results/clientpositive/smb_mapjoin_2.q.out | 20 +- .../results/clientpositive/smb_mapjoin_25.q.out | 96 ++-- .../results/clientpositive/smb_mapjoin_3.q.out | 20 +- .../clientpositive/spark/auto_join_stats.q.out | 6 +- .../clientpositive/spark/auto_join_stats2.q.out | 22 +- .../spark/auto_sortmerge_join_12.q.out | 8 +- .../spark/auto_sortmerge_join_5.q.out | 30 +- .../spark/bucket_map_join_1.q.out | 10 +- .../spark/bucket_map_join_2.q.out | 10 +- .../clientpositive/spark/bucketmapjoin1.q.out | 16 +- .../clientpositive/spark/bucketmapjoin4.q.out | 44 +- .../clientpositive/spark/bucketmapjoin5.q.out | 8 +- .../spark/bucketmapjoin_negative.q.out | 4 +- .../spark/bucketmapjoin_negative2.q.out | 4 +- .../spark/bucketmapjoin_negative3.q.out | 126 +++--- .../spark/column_access_stats.q.out | 56 +-- .../results/clientpositive/spark/count.q.out | 130 +++--- .../spark/gen_udf_example_add10.q.out | 12 +- .../clientpositive/spark/groupby10.q.out | 72 +-- .../clientpositive/spark/groupby_cube1.q.out | 128 +++--- .../clientpositive/spark/groupby_rollup1.q.out | 100 ++--- .../results/clientpositive/spark/input17.q.out | 12 +- .../clientpositive/spark/insert_into1.q.out | 10 +- .../clientpositive/spark/insert_into2.q.out | 4 +- .../clientpositive/spark/join_hive_626.q.out | 34 +- .../clientpositive/spark/join_nullsafe.q.out | 116 ++--- .../clientpositive/spark/join_reorder.q.out | 92 ++-- .../clientpositive/spark/join_reorder2.q.out | 68 +-- .../clientpositive/spark/join_reorder3.q.out | 68 +-- .../clientpositive/spark/join_reorder4.q.out | 72 +-- .../clientpositive/spark/join_star.q.out | 140 +++--- .../clientpositive/spark/join_thrift.q.out | 18 +- .../spark/mapjoin_subquery2.q.out | 26 +- .../clientpositive/spark/orc_merge5.q.out | 16 +- .../clientpositive/spark/orc_merge6.q.out | 16 +- .../clientpositive/spark/orc_merge7.q.out | 12 +- .../spark/orc_merge_incompat1.q.out | 8 +- .../spark/orc_merge_incompat2.q.out | 6 +- .../clientpositive/spark/parallel_orderby.q.out | 10 +- .../clientpositive/spark/ptf_matchpath.q.out | 42 +- .../results/clientpositive/spark/skewjoin.q.out | 72 +-- .../spark/skewjoin_union_remove_1.q.out | 160 +++---- .../spark/skewjoin_union_remove_2.q.out | 56 +-- .../clientpositive/spark/skewjoinopt1.q.out | 152 +++---- .../clientpositive/spark/skewjoinopt10.q.out | 12 +- .../clientpositive/spark/skewjoinopt11.q.out | 56 +-- .../clientpositive/spark/skewjoinopt12.q.out | 40 +- .../clientpositive/spark/skewjoinopt13.q.out | 32 +- .../clientpositive/spark/skewjoinopt14.q.out | 52 +-- .../clientpositive/spark/skewjoinopt16.q.out | 40 +- .../clientpositive/spark/skewjoinopt17.q.out | 80 ++-- .../clientpositive/spark/skewjoinopt18.q.out | 8 +- .../clientpositive/spark/skewjoinopt19.q.out | 40 +- .../clientpositive/spark/skewjoinopt2.q.out | 176 ++++---- .../clientpositive/spark/skewjoinopt20.q.out | 40 +- .../clientpositive/spark/skewjoinopt3.q.out | 80 ++-- .../clientpositive/spark/skewjoinopt4.q.out | 80 ++-- .../clientpositive/spark/skewjoinopt5.q.out | 40 +- .../clientpositive/spark/skewjoinopt6.q.out | 40 +- .../clientpositive/spark/skewjoinopt7.q.out | 56 +-- .../clientpositive/spark/skewjoinopt8.q.out | 56 +-- .../clientpositive/spark/skewjoinopt9.q.out | 44 +- .../clientpositive/spark/smb_mapjoin_1.q.out | 68 +-- .../clientpositive/spark/smb_mapjoin_2.q.out | 68 +-- .../clientpositive/spark/smb_mapjoin_25.q.out | 80 ++-- .../clientpositive/spark/smb_mapjoin_3.q.out | 68 +-- .../clientpositive/spark/smb_mapjoin_4.q.out | 310 ++++++------- .../clientpositive/spark/smb_mapjoin_5.q.out | 310 ++++++------- .../spark/spark_dynamic_partition_pruning.q.out | 450 +++++++++---------- .../spark_dynamic_partition_pruning_2.q.out | 20 +- .../spark_dynamic_partition_pruning_4.q.out | 20 +- .../spark/spark_explainuser_1.q.out | 8 +- ...k_vectorized_dynamic_partition_pruning.q.out | 444 +++++++++--------- .../results/clientpositive/spark/stats10.q.out | 2 +- .../results/clientpositive/spark/stats12.q.out | 2 +- .../results/clientpositive/spark/stats13.q.out | 2 +- .../results/clientpositive/spark/stats2.q.out | 2 +- .../results/clientpositive/spark/stats7.q.out | 2 +- .../results/clientpositive/spark/stats8.q.out | 10 +- .../results/clientpositive/spark/stats9.q.out | 2 +- .../clientpositive/spark/stats_noscan_2.q.out | 4 +- .../clientpositive/spark/subquery_multi.q.out | 362 +++++++-------- .../clientpositive/spark/subquery_scalar.q.out | 132 +++--- .../results/clientpositive/spark/union21.q.out | 8 +- .../clientpositive/spark/union_remove_1.q.out | 16 +- .../clientpositive/spark/union_remove_10.q.out | 24 +- .../clientpositive/spark/union_remove_11.q.out | 24 +- .../clientpositive/spark/union_remove_12.q.out | 24 +- .../clientpositive/spark/union_remove_13.q.out | 30 +- .../clientpositive/spark/union_remove_14.q.out | 24 +- .../clientpositive/spark/union_remove_15.q.out | 20 +- .../clientpositive/spark/union_remove_16.q.out | 20 +- .../clientpositive/spark/union_remove_17.q.out | 16 +- .../clientpositive/spark/union_remove_18.q.out | 20 +- .../clientpositive/spark/union_remove_19.q.out | 52 +-- .../clientpositive/spark/union_remove_2.q.out | 24 +- .../clientpositive/spark/union_remove_20.q.out | 20 +- .../clientpositive/spark/union_remove_21.q.out | 16 +- .../clientpositive/spark/union_remove_22.q.out | 40 +- .../clientpositive/spark/union_remove_23.q.out | 38 +- .../clientpositive/spark/union_remove_24.q.out | 20 +- .../clientpositive/spark/union_remove_25.q.out | 16 +- .../clientpositive/spark/union_remove_3.q.out | 24 +- .../clientpositive/spark/union_remove_4.q.out | 16 +- .../clientpositive/spark/union_remove_5.q.out | 24 +- .../clientpositive/spark/union_remove_6.q.out | 20 +- .../spark/union_remove_6_subq.q.out | 20 +- .../clientpositive/spark/union_remove_7.q.out | 16 +- .../clientpositive/spark/union_remove_8.q.out | 24 +- .../clientpositive/spark/union_remove_9.q.out | 28 +- .../clientpositive/spark/union_view.q.out | 44 +- .../vectorization_parquet_projection.q.out | 4 +- .../test/results/clientpositive/stats10.q.out | 2 +- .../test/results/clientpositive/stats12.q.out | 2 +- .../test/results/clientpositive/stats13.q.out | 2 +- ql/src/test/results/clientpositive/stats2.q.out | 2 +- ql/src/test/results/clientpositive/stats7.q.out | 2 +- ql/src/test/results/clientpositive/stats8.q.out | 10 +- ql/src/test/results/clientpositive/stats9.q.out | 2 +- .../results/clientpositive/stats_noscan_2.q.out | 4 +- .../results/clientpositive/stats_ppr_all.q.out | 2 +- .../symlink_text_input_format.q.out | 48 +- .../temp_table_display_colstats_tbllvl.q.out | 8 +- .../clientpositive/tez/explainanalyze_5.q.out | 2 +- .../clientpositive/timestamp_literal.q.out | 6 +- .../results/clientpositive/timestamptz.q.out | 8 +- .../udaf_percentile_approx_23.q.out | 8 +- .../results/clientpositive/udf_add_months.q.out | 2 +- .../clientpositive/udf_aes_decrypt.q.out | 2 +- .../clientpositive/udf_aes_encrypt.q.out | 2 +- .../clientpositive/udf_bitwise_shiftleft.q.out | 2 +- .../clientpositive/udf_bitwise_shiftright.q.out | 2 +- .../udf_bitwise_shiftrightunsigned.q.out | 2 +- .../clientpositive/udf_case_thrift.q.out | 4 +- .../test/results/clientpositive/udf_cbrt.q.out | 2 +- .../clientpositive/udf_character_length.q.out | 8 +- .../results/clientpositive/udf_coalesce.q.out | 4 +- .../test/results/clientpositive/udf_crc32.q.out | 2 +- .../clientpositive/udf_current_database.q.out | 8 +- .../clientpositive/udf_date_format.q.out | 2 +- .../results/clientpositive/udf_decode.q.out | 2 +- .../results/clientpositive/udf_factorial.q.out | 2 +- .../clientpositive/udf_from_utc_timestamp.q.out | 2 +- .../results/clientpositive/udf_in_file.q.out | 8 +- .../clientpositive/udf_isnull_isnotnull.q.out | 8 +- .../results/clientpositive/udf_last_day.q.out | 2 +- .../results/clientpositive/udf_length.q.out | 4 +- .../clientpositive/udf_levenshtein.q.out | 2 +- .../test/results/clientpositive/udf_mask.q.out | 2 +- .../clientpositive/udf_mask_first_n.q.out | 2 +- .../results/clientpositive/udf_mask_hash.q.out | 2 +- .../clientpositive/udf_mask_last_n.q.out | 2 +- .../clientpositive/udf_mask_show_first_n.q.out | 2 +- .../clientpositive/udf_mask_show_last_n.q.out | 2 +- .../test/results/clientpositive/udf_md5.q.out | 2 +- .../clientpositive/udf_months_between.q.out | 2 +- .../results/clientpositive/udf_nullif.q.out | 6 +- .../clientpositive/udf_octet_length.q.out | 4 +- .../results/clientpositive/udf_quarter.q.out | 2 +- .../test/results/clientpositive/udf_sha1.q.out | 2 +- .../test/results/clientpositive/udf_sha2.q.out | 2 +- .../test/results/clientpositive/udf_size.q.out | 8 +- .../results/clientpositive/udf_soundex.q.out | 2 +- .../clientpositive/udf_substring_index.q.out | 2 +- .../clientpositive/udf_to_utc_timestamp.q.out | 2 +- .../test/results/clientpositive/udf_trunc.q.out | 24 +- .../clientpositive/udf_trunc_number.q.out | 20 +- .../clientpositive/udf_width_bucket.q.out | 2 +- .../results/clientpositive/udtf_stack.q.out | 2 +- .../test/results/clientpositive/union21.q.out | 18 +- .../results/clientpositive/union_remove_1.q.out | 24 +- .../clientpositive/union_remove_10.q.out | 24 +- .../clientpositive/union_remove_11.q.out | 30 +- .../clientpositive/union_remove_12.q.out | 24 +- .../clientpositive/union_remove_13.q.out | 30 +- .../clientpositive/union_remove_14.q.out | 24 +- .../clientpositive/union_remove_15.q.out | 28 +- .../clientpositive/union_remove_16.q.out | 28 +- .../clientpositive/union_remove_17.q.out | 20 +- .../clientpositive/union_remove_18.q.out | 28 +- .../clientpositive/union_remove_19.q.out | 76 ++-- .../results/clientpositive/union_remove_2.q.out | 24 +- .../clientpositive/union_remove_20.q.out | 28 +- .../clientpositive/union_remove_21.q.out | 24 +- .../clientpositive/union_remove_22.q.out | 56 +-- .../clientpositive/union_remove_23.q.out | 38 +- .../clientpositive/union_remove_24.q.out | 28 +- .../clientpositive/union_remove_25.q.out | 24 +- .../results/clientpositive/union_remove_3.q.out | 30 +- .../results/clientpositive/union_remove_4.q.out | 24 +- .../results/clientpositive/union_remove_5.q.out | 24 +- .../results/clientpositive/union_remove_6.q.out | 32 +- .../clientpositive/union_remove_6_subq.q.out | 32 +- .../results/clientpositive/union_remove_7.q.out | 24 +- .../results/clientpositive/union_remove_8.q.out | 24 +- .../results/clientpositive/union_remove_9.q.out | 32 +- .../results/clientpositive/union_view.q.out | 64 +-- .../results/clientpositive/vector_bucket.q.out | 10 +- .../clientpositive/vector_decimal_10_0.q.out | 10 +- .../vector_decimal_precision.q.out | 4 +- .../clientpositive/vector_decimal_udf2.q.out | 16 +- .../clientpositive/vector_gather_stats.q.out | 2 +- .../vector_reduce_groupby_duplicate_cols.q.out | 2 +- .../vector_tablesample_rows.q.out | 6 +- .../vectorization_parquet_projection.q.out | 4 +- 359 files changed, 6609 insertions(+), 6607 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/hive/blob/e26b9325/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ---------------------------------------------------------------------- diff --git a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java index 8648a38..be83489 100644 --- a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java +++ b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java @@ -1832,7 +1832,7 @@ public class HiveConf extends Configuration { // annotation. But the file may be compressed, encoded and serialized which may be lesser in size // than the actual uncompressed/raw data size. This factor will be multiplied to file size to estimate // the raw data size. - HIVE_STATS_DESERIALIZATION_FACTOR("hive.stats.deserialization.factor", (float) 1.0, + HIVE_STATS_DESERIALIZATION_FACTOR("hive.stats.deserialization.factor", (float) 10.0, "Hive/Tez optimizer estimates the data size flowing through each of the operators. In the absence\n" + "of basic statistics like number of rows and data size, file size is used to estimate the number\n" + "of rows and data size. Since files in tables/partitions are serialized (and optionally\n" + http://git-wip-us.apache.org/repos/asf/hive/blob/e26b9325/contrib/src/test/results/clientpositive/udf_example_arraymapstruct.q.out ---------------------------------------------------------------------- diff --git a/contrib/src/test/results/clientpositive/udf_example_arraymapstruct.q.out b/contrib/src/test/results/clientpositive/udf_example_arraymapstruct.q.out index 0eaa229..32a12cf 100644 --- a/contrib/src/test/results/clientpositive/udf_example_arraymapstruct.q.out +++ b/contrib/src/test/results/clientpositive/udf_example_arraymapstruct.q.out @@ -34,14 +34,14 @@ STAGE PLANS: Map Operator Tree: TableScan alias: src_thrift - Statistics: Num rows: 11 Data size: 3070 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 11 Data size: 30700 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: example_arraysum(lint) (type: double), example_mapconcat(mstringstring) (type: string), example_structprint(lintstring[0]) (type: string) outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 11 Data size: 3070 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 11 Data size: 30700 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 11 Data size: 3070 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 11 Data size: 30700 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/e26b9325/itests/hive-blobstore/src/test/results/clientpositive/explain.q.out ---------------------------------------------------------------------- diff --git a/itests/hive-blobstore/src/test/results/clientpositive/explain.q.out b/itests/hive-blobstore/src/test/results/clientpositive/explain.q.out index 5d95dbd..3cfb314 100644 --- a/itests/hive-blobstore/src/test/results/clientpositive/explain.q.out +++ b/itests/hive-blobstore/src/test/results/clientpositive/explain.q.out @@ -46,9 +46,9 @@ STAGE PLANS: Map Operator Tree: TableScan alias: blobstore_table - Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: COMPLETE Select Operator - Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: COMPLETE Group By Operator aggregations: count() mode: hash http://git-wip-us.apache.org/repos/asf/hive/blob/e26b9325/itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out ---------------------------------------------------------------------- diff --git a/itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out b/itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out index bbd81d1..ebf2daa 100644 --- a/itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out +++ b/itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out @@ -85,18 +85,18 @@ STAGE PLANS: Map Operator Tree: TableScan alias: values__tmp__table__3 - Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 300 Basic stats: COMPLETE Column stats: NONE GatherStats: false Select Operator expressions: UDFToInteger(tmp_values_col1) (type: int), tmp_values_col2 (type: string) outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 300 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col1 (type: string), '_bucket_number' (type: string) null sort order: aa sort order: ++ Map-reduce partition columns: _col1 (type: string) - Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 300 Basic stats: COMPLETE Column stats: NONE tag: -1 value expressions: _col0 (type: int) auto parallelism: false @@ -144,14 +144,14 @@ STAGE PLANS: Select Operator expressions: VALUE._col0 (type: int), KEY._col1 (type: string), KEY.'_bucket_number' (type: string) outputColumnNames: _col0, _col1, '_bucket_number' - Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 300 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false GlobalTableId: 1 directory: ### BLOBSTORE_STAGING_PATH ### Dp Sort State: PARTITION_BUCKET_SORTED NumFilesPerFileSink: 1 - Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 300 Basic stats: COMPLETE Column stats: NONE Stats Publishing Key Prefix: ### BLOBSTORE_STAGING_PATH ### table: input format: org.apache.hadoop.mapred.TextInputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/e26b9325/itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out ---------------------------------------------------------------------- diff --git a/itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out b/itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out index 315aedb..40d2571 100644 --- a/itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out +++ b/itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out @@ -56,18 +56,18 @@ STAGE PLANS: Map Operator Tree: TableScan alias: values__tmp__table__3 - Statistics: Num rows: 1 Data size: 2 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE Column stats: NONE GatherStats: false Select Operator expressions: UDFToInteger(tmp_values_col1) (type: int) outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 2 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false GlobalTableId: 1 directory: ### BLOBSTORE_STAGING_PATH ### NumFilesPerFileSink: 1 - Statistics: Num rows: 1 Data size: 2 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE Column stats: NONE Stats Publishing Key Prefix: ### BLOBSTORE_STAGING_PATH ### table: input format: org.apache.hadoop.mapred.TextInputFormat @@ -98,7 +98,7 @@ STAGE PLANS: Select Operator expressions: _col0 (type: int) outputColumnNames: id - Statistics: Num rows: 1 Data size: 2 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: compute_stats(id, 'hll') mode: hash http://git-wip-us.apache.org/repos/asf/hive/blob/e26b9325/itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out ---------------------------------------------------------------------- diff --git a/itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out b/itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out index 2192e15..5cf69d8 100644 --- a/itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out +++ b/itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out @@ -103,18 +103,18 @@ STAGE PLANS: Map Operator Tree: TableScan alias: values__tmp__table__3 - Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 300 Basic stats: COMPLETE Column stats: NONE GatherStats: false Select Operator expressions: UDFToInteger(tmp_values_col1) (type: int), tmp_values_col2 (type: string) outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 300 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col1 (type: string), '_bucket_number' (type: string) null sort order: aa sort order: ++ Map-reduce partition columns: _col1 (type: string) - Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 300 Basic stats: COMPLETE Column stats: NONE tag: -1 value expressions: _col0 (type: int) auto parallelism: false @@ -162,14 +162,14 @@ STAGE PLANS: Select Operator expressions: VALUE._col0 (type: int), KEY._col1 (type: string), KEY.'_bucket_number' (type: string) outputColumnNames: _col0, _col1, '_bucket_number' - Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 300 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false GlobalTableId: 1 directory: ### BLOBSTORE_STAGING_PATH ### Dp Sort State: PARTITION_BUCKET_SORTED NumFilesPerFileSink: 1 - Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 300 Basic stats: COMPLETE Column stats: NONE Stats Publishing Key Prefix: ### BLOBSTORE_STAGING_PATH ### table: input format: org.apache.hadoop.mapred.TextInputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/e26b9325/itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out ---------------------------------------------------------------------- diff --git a/itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out b/itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out index 10911a5..bab88eb 100644 --- a/itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out +++ b/itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out @@ -64,18 +64,18 @@ STAGE PLANS: Map Operator Tree: TableScan alias: values__tmp__table__3 - Statistics: Num rows: 1 Data size: 2 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE Column stats: NONE GatherStats: false Select Operator expressions: UDFToInteger(tmp_values_col1) (type: int) outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 2 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false GlobalTableId: 1 directory: ### BLOBSTORE_STAGING_PATH ### NumFilesPerFileSink: 1 - Statistics: Num rows: 1 Data size: 2 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE Column stats: NONE Stats Publishing Key Prefix: ### BLOBSTORE_STAGING_PATH ### table: input format: org.apache.hadoop.mapred.TextInputFormat @@ -106,7 +106,7 @@ STAGE PLANS: Select Operator expressions: _col0 (type: int) outputColumnNames: id - Statistics: Num rows: 1 Data size: 2 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: compute_stats(id, 'hll') mode: hash http://git-wip-us.apache.org/repos/asf/hive/blob/e26b9325/ql/src/test/queries/clientpositive/runtime_skewjoin_mapjoin_spark.q ---------------------------------------------------------------------- diff --git a/ql/src/test/queries/clientpositive/runtime_skewjoin_mapjoin_spark.q b/ql/src/test/queries/clientpositive/runtime_skewjoin_mapjoin_spark.q index 2d12d08..ca9e9cf 100644 --- a/ql/src/test/queries/clientpositive/runtime_skewjoin_mapjoin_spark.q +++ b/ql/src/test/queries/clientpositive/runtime_skewjoin_mapjoin_spark.q @@ -1,3 +1,4 @@ +set hive.stats.deserialization.factor=1.0; set hive.mapred.mode=nonstrict; set hive.optimize.skewjoin = true; set hive.skewjoin.key = 4; http://git-wip-us.apache.org/repos/asf/hive/blob/e26b9325/ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning_3.q ---------------------------------------------------------------------- diff --git a/ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning_3.q b/ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning_3.q index 8863cf4..2d622ae 100644 --- a/ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning_3.q +++ b/ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning_3.q @@ -1,3 +1,4 @@ +set hive.stats.deserialization.factor=1.0; CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int); CREATE TABLE partitioned_table2 (col int) PARTITIONED BY (part_col int); CREATE TABLE partitioned_table3 (col int) PARTITIONED BY (part_col int); @@ -225,4 +226,4 @@ DROP TABLE partitioned_table3; DROP TABLE partitioned_table4; DROP TABLE partitioned_table5; DROP TABLE regular_table1; -DROP TABLE regular_table2; \ No newline at end of file +DROP TABLE regular_table2; http://git-wip-us.apache.org/repos/asf/hive/blob/e26b9325/ql/src/test/results/clientpositive/acid_table_stats.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/acid_table_stats.q.out b/ql/src/test/results/clientpositive/acid_table_stats.q.out index 8a25e5a..05a03d2 100644 --- a/ql/src/test/results/clientpositive/acid_table_stats.q.out +++ b/ql/src/test/results/clientpositive/acid_table_stats.q.out @@ -133,9 +133,9 @@ STAGE PLANS: Map Operator Tree: TableScan alias: acid - Statistics: Num rows: 1 Data size: 3950 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 39500 Basic stats: PARTIAL Column stats: NONE Select Operator - Statistics: Num rows: 1 Data size: 3950 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 39500 Basic stats: PARTIAL Column stats: NONE Group By Operator aggregations: count() mode: hash @@ -299,9 +299,9 @@ STAGE PLANS: Map Operator Tree: TableScan alias: acid - Statistics: Num rows: 1000 Data size: 208000 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1000 Data size: 2080000 Basic stats: COMPLETE Column stats: NONE Select Operator - Statistics: Num rows: 1000 Data size: 208000 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1000 Data size: 2080000 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count() mode: hash @@ -460,9 +460,9 @@ STAGE PLANS: Map Operator Tree: TableScan alias: acid - Statistics: Num rows: 2000 Data size: 416000 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2000 Data size: 4160000 Basic stats: COMPLETE Column stats: NONE Select Operator - Statistics: Num rows: 2000 Data size: 416000 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2000 Data size: 4160000 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count() mode: hash http://git-wip-us.apache.org/repos/asf/hive/blob/e26b9325/ql/src/test/results/clientpositive/annotate_stats_part.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/annotate_stats_part.q.out b/ql/src/test/results/clientpositive/annotate_stats_part.q.out index fed2a65..cba89a6 100644 --- a/ql/src/test/results/clientpositive/annotate_stats_part.q.out +++ b/ql/src/test/results/clientpositive/annotate_stats_part.q.out @@ -339,7 +339,7 @@ STAGE PLANS: Processor Tree: TableScan alias: loc_orc - Statistics: Num rows: 8 Data size: 2246 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 8 Data size: 9212 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: year (type: string) outputColumnNames: _col0 http://git-wip-us.apache.org/repos/asf/hive/blob/e26b9325/ql/src/test/results/clientpositive/annotate_stats_table.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/annotate_stats_table.q.out b/ql/src/test/results/clientpositive/annotate_stats_table.q.out index f61e8d8..83d241c 100644 --- a/ql/src/test/results/clientpositive/annotate_stats_table.q.out +++ b/ql/src/test/results/clientpositive/annotate_stats_table.q.out @@ -81,11 +81,11 @@ STAGE PLANS: Processor Tree: TableScan alias: emp_orc - Statistics: Num rows: 3 Data size: 564 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 37 Data size: 6956 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: lastname (type: string), deptid (type: int) outputColumnNames: _col0, _col1 - Statistics: Num rows: 3 Data size: 564 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 37 Data size: 6956 Basic stats: COMPLETE Column stats: NONE ListSink PREHOOK: query: analyze table emp_orc compute statistics @@ -295,7 +295,7 @@ STAGE PLANS: TableScan alias: _dummy_table Row Limit Per Split: 1 - Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: COMPLETE + Statistics: Num rows: 1 Data size: 10 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: 1 (type: int) outputColumnNames: _col0 http://git-wip-us.apache.org/repos/asf/hive/blob/e26b9325/ql/src/test/results/clientpositive/autoColumnStats_5.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/autoColumnStats_5.q.out b/ql/src/test/results/clientpositive/autoColumnStats_5.q.out index 196d18d..2655bfd 100644 --- a/ql/src/test/results/clientpositive/autoColumnStats_5.q.out +++ b/ql/src/test/results/clientpositive/autoColumnStats_5.q.out @@ -27,14 +27,14 @@ STAGE PLANS: Map Operator Tree: TableScan alias: values__tmp__table__1 - Statistics: Num rows: 1 Data size: 44 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: UDFToInteger(tmp_values_col1) (type: int), tmp_values_col2 (type: string) outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 44 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 44 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat @@ -43,18 +43,18 @@ STAGE PLANS: Select Operator expressions: _col0 (type: int), _col1 (type: string), UDFToInteger('1') (type: int) outputColumnNames: a, b, part - Statistics: Num rows: 1 Data size: 44 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: compute_stats(a, 'hll'), compute_stats(b, 'hll') keys: part (type: int) mode: hash outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 1 Data size: 44 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 1 Data size: 44 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,bitvector:binary>), _col2 (type: struct<columntype:string,maxlength:bigint,sumlength:bigint,count:bigint,countnulls:bigint,bitvector:binary>) Reduce Operator Tree: Group By Operator @@ -62,14 +62,14 @@ STAGE PLANS: keys: KEY._col0 (type: int) mode: mergepartial outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 1 Data size: 44 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 220 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col1 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col2 (type: struct<columntype:string,maxlength:bigint,avglength:double,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col0 (type: int) outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 1 Data size: 44 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 220 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 44 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 220 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -267,14 +267,14 @@ STAGE PLANS: Map Operator Tree: TableScan alias: values__tmp__table__3 - Statistics: Num rows: 1 Data size: 60 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 600 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: UDFToInteger(tmp_values_col1) (type: int), tmp_values_col2 (type: string), UDFToInteger(tmp_values_col3) (type: int), tmp_values_col4 (type: string) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 1 Data size: 60 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 600 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 60 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 600 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat @@ -283,18 +283,18 @@ STAGE PLANS: Select Operator expressions: _col0 (type: int), _col1 (type: string), _col2 (type: int), _col3 (type: string), UDFToInteger('2') (type: int) outputColumnNames: a, b, c, d, part - Statistics: Num rows: 1 Data size: 60 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 600 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: compute_stats(a, 'hll'), compute_stats(b, 'hll'), compute_stats(c, 'hll'), compute_stats(d, 'hll') keys: part (type: int) mode: hash outputColumnNames: _col0, _col1, _col2, _col3, _col4 - Statistics: Num rows: 1 Data size: 60 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 600 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 1 Data size: 60 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 600 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,bitvector:binary>), _col2 (type: struct<columntype:string,maxlength:bigint,sumlength:bigint,count:bigint,countnulls:bigint,bitvector:binary>), _col3 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,bitvector:binary>), _col4 (type: struct<columntype:string,maxlength:bigint,sumlength:bigint,count:bigint,countnulls:bigint,bitvector:binary>) Reduce Operator Tree: Group By Operator @@ -302,14 +302,14 @@ STAGE PLANS: keys: KEY._col0 (type: int) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3, _col4 - Statistics: Num rows: 1 Data size: 60 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 600 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col1 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col2 (type: struct<columntype:string,maxlength:bigint,avglength:double,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col3 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col4 (type: struct<columntype:string,maxlength:bigint,avglength:double,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col0 (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4 - Statistics: Num rows: 1 Data size: 60 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 600 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 60 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 600 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -465,14 +465,14 @@ STAGE PLANS: Map Operator Tree: TableScan alias: values__tmp__table__5 - Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 400 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: UDFToInteger(tmp_values_col1) (type: int), tmp_values_col2 (type: string), UDFToInteger(tmp_values_col3) (type: int), tmp_values_col4 (type: string) outputColumnNames: _col0, _col1, _col2, _col3 - Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 400 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 400 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat @@ -481,18 +481,18 @@ STAGE PLANS: Select Operator expressions: _col0 (type: int), _col1 (type: string), _col2 (type: int), _col3 (type: string), UDFToInteger('1') (type: int) outputColumnNames: a, b, c, d, part - Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 400 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: compute_stats(a, 'hll'), compute_stats(b, 'hll'), compute_stats(c, 'hll'), compute_stats(d, 'hll') keys: part (type: int) mode: hash outputColumnNames: _col0, _col1, _col2, _col3, _col4 - Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 400 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 400 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,bitvector:binary>), _col2 (type: struct<columntype:string,maxlength:bigint,sumlength:bigint,count:bigint,countnulls:bigint,bitvector:binary>), _col3 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,bitvector:binary>), _col4 (type: struct<columntype:string,maxlength:bigint,sumlength:bigint,count:bigint,countnulls:bigint,bitvector:binary>) Reduce Operator Tree: Group By Operator @@ -500,14 +500,14 @@ STAGE PLANS: keys: KEY._col0 (type: int) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3, _col4 - Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 400 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col1 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col2 (type: struct<columntype:string,maxlength:bigint,avglength:double,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col3 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col4 (type: struct<columntype:string,maxlength:bigint,avglength:double,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col0 (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4 - Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 400 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 400 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/e26b9325/ql/src/test/results/clientpositive/autoColumnStats_5a.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/autoColumnStats_5a.q.out b/ql/src/test/results/clientpositive/autoColumnStats_5a.q.out index d97e1c6..d173c98 100644 --- a/ql/src/test/results/clientpositive/autoColumnStats_5a.q.out +++ b/ql/src/test/results/clientpositive/autoColumnStats_5a.q.out @@ -29,19 +29,19 @@ STAGE PLANS: Map Operator Tree: TableScan alias: values__tmp__table__1 - Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 110 Basic stats: COMPLETE Column stats: NONE GatherStats: false Select Operator expressions: UDFToInteger(tmp_values_col1) (type: int), tmp_values_col2 (type: string) outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 110 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false GlobalTableId: 1 #### A masked pattern was here #### NumFilesPerFileSink: 1 Static Partition Specification: part=1/ - Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 110 Basic stats: COMPLETE Column stats: NONE #### A masked pattern was here #### table: input format: org.apache.hadoop.mapred.TextInputFormat @@ -67,19 +67,19 @@ STAGE PLANS: Select Operator expressions: _col0 (type: int), _col1 (type: string), UDFToInteger('1') (type: int) outputColumnNames: a, b, part - Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 110 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: compute_stats(a, 'hll'), compute_stats(b, 'hll') keys: part (type: int) mode: hash outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 110 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) null sort order: a sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 110 Basic stats: COMPLETE Column stats: NONE tag: -1 value expressions: _col1 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,bitvector:binary>), _col2 (type: struct<columntype:string,maxlength:bigint,sumlength:bigint,count:bigint,countnulls:bigint,bitvector:binary>) auto parallelism: false @@ -129,17 +129,17 @@ STAGE PLANS: keys: KEY._col0 (type: int) mode: mergepartial outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 110 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col1 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col2 (type: struct<columntype:string,maxlength:bigint,avglength:double,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col0 (type: int) outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 110 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false GlobalTableId: 0 #### A masked pattern was here #### NumFilesPerFileSink: 1 - Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 110 Basic stats: COMPLETE Column stats: NONE #### A masked pattern was here #### table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat @@ -430,19 +430,19 @@ STAGE PLANS: Map Operator Tree: TableScan alias: values__tmp__table__3 - Statistics: Num rows: 1 Data size: 33 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 330 Basic stats: COMPLETE Column stats: NONE GatherStats: false Select Operator expressions: UDFToInteger(tmp_values_col1) (type: int), tmp_values_col2 (type: string) outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 33 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 330 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false GlobalTableId: 1 #### A masked pattern was here #### NumFilesPerFileSink: 1 Static Partition Specification: part=1/ - Statistics: Num rows: 1 Data size: 33 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 330 Basic stats: COMPLETE Column stats: NONE #### A masked pattern was here #### table: input format: org.apache.hadoop.mapred.TextInputFormat @@ -468,19 +468,19 @@ STAGE PLANS: Select Operator expressions: _col0 (type: int), _col1 (type: string), UDFToInteger('1') (type: int) outputColumnNames: a, b, part - Statistics: Num rows: 1 Data size: 33 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 330 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: compute_stats(a, 'hll'), compute_stats(b, 'hll') keys: part (type: int) mode: hash outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 1 Data size: 33 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 330 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) null sort order: a sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 1 Data size: 33 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 330 Basic stats: COMPLETE Column stats: NONE tag: -1 value expressions: _col1 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,bitvector:binary>), _col2 (type: struct<columntype:string,maxlength:bigint,sumlength:bigint,count:bigint,countnulls:bigint,bitvector:binary>) auto parallelism: false @@ -530,17 +530,17 @@ STAGE PLANS: keys: KEY._col0 (type: int) mode: mergepartial outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 1 Data size: 33 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 330 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col1 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col2 (type: struct<columntype:string,maxlength:bigint,avglength:double,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col0 (type: int) outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 1 Data size: 33 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 330 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false GlobalTableId: 0 #### A masked pattern was here #### NumFilesPerFileSink: 1 - Statistics: Num rows: 1 Data size: 33 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 330 Basic stats: COMPLETE Column stats: NONE #### A masked pattern was here #### table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat @@ -791,14 +791,14 @@ STAGE PLANS: Map Operator Tree: TableScan alias: values__tmp__table__5 - Statistics: Num rows: 1 Data size: 44 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: UDFToInteger(tmp_values_col1) (type: int), tmp_values_col2 (type: string) outputColumnNames: _col0, _col1 - Statistics: Num rows: 1 Data size: 44 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 44 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat @@ -807,18 +807,18 @@ STAGE PLANS: Select Operator expressions: _col0 (type: int), _col1 (type: string), UDFToInteger('1') (type: int) outputColumnNames: a, b, part - Statistics: Num rows: 1 Data size: 44 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: compute_stats(a, 'hll'), compute_stats(b, 'hll') keys: part (type: int) mode: hash outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 1 Data size: 44 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) - Statistics: Num rows: 1 Data size: 44 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2 Data size: 440 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,bitvector:binary>), _col2 (type: struct<columntype:string,maxlength:bigint,sumlength:bigint,count:bigint,countnulls:bigint,bitvector:binary>) Reduce Operator Tree: Group By Operator @@ -826,14 +826,14 @@ STAGE PLANS: keys: KEY._col0 (type: int) mode: mergepartial outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 1 Data size: 44 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 220 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col1 (type: struct<columntype:string,min:bigint,max:bigint,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col2 (type: struct<columntype:string,maxlength:bigint,avglength:double,countnulls:bigint,numdistinctvalues:bigint,ndvbitvector:binary>), _col0 (type: int) outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 1 Data size: 44 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 220 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 1 Data size: 44 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 220 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/e26b9325/ql/src/test/results/clientpositive/auto_join_stats.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/auto_join_stats.q.out b/ql/src/test/results/clientpositive/auto_join_stats.q.out index cb21718..1f5c74e 100644 --- a/ql/src/test/results/clientpositive/auto_join_stats.q.out +++ b/ql/src/test/results/clientpositive/auto_join_stats.q.out @@ -384,14 +384,14 @@ STAGE PLANS: $hdt$_3:smalltable2 TableScan alias: smalltable2 - Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 300 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) - Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 300 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 300 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 (UDFToDouble(_col0) + UDFToDouble(_col1)) (type: double) http://git-wip-us.apache.org/repos/asf/hive/blob/e26b9325/ql/src/test/results/clientpositive/auto_join_stats2.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/auto_join_stats2.q.out b/ql/src/test/results/clientpositive/auto_join_stats2.q.out index 1a3caa6..dc2a929 100644 --- a/ql/src/test/results/clientpositive/auto_join_stats2.q.out +++ b/ql/src/test/results/clientpositive/auto_join_stats2.q.out @@ -53,14 +53,14 @@ STAGE PLANS: $hdt$_2:smalltable TableScan alias: smalltable - Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 300 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) - Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 300 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 300 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 @@ -86,7 +86,7 @@ STAGE PLANS: 0 1 outputColumnNames: _col0, _col1 - Statistics: Num rows: 500 Data size: 20812 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 500 Data size: 155812 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 @@ -94,17 +94,17 @@ STAGE PLANS: 0 _col0 (type: string) 1 _col0 (type: string) outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 550 Data size: 22893 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 550 Data size: 171393 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: ((UDFToDouble(_col2) + UDFToDouble(_col0)) = UDFToDouble(_col1)) (type: boolean) - Statistics: Num rows: 275 Data size: 11446 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 275 Data size: 85696 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col2 (type: string), _col0 (type: string), _col1 (type: string) outputColumnNames: _col0, _col1, _col2 - Statistics: Num rows: 275 Data size: 11446 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 275 Data size: 85696 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false - Statistics: Num rows: 275 Data size: 11446 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 275 Data size: 85696 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -212,14 +212,14 @@ STAGE PLANS: $hdt$_3:smalltable2 TableScan alias: smalltable2 - Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 300 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) - Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 300 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 300 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 (UDFToDouble(_col0) + UDFToDouble(_col1)) (type: double) http://git-wip-us.apache.org/repos/asf/hive/blob/e26b9325/ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out b/ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out index 1ed3dd0..010f05d 100644 --- a/ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out +++ b/ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out @@ -667,10 +667,10 @@ STAGE PLANS: $hdt$_3:d TableScan alias: d - Statistics: Num rows: 1 Data size: 170 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 1700 Basic stats: PARTIAL Column stats: NONE GatherStats: false Select Operator - Statistics: Num rows: 1 Data size: 170 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 1700 Basic stats: PARTIAL Column stats: NONE HashTable Sink Operator keys: 0 @@ -689,7 +689,7 @@ STAGE PLANS: 0 1 Position of Big Table: 0 - Statistics: Num rows: 255 Data size: 69177 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 255 Data size: 459327 Basic stats: PARTIAL Column stats: NONE Group By Operator aggregations: count() mode: hash @@ -882,10 +882,10 @@ STAGE PLANS: Map Operator Tree: TableScan alias: d - Statistics: Num rows: 1 Data size: 170 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 1700 Basic stats: PARTIAL Column stats: NONE GatherStats: false Select Operator - Statistics: Num rows: 1 Data size: 170 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 1700 Basic stats: PARTIAL Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 @@ -893,7 +893,7 @@ STAGE PLANS: 0 1 Position of Big Table: 1 - Statistics: Num rows: 255 Data size: 69177 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 255 Data size: 459327 Basic stats: PARTIAL Column stats: NONE Group By Operator aggregations: count() mode: hash @@ -1009,14 +1009,14 @@ STAGE PLANS: auto parallelism: false TableScan alias: d - Statistics: Num rows: 1 Data size: 170 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 1700 Basic stats: PARTIAL Column stats: NONE GatherStats: false Select Operator - Statistics: Num rows: 1 Data size: 170 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 1700 Basic stats: PARTIAL Column stats: NONE Reduce Output Operator null sort order: sort order: - Statistics: Num rows: 1 Data size: 170 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 1 Data size: 1700 Basic stats: PARTIAL Column stats: NONE tag: 1 auto parallelism: false Path -> Alias: @@ -1104,7 +1104,7 @@ STAGE PLANS: keys: 0 1 - Statistics: Num rows: 255 Data size: 69177 Basic stats: PARTIAL Column stats: NONE + Statistics: Num rows: 255 Data size: 459327 Basic stats: PARTIAL Column stats: NONE Group By Operator aggregations: count() mode: hash http://git-wip-us.apache.org/repos/asf/hive/blob/e26b9325/ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out b/ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out index 0e6bbf1..cacc3d4 100644 --- a/ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out +++ b/ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out @@ -76,16 +76,16 @@ STAGE PLANS: Map Operator Tree: TableScan alias: b - Statistics: Num rows: 1 Data size: 2750 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 27500 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 1 Data size: 2750 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 27500 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 2750 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 27500 Basic stats: COMPLETE Column stats: NONE Sorted Merge Bucket Map Join Operator condition map: Inner Join 0 to 1 @@ -216,16 +216,16 @@ STAGE PLANS: Map Operator Tree: TableScan alias: a - Statistics: Num rows: 1 Data size: 2750 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 27500 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 1 Data size: 2750 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 27500 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 2750 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 27500 Basic stats: COMPLETE Column stats: NONE Sorted Merge Bucket Map Join Operator condition map: Inner Join 0 to 1 @@ -369,16 +369,16 @@ STAGE PLANS: $hdt$_1:b TableScan alias: b - Statistics: Num rows: 1 Data size: 226 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 2260 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 1 Data size: 226 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 2260 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 226 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 2260 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col0 (type: string) @@ -390,16 +390,16 @@ STAGE PLANS: Map Operator Tree: TableScan alias: a - Statistics: Num rows: 1 Data size: 2750 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 27500 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 1 Data size: 2750 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 27500 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 2750 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 27500 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 @@ -551,16 +551,16 @@ STAGE PLANS: $hdt$_0:a TableScan alias: a - Statistics: Num rows: 1 Data size: 2750 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 27500 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 1 Data size: 2750 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 27500 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 2750 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 27500 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col0 (type: string) @@ -572,16 +572,16 @@ STAGE PLANS: Map Operator Tree: TableScan alias: b - Statistics: Num rows: 1 Data size: 226 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 2260 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 1 Data size: 226 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 2260 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 226 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 2260 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 @@ -728,16 +728,16 @@ STAGE PLANS: Map Operator Tree: TableScan alias: a - Statistics: Num rows: 1 Data size: 2750 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 27500 Basic stats: COMPLETE Column stats: NONE GatherStats: false Filter Operator isSamplingPred: false predicate: key is not null (type: boolean) - Statistics: Num rows: 1 Data size: 2750 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 27500 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 2750 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 1 Data size: 27500 Basic stats: COMPLETE Column stats: NONE Sorted Merge Bucket Map Join Operator condition map: Inner Join 0 to 1