[jira] [Commented] (HIVE-11890) Create ORC module
[ https://issues.apache.org/jira/browse/HIVE-11890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054814#comment-15054814 ] Owen O'Malley commented on HIVE-11890: -- Yeah, I still have the Reader & RecordReader to move. > Create ORC module > - > > Key: HIVE-11890 > URL: https://issues.apache.org/jira/browse/HIVE-11890 > Project: Hive > Issue Type: Sub-task > Components: ORC >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 2.0.0 > > Attachments: HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, > HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, > HIVE-11890.patch > > > Start moving classes over to the ORC module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12055) Create row-by-row shims for the write path
[ https://issues.apache.org/jira/browse/HIVE-12055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054792#comment-15054792 ] Hive QA commented on HIVE-12055: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777322/HIVE-12055.patch {color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 9895 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge10 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge11 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testFetchingPartitionsWithDifferentSchemas org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarDataNucleusUnCaching org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testErrorJob org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testMetricsCollection org.apache.hive.spark.client.TestSparkClient.testRemoteClient org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6338/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6338/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6338/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 19 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12777322 - PreCommit-HIVE-TRUNK-Build > Create row-by-row shims for the write path > --- > > Key: HIVE-12055 > URL: https://issues.apache.org/jira/browse/HIVE-12055 > Project: Hive > Issue Type: Sub-task > Components: ORC, Shims >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Attachments: HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, > HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch > > > As part of removing the row-by-row writer, we'll need to shim out the higher > level API (OrcSerde and OrcOutputFormat) so that we maintain backwards > compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12656) Turn hive.compute.query.using.stats on by default
[ https://issues.apache.org/jira/browse/HIVE-12656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054774#comment-15054774 ] Hive QA commented on HIVE-12656: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777306/HIVE-12656.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 111 failed/errored test(s), 9865 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-vector_decimal_round.q-cbo_windowing.q-tez_schema_evolution.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_2_orc org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_orc org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_udf_udaf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_udaf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization_acid org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_escape1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_escape2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_dependency2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_fold_case org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_orig_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_orig_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_boolexpr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_coltype_literals org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_date org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_decode_name org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_special_char org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_timestamp org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_varchar1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_plan_json org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_constant_where org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rename_partition_location org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_unquote_and org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_unquote_not org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_unquote_or org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_18 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_19 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_noscan_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_u
[jira] [Updated] (HIVE-11775) Implement limit push down through union all in CBO
[ https://issues.apache.org/jira/browse/HIVE-11775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11775: --- Attachment: HIVE-11775.07.patch > Implement limit push down through union all in CBO > -- > > Key: HIVE-11775 > URL: https://issues.apache.org/jira/browse/HIVE-11775 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11775.01.patch, HIVE-11775.02.patch, > HIVE-11775.03.patch, HIVE-11775.04.patch, HIVE-11775.05.patch, > HIVE-11775.06.patch, HIVE-11775.07.patch > > > Enlightened by HIVE-11684 (Kudos to [~jcamachorodriguez]), we can actually > push limit down through union all, which reduces the intermediate number of > rows in union branches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12055) Create row-by-row shims for the write path
[ https://issues.apache.org/jira/browse/HIVE-12055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-12055: - Attachment: HIVE-12055.patch Reloaded for jenkins. > Create row-by-row shims for the write path > --- > > Key: HIVE-12055 > URL: https://issues.apache.org/jira/browse/HIVE-12055 > Project: Hive > Issue Type: Sub-task > Components: ORC, Shims >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Attachments: HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, > HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch > > > As part of removing the row-by-row writer, we'll need to shim out the higher > level API (OrcSerde and OrcOutputFormat) so that we maintain backwards > compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly
[ https://issues.apache.org/jira/browse/HIVE-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054732#comment-15054732 ] Hive QA commented on HIVE-12661: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777298/HIVE-12661.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1322 failed/errored test(s), 9881 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-update_orig_table.q-mapreduce2.q-load_dyn_part3.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_allcolref_in_udf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_file_format org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_stats_orc org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table2_h23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table_h23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_clusterby_sortby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_skewed_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_not_sorted org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ambiguitycheck org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ambiguous_col org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_analyze_table_null_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_deep_filters org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_limit org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_union org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ansi_sql_arithmetic org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join0 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join15 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join16 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join17 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18_multi_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join19 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join21 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join22 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join27 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join28 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join29 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join30 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join31 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join33 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_stats
[jira] [Commented] (HIVE-12590) Repeated UDAFs with literals can produce incorrect result
[ https://issues.apache.org/jira/browse/HIVE-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054663#comment-15054663 ] Hive QA commented on HIVE-12590: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777291/HIVE-12590.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 41 failed/errored test(s), 9895 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join42 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_mapjoin_reduce org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning_2 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_hybridgrace_hashjoin_1 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_self_join org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_hybridgrace_hashjoin_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_join_nullsafe org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_ptf_matchpath org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_self_join org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_coalesce org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_date_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_round_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_interval_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_mapjoin_reduce org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_constprog_partitioner org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_mapjoin_reduce org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testFetchingPartitionsWithDifferentSchemas org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testErrorJob org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testMetricsCollection org.apache.hive.spark.client.TestSparkClient.testRemoteClient org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6334/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6334/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6334/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 41 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12777291 - PreCommit-HIVE-TRUNK-Build > Repeated UDAFs with literals can produce incorrect result > - > > Key: HIVE-12590 > URL: https://issues.apache.org/jira/browse/HIVE-12590 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 1.0.1, 1.1.1, 1.2.1, 2.0.0 >Reporter: Laljo John Pullokkaran >Assignee: Ashutosh Chauhan >Priority: Critical
[jira] [Updated] (HIVE-12656) Turn hive.compute.query.using.stats on by default
[ https://issues.apache.org/jira/browse/HIVE-12656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-12656: --- Attachment: HIVE-12656.01.patch > Turn hive.compute.query.using.stats on by default > - > > Key: HIVE-12656 > URL: https://issues.apache.org/jira/browse/HIVE-12656 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12656.01.patch > > > We now have hive.compute.query.using.stats=false by default. We plan to turn > it on by default so that we can have better performance. We can also set it > to false in some test cases to maintain the original purpose of those tests.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12662) StackOverflowError in HiveSortJoinReduceRule when limit=0
[ https://issues.apache.org/jira/browse/HIVE-12662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054616#comment-15054616 ] Hive QA commented on HIVE-12662: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777282/HIVE-12662.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 9895 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testFetchingPartitionsWithDifferentSchemas org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testErrorJob org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testMetricsCollection org.apache.hive.spark.client.TestSparkClient.testRemoteClient org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6333/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6333/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6333/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 17 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12777282 - PreCommit-HIVE-TRUNK-Build > StackOverflowError in HiveSortJoinReduceRule when limit=0 > - > > Key: HIVE-12662 > URL: https://issues.apache.org/jira/browse/HIVE-12662 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12662.patch > > > L96 of HiveSortJoinReduceRule, you will see > {noformat} > // Finally, if we do not reduce the input size, we bail out > if (RexLiteral.intValue(sortLimit.fetch) > >= RelMetadataQuery.getRowCount(reducedInput)) { > return false; > } > {noformat} > It is using “ RelMetadataQuery.getRowCount” which is always at least 1. This > is the problem that we resolved in CALCITE-987. > To confirm this, I just run the q file : > {noformat} > set hive.mapred.mode=nonstrict; > set hive.optimize.limitjointranspose=true; > set hive.optimize.limitjointranspose.reductionpercentage=1f; > set hive.optimize.limitjointranspose.reductiontuples=0; > explain > select * > from src src1 right outer join ( > select * > from src src2 left outer join src src3 > on src2.value = src3.value) src2 > on src1.key = src2.key > limit 0; > {noformat} > And I got > {noformat} > 2015-12-11T10:21:04,435 ERROR [c1efb099-f900-46dc-9f74-97af0944a99d main[]]: > parse.CalcitePlanner (CalcitePlanner.java:genOPTree(301)) - CBO failed, > skipping CBO. > java.lang.RuntimeException: java.lang.StackOverflowError > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.rethrowCalciteException(CalcitePlanner.java:749) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:645) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:264) > [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10076) > [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:223) > [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > [hive-exec-2
[jira] [Commented] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference
[ https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054532#comment-15054532 ] Hive QA commented on HIVE-12478: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777279/HIVE-12478.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 139 failed/errored test(s), 9880 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join16 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_without_localtask org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_lineage2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_cond_pushdown org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_position org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_self_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join16 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join19 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32_lessSize org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join33 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join42 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_alt_syntax org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_hive_626 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_parse org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_star org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_vc org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_louter_join_ppr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_mapjoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mergejoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mergejoins org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_join_union org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_outer_join4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_outer_join5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_router_join_ppr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoin_mapjoin8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoin_mapjoin9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoinopt13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoinopt14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_views org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_table_access_keys_stats org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_mapjoin_reduce org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_mr_diff_schema_alias org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_context org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_nested_mapjoin org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver
[jira] [Updated] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly
[ https://issues.apache.org/jira/browse/HIVE-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-12661: --- Attachment: HIVE-12661.01.patch > StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly > --- > > Key: HIVE-12661 > URL: https://issues.apache.org/jira/browse/HIVE-12661 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12661.01.patch > > > PROBLEM: > Hive stats are autogathered properly till an 'analyze table [tablename] > compute statistics for columns' is run. Then it does not auto-update the > stats till the command is run again. repo: > {code} > set hive.stats.autogather=true; > set hive.stats.atomic=false ; > set hive.stats.collect.rawdatasize=true ; > set hive.stats.collect.scancols=false ; > set hive.stats.collect.tablekeys=false ; > set hive.stats.fetch.column.stats=true; > set hive.stats.fetch.partition.stats=true ; > set hive.stats.reliable=false ; > set hive.compute.query.using.stats=true; > CREATE TABLE `default`.`calendar` (`year` int) ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' TBLPROPERTIES ( > 'orc.compress'='NONE') ; > insert into calendar values (2010), (2011), (2012); > select * from calendar; > ++--+ > | calendar.year | > ++--+ > | 2010 | > | 2011 | > | 2012 | > ++--+ > select max(year) from calendar; > | 2012 | > insert into calendar values (2013); > select * from calendar; > ++--+ > | calendar.year | > ++--+ > | 2010 | > | 2011 | > | 2012 | > | 2013 | > ++--+ > select max(year) from calendar; > | 2013 | > insert into calendar values (2014); > select max(year) from calendar; > | 2014 | > analyze table calendar compute statistics for columns; > insert into calendar values (2015); > select max(year) from calendar; > | 2014 | > insert into calendar values (2016), (2017), (2018); > select max(year) from calendar; > | 2014 | > analyze table calendar compute statistics for columns; > select max(year) from calendar; > | 2018 | > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12590) Repeated UDAFs with literals can produce incorrect result
[ https://issues.apache.org/jira/browse/HIVE-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-12590: Attachment: HIVE-12590.2.patch > Repeated UDAFs with literals can produce incorrect result > - > > Key: HIVE-12590 > URL: https://issues.apache.org/jira/browse/HIVE-12590 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 1.0.1, 1.1.1, 1.2.1, 2.0.0 >Reporter: Laljo John Pullokkaran >Assignee: Ashutosh Chauhan >Priority: Critical > Attachments: HIVE-12590.2.patch, HIVE-12590.patch > > > Repeated UDAF with literals could produce wrong result. > This is not a common use case, nevertheless a bug. > hive> select max('pants'), max('pANTS') from t1 group by key; > Total MapReduce CPU Time Spent: 0 msec > OK > pANTS pANTS > pANTS pANTS > pANTS pANTS > pANTS pANTS > pANTS pANTS > Time taken: 296.252 seconds, Fetched: 5 row(s) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions
[ https://issues.apache.org/jira/browse/HIVE-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054367#comment-15054367 ] Hive QA commented on HIVE-12643: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777237/HIVE-12643.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 9896 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testErrorJob org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testMetricsCollection org.apache.hive.spark.client.TestSparkClient.testRemoteClient org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6331/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6331/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6331/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 17 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12777237 - PreCommit-HIVE-TRUNK-Build > For self describing InputFormat don't replicate schema information in > partitions > > > Key: HIVE-12643 > URL: https://issues.apache.org/jira/browse/HIVE-12643 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 2.0.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-12643.1.patch, HIVE-12643.2.patch, HIVE-12643.patch > > > Since self describing Input Formats don't use individual partition schemas > for schema resolution, there is no need to send that info to tasks. > Doing this should cut down plan size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12662) StackOverflowError in HiveSortJoinReduceRule when limit=0
[ https://issues.apache.org/jira/browse/HIVE-12662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054331#comment-15054331 ] Pengcheng Xiong commented on HIVE-12662: LGTM +1. I think we do not have any better solution so far for Hive until next release of Calcite. > StackOverflowError in HiveSortJoinReduceRule when limit=0 > - > > Key: HIVE-12662 > URL: https://issues.apache.org/jira/browse/HIVE-12662 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12662.patch > > > L96 of HiveSortJoinReduceRule, you will see > {noformat} > // Finally, if we do not reduce the input size, we bail out > if (RexLiteral.intValue(sortLimit.fetch) > >= RelMetadataQuery.getRowCount(reducedInput)) { > return false; > } > {noformat} > It is using “ RelMetadataQuery.getRowCount” which is always at least 1. This > is the problem that we resolved in CALCITE-987. > To confirm this, I just run the q file : > {noformat} > set hive.mapred.mode=nonstrict; > set hive.optimize.limitjointranspose=true; > set hive.optimize.limitjointranspose.reductionpercentage=1f; > set hive.optimize.limitjointranspose.reductiontuples=0; > explain > select * > from src src1 right outer join ( > select * > from src src2 left outer join src src3 > on src2.value = src3.value) src2 > on src1.key = src2.key > limit 0; > {noformat} > And I got > {noformat} > 2015-12-11T10:21:04,435 ERROR [c1efb099-f900-46dc-9f74-97af0944a99d main[]]: > parse.CalcitePlanner (CalcitePlanner.java:genOPTree(301)) - CBO failed, > skipping CBO. > java.lang.RuntimeException: java.lang.StackOverflowError > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.rethrowCalciteException(CalcitePlanner.java:749) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:645) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:264) > [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10076) > [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:223) > [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:456) > [hive-exec-2.1.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:310) > [hive-exec-2.1.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1138) > [hive-exec-2.1.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1187) > [hive-exec-2.1.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1063) > [hive-exec-2.1.0-SNAPSHOT.jar:?] > {noformat} > via [~pxiong] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12541) SymbolicTextInputFormat should supports the path with regex
[ https://issues.apache.org/jira/browse/HIVE-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054319#comment-15054319 ] Hive QA commented on HIVE-12541: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777227/HIVE-12541.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 9895 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_symlink_text_input_format org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testErrorJob org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testMetricsCollection org.apache.hive.spark.client.TestSparkClient.testRemoteClient org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6330/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6330/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6330/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 18 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12777227 - PreCommit-HIVE-TRUNK-Build > SymbolicTextInputFormat should supports the path with regex > --- > > Key: HIVE-12541 > URL: https://issues.apache.org/jira/browse/HIVE-12541 > Project: Hive > Issue Type: Improvement >Affects Versions: 0.14.0, 1.2.0, 1.2.1 >Reporter: Xiaowei Wang >Assignee: Xiaowei Wang > Attachments: HIVE-12541.1.patch, HIVE-12541.2.patch > > > 1, In fact,SybolicTextInputFormat supports the path with regex .I add some > test sql . > 2, But ,when using CombineHiveInputFormat to combine input files , It cannot > resolve the path with regex ,so it will get a wrong result.I give a example > ,and fix the problem. > Table desc : > {noformat} > CREATE External TABLE `symlink_text_input_format`( > `key` string, > `value` string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'viewfs://nsX/user/hive/warehouse/symlink_text_input_format' > {noformat} > There is a link file in the dir > '/user/hive/warehouse/symlink_text_input_format' , the content of the link > file is > {noformat} > viewfs://nsx/tmp/symlink* > {noformat} > it contains one path ,and the path contains a regex! > Execute the sql : > {noformat} > set hive.rework.mapredwork = true ; > set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat; > set mapred.min.split.size.per.rack= 0 ; > set mapred.min.split.size.per.node= 0 ; > set mapred.max.split.size= 0 ; > select count(*) from symlink_text_input_format ; > {noformat} > It will get a wrong result :0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12662) StackOverflowError in HiveSortJoinReduceRule when limit=0
[ https://issues.apache.org/jira/browse/HIVE-12662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-12662: --- Attachment: HIVE-12662.patch [~pxiong], could you take a look? Thanks > StackOverflowError in HiveSortJoinReduceRule when limit=0 > - > > Key: HIVE-12662 > URL: https://issues.apache.org/jira/browse/HIVE-12662 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12662.patch > > > L96 of HiveSortJoinReduceRule, you will see > {noformat} > // Finally, if we do not reduce the input size, we bail out > if (RexLiteral.intValue(sortLimit.fetch) > >= RelMetadataQuery.getRowCount(reducedInput)) { > return false; > } > {noformat} > It is using “ RelMetadataQuery.getRowCount” which is always at least 1. This > is the problem that we resolved in CALCITE-987. > To confirm this, I just run the q file : > {noformat} > set hive.mapred.mode=nonstrict; > set hive.optimize.limitjointranspose=true; > set hive.optimize.limitjointranspose.reductionpercentage=1f; > set hive.optimize.limitjointranspose.reductiontuples=0; > explain > select * > from src src1 right outer join ( > select * > from src src2 left outer join src src3 > on src2.value = src3.value) src2 > on src1.key = src2.key > limit 0; > {noformat} > And I got > {noformat} > 2015-12-11T10:21:04,435 ERROR [c1efb099-f900-46dc-9f74-97af0944a99d main[]]: > parse.CalcitePlanner (CalcitePlanner.java:genOPTree(301)) - CBO failed, > skipping CBO. > java.lang.RuntimeException: java.lang.StackOverflowError > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.rethrowCalciteException(CalcitePlanner.java:749) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:645) > ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:264) > [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10076) > [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:223) > [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:456) > [hive-exec-2.1.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:310) > [hive-exec-2.1.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1138) > [hive-exec-2.1.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1187) > [hive-exec-2.1.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1063) > [hive-exec-2.1.0-SNAPSHOT.jar:?] > {noformat} > via [~pxiong] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference
[ https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-12478: --- Attachment: HIVE-12478.patch > Improve Hive/Calcite Trasitive Predicate inference > -- > > Key: HIVE-12478 > URL: https://issues.apache.org/jira/browse/HIVE-12478 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 >Reporter: Laljo John Pullokkaran >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12478.patch > > > HiveJoinPushTransitivePredicatesRule does not pull up predicates for > transitive inference if they contain more than one column. > EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from > srcpart where (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference
[ https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-12478: --- Attachment: (was: HIVE-12478.patch) > Improve Hive/Calcite Trasitive Predicate inference > -- > > Key: HIVE-12478 > URL: https://issues.apache.org/jira/browse/HIVE-12478 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 >Reporter: Laljo John Pullokkaran >Assignee: Jesus Camacho Rodriguez > > HiveJoinPushTransitivePredicatesRule does not pull up predicates for > transitive inference if they contain more than one column. > EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from > srcpart where (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference
[ https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054278#comment-15054278 ] Jesus Camacho Rodriguez commented on HIVE-12478: Initial patch groups PPD and JoinTransitive inference in one single invocation. I have been running some local tests, and we do not fall in any loop. In addition, HiveRuleRegistry has been enhanced in case we want to register predicates that have been pushed. However, the enhancement is not used yet. Further, I think we will not need to use it, as being able to register the operators that we have already visited in a given rule will be sufficient. Another discussion that we had is whether HiveRuleRegistry should register the operators or the digest of the plan. We still register the operators objects. The reason is that if we had registered the digest of a given subplan in the tree, and the same subplan appears somewhere else in the tree, we would not trigger the optimization (and we should trigger it). [~pxiong], could you check if after applying this patch, ReduceExpressions could be plugged in the same invocation as PPD and join transitive inference (step 3 in pre-join optimizations)? > Improve Hive/Calcite Trasitive Predicate inference > -- > > Key: HIVE-12478 > URL: https://issues.apache.org/jira/browse/HIVE-12478 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 >Reporter: Laljo John Pullokkaran >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12478.patch > > > HiveJoinPushTransitivePredicatesRule does not pull up predicates for > transitive inference if they contain more than one column. > EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from > srcpart where (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference
[ https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-12478: --- Attachment: HIVE-12478.patch > Improve Hive/Calcite Trasitive Predicate inference > -- > > Key: HIVE-12478 > URL: https://issues.apache.org/jira/browse/HIVE-12478 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 >Reporter: Laljo John Pullokkaran >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12478.patch > > > HiveJoinPushTransitivePredicatesRule does not pull up predicates for > transitive inference if they contain more than one column. > EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from > srcpart where (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12590) Repeated UDAFs with literals can produce incorrect result
[ https://issues.apache.org/jira/browse/HIVE-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054261#comment-15054261 ] Hive QA commented on HIVE-12590: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777223/HIVE-12590.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 104 failed/errored test(s), 9895 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_max org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cluster org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constprog2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cp_sel org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization_acid org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_cond_pushdown org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_identity_project_remove_skip org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input25 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join38 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nullsafe org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_clusterby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_outer_join5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_repeated_alias org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_case org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_col org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ptf_matchpath org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_regex_col org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_semijoin2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_25 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_unqual_corr_expr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_folder_constants org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union27 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_25 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_top_level org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_coalesce org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_date_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_round_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_interval_1 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_hybridgrace_hashjoin_1 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_self_join org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_hybridgrace_hashjoin_1 org.apache.hadoop.hive.cli.TestM
[jira] [Commented] (HIVE-12644) Support for offset in HiveSortMergeRule
[ https://issues.apache.org/jira/browse/HIVE-12644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054221#comment-15054221 ] Jesus Camacho Rodriguez commented on HIVE-12644: We need to check specifically for {{topOffset < (bottomOffset + bottomLimit)}}. Observe this setting: topOffset=1 topLimit=3 bottomOffset=1 bottomLimit=2 In this case, given the change that you propose, we would create a limit 0. However, that is not correct. We need to create a single SortLimit operator containing: offset=2 limit 1 > Support for offset in HiveSortMergeRule > --- > > Key: HIVE-12644 > URL: https://issues.apache.org/jira/browse/HIVE-12644 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12644.patch > > > After HIVE-11531 goes in, HiveSortMergeRule needs to be extended to support > offset properly when it merges operators that contain Limit. Otherwise, limit > pushdown through outer join optimization (introduced in HIVE-11684) will not > work properly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12633) LLAP: package included serde jars
[ https://issues.apache.org/jira/browse/HIVE-12633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054205#comment-15054205 ] Hive QA commented on HIVE-12633: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777219/HIVE-12633.02.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 9880 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_max org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_custom_key2 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testErrorJob org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testMetricsCollection org.apache.hive.spark.client.TestSparkClient.testRemoteClient org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6328/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6328/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6328/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 21 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12777219 - PreCommit-HIVE-TRUNK-Build > LLAP: package included serde jars > - > > Key: HIVE-12633 > URL: https://issues.apache.org/jira/browse/HIVE-12633 > Project: Hive > Issue Type: Bug >Reporter: Takahiko Saito >Assignee: Sergey Shelukhin > Attachments: HIVE-12633.01.patch, HIVE-12633.02.patch, > HIVE-12633.patch > > > Some SerDes like JSONSerde are not packaged with LLAP. One cannot localize > jars on the daemon (due to security consideration if nothing else), so we > should package them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12473) DPP: UDFs on the partition column side does not evaluate correctly
[ https://issues.apache.org/jira/browse/HIVE-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054188#comment-15054188 ] Gopal V commented on HIVE-12473: bq. The converter should work off of the final type, it needs to convert the string partition value to the output of the evaluator. What it is doing before fix was {{cast(dt as int)}}, which is wrong - it should do {{cast(dt as date)}}. Even if the final UDF type is an int due to {{year(dt}}. > DPP: UDFs on the partition column side does not evaluate correctly > -- > > Key: HIVE-12473 > URL: https://issues.apache.org/jira/browse/HIVE-12473 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 1.3.0, 1.2.1, 2.0.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin >Priority: Blocker > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12473.patch > > > Related to HIVE-12462 > {code} > select count(1) from accounts a, transactions t where year(a.dt) = year(t.dt) > and account_id = 22; > $hdt$_0:$hdt$_1:a > TableScan (TS_2) > alias: a > filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) > IN (RS[6])) (type: boolean) > {code} > Ends up being evaluated as {{year(cast(dt as int))}} because the pruner only > checks for final type, not the column type. > {code} > ObjectInspector oi = > > PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(TypeInfoFactory > .getPrimitiveTypeInfo(si.fieldInspector.getTypeName())); > Converter converter = > ObjectInspectorConverters.getConverter( > PrimitiveObjectInspectorFactory.javaStringObjectInspector, oi); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
[ https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054165#comment-15054165 ] Gunther Hagleitner commented on HIVE-11634: --- FYI it's these changes that mean dpp is no longer working: {noformat} -Select Operator - expressions: UDFToDouble(UDFToInteger((hr / 2))) (type: double) - outputColumnNames: _col0 - Statistics: Num rows: 1 Data size: 7 Basic stats: COMPLETE Column stats: NONE - Group By Operator -keys: _col0 (type: double) -mode: hash -outputColumnNames: _col0 -Statistics: Num rows: 1 Data size: 7 Basic stats: COMPLETE Column stats: NONE -Dynamic Partitioning Event Operator - Target Input: srcpart - Partition key expr: UDFToDouble(hr) - Statistics: Num rows: 1 Data size: 7 Basic stats: COMPLETE Column stats: NONE - Target column: hr - Target Vertex: Map 1 {noformat} > Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...) > -- > > Key: HIVE-11634 > URL: https://issues.apache.org/jira/browse/HIVE-11634 > Project: Hive > Issue Type: Bug > Components: CBO >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Labels: TODOC1.3 > Fix For: 1.3.0 > > Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, > HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, > HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, > HIVE-11634.9.patch, HIVE-11634.91.patch, HIVE-11634.92.patch, > HIVE-11634.93.patch, HIVE-11634.94.patch, HIVE-11634.95.patch, > HIVE-11634.96.patch, HIVE-11634.97.patch, HIVE-11634.98.patch, > HIVE-11634.99.patch, HIVE-11634.990.patch, HIVE-11634.991.patch, > HIVE-11634.992.patch, HIVE-11634.993.patch, HIVE-11634.994.patch, > HIVE-11634.995.patch, HIVE-11634.patch > > > Currently, we do not support partition pruning for the following scenario > {code} > create table pcr_t1 (key int, value string) partitioned by (ds string); > insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src > where key < 20 order by key; > insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src > where key < 20 order by key; > insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src > where key < 20 order by key; > explain extended select ds from pcr_t1 where struct(ds, key) in > (struct('2000-04-08',1), struct('2000-04-09',2)); > {code} > If we run the above query, we see that all the partitions of table pcr_t1 are > present in the filter predicate where as we can prune partition > (ds='2000-04-10'). > The optimization is to rewrite the above query into the following. > {code} > explain extended select ds from pcr_t1 where (struct(ds)) IN > (struct('2000-04-08'), struct('2000-04-09')) and struct(ds, key) in > (struct('2000-04-08',1), struct('2000-04-09',2)); > {code} > The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09')) > is used by partition pruner to prune the columns which otherwise will not be > pruned. > This is an extension of the idea presented in HIVE-11573. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12573) some DPP tests are broken
[ https://issues.apache.org/jira/browse/HIVE-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054161#comment-15054161 ] Gunther Hagleitner commented on HIVE-12573: --- I don't believe the effect is only cosmetic. Having an un-executable synthetic predicate left over in the TS operator breaks anyone trying to implement filter pushdown. However, this was introduced by HIVE-12462, which I don't think is the right fix in the first place. Makes sense to find out what's happening there before proceeding with this. > some DPP tests are broken > - > > Key: HIVE-12573 > URL: https://issues.apache.org/jira/browse/HIVE-12573 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12573.patch > > > -It looks like LLAP out files were not updated in some DPP JIRA because the > test was entirely broken in HiveQA at the time- actually looks like out files > have explain output with a glitch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL
[ https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-12462: -- Priority: Blocker (was: Critical) > DPP: DPP optimizers need to run on the TS predicate not FIL > > > Key: HIVE-12462 > URL: https://issues.apache.org/jira/browse/HIVE-12462 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HIVE-12462.02.patch, HIVE-12462.1.patch > > > HIVE-11398 + HIVE-11791, the partition-condition-remover became more > effective. > This removes predicates from the FilterExpression which involve partition > columns, causing a miss for dynamic-partition pruning if the DPP relies on > FilterDesc. > The TS desc will have the correct predicate in that condition. > {code} > $hdt$_0:$hdt$_1:a > TableScan (TS_2) > alias: a > filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) > IN (RS[6])) (type: boolean) > Filter Operator (FIL_20) > predicate: ((account_id = 22) and year(dt) is not null) (type: boolean) > Select Operator (SEL_4) > expressions: dt (type: date) > outputColumnNames: _col1 > Reduce Output Operator (RS_8) > key expressions: year(_col1) (type: int) > sort order: + > Map-reduce partition columns: year(_col1) (type: int) > Join Operator (JOIN_9) > condition map: > Inner Join 0 to 1 > keys: > 0 year(_col1) (type: int) > 1 year(_col1) (type: int) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
[ https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054154#comment-15054154 ] Gunther Hagleitner commented on HIVE-11634: --- [~hsubramaniyan]/[~jpullokkaran] this breaks dpp (see changed golden file in this patch or HIVE-12462). Not sure what the best way to fix this is. > Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...) > -- > > Key: HIVE-11634 > URL: https://issues.apache.org/jira/browse/HIVE-11634 > Project: Hive > Issue Type: Bug > Components: CBO >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Labels: TODOC1.3 > Fix For: 1.3.0 > > Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, > HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, > HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, > HIVE-11634.9.patch, HIVE-11634.91.patch, HIVE-11634.92.patch, > HIVE-11634.93.patch, HIVE-11634.94.patch, HIVE-11634.95.patch, > HIVE-11634.96.patch, HIVE-11634.97.patch, HIVE-11634.98.patch, > HIVE-11634.99.patch, HIVE-11634.990.patch, HIVE-11634.991.patch, > HIVE-11634.992.patch, HIVE-11634.993.patch, HIVE-11634.994.patch, > HIVE-11634.995.patch, HIVE-11634.patch > > > Currently, we do not support partition pruning for the following scenario > {code} > create table pcr_t1 (key int, value string) partitioned by (ds string); > insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src > where key < 20 order by key; > insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src > where key < 20 order by key; > insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src > where key < 20 order by key; > explain extended select ds from pcr_t1 where struct(ds, key) in > (struct('2000-04-08',1), struct('2000-04-09',2)); > {code} > If we run the above query, we see that all the partitions of table pcr_t1 are > present in the filter predicate where as we can prune partition > (ds='2000-04-10'). > The optimization is to rewrite the above query into the following. > {code} > explain extended select ds from pcr_t1 where (struct(ds)) IN > (struct('2000-04-08'), struct('2000-04-09')) and struct(ds, key) in > (struct('2000-04-08',1), struct('2000-04-09',2)); > {code} > The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09')) > is used by partition pruner to prune the columns which otherwise will not be > pruned. > This is an extension of the idea presented in HIVE-11573. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL
[ https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054151#comment-15054151 ] Gunther Hagleitner commented on HIVE-12462: --- I've looked into this some more. This should have already worked. There are tests in dynamic_partition_pruner.q that check for this type of join condition (with udf). However these tests no longer work. HIVE-11634 broke them. [~hsubramaniyan]/[~jpullokkaran] can you please take a look at this? I don't think the patch proposed here is the right fix and should probably be reverted. HIVE-11634 changes the golden file of the dynamic_partition_pruner.q - it effectively disables the optimization and I'm not sure why. The synthetic predicate in dpp is of the form (col IN (reducesink operator)) which for some reason gets lost in HIVE-11634. HIVE-11634 also seem to leave you with different expressions in the table scan and the filter and I'm thinking this is wrong as well (i.e.: the fix in this patch shouldn't work either). > DPP: DPP optimizers need to run on the TS predicate not FIL > > > Key: HIVE-12462 > URL: https://issues.apache.org/jira/browse/HIVE-12462 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Critical > Fix For: 2.0.0 > > Attachments: HIVE-12462.02.patch, HIVE-12462.1.patch > > > HIVE-11398 + HIVE-11791, the partition-condition-remover became more > effective. > This removes predicates from the FilterExpression which involve partition > columns, causing a miss for dynamic-partition pruning if the DPP relies on > FilterDesc. > The TS desc will have the correct predicate in that condition. > {code} > $hdt$_0:$hdt$_1:a > TableScan (TS_2) > alias: a > filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) > IN (RS[6])) (type: boolean) > Filter Operator (FIL_20) > predicate: ((account_id = 22) and year(dt) is not null) (type: boolean) > Select Operator (SEL_4) > expressions: dt (type: date) > outputColumnNames: _col1 > Reduce Output Operator (RS_8) > key expressions: year(_col1) (type: int) > sort order: + > Map-reduce partition columns: year(_col1) (type: int) > Join Operator (JOIN_9) > condition map: > Inner Join 0 to 1 > keys: > 0 year(_col1) (type: int) > 1 year(_col1) (type: int) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12473) DPP: UDFs on the partition column side does not evaluate correctly
[ https://issues.apache.org/jira/browse/HIVE-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054147#comment-15054147 ] Gunther Hagleitner commented on HIVE-12473: --- I can't figure out why this fix would be doing the right thing. The converter should work off of the final type, it needs to convert the string partition value to the output of the evaluator. I think this should be reverted. > DPP: UDFs on the partition column side does not evaluate correctly > -- > > Key: HIVE-12473 > URL: https://issues.apache.org/jira/browse/HIVE-12473 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 1.3.0, 1.2.1, 2.0.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin >Priority: Blocker > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12473.patch > > > Related to HIVE-12462 > {code} > select count(1) from accounts a, transactions t where year(a.dt) = year(t.dt) > and account_id = 22; > $hdt$_0:$hdt$_1:a > TableScan (TS_2) > alias: a > filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) > IN (RS[6])) (type: boolean) > {code} > Ends up being evaluated as {{year(cast(dt as int))}} because the pruner only > checks for final type, not the column type. > {code} > ObjectInspector oi = > > PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(TypeInfoFactory > .getPrimitiveTypeInfo(si.fieldInspector.getTypeName())); > Converter converter = > ObjectInspectorConverters.getConverter( > PrimitiveObjectInspectorFactory.javaStringObjectInspector, oi); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL
[ https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner reopened HIVE-12462: --- > DPP: DPP optimizers need to run on the TS predicate not FIL > > > Key: HIVE-12462 > URL: https://issues.apache.org/jira/browse/HIVE-12462 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Critical > Fix For: 2.0.0 > > Attachments: HIVE-12462.02.patch, HIVE-12462.1.patch > > > HIVE-11398 + HIVE-11791, the partition-condition-remover became more > effective. > This removes predicates from the FilterExpression which involve partition > columns, causing a miss for dynamic-partition pruning if the DPP relies on > FilterDesc. > The TS desc will have the correct predicate in that condition. > {code} > $hdt$_0:$hdt$_1:a > TableScan (TS_2) > alias: a > filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) > IN (RS[6])) (type: boolean) > Filter Operator (FIL_20) > predicate: ((account_id = 22) and year(dt) is not null) (type: boolean) > Select Operator (SEL_4) > expressions: dt (type: date) > outputColumnNames: _col1 > Reduce Output Operator (RS_8) > key expressions: year(_col1) (type: int) > sort order: + > Map-reduce partition columns: year(_col1) (type: int) > Join Operator (JOIN_9) > condition map: > Inner Join 0 to 1 > keys: > 0 year(_col1) (type: int) > 1 year(_col1) (type: int) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)