[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11394: Attachment: (was: HIVE-11394.09.patch) > Enhance EXPLAIN display for vectorization > - > > Key: HIVE-11394 > URL: https://issues.apache.org/jira/browse/HIVE-11394 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, > HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, > HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch > > > Add detail to the EXPLAIN output showing why a Map and Reduce work is not > vectorized. > New syntax is: EXPLAIN VECTORIZATION \[ONLY\] > \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\] > The ONLY option suppresses most non-vectorization elements. > SUMMARY shows vectorization information for the PLAN (is vectorization > enabled) and a summary of Map and Reduce work. > OPERATOR shows vectorization information for operators. E.g. Filter > Vectorization. It includes all information of SUMMARY, too. > EXPRESSION shows vectorization information for expressions. E.g. > predicateExpression. It includes all information of SUMMARY and OPERATOR, > too. > DETAIL shows very vectorization information. > It includes all information of SUMMARY, OPERATOR, and EXPRESSION too. > The optional clause defaults are not ONLY and SUMMARY. > Here are some examples: > EXPLAIN VECTORIZATION example: > (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization > sections) > Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION > SUMMARY. > {code} > coming soon… > {code} > EXPLAIN VECTORIZATION OPERATOR > Notice the added Select Vectorization, Group By Vectorization, Reduce Sink > Vectorization sections in this example. > {code} > coming soon… > {code} > EXPLAIN VECTORIZATION EXPRESSION > Notice the a in this example. > {code} > coming soon… > {code} > EXPLAIN VECTORIZATION DETAIL > Notice the a in this example. > {code} > coming soon… > {code} > EXPLAIN VECTORIZATION ONLY example: > {code} > coming soon… > {code} > EXPLAIN VECTORIZATION ONLY OPERATOR example: > {code} > coming soon… > {code} > EXPLAIN VECTORIZATION ONLY EXPRESSION example: > {code} > {code} > EXPLAIN VECTORIZATION ONLY DETAIL example: > {code} > coming soon… > {code} > The standard @Explain Annotation Type is used. A new 'vectorization' > annotation marks each new class and method. > Works for FORMATTED, like other non-vectorization EXPLAIN variations. > EXPLAIN VECTORIZATION FORMATTED example: > {code} > coming soon… > {code} > or pretty printed: > {code} > coming soon… > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11394: Attachment: HIVE-11394.09.patch > Enhance EXPLAIN display for vectorization > - > > Key: HIVE-11394 > URL: https://issues.apache.org/jira/browse/HIVE-11394 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, > HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, > HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, > HIVE-11394.09.patch > > > Add detail to the EXPLAIN output showing why a Map and Reduce work is not > vectorized. > New syntax is: EXPLAIN VECTORIZATION \[ONLY\] > \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\] > The ONLY option suppresses most non-vectorization elements. > SUMMARY shows vectorization information for the PLAN (is vectorization > enabled) and a summary of Map and Reduce work. > OPERATOR shows vectorization information for operators. E.g. Filter > Vectorization. It includes all information of SUMMARY, too. > EXPRESSION shows vectorization information for expressions. E.g. > predicateExpression. It includes all information of SUMMARY and OPERATOR, > too. > DETAIL shows very vectorization information. > It includes all information of SUMMARY, OPERATOR, and EXPRESSION too. > The optional clause defaults are not ONLY and SUMMARY. > Here are some examples: > EXPLAIN VECTORIZATION example: > (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization > sections) > Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION > SUMMARY. > {code} > coming soon… > {code} > EXPLAIN VECTORIZATION OPERATOR > Notice the added Select Vectorization, Group By Vectorization, Reduce Sink > Vectorization sections in this example. > {code} > coming soon… > {code} > EXPLAIN VECTORIZATION EXPRESSION > Notice the a in this example. > {code} > coming soon… > {code} > EXPLAIN VECTORIZATION DETAIL > Notice the a in this example. > {code} > coming soon… > {code} > EXPLAIN VECTORIZATION ONLY example: > {code} > coming soon… > {code} > EXPLAIN VECTORIZATION ONLY OPERATOR example: > {code} > coming soon… > {code} > EXPLAIN VECTORIZATION ONLY EXPRESSION example: > {code} > {code} > EXPLAIN VECTORIZATION ONLY DETAIL example: > {code} > coming soon… > {code} > The standard @Explain Annotation Type is used. A new 'vectorization' > annotation marks each new class and method. > Works for FORMATTED, like other non-vectorization EXPLAIN variations. > EXPLAIN VECTORIZATION FORMATTED example: > {code} > coming soon… > {code} > or pretty printed: > {code} > coming soon… > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13539) HiveHFileOutputFormat searching the wrong directory for HFiles
[ https://issues.apache.org/jira/browse/HIVE-13539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15567897#comment-15567897 ] Matt McCline commented on HIVE-13539: - I other words, I can't get the non-patched code to fail. Without a test case, the code patch cannot be reviewed and committed. > HiveHFileOutputFormat searching the wrong directory for HFiles > -- > > Key: HIVE-13539 > URL: https://issues.apache.org/jira/browse/HIVE-13539 > Project: Hive > Issue Type: Bug > Components: HBase Handler >Affects Versions: 1.1.0 > Environment: Built into CDH 5.4.7 >Reporter: Tim Robertson >Assignee: Matt McCline >Priority: Blocker > Attachments: hive_hfile_output_format.q, > hive_hfile_output_format.q.out > > > When creating HFiles for a bulkload in HBase I believe it is looking in the > wrong directory to find the HFiles, resulting in the following exception: > {code} > Error: java.lang.RuntimeException: Hive Runtime Error while closing > operators: java.io.IOException: Multiple family directories found in > hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary > at > org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:295) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:453) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.IOException: Multiple family directories found in > hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:188) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:958) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) > at > org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:287) > ... 7 more > Caused by: java.io.IOException: Multiple family directories found in > hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary > at > org.apache.hadoop.hive.hbase.HiveHFileOutputFormat$1.close(HiveHFileOutputFormat.java:158) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:185) > ... 11 more > {code} > The issue is that is looks for the HFiles in > {{hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary}} > when I believe it should be looking in the task attempt subfolder, such as > {{hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary/attempt_1461004169450_0002_r_00_1000}}. > This can be reproduced in any HFile creation such as: > {code:sql} > CREATE TABLE coords_hbase(id INT, x DOUBLE, y DOUBLE) > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH SERDEPROPERTIES ( > 'hbase.columns.mapping' = ':key,o:x,o:y', > 'hbase.table.default.storage.type' = 'binary'); > SET hfile.family.path=/tmp/coords_hfiles/o; > SET hive.hbase.generatehfiles=true; > INSERT OVERWRITE TABLE coords_hbase > SELECT id, decimalLongitude, decimalLatitude > FROM source > CLUSTER BY id; > {code} > Any advice greatly appreciated -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13539) HiveHFileOutputFormat searching the wrong directory for HFiles
[ https://issues.apache.org/jira/browse/HIVE-13539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13539: Resolution: Cannot Reproduce Status: Resolved (was: Patch Available) > HiveHFileOutputFormat searching the wrong directory for HFiles > -- > > Key: HIVE-13539 > URL: https://issues.apache.org/jira/browse/HIVE-13539 > Project: Hive > Issue Type: Bug > Components: HBase Handler >Affects Versions: 1.1.0 > Environment: Built into CDH 5.4.7 >Reporter: Tim Robertson >Assignee: Matt McCline >Priority: Blocker > Attachments: hive_hfile_output_format.q, > hive_hfile_output_format.q.out > > > When creating HFiles for a bulkload in HBase I believe it is looking in the > wrong directory to find the HFiles, resulting in the following exception: > {code} > Error: java.lang.RuntimeException: Hive Runtime Error while closing > operators: java.io.IOException: Multiple family directories found in > hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary > at > org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:295) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:453) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.IOException: Multiple family directories found in > hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:188) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:958) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) > at > org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:287) > ... 7 more > Caused by: java.io.IOException: Multiple family directories found in > hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary > at > org.apache.hadoop.hive.hbase.HiveHFileOutputFormat$1.close(HiveHFileOutputFormat.java:158) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:185) > ... 11 more > {code} > The issue is that is looks for the HFiles in > {{hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary}} > when I believe it should be looking in the task attempt subfolder, such as > {{hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary/attempt_1461004169450_0002_r_00_1000}}. > This can be reproduced in any HFile creation such as: > {code:sql} > CREATE TABLE coords_hbase(id INT, x DOUBLE, y DOUBLE) > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH SERDEPROPERTIES ( > 'hbase.columns.mapping' = ':key,o:x,o:y', > 'hbase.table.default.storage.type' = 'binary'); > SET hfile.family.path=/tmp/coords_hfiles/o; > SET hive.hbase.generatehfiles=true; > INSERT OVERWRITE TABLE coords_hbase > SELECT id, decimalLongitude, decimalLatitude > FROM source > CLUSTER BY id; > {code} > Any advice greatly appreciated -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14877) Move slow CliDriver tests to MiniLlap
[ https://issues.apache.org/jira/browse/HIVE-14877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-14877: - Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to master > Move slow CliDriver tests to MiniLlap > - > > Key: HIVE-14877 > URL: https://issues.apache.org/jira/browse/HIVE-14877 > Project: Hive > Issue Type: Sub-task > Components: Tests >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 2.2.0 > > Attachments: HIVE-14877.1.patch, HIVE-14877.2.patch, > HIVE-14877.3.patch, HIVE-14877.4.patch, HIVE-14877.5.patch, > HIVE-14877.5.patch, HIVE-14877.6.patch > > > When analyzing the test runtimes, there are many CliDriver tests that shows > up as stragglers and are slow. Most of these tests are not really testing the > execution engine. For example special_character_in_tabnames_1.q is the > slowest test case that takes 419s in CliDriver but only 62s in MiniLlap. > Similarly there are many test cases that can benefit from fast runtimes. We > should consider moving the tests that are not testing the execution engine to > MiniLlap (assuming it provides significant performance benefit). > Here is the list of top 100 slow tests based on build #1055 > ||QFiles||TestCliDriver elapsed time|| > |special_character_in_tabnames_1.q|419.229| > |unionDistinct_1.q|278.583| > |vector_leftsemi_mapjoin.q|232.313| > |join_filters.q|172.436| > |escape2.q|167.503| > |archive_excludeHadoop20.q|163.522| > |escape1.q|130.217| > |lineage3.q|110.935| > |insert_into_with_schema.q|107.345| > |auto_join_filters.q|104.331| > |windowing.q|99.622| > |index_compact_binary_search.q|97.637| > |cbo_rp_windowing_2.q|95.108| > |vectorized_ptf.q|93.397| > |dynpart_sort_optimization_acid.q|91.831| > |partition_multilevels.q|90.392| > |ptf.q|89.115| > |sample_islocalmode_hook.q|88.293| > |udaf_collect_set_2.q|84.725| > |skewjoin.q|84.588| > |lineage2.q|84.187| > |correlationoptimizer1.q|80.367| > |dynpart_sort_optimization.q|77.07| > |orc_ppd_decimal.q|75.523| > |orc_ppd_schema_evol_3a.q|75.352| > |groupby_sort_skew_1_23.q|75.342| > |cbo_rp_lineage2.q|75.283| > |parquet_ppd_decimal.q|74.063| > |sample_islocalmode_hook_use_metadata.q|73.988| > |orc_analyze.q|73.803| > |join_nulls.q|72.417| > |semijoin.q|70.403| > |correlationoptimizer6.q|69.151| > |table_access_keys_stats.q|68.699| > |autoColumnStats_2.q|68.632| > |cbo_join.q|68.325| > |cbo_rp_join.q|68.317| > |sample10.q|64.513| > |mergejoin.q|63.647| > |multi_insert_move_tasks_share_dependencies.q|62.079| > |union_view.q|61.772| > |autoColumnStats_1.q|61.246| > |groupby_sort_1_23.q|61.129| > |pcr.q|59.546| > |vectorization_short_regress.q|58.775| > |auto_sortmerge_join_9.q|58.3| > |correlationoptimizer2.q|56.591| > |alter_merge_stats_orc.q|55.202| > |vector_join30.q|54.85| > |selectDistinctStar.q|53.981| > |vector_decimal_udf.q|53.879| > |auto_join30.q|53.762| > |subquery_notin.q|52.879| > |cbo_rp_subq_not_in.q|52.609| > |cbo_rp_gby.q|51.866| > |cbo_subq_not_in.q|51.672| > |cbo_gby.q|50.361| > |infer_bucket_sort.q|49.158| > |ptf_streaming.q|48.484| > |join_1to1.q|48.268| > |load_dyn_part5.q|47.796| > |limit_join_transpose.q|47.517| > |ppd_windowing2.q|47.318| > |dynpart_sort_opt_vectorization.q|47.208| > |vector_number_compare_projection.q|47.024| > |correlationoptimizer4.q|45.472| > |orc_ppd_date.q|45.19| > |global_limit.q|44.438| > |union_top_level.q|44.229| > |llap_partitioned.q|44.139| > |orc_ppd_timestamp.q|43.617| > |parquet_ppd_date.q|43.539| > |multiMapJoin2.q|43.036| > |parquet_ppd_timestamp.q|42.665| > |vector_partitioned_date_time.q|42.511| > |auto_sortmerge_join_8.q|42.377| > |create_view.q|42.23| > |windowing_windowspec2.q|42.202| > |multiMapJoin1.q|41.176| > |vector_decimal_2.q|41.026| > |bucket_groupby.q|40.565| > |rcfile_merge2.q|39.782| > |index_compact_2.q|39.765| > |join_nullsafe.q|39.698| > |vector_join_filters.q|39.343| > |cbo_rp_auto_join1.q|39.308| > |vector_auto_smb_mapjoin_14.q|39.17| > |vector_udf1.q|38.988| > |rcfile_createas1.q|38.932| > |cbo_rp_semijoin.q|38.675| > |auto_join_nulls.q|38.519| > |cbo_rp_unionDistinct_2.q|37.815| > |union_remove_26.q|37.672| > |rcfile_merge3.q|37.373| > |rcfile_merge4.q|37.194| > |bucketsortoptimize_insert_2.q|37.187| > |cbo_limit.q|37.038| > |auto_sortmerge_join_6.q|36.663| > |join43.q|36.656| -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14921) Move slow CliDriver tests to MiniLlap - part 2
[ https://issues.apache.org/jira/browse/HIVE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-14921: - Attachment: HIVE-14921.1.patch same patch after rebase > Move slow CliDriver tests to MiniLlap - part 2 > -- > > Key: HIVE-14921 > URL: https://issues.apache.org/jira/browse/HIVE-14921 > Project: Hive > Issue Type: Sub-task > Components: Tests >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-14921.1.patch, HIVE-14921.1.patch > > > Continuation to HIVE-14877 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14921) Move slow CliDriver tests to MiniLlap - part 2
[ https://issues.apache.org/jira/browse/HIVE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-14921: - Status: Patch Available (was: Open) > Move slow CliDriver tests to MiniLlap - part 2 > -- > > Key: HIVE-14921 > URL: https://issues.apache.org/jira/browse/HIVE-14921 > Project: Hive > Issue Type: Sub-task > Components: Tests >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-14921.1.patch, HIVE-14921.1.patch > > > Continuation to HIVE-14877 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14803) S3: Stats gathering for insert queries can be expensive for partitioned dataset
[ https://issues.apache.org/jira/browse/HIVE-14803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15567924#comment-15567924 ] Hive QA commented on HIVE-14803: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12829480/HIVE-14803.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10666 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join32_lessSize] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_4] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd2] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_udf_case] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_map_ppr_multi_distinct] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_ppr] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join26] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join_merge_multi_expressions] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[load_dyn_part8] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[transform_ppr1] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[transform_ppr2] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_ppr] {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1493/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1493/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1493/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 12 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12829480 - PreCommit-HIVE-Build > S3: Stats gathering for insert queries can be expensive for partitioned > dataset > --- > > Key: HIVE-14803 > URL: https://issues.apache.org/jira/browse/HIVE-14803 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 2.1.0 >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-14803.1.patch > > > StatsTask's aggregateStats populates stats details for all partitions by > checking the file sizes which turns out to be expensive when larger number of > partitions are inserted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14916) Reduce the memory requirements for Spark tests
[ https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15567971#comment-15567971 ] Siddharth Seth commented on HIVE-14916: --- It is necessary. The values should be 2048, 512, 2048 instead of 4096, 1024, 4096 > Reduce the memory requirements for Spark tests > -- > > Key: HIVE-14916 > URL: https://issues.apache.org/jira/browse/HIVE-14916 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Dapeng Sun > Attachments: HIVE-14916.001.patch, HIVE-14916.002.patch, > HIVE-14916.003.patch > > > As HIVE-14887, we need to reduce the memory requirements for Spark tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14761) Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2
[ https://issues.apache.org/jira/browse/HIVE-14761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15567978#comment-15567978 ] Siddharth Seth commented on HIVE-14761: --- +1. > Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2 > -- > > Key: HIVE-14761 > URL: https://issues.apache.org/jira/browse/HIVE-14761 > Project: Hive > Issue Type: Sub-task >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Attachments: HIVE-14761.1.patch > > > Currently 2 min 30 sec -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14761) Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2
[ https://issues.apache.org/jira/browse/HIVE-14761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15567983#comment-15567983 ] Siddharth Seth commented on HIVE-14761: --- Created HIVE-14936 to track the flaky minillap test. > Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2 > -- > > Key: HIVE-14761 > URL: https://issues.apache.org/jira/browse/HIVE-14761 > Project: Hive > Issue Type: Sub-task >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Attachments: HIVE-14761.1.patch > > > Currently 2 min 30 sec -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13316) Upgrade to Calcite 1.10
[ https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13316: --- Summary: Upgrade to Calcite 1.10 (was: Upgrade to Calcite 1.9) > Upgrade to Calcite 1.10 > --- > > Key: HIVE-13316 > URL: https://issues.apache.org/jira/browse/HIVE-13316 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13316.01.patch, HIVE-13316.02.patch, > HIVE-13316.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14887) Reduce the memory requirements for tests
[ https://issues.apache.org/jira/browse/HIVE-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-14887: -- Attachment: HIVE-14887.02.patch Updated patch, with some more runtime parameters set. > Reduce the memory requirements for tests > > > Key: HIVE-14887 > URL: https://issues.apache.org/jira/browse/HIVE-14887 > Project: Hive > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14887.01.patch, HIVE-14887.02.patch > > > The clusters that we spin up end up requiring 16GB at times. Also the maven > arguments seem a little heavy weight. > Reducing this will allow for additional ptest drones per box, which should > bring down the runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13316) Upgrade to Calcite 1.10
[ https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13316: --- Attachment: HIVE-13316.03.patch > Upgrade to Calcite 1.10 > --- > > Key: HIVE-13316 > URL: https://issues.apache.org/jira/browse/HIVE-13316 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13316.01.patch, HIVE-13316.02.patch, > HIVE-13316.03.patch, HIVE-13316.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13316) Upgrade to Calcite 1.10
[ https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13316: --- Attachment: (was: HIVE-13316.03.patch) > Upgrade to Calcite 1.10 > --- > > Key: HIVE-13316 > URL: https://issues.apache.org/jira/browse/HIVE-13316 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13316.01.patch, HIVE-13316.02.patch, > HIVE-13316.04.patch, HIVE-13316.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13316) Upgrade to Calcite 1.10
[ https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13316: --- Attachment: HIVE-13316.04.patch > Upgrade to Calcite 1.10 > --- > > Key: HIVE-13316 > URL: https://issues.apache.org/jira/browse/HIVE-13316 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13316.01.patch, HIVE-13316.02.patch, > HIVE-13316.04.patch, HIVE-13316.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14916) Reduce the memory requirements for Spark tests
[ https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568015#comment-15568015 ] Dapeng Sun commented on HIVE-14916: --- Thank [~sseth], I will try and update the patch. > Reduce the memory requirements for Spark tests > -- > > Key: HIVE-14916 > URL: https://issues.apache.org/jira/browse/HIVE-14916 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Dapeng Sun > Attachments: HIVE-14916.001.patch, HIVE-14916.002.patch, > HIVE-14916.003.patch > > > As HIVE-14887, we need to reduce the memory requirements for Spark tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13316) Upgrade to Calcite 1.10
[ https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13316: --- Attachment: HIVE-13316.05.patch > Upgrade to Calcite 1.10 > --- > > Key: HIVE-13316 > URL: https://issues.apache.org/jira/browse/HIVE-13316 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13316.01.patch, HIVE-13316.02.patch, > HIVE-13316.05.patch, HIVE-13316.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13316) Upgrade to Calcite 1.10
[ https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13316: --- Attachment: (was: HIVE-13316.04.patch) > Upgrade to Calcite 1.10 > --- > > Key: HIVE-13316 > URL: https://issues.apache.org/jira/browse/HIVE-13316 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13316.01.patch, HIVE-13316.02.patch, > HIVE-13316.05.patch, HIVE-13316.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568073#comment-15568073 ] Hive QA commented on HIVE-11394: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832834/HIVE-11394.09.patch {color:green}SUCCESS:{color} +1 due to 162 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 10606 tests executed *Failed tests:* {noformat} TestMiniLlapLocalCliDriver-orc_llap.q-delete_where_non_partitioned.q-vector_groupby_mapjoin.q-and-27-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_between_in] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_vec_part] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_vec_part_all_primitive] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_vec_table] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part_all_primitive] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_table] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_udf] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_number_compare_projection] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_udf1] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[vectorization_limit] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_string_concat] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_13] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_14] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_15] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_16] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_9] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_short_regress] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_mapjoin] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_shufflejoin] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_timestamp_funcs] {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1494/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1494/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1494/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 25 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12832834 - PreCommit-HIVE-Build > Enhance EXPLAIN display for vectorization > - > > Key: HIVE-11394 > URL: https://issues.apache.org/jira/browse/HIVE-11394 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, > HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, > HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, > HIVE-11394.09.patch > > > Add detail to the EXPLAIN output showing why a Map and Reduce work is not > vectorized. > New syntax is: EXPLAIN VECTORIZATION \[ONLY\] > \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\] > The ONLY option suppresses most non-vectorization elements. > SUMMARY shows vectorization information for the PLAN (is vectorization > enabled) and a summary of Map and Reduce work. > OPERATOR shows vectorization information for operators. E.g. Filter > Vectorization. It includes all information of SUMMARY, too. > EXPRESSION shows vectorization information for expressions. E.g. > predicateExpression. It includes all information of SUMMARY and OPERATOR, > too. > DETAIL shows very vectorization information. > It includes all information of SUMMARY, OPERATOR, and EXPRESSION too. > The optional clause defaults are not ONLY and SUMMARY. > Here are some examples: > EXPLAIN VECTORIZATION exampl
[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568152#comment-15568152 ] Rui Li commented on HIVE-14797: --- Seems for MR, we need to get #reducers from hconf, but for Spark/Tez, we need to get it from ReduceSinkDesc::getNumReducers. Therefore we have to check both of them to determine if #reducer is the same as our hash seed. > reducer number estimating may lead to data skew > --- > > Key: HIVE-14797 > URL: https://issues.apache.org/jira/browse/HIVE-14797 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: roncenzhao >Assignee: roncenzhao > Attachments: HIVE-14797.2.patch, HIVE-14797.3.patch, HIVE-14797.patch > > > HiveKey's hash code is generated by multipling by 31 key by key which is > implemented in method `ObjectInspectorUtils.getBucketHashCode()`: > for (int i = 0; i < bucketFields.length; i++) { > int fieldHash = ObjectInspectorUtils.hashCode(bucketFields[i], > bucketFieldInspectors[i]); > hashCode = 31 * hashCode + fieldHash; > } > The follow example will lead to data skew: > I hava two table called tbl1 and tbl2 and they have the same column: a int, b > string. The values of column 'a' in both two tables are not skew, but values > of column 'b' in both two tables are skew. > When my sql is "select * from tbl1 join tbl2 on tbl1.a=tbl2.a and > tbl1.b=tbl2.b" and the estimated reducer number is 31, it will lead to data > skew. > As we know, the HiveKey's hash code is generated by `hash(a)*31 + hash(b)`. > When reducer number is 31 the reducer No. of each row is `hash(b)%31`. In the > result, the job will be skew. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11957) SHOW TRANSACTIONS should show queryID/agent id of the creator
[ https://issues.apache.org/jira/browse/HIVE-11957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568216#comment-15568216 ] Hive QA commented on HIVE-11957: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832769/HIVE-11957.5.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 10636 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1495/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1495/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1495/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12832769 - PreCommit-HIVE-Build > SHOW TRANSACTIONS should show queryID/agent id of the creator > - > > Key: HIVE-11957 > URL: https://issues.apache.org/jira/browse/HIVE-11957 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Wei Zheng > Attachments: HIVE-11957.1.patch, HIVE-11957.2.patch, > HIVE-11957.3.patch, HIVE-11957.4.patch, HIVE-11957.5.patch > > > this would be very useful for debugging > should also include heartbeat/create timestamps > would be nice to support some filtering/sorting options, like sort by create > time, agent id. filter by table, database, etc -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14922) Add perf logging for post job completion steps
[ https://issues.apache.org/jira/browse/HIVE-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568410#comment-15568410 ] Hive QA commented on HIVE-14922: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832766/HIVE-14922.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10636 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk] {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1496/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1496/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1496/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12832766 - PreCommit-HIVE-Build > Add perf logging for post job completion steps > --- > > Key: HIVE-14922 > URL: https://issues.apache.org/jira/browse/HIVE-14922 > Project: Hive > Issue Type: Task > Components: Logging >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-14922.patch > > > Mostly FS related operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14753) Track the number of open/closed/abandoned sessions in HS2
[ https://issues.apache.org/jira/browse/HIVE-14753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568478#comment-15568478 ] Barna Zsombor Klara commented on HIVE-14753: Hi [~szehon], if you have the time do you think you could review my patch? I think you are very familiar with the metrics API, so the changes should be straightforward to follow. Thanks! > Track the number of open/closed/abandoned sessions in HS2 > - > > Key: HIVE-14753 > URL: https://issues.apache.org/jira/browse/HIVE-14753 > Project: Hive > Issue Type: Sub-task > Components: Hive, HiveServer2 >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara > Fix For: 2.2.0 > > Attachments: HIVE-14753.patch > > > We should be able to track the nr. of sessions since the startup of the HS2 > instance as well as the average lifetime of a session. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13280) Error when more than 1 mapper for HBase storage handler
[ https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568492#comment-15568492 ] Damien Carol commented on HIVE-13280: - Yes this fix the pb. > Error when more than 1 mapper for HBase storage handler > --- > > Key: HIVE-13280 > URL: https://issues.apache.org/jira/browse/HIVE-13280 > Project: Hive > Issue Type: Bug > Components: HBase Handler >Affects Versions: 2.0.0 >Reporter: Damien Carol > > With a simple query (select from orc table and insert into HBase external > table): > {code:sql} > insert into table register.register select * from aa_temp > {code} > The aa_temp table have 45 orc files. It generate 45 mappers. > Some mappers fail with this error: > {noformat} > Caused by: java.lang.IllegalArgumentException: Must specify table name > at > org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) > at > org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) > at > org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) > ... 25 more > ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 > killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed > due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. > failedVertices:1 killedVertices:0 (state=08S01,code=2) > {noformat} > If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is > fine because there are only one mapper. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-13280) Error when more than 1 mapper for HBase storage handler
[ https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol resolved HIVE-13280. - Resolution: Invalid Assignee: Damien Carol > Error when more than 1 mapper for HBase storage handler > --- > > Key: HIVE-13280 > URL: https://issues.apache.org/jira/browse/HIVE-13280 > Project: Hive > Issue Type: Bug > Components: HBase Handler >Affects Versions: 2.0.0 >Reporter: Damien Carol >Assignee: Damien Carol > > With a simple query (select from orc table and insert into HBase external > table): > {code:sql} > insert into table register.register select * from aa_temp > {code} > The aa_temp table have 45 orc files. It generate 45 mappers. > Some mappers fail with this error: > {noformat} > Caused by: java.lang.IllegalArgumentException: Must specify table name > at > org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) > at > org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) > at > org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) > ... 25 more > ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 > killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed > due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. > failedVertices:1 killedVertices:0 (state=08S01,code=2) > {noformat} > If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is > fine because there are only one mapper. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13280) Error when more than 1 mapper for HBase storage handler
[ https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568493#comment-15568493 ] Damien Carol commented on HIVE-13280: - Yes this fix the pb. > Error when more than 1 mapper for HBase storage handler > --- > > Key: HIVE-13280 > URL: https://issues.apache.org/jira/browse/HIVE-13280 > Project: Hive > Issue Type: Bug > Components: HBase Handler >Affects Versions: 2.0.0 >Reporter: Damien Carol > > With a simple query (select from orc table and insert into HBase external > table): > {code:sql} > insert into table register.register select * from aa_temp > {code} > The aa_temp table have 45 orc files. It generate 45 mappers. > Some mappers fail with this error: > {noformat} > Caused by: java.lang.IllegalArgumentException: Must specify table name > at > org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) > at > org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) > at > org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) > ... 25 more > ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 > killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed > due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. > failedVertices:1 killedVertices:0 (state=08S01,code=2) > {noformat} > If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is > fine because there are only one mapper. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14872) Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS
[ https://issues.apache.org/jira/browse/HIVE-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568529#comment-15568529 ] Hive QA commented on HIVE-14872: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832776/HIVE-14872.02.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10484 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[udaf_collect_set_2] {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1497/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1497/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1497/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12832776 - PreCommit-HIVE-Build > Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS > - > > Key: HIVE-14872 > URL: https://issues.apache.org/jira/browse/HIVE-14872 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14872.01.patch, HIVE-14872.02.patch > > > The main purpose for the configuration of > HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS is for backward compatibility because a > lot of reserved key words has been used as identifiers in the previous > releases. We already have had several releases with this configuration. Now > when I tried to add new set operators to the parser, ANTLR is always > complaining "code too large". I think it is time to remove this > configuration. (1) It will simplify the parser logic and largely reduce the > size of generated parser code; (2) it leave space for new features, > especially those which require parser changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11394: Description: Add detail to the EXPLAIN output showing why a Map and Reduce work is not vectorized. New syntax is: EXPLAIN VECTORIZATION \[ONLY\] \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\] The ONLY option suppresses most non-vectorization elements. SUMMARY shows vectorization information for the PLAN (is vectorization enabled) and a summary of Map and Reduce work. OPERATOR shows vectorization information for operators. E.g. Filter Vectorization. It includes all information of SUMMARY, too. EXPRESSION shows vectorization information for expressions. E.g. predicateExpression. It includes all information of SUMMARY and OPERATOR, too. DETAIL shows very vectorization information. It includes all information of SUMMARY, OPERATOR, and EXPRESSION too. The optional clause defaults are not ONLY and SUMMARY. --- Here are some examples: EXPLAIN VECTORIZATION example: (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization sections) Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION SUMMARY. Under Reducer 3’s "Reduce Vectorization:" you’ll see notVectorizedReason: Aggregation Function UDF avg parameter expression for GROUPBY operator: Data type struct of Column\[VALUE._col2\] not supported For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": "false" which says a node has a GROUP BY with an AVG or some other aggregator that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators are row-mode. I.e. not vector output. If "usesVectorUDFAdaptor:": "false" were true, it would say there was at least one vectorized expression is using VectorUDFAdaptor. And, "allNative:": "false" will be true when all operators are native. Today, GROUP BY and FILE SINK are not native. MAP JOIN and REDUCE SINK are conditionally native. FILTER and SELECT are native. {code} PLAN VECTORIZATION: enabled: true enabledConditionsMet: [hive.vectorized.execution.enabled IS true] STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez ... Edges: Reducer 2 <- Map 1 (SIMPLE_EDGE) Reducer 3 <- Reducer 2 (SIMPLE_EDGE) ... Vertices: Map 1 Map Operator Tree: TableScan alias: alltypesorc Statistics: Num rows: 12288 Data size: 36696 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: cint (type: int) outputColumnNames: cint Statistics: Num rows: 12288 Data size: 36696 Basic stats: COMPLETE Column stats: COMPLETE Group By Operator keys: cint (type: int) mode: hash outputColumnNames: _col0 Statistics: Num rows: 5775 Data size: 17248 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 5775 Data size: 17248 Basic stats: COMPLETE Column stats: COMPLETE Execution mode: vectorized, llap LLAP IO: all inputs Map Vectorization: enabled: true enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true groupByVectorOutput: true inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat allNative: false usesVectorUDFAdaptor: false vectorized: true Reducer 2 Execution mode: vectorized, llap Reduce Vectorization: enabled: true enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true groupByVectorOutput: false allNative: false usesVectorUDFAdaptor: false vectorized: true Reduce Operator Tree: Group By Operator keys: KEY._col0 (type: int) mode: mergepartial outputColumnNames: _col0 Statistics: Num rows: 5775 Data size: 17248 Basic stats: COMPLETE Column stats: COMPLETE Group By Operator aggregations: sum(_col0), count(_col0), avg(_col0), std(_col0) mode: hash outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 1
[jira] [Commented] (HIVE-12765) Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)
[ https://issues.apache.org/jira/browse/HIVE-12765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568670#comment-15568670 ] Hive QA commented on HIVE-12765: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832788/HIVE-12765.02.patch {color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 185 failed/errored test(s), 10641 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_3] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join8] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_input26] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input25] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input26] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join8] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[keyword_2] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[limit0] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_query_oneskew_2] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_dyn_part14] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_field_garbage] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_25] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[timestamp] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_remove_25] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_null_projection] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[autoColumnStats_1] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[autoColumnStats_2] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[udaf_collect_set_2] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_top_level] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_null_projection] org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_cannot_create_all_role] org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_cannot_create_none_role] org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[cte_with_in_subquery] org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[lateral_view_join] org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[subq_insert] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_join8] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join8] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[load_dyn_part14] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_25] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_25] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_top_level] org.apache.hadoop.hive.ql.parse.TestParseNegativeDriver.testCliDriver[missing_overwrite] org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_ALL org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_ALTER org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_ARRAY org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_AS org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_AUTHORIZATION org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_BETWEEN org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_BIGINT org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_BINARY org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_BOOLEAN org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_BOTH org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_BY org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_CREATE org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_CUBE org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_CURRENT_DATE org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_CURRENT_TIMESTAMP org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_CURSOR org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_DATE org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_DECIMAL org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_DEL
[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11394: Status: Patch Available (was: In Progress) > Enhance EXPLAIN display for vectorization > - > > Key: HIVE-11394 > URL: https://issues.apache.org/jira/browse/HIVE-11394 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, > HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, > HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, > HIVE-11394.09.patch, HIVE-11394.091.patch > > > Add detail to the EXPLAIN output showing why a Map and Reduce work is not > vectorized. > New syntax is: EXPLAIN VECTORIZATION \[ONLY\] > \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\] > The ONLY option suppresses most non-vectorization elements. > SUMMARY shows vectorization information for the PLAN (is vectorization > enabled) and a summary of Map and Reduce work. > OPERATOR shows vectorization information for operators. E.g. Filter > Vectorization. It includes all information of SUMMARY, too. > EXPRESSION shows vectorization information for expressions. E.g. > predicateExpression. It includes all information of SUMMARY and OPERATOR, > too. > DETAIL shows very vectorization information. > It includes all information of SUMMARY, OPERATOR, and EXPRESSION too. > The optional clause defaults are not ONLY and SUMMARY. > --- > Here are some examples: > EXPLAIN VECTORIZATION example: > (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization > sections) > Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION > SUMMARY. > Under Reducer 3’s "Reduce Vectorization:" you’ll see > notVectorizedReason: Aggregation Function UDF avg parameter expression for > GROUPBY operator: Data type struct of > Column\[VALUE._col2\] not supported > For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": > "false" which says a node has a GROUP BY with an AVG or some other aggregator > that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators > are row-mode. I.e. not vector output. > If "usesVectorUDFAdaptor:": "false" were true, it would say there was at > least one vectorized expression is using VectorUDFAdaptor. > And, "allNative:": "false" will be true when all operators are native. > Today, GROUP BY and FILE SINK are not native. MAP JOIN and REDUCE SINK are > conditionally native. FILTER and SELECT are native. > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > ... > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Reducer 3 <- Reducer 2 (SIMPLE_EDGE) > ... > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: alltypesorc > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Select Operator > expressions: cint (type: int) > outputColumnNames: cint > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Group By Operator > keys: cint (type: int) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Execution mode: vectorized, llap > LLAP IO: all inputs > Map Vectorization: > enabled: true > enabledConditionsMet: > hive.vectorized.use.vectorized.input.format IS true > groupByVectorOutput: true > inputFileFormats: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > allNative: false > usesVectorUDFAdaptor: false > vectorized: true > Reducer 2 > Execution mode: vectorized, llap > Reduce Vectorization: > enabled: true >
[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11394: Attachment: HIVE-11394.091.patch > Enhance EXPLAIN display for vectorization > - > > Key: HIVE-11394 > URL: https://issues.apache.org/jira/browse/HIVE-11394 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, > HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, > HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, > HIVE-11394.09.patch, HIVE-11394.091.patch > > > Add detail to the EXPLAIN output showing why a Map and Reduce work is not > vectorized. > New syntax is: EXPLAIN VECTORIZATION \[ONLY\] > \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\] > The ONLY option suppresses most non-vectorization elements. > SUMMARY shows vectorization information for the PLAN (is vectorization > enabled) and a summary of Map and Reduce work. > OPERATOR shows vectorization information for operators. E.g. Filter > Vectorization. It includes all information of SUMMARY, too. > EXPRESSION shows vectorization information for expressions. E.g. > predicateExpression. It includes all information of SUMMARY and OPERATOR, > too. > DETAIL shows very vectorization information. > It includes all information of SUMMARY, OPERATOR, and EXPRESSION too. > The optional clause defaults are not ONLY and SUMMARY. > --- > Here are some examples: > EXPLAIN VECTORIZATION example: > (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization > sections) > Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION > SUMMARY. > Under Reducer 3’s "Reduce Vectorization:" you’ll see > notVectorizedReason: Aggregation Function UDF avg parameter expression for > GROUPBY operator: Data type struct of > Column\[VALUE._col2\] not supported > For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": > "false" which says a node has a GROUP BY with an AVG or some other aggregator > that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators > are row-mode. I.e. not vector output. > If "usesVectorUDFAdaptor:": "false" were true, it would say there was at > least one vectorized expression is using VectorUDFAdaptor. > And, "allNative:": "false" will be true when all operators are native. > Today, GROUP BY and FILE SINK are not native. MAP JOIN and REDUCE SINK are > conditionally native. FILTER and SELECT are native. > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > ... > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Reducer 3 <- Reducer 2 (SIMPLE_EDGE) > ... > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: alltypesorc > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Select Operator > expressions: cint (type: int) > outputColumnNames: cint > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Group By Operator > keys: cint (type: int) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Execution mode: vectorized, llap > LLAP IO: all inputs > Map Vectorization: > enabled: true > enabledConditionsMet: > hive.vectorized.use.vectorized.input.format IS true > groupByVectorOutput: true > inputFileFormats: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > allNative: false > usesVectorUDFAdaptor: false > vectorized: true > Reducer 2 > Execution mode: vectorized, llap > Reduce Vectorization: > enabled: true > en
[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568740#comment-15568740 ] Matt McCline commented on HIVE-11394: - Patch #91 is Hive QA #1512? > Enhance EXPLAIN display for vectorization > - > > Key: HIVE-11394 > URL: https://issues.apache.org/jira/browse/HIVE-11394 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, > HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, > HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, > HIVE-11394.09.patch, HIVE-11394.091.patch > > > Add detail to the EXPLAIN output showing why a Map and Reduce work is not > vectorized. > New syntax is: EXPLAIN VECTORIZATION \[ONLY\] > \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\] > The ONLY option suppresses most non-vectorization elements. > SUMMARY shows vectorization information for the PLAN (is vectorization > enabled) and a summary of Map and Reduce work. > OPERATOR shows vectorization information for operators. E.g. Filter > Vectorization. It includes all information of SUMMARY, too. > EXPRESSION shows vectorization information for expressions. E.g. > predicateExpression. It includes all information of SUMMARY and OPERATOR, > too. > DETAIL shows very vectorization information. > It includes all information of SUMMARY, OPERATOR, and EXPRESSION too. > The optional clause defaults are not ONLY and SUMMARY. > --- > Here are some examples: > EXPLAIN VECTORIZATION example: > (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization > sections) > Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION > SUMMARY. > Under Reducer 3’s "Reduce Vectorization:" you’ll see > notVectorizedReason: Aggregation Function UDF avg parameter expression for > GROUPBY operator: Data type struct of > Column\[VALUE._col2\] not supported > For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": > "false" which says a node has a GROUP BY with an AVG or some other aggregator > that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators > are row-mode. I.e. not vector output. > If "usesVectorUDFAdaptor:": "false" were true, it would say there was at > least one vectorized expression is using VectorUDFAdaptor. > And, "allNative:": "false" will be true when all operators are native. > Today, GROUP BY and FILE SINK are not native. MAP JOIN and REDUCE SINK are > conditionally native. FILTER and SELECT are native. > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > ... > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Reducer 3 <- Reducer 2 (SIMPLE_EDGE) > ... > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: alltypesorc > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Select Operator > expressions: cint (type: int) > outputColumnNames: cint > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Group By Operator > keys: cint (type: int) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Execution mode: vectorized, llap > LLAP IO: all inputs > Map Vectorization: > enabled: true > enabledConditionsMet: > hive.vectorized.use.vectorized.input.format IS true > groupByVectorOutput: true > inputFileFormats: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > allNative: false > usesVectorUDFAdaptor: false > vectorized: true > Reducer 2 > Execution mode: vectorized, llap > Reduce Vectorization: >
[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11394: Status: In Progress (was: Patch Available) > Enhance EXPLAIN display for vectorization > - > > Key: HIVE-11394 > URL: https://issues.apache.org/jira/browse/HIVE-11394 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, > HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, > HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, > HIVE-11394.09.patch, HIVE-11394.091.patch > > > Add detail to the EXPLAIN output showing why a Map and Reduce work is not > vectorized. > New syntax is: EXPLAIN VECTORIZATION \[ONLY\] > \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\] > The ONLY option suppresses most non-vectorization elements. > SUMMARY shows vectorization information for the PLAN (is vectorization > enabled) and a summary of Map and Reduce work. > OPERATOR shows vectorization information for operators. E.g. Filter > Vectorization. It includes all information of SUMMARY, too. > EXPRESSION shows vectorization information for expressions. E.g. > predicateExpression. It includes all information of SUMMARY and OPERATOR, > too. > DETAIL shows very vectorization information. > It includes all information of SUMMARY, OPERATOR, and EXPRESSION too. > The optional clause defaults are not ONLY and SUMMARY. > --- > Here are some examples: > EXPLAIN VECTORIZATION example: > (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization > sections) > Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION > SUMMARY. > Under Reducer 3’s "Reduce Vectorization:" you’ll see > notVectorizedReason: Aggregation Function UDF avg parameter expression for > GROUPBY operator: Data type struct of > Column\[VALUE._col2\] not supported > For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": > "false" which says a node has a GROUP BY with an AVG or some other aggregator > that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators > are row-mode. I.e. not vector output. > If "usesVectorUDFAdaptor:": "false" were true, it would say there was at > least one vectorized expression is using VectorUDFAdaptor. > And, "allNative:": "false" will be true when all operators are native. > Today, GROUP BY and FILE SINK are not native. MAP JOIN and REDUCE SINK are > conditionally native. FILTER and SELECT are native. > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > ... > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Reducer 3 <- Reducer 2 (SIMPLE_EDGE) > ... > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: alltypesorc > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Select Operator > expressions: cint (type: int) > outputColumnNames: cint > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Group By Operator > keys: cint (type: int) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Execution mode: vectorized, llap > LLAP IO: all inputs > Map Vectorization: > enabled: true > enabledConditionsMet: > hive.vectorized.use.vectorized.input.format IS true > groupByVectorOutput: true > inputFileFormats: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > allNative: false > usesVectorUDFAdaptor: false > vectorized: true > Reducer 2 > Execution mode: vectorized, llap > Reduce Vectorization: > enabled: true >
[jira] [Commented] (HIVE-13046) DependencyResolver should not lowercase the dependency URI's authority
[ https://issues.apache.org/jira/browse/HIVE-13046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568823#comment-15568823 ] Hive QA commented on HIVE-13046: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832794/HIVE-13046.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10636 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk] {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1499/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1499/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1499/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12832794 - PreCommit-HIVE-Build > DependencyResolver should not lowercase the dependency URI's authority > -- > > Key: HIVE-13046 > URL: https://issues.apache.org/jira/browse/HIVE-13046 > Project: Hive > Issue Type: Bug >Reporter: Anthony Hsu >Assignee: Anthony Hsu > Attachments: HIVE-13046.1.patch, HIVE-13046.2.patch > > > When using {{ADD JAR ivy://...}} to add a jar version {{1.2.3-SNAPSHOT}}, > Hive will lowercase it to {{1.2.3-snapshot}} due to: > {code:title=DependencyResolver.java#84} > String[] authorityTokens = authority.toLowerCase().split(":"); > {code} > We should not {{.lowerCase()}}. > RB: https://reviews.apache.org/r/43513 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14640) handle hive.merge.*files in select queries
[ https://issues.apache.org/jira/browse/HIVE-14640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568827#comment-15568827 ] Hive QA commented on HIVE-14640: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832789/HIVE-14640.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1500/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1500/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1500/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2016-10-12 14:08:13.340 + [[ -n /usr/java/jdk1.8.0_25 ]] + export JAVA_HOME=/usr/java/jdk1.8.0_25 + JAVA_HOME=/usr/java/jdk1.8.0_25 + export PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-1500/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2016-10-12 14:08:13.343 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 1f258e9 HIVE-14877: Move slow CliDriver tests to MiniLlap (Prasanth Jayachandran reviewed by Siddharth Seth) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 1f258e9 HIVE-14877: Move slow CliDriver tests to MiniLlap (Prasanth Jayachandran reviewed by Siddharth Seth) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2016-10-12 14:08:14.396 + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch error: patch failed: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:3123 error: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: patch does not apply error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java:253 error: ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java: patch does not apply error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:98 error: ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java: patch does not apply error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java:315 error: ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java: patch does not apply error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java:1420 error: ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java: patch does not apply error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java:1256 error: ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java: patch does not apply error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java:1676 error: ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java: patch does not apply error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java:304 error: ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java: patch does not apply error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:6575 error: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java: patch does not apply error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java:255 error: ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java: patch does not apply error: ql/src/test/queries/clientpositive/mm_all.q: No such file or directory error: ql/src/test/queries/clientpositive/mm_current.q: No such file or directory error: ql/src/test/results/clientpositive/llap/mm_all.q.out: No such file or
[jira] [Commented] (HIVE-13280) Error when more than 1 mapper for HBase storage handler
[ https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568866#comment-15568866 ] Vsevolod Ostapenko commented on HIVE-13280: --- Hive-Hbase integration documentation (https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration) claims that hbase.mapred.output.outputtable property is optional, and provides no good explanation under what circumstances one would want or need to define it. In all the provided samples values of hbase.mapred.output.outputtable and hbase.table.name are the same, so samples are hot helpful and not self-explanatory. If TEZ does require hbase.mapred.output.outputtable property to be explicitly set, documentation needs to be updated to indicate that fact. Also, it would be helpful to provide some background why this property exists in the first place. > Error when more than 1 mapper for HBase storage handler > --- > > Key: HIVE-13280 > URL: https://issues.apache.org/jira/browse/HIVE-13280 > Project: Hive > Issue Type: Bug > Components: HBase Handler >Affects Versions: 2.0.0 >Reporter: Damien Carol >Assignee: Damien Carol > > With a simple query (select from orc table and insert into HBase external > table): > {code:sql} > insert into table register.register select * from aa_temp > {code} > The aa_temp table have 45 orc files. It generate 45 mappers. > Some mappers fail with this error: > {noformat} > Caused by: java.lang.IllegalArgumentException: Must specify table name > at > org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) > at > org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) > at > org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) > ... 25 more > ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 > killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed > due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. > failedVertices:1 killedVertices:0 (state=08S01,code=2) > {noformat} > If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is > fine because there are only one mapper. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13798) Fix the unit test failure org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
[ https://issues.apache.org/jira/browse/HIVE-13798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-13798: Fix Version/s: 2.1.1 > Fix the unit test failure > org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload > > > Key: HIVE-13798 > URL: https://issues.apache.org/jira/browse/HIVE-13798 > Project: Hive > Issue Type: Sub-task >Reporter: Aihua Xu >Assignee: Aihua Xu > Fix For: 2.2.0, 2.1.1 > > Attachments: HIVE-13798.2.patch, HIVE-13798.3.patch, > HIVE-13798.4.patch, HIVE-13798.addendum.patch, HIVE-13798.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13798) Fix the unit test failure org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
[ https://issues.apache.org/jira/browse/HIVE-13798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568925#comment-15568925 ] Aihua Xu commented on HIVE-13798: - Thanks [~leftylev] for catching it. Yeah. The addendum is needed. > Fix the unit test failure > org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload > > > Key: HIVE-13798 > URL: https://issues.apache.org/jira/browse/HIVE-13798 > Project: Hive > Issue Type: Sub-task >Reporter: Aihua Xu >Assignee: Aihua Xu > Fix For: 2.2.0, 2.1.1 > > Attachments: HIVE-13798.2.patch, HIVE-13798.3.patch, > HIVE-13798.4.patch, HIVE-13798.addendum.patch, HIVE-13798.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14928) Analyze table no scan mess up schema
[ https://issues.apache.org/jira/browse/HIVE-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HIVE-14928: --- Status: Open (was: Patch Available) > Analyze table no scan mess up schema > > > Key: HIVE-14928 > URL: https://issues.apache.org/jira/browse/HIVE-14928 > Project: Hive > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > Attachments: HIVE-14928.1.patch, HIVE-14928.2.patch > > > StatsNoJobTask uses static variables partUpdates and table to track stats > changes. If multiple analyze no scan tasks run at the same time, then > table/partition schema could mess up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14928) Analyze table no scan mess up schema
[ https://issues.apache.org/jira/browse/HIVE-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HIVE-14928: --- Attachment: HIVE-14928.2.patch Rebased the patch to master latest. > Analyze table no scan mess up schema > > > Key: HIVE-14928 > URL: https://issues.apache.org/jira/browse/HIVE-14928 > Project: Hive > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > Attachments: HIVE-14928.1.patch, HIVE-14928.2.patch > > > StatsNoJobTask uses static variables partUpdates and table to track stats > changes. If multiple analyze no scan tasks run at the same time, then > table/partition schema could mess up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14928) Analyze table no scan mess up schema
[ https://issues.apache.org/jira/browse/HIVE-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HIVE-14928: --- Status: Patch Available (was: Open) > Analyze table no scan mess up schema > > > Key: HIVE-14928 > URL: https://issues.apache.org/jira/browse/HIVE-14928 > Project: Hive > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > Attachments: HIVE-14928.1.patch, HIVE-14928.2.patch > > > StatsNoJobTask uses static variables partUpdates and table to track stats > changes. If multiple analyze no scan tasks run at the same time, then > table/partition schema could mess up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14930) RuntimeException was seen in explainanalyze_3.q test log
[ https://issues.apache.org/jira/browse/HIVE-14930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-14930: --- Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to 2.2.0, thanks [~pxiong] for review. > RuntimeException was seen in explainanalyze_3.q test log > > > Key: HIVE-14930 > URL: https://issues.apache.org/jira/browse/HIVE-14930 > Project: Hive > Issue Type: Bug >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-14930.patch > > > When working on HIVE-14799, I noticed there were some RuntimeException when > running explainanalyze_3.q and explainanalyze_5.q, though these tests shew > successful. > {code} > 016-10-10T19:02:48,455 ERROR [aa5c6743-b5de-40fc-82da-5dde0e6b387f main] > ql.Driver: FAILED: Hive Internal Error: java.lang.RuntimeException(Cannot > overwrite read-only table: src) > java.lang.RuntimeException: Cannot overwrite read-only table: src > at > org.apache.hadoop.hive.ql.hooks.EnforceReadOnlyTables.run(EnforceReadOnlyTables.java:74) > at > org.apache.hadoop.hive.ql.hooks.EnforceReadOnlyTables.run(EnforceReadOnlyTables.java:56) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1736) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1505) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1218) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1208) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:106) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:251) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:504) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1298) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1436) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1218) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1208) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336) > at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1319) > at > org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1293) > at > org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:173) > at > org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104) > at > org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver(TestMiniTezCliDriver.java:59) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:92) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at org.junit.runners.Suite.runChild(Suite.java:127) > at org.junit.runners.Suite.runChild(Suite.java:26) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.
[jira] [Commented] (HIVE-13046) DependencyResolver should not lowercase the dependency URI's authority
[ https://issues.apache.org/jira/browse/HIVE-13046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568973#comment-15568973 ] Hive QA commented on HIVE-13046: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832794/HIVE-13046.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 10636 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1501/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1501/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1501/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12832794 - PreCommit-HIVE-Build > DependencyResolver should not lowercase the dependency URI's authority > -- > > Key: HIVE-13046 > URL: https://issues.apache.org/jira/browse/HIVE-13046 > Project: Hive > Issue Type: Bug >Reporter: Anthony Hsu >Assignee: Anthony Hsu > Attachments: HIVE-13046.1.patch, HIVE-13046.2.patch > > > When using {{ADD JAR ivy://...}} to add a jar version {{1.2.3-SNAPSHOT}}, > Hive will lowercase it to {{1.2.3-snapshot}} due to: > {code:title=DependencyResolver.java#84} > String[] authorityTokens = authority.toLowerCase().split(":"); > {code} > We should not {{.lowerCase()}}. > RB: https://reviews.apache.org/r/43513 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14476) Fix logging issue for branch-1
[ https://issues.apache.org/jira/browse/HIVE-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568982#comment-15568982 ] Hive QA commented on HIVE-14476: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832801/HIVE-14476.1-branch-1.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1502/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1502/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1502/ Messages: {noformat} This message was trimmed, see log for full details [INFO] [INFO] [INFO] Building Hive Query Language 1.3.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-exec --- [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql/target [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ hive-exec --- [INFO] [INFO] --- maven-antrun-plugin:1.7:run (generate-sources) @ hive-exec --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-test-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen Generating vector expression code Generating vector expression test code [INFO] Executed tasks [INFO] [INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-exec --- [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/protobuf/gen-java added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/thrift/gen-javabean added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java added. [INFO] [INFO] --- antlr3-maven-plugin:3.4:antlr (default) @ hive-exec --- [INFO] ANTLR: Processing source directory /data/hive-ptest/working/apache-github-source-source/ql/src/java ANTLR Parser Generator Version 3.4 org/apache/hadoop/hive/ql/parse/HiveLexer.g org/apache/hadoop/hive/ql/parse/HiveParser.g warning(200): IdentifiersParser.g:455:5: Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_ORDER KW_BY" using multiple alternatives: 2, 9 As a result, alternative(s) 9 were disabled for that input warning(200): IdentifiersParser.g:455:5: Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_UNION KW_FROM" using multiple alternatives: 2, 9 As a result, alternative(s) 9 were disabled for that input warning(200): IdentifiersParser.g:455:5: Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_UNION KW_ALL" using multiple alternatives: 2, 9 As a result, alternative(s) 9 were disabled for that input warning(200): IdentifiersParser.g:455:5: Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_SORT KW_BY" using multiple alternatives: 2, 9 As a result, alternative(s) 9 were disabled for that input warning(200): IdentifiersParser.g:455:5: Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_INSERT KW_INTO" using multiple alternatives: 2, 9 As a result, alternative(s) 9 were disabled for that input warning(200): IdentifiersParser.g:455:5: Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_UNION KW_SELECT" using multiple alternatives: 2, 9 As a result, alternative(s) 9 were disabled for that input warning(200): IdentifiersParser.g:455:5: Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_INSERT KW_OVERWRITE" using multiple alternatives: 2, 9 As a result, alternative(s) 9 were disabled for that input warning(200): IdentifiersParser.g:455:5: Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_UNION KW_MAP" using multiple alternatives: 2, 9 As a result, alternative(s) 9 were disabled for that input warning(200): IdentifiersParser.g:455:5: Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_MAP LPAREN" using multiple alternatives: 2, 9 As a result, alternative(s) 9 were disabled for that input warning(200): IdentifiersParser.g:455:5: Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_GROUP KW_BY" using multiple alternatives: 2, 9 As a re
[jira] [Comment Edited] (HIVE-13280) Error when more than 1 mapper for HBase storage handler
[ https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568866#comment-15568866 ] Vsevolod Ostapenko edited comment on HIVE-13280 at 10/12/16 3:08 PM: - Hive-Hbase integration documentation (https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration) states that hbase.mapred.output.outputtable property is optional, and needed only when one wants to insert into a table. The latter statement is obviously incorrect, as prior to Feb 26, 2016, this property wasn't even documented and inserts into HBase-backed tables were working just fine with MR engine. If TEZ does require hbase.mapred.output.outputtable property to be explicitly set, documentation needs to be updated to indicate that fact. One more thing, all the existing samples have hbase.mapred.output.outputtable and hbase.table.name set to the same value. If there is no use case when they are different, why the former even needed? was (Author: seva_ostapenko): Hive-Hbase integration documentation (https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration) claims that hbase.mapred.output.outputtable property is optional, and provides no good explanation under what circumstances one would want or need to define it. In all the provided samples values of hbase.mapred.output.outputtable and hbase.table.name are the same, so samples are hot helpful and not self-explanatory. If TEZ does require hbase.mapred.output.outputtable property to be explicitly set, documentation needs to be updated to indicate that fact. Also, it would be helpful to provide some background why this property exists in the first place. > Error when more than 1 mapper for HBase storage handler > --- > > Key: HIVE-13280 > URL: https://issues.apache.org/jira/browse/HIVE-13280 > Project: Hive > Issue Type: Bug > Components: HBase Handler >Affects Versions: 2.0.0 >Reporter: Damien Carol >Assignee: Damien Carol > > With a simple query (select from orc table and insert into HBase external > table): > {code:sql} > insert into table register.register select * from aa_temp > {code} > The aa_temp table have 45 orc files. It generate 45 mappers. > Some mappers fail with this error: > {noformat} > Caused by: java.lang.IllegalArgumentException: Must specify table name > at > org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) > at > org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) > at > org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) > ... 25 more > ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 > killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed > due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. > failedVertices:1 killedVertices:0 (state=08S01,code=2) > {noformat} > If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is > fine because there are only one mapper. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14922) Add perf logging for post job completion steps
[ https://issues.apache.org/jira/browse/HIVE-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-14922: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Pushed to master. Failed test passed when re-run locally with patch. > Add perf logging for post job completion steps > --- > > Key: HIVE-14922 > URL: https://issues.apache.org/jira/browse/HIVE-14922 > Project: Hive > Issue Type: Task > Components: Logging >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Fix For: 2.2.0 > > Attachments: HIVE-14922.patch > > > Mostly FS related operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14916) Reduce the memory requirements for Spark tests
[ https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569113#comment-15569113 ] Hive QA commented on HIVE-14916: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832814/HIVE-14916.003.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 10636 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1503/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1503/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1503/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12832814 - PreCommit-HIVE-Build > Reduce the memory requirements for Spark tests > -- > > Key: HIVE-14916 > URL: https://issues.apache.org/jira/browse/HIVE-14916 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Dapeng Sun > Attachments: HIVE-14916.001.patch, HIVE-14916.002.patch, > HIVE-14916.003.patch > > > As HIVE-14887, we need to reduce the memory requirements for Spark tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14913) Add new unit tests
[ https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569135#comment-15569135 ] Ashutosh Chauhan commented on HIVE-14913: - [~vgarg] Can you add RB link for this? > Add new unit tests > -- > > Key: HIVE-14913 > URL: https://issues.apache.org/jira/browse/HIVE-14913 > Project: Hive > Issue Type: Task > Components: Tests >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, > HIVE-14913.3.patch, HIVE-14913.4.patch > > > Moving bunch of tests from system test to hive unit tests to reduce testing > overhead -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11957) SHOW TRANSACTIONS should show queryID/agent id of the creator
[ https://issues.apache.org/jira/browse/HIVE-11957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569142#comment-15569142 ] Wei Zheng commented on HIVE-11957: -- [~ekoifman] Can you take a look? SHOW TRANSACTIONS now output like this: {code} hive> show transactions; OK Transaction ID Transaction State Started TimeLast Heartbeat Time UserHostname 16 OPENMon Oct 10 11:26:14 PDT 2016Mon Oct 10 11:26:14 PDT 2016 wzheng weimac.local Time taken: 0.028 seconds, Fetched: 2 row(s) {code} > SHOW TRANSACTIONS should show queryID/agent id of the creator > - > > Key: HIVE-11957 > URL: https://issues.apache.org/jira/browse/HIVE-11957 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Wei Zheng > Attachments: HIVE-11957.1.patch, HIVE-11957.2.patch, > HIVE-11957.3.patch, HIVE-11957.4.patch, HIVE-11957.5.patch > > > this would be very useful for debugging > should also include heartbeat/create timestamps > would be nice to support some filtering/sorting options, like sort by create > time, agent id. filter by table, database, etc -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-14835) Improve ptest2 build time
[ https://issues.apache.org/jira/browse/HIVE-14835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569154#comment-15569154 ] Siddharth Seth edited comment on HIVE-14835 at 10/12/16 4:20 PM: - [~prasanth_j] - did this go in again? Can the jira be closed. was (Author: sseth): [~prasanth_j] - did this go in again? > Improve ptest2 build time > - > > Key: HIVE-14835 > URL: https://issues.apache.org/jira/browse/HIVE-14835 > Project: Hive > Issue Type: Sub-task > Components: Testing Infrastructure >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 2.2.0 > > Attachments: HIVE-14835.1.patch > > > NO PRECOMMIT TESTS > 2 things can be improved > 1) ptest2 always downloads jars for compiling its own directory which takes > about 1m30s which should take only 5s with cache jars. The reason for that is > maven.repo.local is pointing to a path under WORKSPACE which will be cleaned > by jenkins for every run. > 2) For hive build we can make use of parallel build and quite the output of > build which should shave off another 15-30s. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14835) Improve ptest2 build time
[ https://issues.apache.org/jira/browse/HIVE-14835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569154#comment-15569154 ] Siddharth Seth commented on HIVE-14835: --- [~prasanth_j] - did this go in again? > Improve ptest2 build time > - > > Key: HIVE-14835 > URL: https://issues.apache.org/jira/browse/HIVE-14835 > Project: Hive > Issue Type: Sub-task > Components: Testing Infrastructure >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 2.2.0 > > Attachments: HIVE-14835.1.patch > > > NO PRECOMMIT TESTS > 2 things can be improved > 1) ptest2 always downloads jars for compiling its own directory which takes > about 1m30s which should take only 5s with cache jars. The reason for that is > maven.repo.local is pointing to a path under WORKSPACE which will be cleaned > by jenkins for every run. > 2) For hive build we can make use of parallel build and quite the output of > build which should shave off another 15-30s. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-14539) Run additional tests from the module directory
[ https://issues.apache.org/jira/browse/HIVE-14539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth resolved HIVE-14539. --- Resolution: Done As part of HIVE-14540 > Run additional tests from the module directory > -- > > Key: HIVE-14539 > URL: https://issues.apache.org/jira/browse/HIVE-14539 > Project: Hive > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > > There's still close to 400 tests which run from the wrong directory (and end > up checking for file changes on more modules than required) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-14827) Micro benchmark for Parquet vectorized reader
[ https://issues.apache.org/jira/browse/HIVE-14827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar reassigned HIVE-14827: --- Assignee: Sahil Takiar > Micro benchmark for Parquet vectorized reader > - > > Key: HIVE-14827 > URL: https://issues.apache.org/jira/browse/HIVE-14827 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Sahil Takiar > > We need a microbenchmark to evaluate the throughput and execution time for > Parquet vectorized reader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14938) Add deployed ptest properties file to repo, update to remove isolated tests
[ https://issues.apache.org/jira/browse/HIVE-14938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-14938: -- Attachment: HIVE-14938.part1.patch Initial config file - the existing one being used. > Add deployed ptest properties file to repo, update to remove isolated tests > --- > > Key: HIVE-14938 > URL: https://issues.apache.org/jira/browse/HIVE-14938 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14938.part1.patch > > > The intent is to checkin the original file, and then modify it to remove > isolated tests (and move relevant ones to the skipBatching list), which > normally lead to stragglers, and sub-optimal resource utilization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14938) Add deployed ptest properties file to repo, update to remove isolated tests
[ https://issues.apache.org/jira/browse/HIVE-14938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-14938: -- Attachment: HIVE-14938.part2.patch Revision on top of the first patch with changes to remove isolation, add batching for spark tests and encryptedhdfs tests, skipBatching for others. This includes changes made by [~prasanth_j] and me for internal runs, to improve the runtimes. [~prasanth_j], [~spena] - could you please take a look for sanity, before I commit these changes, and update the deployed ptest instance. > Add deployed ptest properties file to repo, update to remove isolated tests > --- > > Key: HIVE-14938 > URL: https://issues.apache.org/jira/browse/HIVE-14938 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14938.part1.patch, HIVE-14938.part2.patch > > > The intent is to checkin the original file, and then modify it to remove > isolated tests (and move relevant ones to the skipBatching list), which > normally lead to stragglers, and sub-optimal resource utilization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14799) Query operation are not thread safe during its cancellation
[ https://issues.apache.org/jira/browse/HIVE-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569254#comment-15569254 ] Hive QA commented on HIVE-14799: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832818/HIVE-14799.6.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10636 tests executed *Failed tests:* {noformat} org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testTaskStatus org.apache.hive.spark.client.TestSparkClient.testJobSubmission {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1504/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1504/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1504/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12832818 - PreCommit-HIVE-Build > Query operation are not thread safe during its cancellation > --- > > Key: HIVE-14799 > URL: https://issues.apache.org/jira/browse/HIVE-14799 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-14799.1.patch, HIVE-14799.2.patch, > HIVE-14799.3.patch, HIVE-14799.4.patch, HIVE-14799.5.patch, > HIVE-14799.5.patch, HIVE-14799.6.patch, HIVE-14799.patch > > > When a query is cancelled either via Beeline (Ctrl-C) or API call > TCLIService.Client.CancelOperation, SQLOperation.cancel is invoked in a > different thread from that running the query to close/destroy its > encapsulated Driver object. Both SQLOperation and Driver are not thread-safe > which could sometimes result in Runtime exceptions like NPE. The errors from > the running query are not handled properly therefore probably causing some > stuffs (files, locks etc) not being cleaned after the query termination. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14721) Fix TestJdbcWithMiniHS2 runtime
[ https://issues.apache.org/jira/browse/HIVE-14721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569263#comment-15569263 ] Hive QA commented on HIVE-14721: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832823/HIVE-14721.7.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1505/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1505/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1505/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2016-10-12 17:07:09.998 + [[ -n /usr/java/jdk1.8.0_25 ]] + export JAVA_HOME=/usr/java/jdk1.8.0_25 + JAVA_HOME=/usr/java/jdk1.8.0_25 + export PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-1505/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2016-10-12 17:07:10.000 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 04b303b HIVE-14922 : Add perf logging for post job completion steps (Ashutosh Chauhan via Pengcheng Xiong) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 04b303b HIVE-14922 : Add perf logging for post job completion steps (Ashutosh Chauhan via Pengcheng Xiong) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2016-10-12 17:07:11.083 + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch error: a/itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/JdbcWithMiniKdcSQLAuthTest.java: No such file or directory error: a/itests/hive-unit/src/main/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java: No such file or directory error: a/itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java: No such file or directory The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12832823 - PreCommit-HIVE-Build > Fix TestJdbcWithMiniHS2 runtime > --- > > Key: HIVE-14721 > URL: https://issues.apache.org/jira/browse/HIVE-14721 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Fix For: 2.2.0 > > Attachments: HIVE-14721.1.patch, HIVE-14721.2.patch, > HIVE-14721.3.patch, HIVE-14721.3.patch, HIVE-14721.3.patch, > HIVE-14721.4.patch, HIVE-14721.4.patch, HIVE-14721.5.patch, > HIVE-14721.6.patch, HIVE-14721.6.patch, HIVE-14721.6.patch, HIVE-14721.7.patch > > > Currently 450s -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14761) Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2
[ https://issues.apache.org/jira/browse/HIVE-14761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-14761: Committed to master. Thanks [~sseth]. > Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2 > -- > > Key: HIVE-14761 > URL: https://issues.apache.org/jira/browse/HIVE-14761 > Project: Hive > Issue Type: Sub-task >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Fix For: 2.2.0 > > Attachments: HIVE-14761.1.patch > > > Currently 2 min 30 sec -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14761) Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2
[ https://issues.apache.org/jira/browse/HIVE-14761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-14761: Affects Version/s: 2.1.0 > Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2 > -- > > Key: HIVE-14761 > URL: https://issues.apache.org/jira/browse/HIVE-14761 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Fix For: 2.2.0 > > Attachments: HIVE-14761.1.patch > > > Currently 2 min 30 sec -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12458) remove identity_udf.jar from source
[ https://issues.apache.org/jira/browse/HIVE-12458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-12458: Attachment: HIVE-12458.1.patch > remove identity_udf.jar from source > --- > > Key: HIVE-12458 > URL: https://issues.apache.org/jira/browse/HIVE-12458 > Project: Hive > Issue Type: Bug > Components: Test >Affects Versions: 2.1.0 >Reporter: Thejas M Nair >Assignee: Vaibhav Gumashta > Attachments: HIVE-12458.1.patch > > > We should not be checking in jars into the source repo. > We could use hive-contrib jar like its used in > ./ql/src/test/queries/clientpositive/add_jar_pfile.q > add jar > pfile://${system:test.tmp.dir}/hive-contrib-${system:hive.version}.jar; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12458) remove identity_udf.jar from source
[ https://issues.apache.org/jira/browse/HIVE-12458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569295#comment-15569295 ] Vaibhav Gumashta commented on HIVE-12458: - [~thejas] I've removed the code that used this jar (in tests) as part of the work on improving test cases. Can you review this? > remove identity_udf.jar from source > --- > > Key: HIVE-12458 > URL: https://issues.apache.org/jira/browse/HIVE-12458 > Project: Hive > Issue Type: Bug > Components: Test >Affects Versions: 2.1.0 >Reporter: Thejas M Nair >Assignee: Vaibhav Gumashta > Attachments: HIVE-12458.1.patch > > > We should not be checking in jars into the source repo. > We could use hive-contrib jar like its used in > ./ql/src/test/queries/clientpositive/add_jar_pfile.q > add jar > pfile://${system:test.tmp.dir}/hive-contrib-${system:hive.version}.jar; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12458) remove identity_udf.jar from source
[ https://issues.apache.org/jira/browse/HIVE-12458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569312#comment-15569312 ] Thejas M Nair commented on HIVE-12458: -- +1 > remove identity_udf.jar from source > --- > > Key: HIVE-12458 > URL: https://issues.apache.org/jira/browse/HIVE-12458 > Project: Hive > Issue Type: Bug > Components: Test >Affects Versions: 2.1.0 >Reporter: Thejas M Nair >Assignee: Vaibhav Gumashta > Attachments: HIVE-12458.1.patch > > > We should not be checking in jars into the source repo. > We could use hive-contrib jar like its used in > ./ql/src/test/queries/clientpositive/add_jar_pfile.q > add jar > pfile://${system:test.tmp.dir}/hive-contrib-${system:hive.version}.jar; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14913) Add new unit tests
[ https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569322#comment-15569322 ] Vineet Garg commented on HIVE-14913: RB Link: https://reviews.apache.org/r/52708/ > Add new unit tests > -- > > Key: HIVE-14913 > URL: https://issues.apache.org/jira/browse/HIVE-14913 > Project: Hive > Issue Type: Task > Components: Tests >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, > HIVE-14913.3.patch, HIVE-14913.4.patch > > > Moving bunch of tests from system test to hive unit tests to reduce testing > overhead -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14933) include argparse with LLAP scripts to support antique Python versions
[ https://issues.apache.org/jira/browse/HIVE-14933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14933: Status: Patch Available (was: Open) > include argparse with LLAP scripts to support antique Python versions > - > > Key: HIVE-14933 > URL: https://issues.apache.org/jira/browse/HIVE-14933 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14933.patch > > > The module is a standalone file, and it's under Python license that is > compatible with Apache. In the long term we should probably just move > LlapServiceDriver code entirely to Java, as right now it's a combination of > part-py, part-java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14803) S3: Stats gathering for insert queries can be expensive for partitioned dataset
[ https://issues.apache.org/jira/browse/HIVE-14803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569355#comment-15569355 ] Pengcheng Xiong commented on HIVE-14803: Thanks [~sseth] for digging this out. [~rajesh.balamohan], it seems that we really have some problem in this patch. It looks like the stats are missing. In the explain plan, if the row of src table is 29 rather than 500, that usually means stats are missing. Could u take another look and upload a new patch? And, there is also a problem of the thread pool. People may set the mv.files.thread=0. In that case, threadpool will be null. Thanks. > S3: Stats gathering for insert queries can be expensive for partitioned > dataset > --- > > Key: HIVE-14803 > URL: https://issues.apache.org/jira/browse/HIVE-14803 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 2.1.0 >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-14803.1.patch > > > StatsTask's aggregateStats populates stats details for all partitions by > checking the file sizes which turns out to be expensive when larger number of > partitions are inserted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-12458) remove identity_udf.jar from source
[ https://issues.apache.org/jira/browse/HIVE-12458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta resolved HIVE-12458. - Resolution: Fixed Committed. Thanks [~thejas] > remove identity_udf.jar from source > --- > > Key: HIVE-12458 > URL: https://issues.apache.org/jira/browse/HIVE-12458 > Project: Hive > Issue Type: Bug > Components: Test >Affects Versions: 2.1.0 >Reporter: Thejas M Nair >Assignee: Vaibhav Gumashta > Attachments: HIVE-12458.1.patch > > > We should not be checking in jars into the source repo. > We could use hive-contrib jar like its used in > ./ql/src/test/queries/clientpositive/add_jar_pfile.q > add jar > pfile://${system:test.tmp.dir}/hive-contrib-${system:hive.version}.jar; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12458) remove identity_udf.jar from source
[ https://issues.apache.org/jira/browse/HIVE-12458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-12458: Fix Version/s: 2.2.0 > remove identity_udf.jar from source > --- > > Key: HIVE-12458 > URL: https://issues.apache.org/jira/browse/HIVE-12458 > Project: Hive > Issue Type: Bug > Components: Test >Affects Versions: 2.1.0 >Reporter: Thejas M Nair >Assignee: Vaibhav Gumashta > Fix For: 2.2.0 > > Attachments: HIVE-12458.1.patch > > > We should not be checking in jars into the source repo. > We could use hive-contrib jar like its used in > ./ql/src/test/queries/clientpositive/add_jar_pfile.q > add jar > pfile://${system:test.tmp.dir}/hive-contrib-${system:hive.version}.jar; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569399#comment-15569399 ] Thomas Poepping commented on HIVE-14373: [~spena] I responded to your comments on RB. I would like to open a separate JIRA after the submission of this one that will change the qtests to run on Tez by default, rather than running on MR. What do you think? > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Thomas Poepping > Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, > HIVE-14373.04.patch, HIVE-14373.05.patch, HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Poepping updated HIVE-14373: --- Status: Open (was: Patch Available) > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Thomas Poepping > Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, > HIVE-14373.04.patch, HIVE-14373.05.patch, HIVE-14373.06.patch, > HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Poepping updated HIVE-14373: --- Status: Patch Available (was: Open) > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Thomas Poepping > Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, > HIVE-14373.04.patch, HIVE-14373.05.patch, HIVE-14373.06.patch, > HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Poepping updated HIVE-14373: --- Attachment: HIVE-14373.06.patch Attach new patch, addressed comments from RB > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Thomas Poepping > Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, > HIVE-14373.04.patch, HIVE-14373.05.patch, HIVE-14373.06.patch, > HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14835) Improve ptest2 build time
[ https://issues.apache.org/jira/browse/HIVE-14835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569418#comment-15569418 ] Prasanth Jayachandran commented on HIVE-14835: -- No. This patch is breaking ptest. Will apply it again when the queue is close to empty and will debug it further. > Improve ptest2 build time > - > > Key: HIVE-14835 > URL: https://issues.apache.org/jira/browse/HIVE-14835 > Project: Hive > Issue Type: Sub-task > Components: Testing Infrastructure >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 2.2.0 > > Attachments: HIVE-14835.1.patch > > > NO PRECOMMIT TESTS > 2 things can be improved > 1) ptest2 always downloads jars for compiling its own directory which takes > about 1m30s which should take only 5s with cache jars. The reason for that is > maven.repo.local is pointing to a path under WORKSPACE which will be cleaned > by jenkins for every run. > 2) For hive build we can make use of parallel build and quite the output of > build which should shave off another 15-30s. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14929) Adding JDBC test for query cancellation scenario
[ https://issues.apache.org/jira/browse/HIVE-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569426#comment-15569426 ] Hive QA commented on HIVE-14929: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832746/HIVE-14929.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 10640 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1506/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1506/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1506/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12832746 - PreCommit-HIVE-Build > Adding JDBC test for query cancellation scenario > > > Key: HIVE-14929 > URL: https://issues.apache.org/jira/browse/HIVE-14929 > Project: Hive > Issue Type: Test >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-14929.1.patch, HIVE-14929.2.patch > > > There is some functional testing for query cancellation using JDBC which is > missing in unit tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14929) Adding JDBC test for query cancellation scenario
[ https://issues.apache.org/jira/browse/HIVE-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569432#comment-15569432 ] Vaibhav Gumashta commented on HIVE-14929: - [~djaiswal] Can you submit again for QA run? There were some changes that went in {{TestJdbcDriver2}} yesterday, which brought down the running time to ~60-70s. Want to be sure the new tests don't affect that in a major way. I'll also take a look at the patch shortly. > Adding JDBC test for query cancellation scenario > > > Key: HIVE-14929 > URL: https://issues.apache.org/jira/browse/HIVE-14929 > Project: Hive > Issue Type: Test >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-14929.1.patch, HIVE-14929.2.patch > > > There is some functional testing for query cancellation using JDBC which is > missing in unit tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14929) Adding JDBC test for query cancellation scenario
[ https://issues.apache.org/jira/browse/HIVE-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569439#comment-15569439 ] Vaibhav Gumashta commented on HIVE-14929: - [~djaiswal] Nevermind, looks like the patch just had a fresh QA run. Please ignore my comment about rerunning. I'll take a look at the patch shortly. > Adding JDBC test for query cancellation scenario > > > Key: HIVE-14929 > URL: https://issues.apache.org/jira/browse/HIVE-14929 > Project: Hive > Issue Type: Test >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-14929.1.patch, HIVE-14929.2.patch > > > There is some functional testing for query cancellation using JDBC which is > missing in unit tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14929) Adding JDBC test for query cancellation scenario
[ https://issues.apache.org/jira/browse/HIVE-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569441#comment-15569441 ] Deepak Jaiswal commented on HIVE-14929: --- Sure. I will refresh my code and try that. > Adding JDBC test for query cancellation scenario > > > Key: HIVE-14929 > URL: https://issues.apache.org/jira/browse/HIVE-14929 > Project: Hive > Issue Type: Test >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-14929.1.patch, HIVE-14929.2.patch > > > There is some functional testing for query cancellation using JDBC which is > missing in unit tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14913) Add new unit tests
[ https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-14913: --- Status: Open (was: Patch Available) > Add new unit tests > -- > > Key: HIVE-14913 > URL: https://issues.apache.org/jira/browse/HIVE-14913 > Project: Hive > Issue Type: Task > Components: Tests >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, > HIVE-14913.3.patch, HIVE-14913.4.patch > > > Moving bunch of tests from system test to hive unit tests to reduce testing > overhead -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569450#comment-15569450 ] Thomas Poepping commented on HIVE-14373: Have two +1s on RB. Awaiting precommit tests, then patch should be good to go > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Thomas Poepping > Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, > HIVE-14373.04.patch, HIVE-14373.05.patch, HIVE-14373.06.patch, > HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14913) Add new unit tests
[ https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-14913: --- Status: Patch Available (was: Open) > Add new unit tests > -- > > Key: HIVE-14913 > URL: https://issues.apache.org/jira/browse/HIVE-14913 > Project: Hive > Issue Type: Task > Components: Tests >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, > HIVE-14913.3.patch, HIVE-14913.4.patch > > > Moving bunch of tests from system test to hive unit tests to reduce testing > overhead -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14929) Adding JDBC test for query cancellation scenario
[ https://issues.apache.org/jira/browse/HIVE-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569473#comment-15569473 ] Vaibhav Gumashta commented on HIVE-14929: - Patch looks good. I just saw the latest test report and it doesn't add any overhead. +1 from my side. > Adding JDBC test for query cancellation scenario > > > Key: HIVE-14929 > URL: https://issues.apache.org/jira/browse/HIVE-14929 > Project: Hive > Issue Type: Test >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-14929.1.patch, HIVE-14929.2.patch > > > There is some functional testing for query cancellation using JDBC which is > missing in unit tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14929) Adding JDBC test for query cancellation scenario
[ https://issues.apache.org/jira/browse/HIVE-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569478#comment-15569478 ] Deepak Jaiswal commented on HIVE-14929: --- Thanks Vaibhav. > Adding JDBC test for query cancellation scenario > > > Key: HIVE-14929 > URL: https://issues.apache.org/jira/browse/HIVE-14929 > Project: Hive > Issue Type: Test >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-14929.1.patch, HIVE-14929.2.patch > > > There is some functional testing for query cancellation using JDBC which is > missing in unit tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14906) HMS should support an API to get consistent atomic snapshot associated with a Notification ID.
[ https://issues.apache.org/jira/browse/HIVE-14906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569504#comment-15569504 ] Sravya Tirukkovalur commented on HIVE-14906: Seems like if we do the following, we should be able to support an atomic getSnapshot() API: - Set transaction level to "repeatable-read", so that all reads within a transaction would be from a single generation point. In other words, concurrent writes would not affect the state of the read. - Make all the reads of snapshot building function part of the same transaction. > HMS should support an API to get consistent atomic snapshot associated with a > Notification ID. > -- > > Key: HIVE-14906 > URL: https://issues.apache.org/jira/browse/HIVE-14906 > Project: Hive > Issue Type: Improvement >Reporter: Sravya Tirukkovalur > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13966) DbNotificationListener: can loose DDL operation notifications
[ https://issues.apache.org/jira/browse/HIVE-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569597#comment-15569597 ] Alan Gates commented on HIVE-13966: --- Assigned back to Rahul as I didn't intend to take over the JIRA, I just had to assign it to myself to upload a patch. > DbNotificationListener: can loose DDL operation notifications > - > > Key: HIVE-13966 > URL: https://issues.apache.org/jira/browse/HIVE-13966 > Project: Hive > Issue Type: Bug > Components: HCatalog >Reporter: Nachiket Vaidya >Assignee: Rahul Sharma >Priority: Critical > Attachments: HIVE-13966.1.patch, HIVE-13966.2.patch, > HIVE-13966.3.patch, HIVE-13966.pdf > > > The code for each API in HiveMetaStore.java is like this: > 1. openTransaction() > 2. -- operation-- > 3. commit() or rollback() based on result of the operation. > 4. add entry to notification log (unconditionally) > If the operation is failed (in step 2), we still add entry to notification > log. Found this issue in testing. > It is still ok as this is the case of false positive. > If the operation is successful and adding to notification log failed, the > user will get an MetaException. It will not rollback the operation, as it is > already committed. We need to handle this case so that we will not have false > negatives. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13966) DbNotificationListener: can loose DDL operation notifications
[ https://issues.apache.org/jira/browse/HIVE-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-13966: -- Assignee: Rahul Sharma (was: Alan Gates) > DbNotificationListener: can loose DDL operation notifications > - > > Key: HIVE-13966 > URL: https://issues.apache.org/jira/browse/HIVE-13966 > Project: Hive > Issue Type: Bug > Components: HCatalog >Reporter: Nachiket Vaidya >Assignee: Rahul Sharma >Priority: Critical > Attachments: HIVE-13966.1.patch, HIVE-13966.2.patch, > HIVE-13966.3.patch, HIVE-13966.pdf > > > The code for each API in HiveMetaStore.java is like this: > 1. openTransaction() > 2. -- operation-- > 3. commit() or rollback() based on result of the operation. > 4. add entry to notification log (unconditionally) > If the operation is failed (in step 2), we still add entry to notification > log. Found this issue in testing. > It is still ok as this is the case of false positive. > If the operation is successful and adding to notification log failed, the > user will get an MetaException. It will not rollback the operation, as it is > already committed. We need to handle this case so that we will not have false > negatives. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14822) Add support for credential provider for jobs launched from Hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-14822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-14822: --- Attachment: HIVE-14822.05.patch Updating the patch with the changes suggested. > Add support for credential provider for jobs launched from Hiveserver2 > -- > > Key: HIVE-14822 > URL: https://issues.apache.org/jira/browse/HIVE-14822 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-14822.01.patch, HIVE-14822.02.patch, > HIVE-14822.03.patch, HIVE-14822.05.patch > > > When using encrypted passwords via the Hadoop Credential Provider, > HiveServer2 currently does not correctly forward enough information to the > job configuration for jobs to read those secrets. If your job needs to access > any secrets, like S3 credentials, then there's no convenient and secure way > to configure this today. > You could specify the decryption key in files like mapred-site.xml that > HiveServer2 uses, but this would place the encryption password on local disk > in plaintext, which can be a security concern. > To solve this problem, HiveServer2 should modify job configuration to include > the environment variable settings needed to decrypt the passwords. > Specifically, it will need to modify: > * For MR2 jobs: > ** yarn.app.mapreduce.am.admin.user.env > ** mapreduce.admin.user.env > * For Spark jobs: > ** spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD > ** spark.executorEnv.HADOOP_CREDSTORE_PASSWORD > HiveServer2 can get the decryption password from its own environment, the > same way it does for its own credential provider store today. > Additionally, it can be desirable for HiveServer2 to have a separate > encrypted password file than what is used by the job. HiveServer2 may have > secrets that the job should not have, such as the metastore database password > or the password to decrypt its private SSL certificate. It is also best > practices to have separate passwords on separate files. To facilitate this, > Hive will also accept: > * A configuration for a path to a credential store to use for jobs. This > should already be uploaded in HDFS. (hive.server2.job.keystore.location or a > better name) If this is not specified, then HS2 will simply use the value of > hadoop.security.credential.provider.path. > * An environment variable for the password to decrypt the credential store > (HIVE_JOB_KEYSTORE_PASSWORD or better). If this is not specified, then HS2 > will simply use the standard environment variable for decrypting the Hadoop > Credential Provider. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569628#comment-15569628 ] Hive QA commented on HIVE-11394: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832883/HIVE-11394.091.patch {color:green}SUCCESS:{color} +1 due to 162 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 81 failed/errored test(s), 10601 tests executed *Failed tests:* {noformat} TestMiniLlapLocalCliDriver-orc_llap.q-delete_where_non_partitioned.q-vector_groupby_mapjoin.q-and-27-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reloadJar] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_adaptor_usage_mode] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_aggregate_9] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_between_in] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_binary_join_groupby] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_cast_constant] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_2] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_coalesce_2] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_complex_all] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_count] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_count_distinct] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_aggregate] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_precision] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_distinct_2] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_empty_where] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_3] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_include_no_sel] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_orderby_5] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_partition_diff_num_cols] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce_groupby_decimal] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_string_concat] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_when_case_null] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_0] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_13] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_limit] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_date_funcs] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_mapjoin2] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_mapjoin] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_timestamp] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_timestamp_funcs] org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_aggregate_9] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_auto_smb_mapjoin_14] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_binary_join_groupby] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_coalesce_2] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_complex_all] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_count] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_count_distinct] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_aggregate] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_precision] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_udf] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_distinct_2] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby4] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby6] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_3] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_reduce] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_grouping_sets] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_include_no_sel] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_mapjoin_reduce] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_number_compare_projection] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_orderby_5] org.apache.hadoop.hive.cli.TestMiniLlapLocalC
[jira] [Updated] (HIVE-14925) MSCK repair table hang while running with multi threading enabled
[ https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ratheesh Kamoor updated HIVE-14925: --- Fix Version/s: 2.2.0 Release Note: Issue: MSCK is failing in multithreaded execution Solution: - Moved Path processor logic to an external class which will avoid code duplication and it will be used in both multi-threaded and single threaded execution. Status: Patch Available (was: Open) > MSCK repair table hang while running with multi threading enabled > - > > Key: HIVE-14925 > URL: https://issues.apache.org/jira/browse/HIVE-14925 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 2.2.0 >Reporter: Ratheesh Kamoor >Assignee: Pengcheng Xiong >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-14925.patch > > > MSCK REPAIR TABLE hanging while running with multi-threading enabled > (default). I think it is because of a major design flaw in how thread pool > implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This > method has a thread pool which register Callable but callable makes a > recursive call to checkPartitionDirs method again. This code will hang when > number of directories is more than thread pool size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14925) MSCK repair table hang while running with multi threading enabled
[ https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ratheesh Kamoor updated HIVE-14925: --- Attachment: HIVE-14925.patch > MSCK repair table hang while running with multi threading enabled > - > > Key: HIVE-14925 > URL: https://issues.apache.org/jira/browse/HIVE-14925 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 2.2.0 >Reporter: Ratheesh Kamoor >Assignee: Pengcheng Xiong >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-14925.patch > > > MSCK REPAIR TABLE hanging while running with multi-threading enabled > (default). I think it is because of a major design flaw in how thread pool > implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This > method has a thread pool which register Callable but callable makes a > recursive call to checkPartitionDirs method again. This code will hang when > number of directories is more than thread pool size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled
[ https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569746#comment-15569746 ] Ratheesh Kamoor commented on HIVE-14925: [~pxiong] I moved the logic in inline callable to an external class so that code can be reused in with multi-threaded and non-multi threaded scenario. Also, it will fix the issues of thread lock. Could you please review. Tested with very large partitions (5K+) we have and worked fine. > MSCK repair table hang while running with multi threading enabled > - > > Key: HIVE-14925 > URL: https://issues.apache.org/jira/browse/HIVE-14925 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 2.2.0 >Reporter: Ratheesh Kamoor >Assignee: Pengcheng Xiong >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-14925.patch > > > MSCK REPAIR TABLE hanging while running with multi-threading enabled > (default). I think it is because of a major design flaw in how thread pool > implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This > method has a thread pool which register Callable but callable makes a > recursive call to checkPartitionDirs method again. This code will hang when > number of directories is more than thread pool size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14921) Move slow CliDriver tests to MiniLlap - part 2
[ https://issues.apache.org/jira/browse/HIVE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569754#comment-15569754 ] Hive QA commented on HIVE-14921: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832840/HIVE-14921.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10601 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reloadJar] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[alter_table_invalidate_column_stats] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[newline] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_merge10] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1508/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1508/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1508/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12832840 - PreCommit-HIVE-Build > Move slow CliDriver tests to MiniLlap - part 2 > -- > > Key: HIVE-14921 > URL: https://issues.apache.org/jira/browse/HIVE-14921 > Project: Hive > Issue Type: Sub-task > Components: Tests >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-14921.1.patch, HIVE-14921.1.patch > > > Continuation to HIVE-14877 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13316) Upgrade to Calcite 1.10
[ https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569767#comment-15569767 ] Hive QA commented on HIVE-13316: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832847/HIVE-13316.05.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1509/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1509/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1509/ Messages: {noformat} This message was trimmed, see log for full details main: [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ spark-client --- [INFO] Compiling 28 source files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/classes [WARNING] /data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounter.java: Some input files use or override a deprecated API. [WARNING] /data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounter.java: Recompile with -Xlint:deprecation for details. [WARNING] /data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java: Some input files use unchecked or unsafe operations. [WARNING] /data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java: Recompile with -Xlint:unchecked for details. [INFO] [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ spark-client --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 1 resource [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ spark-client --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [copy] Copying 15 files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ spark-client --- [INFO] Compiling 5 source files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/test-classes [INFO] [INFO] --- maven-dependency-plugin:2.8:copy (copy-guava-14) @ spark-client --- [INFO] Configured Artifact: com.google.guava:guava:14.0.1:jar [INFO] Copying guava-14.0.1.jar to /data/hive-ptest/working/apache-github-source-source/spark-client/target/dependency/guava-14.0.1.jar [INFO] [INFO] --- maven-surefire-plugin:2.19.1:test (default-test) @ spark-client --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ spark-client --- [INFO] Building jar: /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.2.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ spark-client --- [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ spark-client --- [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.2.0-SNAPSHOT.jar to /data/hive-ptest/working/maven/org/apache/hive/spark-client/2.2.0-SNAPSHOT/spark-client-2.2.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/pom.xml to /data/hive-ptest/working/maven/org/apache/hive/spark-client/2.2.0-SNAPSHOT/spark-client-2.2.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive Query Language 2.2.0-SNAPSHOT [INFO] Downloading: http://www.datanucleus.org/downloads/maven2/org/apache/calcite/calcite-core/1.10.0/calcite-core-1.10.0.pom Downloading: http://repo.maven.apache.org/maven2/org/apache/calcite/calcite-core/1.10.0/calcite-core-1.10.0.pom Downloaded: http://repo.maven.apache.org/maven2/org/apache/calcite/calcite-core/1.10.0/calcite-core-1.10.0.pom (16 KB at 76.7 KB/sec) Downloading: http://www.datanucleus.org/downloads/maven2/org/apache/calcite/calcite/1.10.0/calcite-1.10.0.pom Downloading: http://repo.maven.apache.org/maven2/org/apache/calcite/calcite/1.10.0/calcite-1.10.0.pom Downloaded: http://repo.maven.apache.org/maven2/org/apache/calcite/calcite/1.10.0/calcite-
[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11394: Attachment: (was: HIVE-11394.091.patch) > Enhance EXPLAIN display for vectorization > - > > Key: HIVE-11394 > URL: https://issues.apache.org/jira/browse/HIVE-11394 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, > HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, > HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, > HIVE-11394.09.patch > > > Add detail to the EXPLAIN output showing why a Map and Reduce work is not > vectorized. > New syntax is: EXPLAIN VECTORIZATION \[ONLY\] > \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\] > The ONLY option suppresses most non-vectorization elements. > SUMMARY shows vectorization information for the PLAN (is vectorization > enabled) and a summary of Map and Reduce work. > OPERATOR shows vectorization information for operators. E.g. Filter > Vectorization. It includes all information of SUMMARY, too. > EXPRESSION shows vectorization information for expressions. E.g. > predicateExpression. It includes all information of SUMMARY and OPERATOR, > too. > DETAIL shows very vectorization information. > It includes all information of SUMMARY, OPERATOR, and EXPRESSION too. > The optional clause defaults are not ONLY and SUMMARY. > --- > Here are some examples: > EXPLAIN VECTORIZATION example: > (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization > sections) > Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION > SUMMARY. > Under Reducer 3’s "Reduce Vectorization:" you’ll see > notVectorizedReason: Aggregation Function UDF avg parameter expression for > GROUPBY operator: Data type struct of > Column\[VALUE._col2\] not supported > For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": > "false" which says a node has a GROUP BY with an AVG or some other aggregator > that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators > are row-mode. I.e. not vector output. > If "usesVectorUDFAdaptor:": "false" were true, it would say there was at > least one vectorized expression is using VectorUDFAdaptor. > And, "allNative:": "false" will be true when all operators are native. > Today, GROUP BY and FILE SINK are not native. MAP JOIN and REDUCE SINK are > conditionally native. FILTER and SELECT are native. > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > ... > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Reducer 3 <- Reducer 2 (SIMPLE_EDGE) > ... > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: alltypesorc > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Select Operator > expressions: cint (type: int) > outputColumnNames: cint > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Group By Operator > keys: cint (type: int) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Execution mode: vectorized, llap > LLAP IO: all inputs > Map Vectorization: > enabled: true > enabledConditionsMet: > hive.vectorized.use.vectorized.input.format IS true > groupByVectorOutput: true > inputFileFormats: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > allNative: false > usesVectorUDFAdaptor: false > vectorized: true > Reducer 2 > Execution mode: vectorized, llap > Reduce Vectorization: > enabled: true > enableConditi
[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11394: Attachment: HIVE-11394.091.patch > Enhance EXPLAIN display for vectorization > - > > Key: HIVE-11394 > URL: https://issues.apache.org/jira/browse/HIVE-11394 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, > HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, > HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, > HIVE-11394.09.patch, HIVE-11394.091.patch > > > Add detail to the EXPLAIN output showing why a Map and Reduce work is not > vectorized. > New syntax is: EXPLAIN VECTORIZATION \[ONLY\] > \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\] > The ONLY option suppresses most non-vectorization elements. > SUMMARY shows vectorization information for the PLAN (is vectorization > enabled) and a summary of Map and Reduce work. > OPERATOR shows vectorization information for operators. E.g. Filter > Vectorization. It includes all information of SUMMARY, too. > EXPRESSION shows vectorization information for expressions. E.g. > predicateExpression. It includes all information of SUMMARY and OPERATOR, > too. > DETAIL shows very vectorization information. > It includes all information of SUMMARY, OPERATOR, and EXPRESSION too. > The optional clause defaults are not ONLY and SUMMARY. > --- > Here are some examples: > EXPLAIN VECTORIZATION example: > (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization > sections) > Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION > SUMMARY. > Under Reducer 3’s "Reduce Vectorization:" you’ll see > notVectorizedReason: Aggregation Function UDF avg parameter expression for > GROUPBY operator: Data type struct of > Column\[VALUE._col2\] not supported > For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": > "false" which says a node has a GROUP BY with an AVG or some other aggregator > that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators > are row-mode. I.e. not vector output. > If "usesVectorUDFAdaptor:": "false" were true, it would say there was at > least one vectorized expression is using VectorUDFAdaptor. > And, "allNative:": "false" will be true when all operators are native. > Today, GROUP BY and FILE SINK are not native. MAP JOIN and REDUCE SINK are > conditionally native. FILTER and SELECT are native. > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > ... > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Reducer 3 <- Reducer 2 (SIMPLE_EDGE) > ... > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: alltypesorc > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Select Operator > expressions: cint (type: int) > outputColumnNames: cint > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Group By Operator > keys: cint (type: int) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Execution mode: vectorized, llap > LLAP IO: all inputs > Map Vectorization: > enabled: true > enabledConditionsMet: > hive.vectorized.use.vectorized.input.format IS true > groupByVectorOutput: true > inputFileFormats: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > allNative: false > usesVectorUDFAdaptor: false > vectorized: true > Reducer 2 > Execution mode: vectorized, llap > Reduce Vectorization: > enabled: true > en
[jira] [Updated] (HIVE-14822) Add support for credential provider for jobs launched from Hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-14822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-14822: --- Attachment: HIVE-14822.06.patch > Add support for credential provider for jobs launched from Hiveserver2 > -- > > Key: HIVE-14822 > URL: https://issues.apache.org/jira/browse/HIVE-14822 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-14822.01.patch, HIVE-14822.02.patch, > HIVE-14822.03.patch, HIVE-14822.05.patch, HIVE-14822.06.patch > > > When using encrypted passwords via the Hadoop Credential Provider, > HiveServer2 currently does not correctly forward enough information to the > job configuration for jobs to read those secrets. If your job needs to access > any secrets, like S3 credentials, then there's no convenient and secure way > to configure this today. > You could specify the decryption key in files like mapred-site.xml that > HiveServer2 uses, but this would place the encryption password on local disk > in plaintext, which can be a security concern. > To solve this problem, HiveServer2 should modify job configuration to include > the environment variable settings needed to decrypt the passwords. > Specifically, it will need to modify: > * For MR2 jobs: > ** yarn.app.mapreduce.am.admin.user.env > ** mapreduce.admin.user.env > * For Spark jobs: > ** spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD > ** spark.executorEnv.HADOOP_CREDSTORE_PASSWORD > HiveServer2 can get the decryption password from its own environment, the > same way it does for its own credential provider store today. > Additionally, it can be desirable for HiveServer2 to have a separate > encrypted password file than what is used by the job. HiveServer2 may have > secrets that the job should not have, such as the metastore database password > or the password to decrypt its private SSL certificate. It is also best > practices to have separate passwords on separate files. To facilitate this, > Hive will also accept: > * A configuration for a path to a credential store to use for jobs. This > should already be uploaded in HDFS. (hive.server2.job.keystore.location or a > better name) If this is not specified, then HS2 will simply use the value of > hadoop.security.credential.provider.path. > * An environment variable for the password to decrypt the credential store > (HIVE_JOB_KEYSTORE_PASSWORD or better). If this is not specified, then HS2 > will simply use the standard environment variable for decrypting the Hadoop > Credential Provider. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14872) Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS
[ https://issues.apache.org/jira/browse/HIVE-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569871#comment-15569871 ] Pengcheng Xiong commented on HIVE-14872: update the golden file. Double check that it passed. pushed to master. Thanks [~ashutoshc] for the review. > Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS > - > > Key: HIVE-14872 > URL: https://issues.apache.org/jira/browse/HIVE-14872 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14872.01.patch, HIVE-14872.02.patch > > > The main purpose for the configuration of > HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS is for backward compatibility because a > lot of reserved key words has been used as identifiers in the previous > releases. We already have had several releases with this configuration. Now > when I tried to add new set operators to the parser, ANTLR is always > complaining "code too large". I think it is time to remove this > configuration. (1) It will simplify the parser logic and largely reduce the > size of generated parser code; (2) it leave space for new features, > especially those which require parser changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14926) Keep Schema in consistent state where schemaTool fails or succeeds.
[ https://issues.apache.org/jira/browse/HIVE-14926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-14926: Attachment: HIVE-14926.1.patch > Keep Schema in consistent state where schemaTool fails or succeeds. > - > > Key: HIVE-14926 > URL: https://issues.apache.org/jira/browse/HIVE-14926 > Project: Hive > Issue Type: Improvement > Components: Database/Schema >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-14926.1.patch > > > SchemaTool uses autocommit right now when executing the upgrade or init > scripts. Seems we should use database transaction to commit or roll back to > keep schema consistent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14872) Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS
[ https://issues.apache.org/jira/browse/HIVE-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14872: --- Component/s: Parser > Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS > - > > Key: HIVE-14872 > URL: https://issues.apache.org/jira/browse/HIVE-14872 > Project: Hive > Issue Type: Sub-task > Components: Parser >Affects Versions: 2.1.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.2.0 > > Attachments: HIVE-14872.01.patch, HIVE-14872.02.patch > > > The main purpose for the configuration of > HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS is for backward compatibility because a > lot of reserved key words has been used as identifiers in the previous > releases. We already have had several releases with this configuration. Now > when I tried to add new set operators to the parser, ANTLR is always > complaining "code too large". I think it is time to remove this > configuration. (1) It will simplify the parser logic and largely reduce the > size of generated parser code; (2) it leave space for new features, > especially those which require parser changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14872) Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS
[ https://issues.apache.org/jira/browse/HIVE-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14872: --- Resolution: Fixed Status: Resolved (was: Patch Available) > Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS > - > > Key: HIVE-14872 > URL: https://issues.apache.org/jira/browse/HIVE-14872 > Project: Hive > Issue Type: Sub-task > Components: Parser >Affects Versions: 2.1.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14872.01.patch, HIVE-14872.02.patch > > > The main purpose for the configuration of > HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS is for backward compatibility because a > lot of reserved key words has been used as identifiers in the previous > releases. We already have had several releases with this configuration. Now > when I tried to add new set operators to the parser, ANTLR is always > complaining "code too large". I think it is time to remove this > configuration. (1) It will simplify the parser logic and largely reduce the > size of generated parser code; (2) it leave space for new features, > especially those which require parser changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14872) Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS
[ https://issues.apache.org/jira/browse/HIVE-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14872: --- Fix Version/s: 2.2.0 > Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS > - > > Key: HIVE-14872 > URL: https://issues.apache.org/jira/browse/HIVE-14872 > Project: Hive > Issue Type: Sub-task > Components: Parser >Affects Versions: 2.1.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.2.0 > > Attachments: HIVE-14872.01.patch, HIVE-14872.02.patch > > > The main purpose for the configuration of > HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS is for backward compatibility because a > lot of reserved key words has been used as identifiers in the previous > releases. We already have had several releases with this configuration. Now > when I tried to add new set operators to the parser, ANTLR is always > complaining "code too large". I think it is time to remove this > configuration. (1) It will simplify the parser logic and largely reduce the > size of generated parser code; (2) it leave space for new features, > especially those which require parser changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)