date:20161012


 [ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11394:

Attachment: (was: HIVE-11394.09.patch)

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> {code}
> coming soon…
> {code}
> EXPLAIN VECTORIZATION OPERATOR
> Notice the added  Select Vectorization, Group By Vectorization, Reduce Sink 
> Vectorization sections in this example.
> {code}
> coming soon…
> {code}
> EXPLAIN VECTORIZATION EXPRESSION
> Notice the a in this example.
> {code}
> coming soon…
> {code}
> EXPLAIN VECTORIZATION DETAIL
> Notice the a in this example.
> {code}
> coming soon…
> {code}
> EXPLAIN VECTORIZATION ONLY example:
> {code}
> coming soon…
> {code}
> EXPLAIN VECTORIZATION ONLY OPERATOR example:
> {code}
> coming soon…
> {code}
> EXPLAIN VECTORIZATION ONLY EXPRESSION example:
> {code}
> {code}
> EXPLAIN VECTORIZATION ONLY DETAIL example:
> {code}
> coming soon…
> {code}
> The standard @Explain Annotation Type is used.  A new 'vectorization' 
> annotation marks each new class and method.
> Works for FORMATTED, like other non-vectorization EXPLAIN variations.
> EXPLAIN VECTORIZATION FORMATTED example:
> {code}
> coming soon…
> {code}
> or pretty printed:
> {code}
> coming soon…
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization


 [ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11394:

Attachment: HIVE-11394.09.patch

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> {code}
> coming soon…
> {code}
> EXPLAIN VECTORIZATION OPERATOR
> Notice the added  Select Vectorization, Group By Vectorization, Reduce Sink 
> Vectorization sections in this example.
> {code}
> coming soon…
> {code}
> EXPLAIN VECTORIZATION EXPRESSION
> Notice the a in this example.
> {code}
> coming soon…
> {code}
> EXPLAIN VECTORIZATION DETAIL
> Notice the a in this example.
> {code}
> coming soon…
> {code}
> EXPLAIN VECTORIZATION ONLY example:
> {code}
> coming soon…
> {code}
> EXPLAIN VECTORIZATION ONLY OPERATOR example:
> {code}
> coming soon…
> {code}
> EXPLAIN VECTORIZATION ONLY EXPRESSION example:
> {code}
> {code}
> EXPLAIN VECTORIZATION ONLY DETAIL example:
> {code}
> coming soon…
> {code}
> The standard @Explain Annotation Type is used.  A new 'vectorization' 
> annotation marks each new class and method.
> Works for FORMATTED, like other non-vectorization EXPLAIN variations.
> EXPLAIN VECTORIZATION FORMATTED example:
> {code}
> coming soon…
> {code}
> or pretty printed:
> {code}
> coming soon…
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13539) HiveHFileOutputFormat searching the wrong directory for HFiles


[ 
https://issues.apache.org/jira/browse/HIVE-13539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15567897#comment-15567897
 ] 

Matt McCline commented on HIVE-13539:
-

I other words, I can't get the non-patched code to fail.  Without a test case, 
the code patch cannot be reviewed and committed.

> HiveHFileOutputFormat searching the wrong directory for HFiles
> --
>
> Key: HIVE-13539
> URL: https://issues.apache.org/jira/browse/HIVE-13539
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 1.1.0
> Environment: Built into CDH 5.4.7
>Reporter: Tim Robertson
>Assignee: Matt McCline
>Priority: Blocker
> Attachments: hive_hfile_output_format.q, 
> hive_hfile_output_format.q.out
>
>
> When creating HFiles for a bulkload in HBase I believe it is looking in the 
> wrong directory to find the HFiles, resulting in the following exception:
> {code}
> Error: java.lang.RuntimeException: Hive Runtime Error while closing 
> operators: java.io.IOException: Multiple family directories found in 
> hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:295)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:453)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: Multiple family directories found in 
> hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:188)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:958)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:287)
>   ... 7 more
> Caused by: java.io.IOException: Multiple family directories found in 
> hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary
>   at 
> org.apache.hadoop.hive.hbase.HiveHFileOutputFormat$1.close(HiveHFileOutputFormat.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:185)
>   ... 11 more
> {code}
> The issue is that is looks for the HFiles in 
> {{hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary}}
>  when I believe it should be looking in the task attempt subfolder, such as 
> {{hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary/attempt_1461004169450_0002_r_00_1000}}.
> This can be reproduced in any HFile creation such as:
> {code:sql}
> CREATE TABLE coords_hbase(id INT, x DOUBLE, y DOUBLE)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
>   'hbase.columns.mapping' = ':key,o:x,o:y',
>   'hbase.table.default.storage.type' = 'binary');
> SET hfile.family.path=/tmp/coords_hfiles/o; 
> SET hive.hbase.generatehfiles=true;
> INSERT OVERWRITE TABLE coords_hbase 
> SELECT id, decimalLongitude, decimalLatitude
> FROM source
> CLUSTER BY id; 
> {code}
> Any advice greatly appreciated



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13539) HiveHFileOutputFormat searching the wrong directory for HFiles


 [ 
https://issues.apache.org/jira/browse/HIVE-13539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13539:

Resolution: Cannot Reproduce
Status: Resolved  (was: Patch Available)

> HiveHFileOutputFormat searching the wrong directory for HFiles
> --
>
> Key: HIVE-13539
> URL: https://issues.apache.org/jira/browse/HIVE-13539
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 1.1.0
> Environment: Built into CDH 5.4.7
>Reporter: Tim Robertson
>Assignee: Matt McCline
>Priority: Blocker
> Attachments: hive_hfile_output_format.q, 
> hive_hfile_output_format.q.out
>
>
> When creating HFiles for a bulkload in HBase I believe it is looking in the 
> wrong directory to find the HFiles, resulting in the following exception:
> {code}
> Error: java.lang.RuntimeException: Hive Runtime Error while closing 
> operators: java.io.IOException: Multiple family directories found in 
> hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:295)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:453)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: Multiple family directories found in 
> hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:188)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:958)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:287)
>   ... 7 more
> Caused by: java.io.IOException: Multiple family directories found in 
> hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary
>   at 
> org.apache.hadoop.hive.hbase.HiveHFileOutputFormat$1.close(HiveHFileOutputFormat.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:185)
>   ... 11 more
> {code}
> The issue is that is looks for the HFiles in 
> {{hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary}}
>  when I believe it should be looking in the task attempt subfolder, such as 
> {{hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary/attempt_1461004169450_0002_r_00_1000}}.
> This can be reproduced in any HFile creation such as:
> {code:sql}
> CREATE TABLE coords_hbase(id INT, x DOUBLE, y DOUBLE)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
>   'hbase.columns.mapping' = ':key,o:x,o:y',
>   'hbase.table.default.storage.type' = 'binary');
> SET hfile.family.path=/tmp/coords_hfiles/o; 
> SET hive.hbase.generatehfiles=true;
> INSERT OVERWRITE TABLE coords_hbase 
> SELECT id, decimalLongitude, decimalLatitude
> FROM source
> CLUSTER BY id; 
> {code}
> Any advice greatly appreciated



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14877) Move slow CliDriver tests to MiniLlap


 [ 
https://issues.apache.org/jira/browse/HIVE-14877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14877:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master

> Move slow CliDriver tests to MiniLlap
> -
>
> Key: HIVE-14877
> URL: https://issues.apache.org/jira/browse/HIVE-14877
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.2.0
>
> Attachments: HIVE-14877.1.patch, HIVE-14877.2.patch, 
> HIVE-14877.3.patch, HIVE-14877.4.patch, HIVE-14877.5.patch, 
> HIVE-14877.5.patch, HIVE-14877.6.patch
>
>
> When analyzing the test runtimes, there are many CliDriver tests that shows 
> up as stragglers and are slow. Most of these tests are not really testing the 
> execution engine. For example special_character_in_tabnames_1.q is the 
> slowest test case that takes 419s in CliDriver but only 62s in MiniLlap. 
> Similarly there are many test cases that can benefit from fast runtimes. We 
> should consider moving the tests that are not testing the execution engine to 
> MiniLlap (assuming it provides significant performance benefit).
> Here is the list of top 100 slow tests based on build #1055
> ||QFiles||TestCliDriver elapsed time||
> |special_character_in_tabnames_1.q|419.229|
> |unionDistinct_1.q|278.583|
> |vector_leftsemi_mapjoin.q|232.313|
> |join_filters.q|172.436|
> |escape2.q|167.503|
> |archive_excludeHadoop20.q|163.522|
> |escape1.q|130.217|
> |lineage3.q|110.935|
> |insert_into_with_schema.q|107.345|
> |auto_join_filters.q|104.331|
> |windowing.q|99.622|
> |index_compact_binary_search.q|97.637|
> |cbo_rp_windowing_2.q|95.108|
> |vectorized_ptf.q|93.397|
> |dynpart_sort_optimization_acid.q|91.831|
> |partition_multilevels.q|90.392|
> |ptf.q|89.115|
> |sample_islocalmode_hook.q|88.293|
> |udaf_collect_set_2.q|84.725|
> |skewjoin.q|84.588|
> |lineage2.q|84.187|
> |correlationoptimizer1.q|80.367|
> |dynpart_sort_optimization.q|77.07|
> |orc_ppd_decimal.q|75.523|
> |orc_ppd_schema_evol_3a.q|75.352|
> |groupby_sort_skew_1_23.q|75.342|
> |cbo_rp_lineage2.q|75.283|
> |parquet_ppd_decimal.q|74.063|
> |sample_islocalmode_hook_use_metadata.q|73.988|
> |orc_analyze.q|73.803|
> |join_nulls.q|72.417|
> |semijoin.q|70.403|
> |correlationoptimizer6.q|69.151|
> |table_access_keys_stats.q|68.699|
> |autoColumnStats_2.q|68.632|
> |cbo_join.q|68.325|
> |cbo_rp_join.q|68.317|
> |sample10.q|64.513|
> |mergejoin.q|63.647|
> |multi_insert_move_tasks_share_dependencies.q|62.079|
> |union_view.q|61.772|
> |autoColumnStats_1.q|61.246|
> |groupby_sort_1_23.q|61.129|
> |pcr.q|59.546|
> |vectorization_short_regress.q|58.775|
> |auto_sortmerge_join_9.q|58.3|
> |correlationoptimizer2.q|56.591|
> |alter_merge_stats_orc.q|55.202|
> |vector_join30.q|54.85|
> |selectDistinctStar.q|53.981|
> |vector_decimal_udf.q|53.879|
> |auto_join30.q|53.762|
> |subquery_notin.q|52.879|
> |cbo_rp_subq_not_in.q|52.609|
> |cbo_rp_gby.q|51.866|
> |cbo_subq_not_in.q|51.672|
> |cbo_gby.q|50.361|
> |infer_bucket_sort.q|49.158|
> |ptf_streaming.q|48.484|
> |join_1to1.q|48.268|
> |load_dyn_part5.q|47.796|
> |limit_join_transpose.q|47.517|
> |ppd_windowing2.q|47.318|
> |dynpart_sort_opt_vectorization.q|47.208|
> |vector_number_compare_projection.q|47.024|
> |correlationoptimizer4.q|45.472|
> |orc_ppd_date.q|45.19|
> |global_limit.q|44.438|
> |union_top_level.q|44.229|
> |llap_partitioned.q|44.139|
> |orc_ppd_timestamp.q|43.617|
> |parquet_ppd_date.q|43.539|
> |multiMapJoin2.q|43.036|
> |parquet_ppd_timestamp.q|42.665|
> |vector_partitioned_date_time.q|42.511|
> |auto_sortmerge_join_8.q|42.377|
> |create_view.q|42.23|
> |windowing_windowspec2.q|42.202|
> |multiMapJoin1.q|41.176|
> |vector_decimal_2.q|41.026|
> |bucket_groupby.q|40.565|
> |rcfile_merge2.q|39.782|
> |index_compact_2.q|39.765|
> |join_nullsafe.q|39.698|
> |vector_join_filters.q|39.343|
> |cbo_rp_auto_join1.q|39.308|
> |vector_auto_smb_mapjoin_14.q|39.17|
> |vector_udf1.q|38.988|
> |rcfile_createas1.q|38.932|
> |cbo_rp_semijoin.q|38.675|
> |auto_join_nulls.q|38.519|
> |cbo_rp_unionDistinct_2.q|37.815|
> |union_remove_26.q|37.672|
> |rcfile_merge3.q|37.373|
> |rcfile_merge4.q|37.194|
> |bucketsortoptimize_insert_2.q|37.187|
> |cbo_limit.q|37.038|
> |auto_sortmerge_join_6.q|36.663|
> |join43.q|36.656|



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14921) Move slow CliDriver tests to MiniLlap - part 2


 [ 
https://issues.apache.org/jira/browse/HIVE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14921:
-
Attachment: HIVE-14921.1.patch

same patch after rebase

> Move slow CliDriver tests to MiniLlap - part 2
> --
>
> Key: HIVE-14921
> URL: https://issues.apache.org/jira/browse/HIVE-14921
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14921.1.patch, HIVE-14921.1.patch
>
>
> Continuation to HIVE-14877



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14921) Move slow CliDriver tests to MiniLlap - part 2


 [ 
https://issues.apache.org/jira/browse/HIVE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14921:
-
Status: Patch Available  (was: Open)

> Move slow CliDriver tests to MiniLlap - part 2
> --
>
> Key: HIVE-14921
> URL: https://issues.apache.org/jira/browse/HIVE-14921
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14921.1.patch, HIVE-14921.1.patch
>
>
> Continuation to HIVE-14877



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14803) S3: Stats gathering for insert queries can be expensive for partitioned dataset


[ 
https://issues.apache.org/jira/browse/HIVE-14803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15567924#comment-15567924
 ] 

Hive QA commented on HIVE-14803:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12829480/HIVE-14803.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10666 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join32_lessSize]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_4]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_udf_case]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_map_ppr_multi_distinct]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_ppr]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join26]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join_merge_multi_expressions]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[load_dyn_part8]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[transform_ppr1]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[transform_ppr2]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_ppr]
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1493/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1493/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1493/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12829480 - PreCommit-HIVE-Build

> S3: Stats gathering for insert queries can be expensive for partitioned 
> dataset
> ---
>
> Key: HIVE-14803
> URL: https://issues.apache.org/jira/browse/HIVE-14803
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.1.0
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14803.1.patch
>
>
> StatsTask's aggregateStats populates stats details for all partitions by 
> checking the file sizes which turns out to be expensive when larger number of 
> partitions are inserted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14916) Reduce the memory requirements for Spark tests


[ 
https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15567971#comment-15567971
 ] 

Siddharth Seth commented on HIVE-14916:
---

It is necessary. The values should be 2048, 512, 2048 instead of 4096, 1024, 
4096

> Reduce the memory requirements for Spark tests
> --
>
> Key: HIVE-14916
> URL: https://issues.apache.org/jira/browse/HIVE-14916
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Dapeng Sun
> Attachments: HIVE-14916.001.patch, HIVE-14916.002.patch, 
> HIVE-14916.003.patch
>
>
> As HIVE-14887, we need to reduce the memory requirements for Spark tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14761) Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2


[ 
https://issues.apache.org/jira/browse/HIVE-14761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15567978#comment-15567978
 ] 

Siddharth Seth commented on HIVE-14761:
---

+1.

> Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2
> --
>
> Key: HIVE-14761
> URL: https://issues.apache.org/jira/browse/HIVE-14761
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-14761.1.patch
>
>
> Currently 2 min 30 sec



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14761) Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2


[ 
https://issues.apache.org/jira/browse/HIVE-14761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15567983#comment-15567983
 ] 

Siddharth Seth commented on HIVE-14761:
---

Created HIVE-14936 to track the flaky minillap test.

> Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2
> --
>
> Key: HIVE-14761
> URL: https://issues.apache.org/jira/browse/HIVE-14761
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-14761.1.patch
>
>
> Currently 2 min 30 sec



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13316) Upgrade to Calcite 1.10


 [ 
https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13316:
---
Summary: Upgrade to Calcite 1.10  (was: Upgrade to Calcite 1.9)

> Upgrade to Calcite 1.10
> ---
>
> Key: HIVE-13316
> URL: https://issues.apache.org/jira/browse/HIVE-13316
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13316.01.patch, HIVE-13316.02.patch, 
> HIVE-13316.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14887) Reduce the memory requirements for tests


 [ 
https://issues.apache.org/jira/browse/HIVE-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14887:
--
Attachment: HIVE-14887.02.patch

Updated patch, with some more runtime parameters set.

> Reduce the memory requirements for tests
> 
>
> Key: HIVE-14887
> URL: https://issues.apache.org/jira/browse/HIVE-14887
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14887.01.patch, HIVE-14887.02.patch
>
>
> The clusters that we spin up end up requiring 16GB at times. Also the maven 
> arguments seem a little heavy weight.
> Reducing this will allow for additional ptest drones per box, which should 
> bring down the runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13316) Upgrade to Calcite 1.10


 [ 
https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13316:
---
Attachment: HIVE-13316.03.patch

> Upgrade to Calcite 1.10
> ---
>
> Key: HIVE-13316
> URL: https://issues.apache.org/jira/browse/HIVE-13316
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13316.01.patch, HIVE-13316.02.patch, 
> HIVE-13316.03.patch, HIVE-13316.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13316) Upgrade to Calcite 1.10


 [ 
https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13316:
---
Attachment: (was: HIVE-13316.03.patch)

> Upgrade to Calcite 1.10
> ---
>
> Key: HIVE-13316
> URL: https://issues.apache.org/jira/browse/HIVE-13316
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13316.01.patch, HIVE-13316.02.patch, 
> HIVE-13316.04.patch, HIVE-13316.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13316) Upgrade to Calcite 1.10


 [ 
https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13316:
---
Attachment: HIVE-13316.04.patch

> Upgrade to Calcite 1.10
> ---
>
> Key: HIVE-13316
> URL: https://issues.apache.org/jira/browse/HIVE-13316
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13316.01.patch, HIVE-13316.02.patch, 
> HIVE-13316.04.patch, HIVE-13316.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14916) Reduce the memory requirements for Spark tests

2016-10-12 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568015#comment-15568015
 ] 

Dapeng Sun commented on HIVE-14916:
---

Thank [~sseth], I will try and update the patch.

> Reduce the memory requirements for Spark tests
> --
>
> Key: HIVE-14916
> URL: https://issues.apache.org/jira/browse/HIVE-14916
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Dapeng Sun
> Attachments: HIVE-14916.001.patch, HIVE-14916.002.patch, 
> HIVE-14916.003.patch
>
>
> As HIVE-14887, we need to reduce the memory requirements for Spark tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13316) Upgrade to Calcite 1.10


 [ 
https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13316:
---
Attachment: HIVE-13316.05.patch

> Upgrade to Calcite 1.10
> ---
>
> Key: HIVE-13316
> URL: https://issues.apache.org/jira/browse/HIVE-13316
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13316.01.patch, HIVE-13316.02.patch, 
> HIVE-13316.05.patch, HIVE-13316.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13316) Upgrade to Calcite 1.10


 [ 
https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13316:
---
Attachment: (was: HIVE-13316.04.patch)

> Upgrade to Calcite 1.10
> ---
>
> Key: HIVE-13316
> URL: https://issues.apache.org/jira/browse/HIVE-13316
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13316.01.patch, HIVE-13316.02.patch, 
> HIVE-13316.05.patch, HIVE-13316.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization


[ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568073#comment-15568073
 ] 

Hive QA commented on HIVE-11394:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832834/HIVE-11394.09.patch

{color:green}SUCCESS:{color} +1 due to 162 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 10606 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapLocalCliDriver-orc_llap.q-delete_where_non_partitioned.q-vector_groupby_mapjoin.q-and-27-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_between_in]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_vec_part]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_vec_part_all_primitive]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_vec_table]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part_all_primitive]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_table]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_udf]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_number_compare_projection]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_udf1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[vectorization_limit]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_string_concat]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_13]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_14]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_15]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_16]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_9]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_short_regress]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_mapjoin]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_shufflejoin]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_timestamp_funcs]
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1494/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1494/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1494/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 25 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832834 - PreCommit-HIVE-Build

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> Here are some examples:
> EXPLAIN VECTORIZATION exampl

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-12 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568152#comment-15568152
 ] 

Rui Li commented on HIVE-14797:
---

Seems for MR, we need to get #reducers from hconf, but for Spark/Tez, we need 
to get it from ReduceSinkDesc::getNumReducers. Therefore we have to check both 
of them to determine if #reducer is the same as our hash seed.

> reducer number estimating may lead to data skew
> ---
>
> Key: HIVE-14797
> URL: https://issues.apache.org/jira/browse/HIVE-14797
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: roncenzhao
>Assignee: roncenzhao
> Attachments: HIVE-14797.2.patch, HIVE-14797.3.patch, HIVE-14797.patch
>
>
> HiveKey's hash code is generated by multipling by 31 key by key which is 
> implemented in method `ObjectInspectorUtils.getBucketHashCode()`:
> for (int i = 0; i < bucketFields.length; i++) {
>   int fieldHash = ObjectInspectorUtils.hashCode(bucketFields[i], 
> bucketFieldInspectors[i]);
>   hashCode = 31 * hashCode + fieldHash;
> }
> The follow example will lead to data skew:
> I hava two table called tbl1 and tbl2 and they have the same column: a int, b 
> string. The values of column 'a' in both two tables are not skew, but values 
> of column 'b' in both two tables are skew.
> When my sql is "select * from tbl1 join tbl2 on tbl1.a=tbl2.a and 
> tbl1.b=tbl2.b" and the estimated reducer number is 31, it will lead to data 
> skew.
> As we know, the HiveKey's hash code is generated by `hash(a)*31 + hash(b)`. 
> When reducer number is 31 the reducer No. of each row is `hash(b)%31`. In the 
> result, the job will be skew.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11957) SHOW TRANSACTIONS should show queryID/agent id of the creator


[ 
https://issues.apache.org/jira/browse/HIVE-11957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568216#comment-15568216
 ] 

Hive QA commented on HIVE-11957:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832769/HIVE-11957.5.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 10636 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1495/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1495/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1495/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832769 - PreCommit-HIVE-Build

> SHOW TRANSACTIONS should show queryID/agent id of the creator
> -
>
> Key: HIVE-11957
> URL: https://issues.apache.org/jira/browse/HIVE-11957
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
> Attachments: HIVE-11957.1.patch, HIVE-11957.2.patch, 
> HIVE-11957.3.patch, HIVE-11957.4.patch, HIVE-11957.5.patch
>
>
> this would be very useful for debugging
> should also include heartbeat/create timestamps
> would be nice to support some filtering/sorting options, like sort by create 
> time, agent id. filter by table, database, etc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14922) Add perf logging for post job completion steps


[ 
https://issues.apache.org/jira/browse/HIVE-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568410#comment-15568410
 ] 

Hive QA commented on HIVE-14922:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832766/HIVE-14922.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10636 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk]
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1496/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1496/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1496/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832766 - PreCommit-HIVE-Build

> Add perf logging for post job completion steps 
> ---
>
> Key: HIVE-14922
> URL: https://issues.apache.org/jira/browse/HIVE-14922
> Project: Hive
>  Issue Type: Task
>  Components: Logging
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14922.patch
>
>
> Mostly FS related operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14753) Track the number of open/closed/abandoned sessions in HS2

2016-10-12 Thread Barna Zsombor Klara (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568478#comment-15568478
 ] 

Barna Zsombor Klara commented on HIVE-14753:


Hi [~szehon],
if you have the time do you think you could review my patch? I think you are 
very familiar with the metrics API, so the changes should be straightforward to 
follow.
Thanks!

> Track the number of open/closed/abandoned sessions in HS2
> -
>
> Key: HIVE-14753
> URL: https://issues.apache.org/jira/browse/HIVE-14753
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Fix For: 2.2.0
>
> Attachments: HIVE-14753.patch
>
>
> We should be able to track the nr. of sessions since the startup of the HS2 
> instance as well as the average lifetime of a session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13280) Error when more than 1 mapper for HBase storage handler

2016-10-12 Thread Damien Carol (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568492#comment-15568492
 ] 

Damien Carol commented on HIVE-13280:
-

Yes this fix the pb.

> Error when more than 1 mapper for HBase storage handler
> ---
>
> Key: HIVE-13280
> URL: https://issues.apache.org/jira/browse/HIVE-13280
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 2.0.0
>Reporter: Damien Carol
>
> With a simple query (select from orc table and insert into HBase external 
> table):
> {code:sql}
> insert into table register.register  select * from aa_temp
> {code}
> The aa_temp table have 45 orc files. It generate 45 mappers.
> Some mappers fail with this error:
> {noformat}
> Caused by: java.lang.IllegalArgumentException: Must specify table name
> at 
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
> at 
> org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101)
> at 
> org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126)
> ... 25 more
> ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
> killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed 
> due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. 
> failedVertices:1 killedVertices:0 (state=08S01,code=2)
> {noformat}
> If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is 
> fine because there are only one mapper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-13280) Error when more than 1 mapper for HBase storage handler

2016-10-12 Thread Damien Carol (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Carol resolved HIVE-13280.
-
Resolution: Invalid
  Assignee: Damien Carol

> Error when more than 1 mapper for HBase storage handler
> ---
>
> Key: HIVE-13280
> URL: https://issues.apache.org/jira/browse/HIVE-13280
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 2.0.0
>Reporter: Damien Carol
>Assignee: Damien Carol
>
> With a simple query (select from orc table and insert into HBase external 
> table):
> {code:sql}
> insert into table register.register  select * from aa_temp
> {code}
> The aa_temp table have 45 orc files. It generate 45 mappers.
> Some mappers fail with this error:
> {noformat}
> Caused by: java.lang.IllegalArgumentException: Must specify table name
> at 
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
> at 
> org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101)
> at 
> org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126)
> ... 25 more
> ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
> killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed 
> due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. 
> failedVertices:1 killedVertices:0 (state=08S01,code=2)
> {noformat}
> If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is 
> fine because there are only one mapper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13280) Error when more than 1 mapper for HBase storage handler

2016-10-12 Thread Damien Carol (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568493#comment-15568493
 ] 

Damien Carol commented on HIVE-13280:
-

Yes this fix the pb.

> Error when more than 1 mapper for HBase storage handler
> ---
>
> Key: HIVE-13280
> URL: https://issues.apache.org/jira/browse/HIVE-13280
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 2.0.0
>Reporter: Damien Carol
>
> With a simple query (select from orc table and insert into HBase external 
> table):
> {code:sql}
> insert into table register.register  select * from aa_temp
> {code}
> The aa_temp table have 45 orc files. It generate 45 mappers.
> Some mappers fail with this error:
> {noformat}
> Caused by: java.lang.IllegalArgumentException: Must specify table name
> at 
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
> at 
> org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101)
> at 
> org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126)
> ... 25 more
> ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
> killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed 
> due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. 
> failedVertices:1 killedVertices:0 (state=08S01,code=2)
> {noformat}
> If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is 
> fine because there are only one mapper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14872) Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS


[ 
https://issues.apache.org/jira/browse/HIVE-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568529#comment-15568529
 ] 

Hive QA commented on HIVE-14872:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832776/HIVE-14872.02.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10484 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[udaf_collect_set_2]
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1497/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1497/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1497/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832776 - PreCommit-HIVE-Build

> Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS
> -
>
> Key: HIVE-14872
> URL: https://issues.apache.org/jira/browse/HIVE-14872
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14872.01.patch, HIVE-14872.02.patch
>
>
> The main purpose for the configuration of 
> HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS is for backward compatibility because a 
> lot of reserved key words has been used as identifiers in the previous 
> releases. We already have had several releases with this configuration. Now 
> when I tried to add new set operators to the parser, ANTLR is always 
> complaining "code too large". I think it is time to remove this 
> configuration. (1) It will simplify the parser logic and largely reduce the 
> size of generated parser code; (2) it leave space for new features, 
> especially those which require parser changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization

[
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Matt McCline updated HIVE-11394:

Description:
Add detail to the EXPLAIN output showing why a Map and Reduce work is not
vectorized.

New syntax is: EXPLAIN VECTORIZATION \[ONLY\]
\[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]

The ONLY option suppresses most non-vectorization elements.

SUMMARY shows vectorization information for the PLAN (is vectorization enabled)
and a summary of Map and Reduce work.

OPERATOR shows vectorization information for operators. E.g. Filter
Vectorization. It includes all information of SUMMARY, too.

EXPRESSION shows vectorization information for expressions. E.g.
predicateExpression. It includes all information of SUMMARY and OPERATOR, too.

DETAIL shows very vectorization information.
It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.

The optional clause defaults are not ONLY and SUMMARY.

---

Here are some examples:

EXPLAIN VECTORIZATION example:

(Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization sections)

Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION SUMMARY.

Under Reducer 3’s "Reduce Vectorization:" you’ll see
notVectorizedReason: Aggregation Function UDF avg parameter expression for
GROUPBY operator: Data type struct of
Column\[VALUE._col2\] not supported

For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:":
"false" which says a node has a GROUP BY with an AVG or some other aggregator
that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators
are row-mode. I.e. not vector output.

If "usesVectorUDFAdaptor:": "false" were true, it would say there was at least
one vectorized expression is using VectorUDFAdaptor.

And, "allNative:": "false" will be true when all operators are native. Today,
GROUP BY and FILE SINK are not native. MAP JOIN and REDUCE SINK are
conditionally native. FILTER and SELECT are native.

{code}
PLAN VECTORIZATION:
enabled: true
enabledConditionsMet: [hive.vectorized.execution.enabled IS true]

STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1

STAGE PLANS:
Stage: Stage-1
Tez
...
Edges:
Reducer 2 <- Map 1 (SIMPLE_EDGE)
Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
...
Vertices:
Map 1
Map Operator Tree:
TableScan
alias: alltypesorc
Statistics: Num rows: 12288 Data size: 36696 Basic stats:
COMPLETE Column stats: COMPLETE
Select Operator
expressions: cint (type: int)
outputColumnNames: cint
Statistics: Num rows: 12288 Data size: 36696 Basic stats:
COMPLETE Column stats: COMPLETE
Group By Operator
keys: cint (type: int)
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 5775 Data size: 17248 Basic stats:
COMPLETE Column stats: COMPLETE
Reduce Output Operator
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 5775 Data size: 17248 Basic
stats: COMPLETE Column stats: COMPLETE
Execution mode: vectorized, llap
LLAP IO: all inputs
Map Vectorization:
enabled: true
enabledConditionsMet:
hive.vectorized.use.vectorized.input.format IS true
groupByVectorOutput: true
inputFileFormats:
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
allNative: false
usesVectorUDFAdaptor: false
vectorized: true
Reducer 2
Execution mode: vectorized, llap
Reduce Vectorization:
enabled: true
enableConditionsMet: hive.vectorized.execution.reduce.enabled
IS true, hive.execution.engine tez IN [tez, spark] IS true
groupByVectorOutput: false
allNative: false
usesVectorUDFAdaptor: false
vectorized: true
Reduce Operator Tree:
Group By Operator
keys: KEY._col0 (type: int)
mode: mergepartial
outputColumnNames: _col0
Statistics: Num rows: 5775 Data size: 17248 Basic stats:
COMPLETE Column stats: COMPLETE
Group By Operator
aggregations: sum(_col0), count(_col0), avg(_col0), std(_col0)
mode: hash
outputColumnNames: _col0, _col1, _col2, _col3
Statistics: Num rows: 1

[jira] [Commented] (HIVE-12765) Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)


[ 
https://issues.apache.org/jira/browse/HIVE-12765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568670#comment-15568670
 ] 

Hive QA commented on HIVE-12765:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832788/HIVE-12765.02.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 185 failed/errored test(s), 10641 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_3]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join8]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_input26]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input25]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input26]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join8]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[keyword_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[limit0]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_query_oneskew_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_dyn_part14]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_field_garbage]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_25]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[timestamp]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_remove_25]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_null_projection]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[autoColumnStats_1]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[autoColumnStats_2]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[udaf_collect_set_2]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_top_level]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_null_projection]
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_cannot_create_all_role]
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_cannot_create_none_role]
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[cte_with_in_subquery]
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[lateral_view_join]
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[subq_insert]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_join8]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join8]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[load_dyn_part14]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_25]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_25]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_top_level]
org.apache.hadoop.hive.ql.parse.TestParseNegativeDriver.testCliDriver[missing_overwrite]
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_ALL
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_ALTER
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_ARRAY
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_AS
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_AUTHORIZATION
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_BETWEEN
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_BIGINT
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_BINARY
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_BOOLEAN
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_BOTH
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_BY
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_CREATE
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_CUBE
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_CURRENT_DATE
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_CURRENT_TIMESTAMP
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_CURSOR
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_DATE
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_DECIMAL
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative.testSQL11ReservedKeyWords_DEL

[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization


 [ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11394:

Status: Patch Available  (was: In Progress)

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
> groupByVectorOutput: true
> inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> allNative: false
> usesVectorUDFAdaptor: false
> vectorized: true
> Reducer 2 
> Execution mode: vectorized, llap
> Reduce Vectorization:
> enabled: true
>

[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization


 [ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11394:

Attachment: HIVE-11394.091.patch

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
> groupByVectorOutput: true
> inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> allNative: false
> usesVectorUDFAdaptor: false
> vectorized: true
> Reducer 2 
> Execution mode: vectorized, llap
> Reduce Vectorization:
> enabled: true
> en

[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization


[ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568740#comment-15568740
 ] 

Matt McCline commented on HIVE-11394:
-

Patch #91 is Hive QA #1512?

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
> groupByVectorOutput: true
> inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> allNative: false
> usesVectorUDFAdaptor: false
> vectorized: true
> Reducer 2 
> Execution mode: vectorized, llap
> Reduce Vectorization:
>

[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization


 [ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11394:

Status: In Progress  (was: Patch Available)

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
> groupByVectorOutput: true
> inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> allNative: false
> usesVectorUDFAdaptor: false
> vectorized: true
> Reducer 2 
> Execution mode: vectorized, llap
> Reduce Vectorization:
> enabled: true
>

[jira] [Commented] (HIVE-13046) DependencyResolver should not lowercase the dependency URI's authority


[ 
https://issues.apache.org/jira/browse/HIVE-13046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568823#comment-15568823
 ] 

Hive QA commented on HIVE-13046:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832794/HIVE-13046.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10636 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk]
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1499/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1499/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1499/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832794 - PreCommit-HIVE-Build

> DependencyResolver should not lowercase the dependency URI's authority
> --
>
> Key: HIVE-13046
> URL: https://issues.apache.org/jira/browse/HIVE-13046
> Project: Hive
>  Issue Type: Bug
>Reporter: Anthony Hsu
>Assignee: Anthony Hsu
> Attachments: HIVE-13046.1.patch, HIVE-13046.2.patch
>
>
> When using {{ADD JAR ivy://...}} to add a jar version {{1.2.3-SNAPSHOT}}, 
> Hive will lowercase it to {{1.2.3-snapshot}} due to:
> {code:title=DependencyResolver.java#84}
> String[] authorityTokens = authority.toLowerCase().split(":");
> {code}
> We should not {{.lowerCase()}}.
> RB: https://reviews.apache.org/r/43513



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14640) handle hive.merge.*files in select queries


[ 
https://issues.apache.org/jira/browse/HIVE-14640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568827#comment-15568827
 ] 

Hive QA commented on HIVE-14640:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832789/HIVE-14640.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1500/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1500/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1500/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2016-10-12 14:08:13.340
+ [[ -n /usr/java/jdk1.8.0_25 ]]
+ export JAVA_HOME=/usr/java/jdk1.8.0_25
+ JAVA_HOME=/usr/java/jdk1.8.0_25
+ export 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-Build-1500/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-10-12 14:08:13.343
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 1f258e9 HIVE-14877: Move slow CliDriver tests to MiniLlap 
(Prasanth Jayachandran reviewed by Siddharth Seth)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 1f258e9 HIVE-14877: Move slow CliDriver tests to MiniLlap 
(Prasanth Jayachandran reviewed by Siddharth Seth)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2016-10-12 14:08:14.396
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
error: patch failed: 
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:3123
error: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: patch does 
not apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java:253
error: 
ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java: 
patch does not apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:98
error: ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java: patch 
does not apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java:315
error: ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java: patch does not 
apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java:1420
error: ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java: patch does 
not apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java:1256
error: ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java: 
patch does not apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java:1676
error: ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java: 
patch does not apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java:304
error: ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java: patch does 
not apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:6575
error: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java: patch 
does not apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java:255
error: ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java: patch does 
not apply
error: ql/src/test/queries/clientpositive/mm_all.q: No such file or directory
error: ql/src/test/queries/clientpositive/mm_current.q: No such file or 
directory
error: ql/src/test/results/clientpositive/llap/mm_all.q.out: No such file or

[jira] [Commented] (HIVE-13280) Error when more than 1 mapper for HBase storage handler

2016-10-12 Thread Vsevolod Ostapenko (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568866#comment-15568866
 ] 

Vsevolod Ostapenko commented on HIVE-13280:
---

Hive-Hbase integration documentation 
(https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration) claims that 
hbase.mapred.output.outputtable property is optional, and provides no good 
explanation under what circumstances one would want or need to define it. In 
all the provided samples values of hbase.mapred.output.outputtable and 
hbase.table.name are the same, so samples are hot helpful and not 
self-explanatory.

If TEZ does require hbase.mapred.output.outputtable property to be explicitly 
set, documentation needs to be updated to indicate that fact.
Also, it would be helpful to provide some background why this property exists 
in the first place.


> Error when more than 1 mapper for HBase storage handler
> ---
>
> Key: HIVE-13280
> URL: https://issues.apache.org/jira/browse/HIVE-13280
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 2.0.0
>Reporter: Damien Carol
>Assignee: Damien Carol
>
> With a simple query (select from orc table and insert into HBase external 
> table):
> {code:sql}
> insert into table register.register  select * from aa_temp
> {code}
> The aa_temp table have 45 orc files. It generate 45 mappers.
> Some mappers fail with this error:
> {noformat}
> Caused by: java.lang.IllegalArgumentException: Must specify table name
> at 
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
> at 
> org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101)
> at 
> org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126)
> ... 25 more
> ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
> killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed 
> due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. 
> failedVertices:1 killedVertices:0 (state=08S01,code=2)
> {noformat}
> If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is 
> fine because there are only one mapper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13798) Fix the unit test failure org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload

2016-10-12 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13798:

Fix Version/s: 2.1.1

> Fix the unit test failure 
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
> 
>
> Key: HIVE-13798
> URL: https://issues.apache.org/jira/browse/HIVE-13798
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-13798.2.patch, HIVE-13798.3.patch, 
> HIVE-13798.4.patch, HIVE-13798.addendum.patch, HIVE-13798.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13798) Fix the unit test failure org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload

2016-10-12 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568925#comment-15568925
 ] 

Aihua Xu commented on HIVE-13798:
-

 Thanks [~leftylev] for catching it. Yeah. The addendum is needed.

> Fix the unit test failure 
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
> 
>
> Key: HIVE-13798
> URL: https://issues.apache.org/jira/browse/HIVE-13798
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-13798.2.patch, HIVE-13798.3.patch, 
> HIVE-13798.4.patch, HIVE-13798.addendum.patch, HIVE-13798.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14928) Analyze table no scan mess up schema

2016-10-12 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-14928:
---
Status: Open  (was: Patch Available)

> Analyze table no scan mess up schema
> 
>
> Key: HIVE-14928
> URL: https://issues.apache.org/jira/browse/HIVE-14928
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: HIVE-14928.1.patch, HIVE-14928.2.patch
>
>
> StatsNoJobTask uses static variables partUpdates and  table to track stats 
> changes. If multiple analyze no scan tasks run at the same time, then 
> table/partition schema could mess up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14928) Analyze table no scan mess up schema

2016-10-12 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-14928:
---
Attachment: HIVE-14928.2.patch

Rebased the patch to master latest.

> Analyze table no scan mess up schema
> 
>
> Key: HIVE-14928
> URL: https://issues.apache.org/jira/browse/HIVE-14928
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: HIVE-14928.1.patch, HIVE-14928.2.patch
>
>
> StatsNoJobTask uses static variables partUpdates and  table to track stats 
> changes. If multiple analyze no scan tasks run at the same time, then 
> table/partition schema could mess up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14928) Analyze table no scan mess up schema

2016-10-12 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-14928:
---
Status: Patch Available  (was: Open)

> Analyze table no scan mess up schema
> 
>
> Key: HIVE-14928
> URL: https://issues.apache.org/jira/browse/HIVE-14928
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: HIVE-14928.1.patch, HIVE-14928.2.patch
>
>
> StatsNoJobTask uses static variables partUpdates and  table to track stats 
> changes. If multiple analyze no scan tasks run at the same time, then 
> table/partition schema could mess up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14930) RuntimeException was seen in explainanalyze_3.q test log

2016-10-12 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-14930:
---
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to 2.2.0, thanks [~pxiong] for review.

> RuntimeException was seen in explainanalyze_3.q test log
> 
>
> Key: HIVE-14930
> URL: https://issues.apache.org/jira/browse/HIVE-14930
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-14930.patch
>
>
> When working on HIVE-14799, I noticed there were some RuntimeException when 
> running explainanalyze_3.q and explainanalyze_5.q, though these tests shew 
> successful.
> {code}
> 016-10-10T19:02:48,455 ERROR [aa5c6743-b5de-40fc-82da-5dde0e6b387f main] 
> ql.Driver: FAILED: Hive Internal Error: java.lang.RuntimeException(Cannot 
> overwrite read-only table: src)
> java.lang.RuntimeException: Cannot overwrite read-only table: src
>   at 
> org.apache.hadoop.hive.ql.hooks.EnforceReadOnlyTables.run(EnforceReadOnlyTables.java:74)
>   at 
> org.apache.hadoop.hive.ql.hooks.EnforceReadOnlyTables.run(EnforceReadOnlyTables.java:56)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1736)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1505)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1218)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1208)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:106)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:251)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:504)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1298)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1436)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1218)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1208)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1319)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1293)
>   at 
> org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:173)
>   at 
> org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104)
>   at 
> org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver(TestMiniTezCliDriver.java:59)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:92)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at org.junit.runners.Suite.runChild(Suite.java:127)
>   at org.junit.runners.Suite.runChild(Suite.java:26)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.

[jira] [Commented] (HIVE-13046) DependencyResolver should not lowercase the dependency URI's authority


[ 
https://issues.apache.org/jira/browse/HIVE-13046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568973#comment-15568973
 ] 

Hive QA commented on HIVE-13046:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832794/HIVE-13046.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 10636 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1501/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1501/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1501/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832794 - PreCommit-HIVE-Build

> DependencyResolver should not lowercase the dependency URI's authority
> --
>
> Key: HIVE-13046
> URL: https://issues.apache.org/jira/browse/HIVE-13046
> Project: Hive
>  Issue Type: Bug
>Reporter: Anthony Hsu
>Assignee: Anthony Hsu
> Attachments: HIVE-13046.1.patch, HIVE-13046.2.patch
>
>
> When using {{ADD JAR ivy://...}} to add a jar version {{1.2.3-SNAPSHOT}}, 
> Hive will lowercase it to {{1.2.3-snapshot}} due to:
> {code:title=DependencyResolver.java#84}
> String[] authorityTokens = authority.toLowerCase().split(":");
> {code}
> We should not {{.lowerCase()}}.
> RB: https://reviews.apache.org/r/43513



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14476) Fix logging issue for branch-1


[ 
https://issues.apache.org/jira/browse/HIVE-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568982#comment-15568982
 ] 

Hive QA commented on HIVE-14476:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832801/HIVE-14476.1-branch-1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1502/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1502/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1502/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[INFO] 
[INFO] 
[INFO] Building Hive Query Language 1.3.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-exec ---
[INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql/target
[INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql 
(includes = [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
hive-exec ---
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (generate-sources) @ hive-exec ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/ql/target/generated-test-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen
Generating vector expression code
Generating vector expression test code
[INFO] Executed tasks
[INFO] 
[INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-exec ---
[INFO] Source directory: 
/data/hive-ptest/working/apache-github-source-source/ql/src/gen/protobuf/gen-java
 added.
[INFO] Source directory: 
/data/hive-ptest/working/apache-github-source-source/ql/src/gen/thrift/gen-javabean
 added.
[INFO] Source directory: 
/data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java
 added.
[INFO] 
[INFO] --- antlr3-maven-plugin:3.4:antlr (default) @ hive-exec ---
[INFO] ANTLR: Processing source directory 
/data/hive-ptest/working/apache-github-source-source/ql/src/java
ANTLR Parser Generator  Version 3.4
org/apache/hadoop/hive/ql/parse/HiveLexer.g
org/apache/hadoop/hive/ql/parse/HiveParser.g
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_ORDER KW_BY" using 
multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_UNION KW_FROM" using 
multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_UNION KW_ALL" using 
multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_SORT KW_BY" using 
multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_INSERT KW_INTO" 
using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_UNION KW_SELECT" 
using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_INSERT KW_OVERWRITE" 
using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_UNION KW_MAP" using 
multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_MAP LPAREN" using 
multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_GROUP KW_BY" using 
multiple alternatives: 2, 9

As a re

[jira] [Comment Edited] (HIVE-13280) Error when more than 1 mapper for HBase storage handler

2016-10-12 Thread Vsevolod Ostapenko (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568866#comment-15568866
 ] 

Vsevolod Ostapenko edited comment on HIVE-13280 at 10/12/16 3:08 PM:
-

Hive-Hbase integration documentation 
(https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration) states that 
hbase.mapred.output.outputtable property is optional, and needed only when one 
wants to insert into a table. The latter statement is obviously incorrect, as 
prior to Feb 26, 2016, this property wasn't even documented and inserts into 
HBase-backed tables were working just fine with MR engine.

If TEZ does require hbase.mapred.output.outputtable property to be explicitly 
set, documentation needs to be updated to indicate that fact.

One more thing, all the existing samples have hbase.mapred.output.outputtable 
and hbase.table.name set to the same value. If there is no use case when they 
are different, why the former even needed?


was (Author: seva_ostapenko):
Hive-Hbase integration documentation 
(https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration) claims that 
hbase.mapred.output.outputtable property is optional, and provides no good 
explanation under what circumstances one would want or need to define it. In 
all the provided samples values of hbase.mapred.output.outputtable and 
hbase.table.name are the same, so samples are hot helpful and not 
self-explanatory.

If TEZ does require hbase.mapred.output.outputtable property to be explicitly 
set, documentation needs to be updated to indicate that fact.
Also, it would be helpful to provide some background why this property exists 
in the first place.


> Error when more than 1 mapper for HBase storage handler
> ---
>
> Key: HIVE-13280
> URL: https://issues.apache.org/jira/browse/HIVE-13280
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 2.0.0
>Reporter: Damien Carol
>Assignee: Damien Carol
>
> With a simple query (select from orc table and insert into HBase external 
> table):
> {code:sql}
> insert into table register.register  select * from aa_temp
> {code}
> The aa_temp table have 45 orc files. It generate 45 mappers.
> Some mappers fail with this error:
> {noformat}
> Caused by: java.lang.IllegalArgumentException: Must specify table name
> at 
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
> at 
> org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101)
> at 
> org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126)
> ... 25 more
> ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
> killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed 
> due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. 
> failedVertices:1 killedVertices:0 (state=08S01,code=2)
> {noformat}
> If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is 
> fine because there are only one mapper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14922) Add perf logging for post job completion steps

2016-10-12 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14922:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Failed test passed when re-run locally with patch.

> Add perf logging for post job completion steps 
> ---
>
> Key: HIVE-14922
> URL: https://issues.apache.org/jira/browse/HIVE-14922
> Project: Hive
>  Issue Type: Task
>  Components: Logging
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Fix For: 2.2.0
>
> Attachments: HIVE-14922.patch
>
>
> Mostly FS related operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14916) Reduce the memory requirements for Spark tests


[ 
https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569113#comment-15569113
 ] 

Hive QA commented on HIVE-14916:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832814/HIVE-14916.003.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 10636 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1503/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1503/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1503/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832814 - PreCommit-HIVE-Build

> Reduce the memory requirements for Spark tests
> --
>
> Key: HIVE-14916
> URL: https://issues.apache.org/jira/browse/HIVE-14916
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Dapeng Sun
> Attachments: HIVE-14916.001.patch, HIVE-14916.002.patch, 
> HIVE-14916.003.patch
>
>
> As HIVE-14887, we need to reduce the memory requirements for Spark tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14913) Add new unit tests

2016-10-12 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569135#comment-15569135
 ] 

Ashutosh Chauhan commented on HIVE-14913:
-

[~vgarg] Can you add RB link for this?

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, 
> HIVE-14913.3.patch, HIVE-14913.4.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11957) SHOW TRANSACTIONS should show queryID/agent id of the creator

2016-10-12 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569142#comment-15569142
 ] 

Wei Zheng commented on HIVE-11957:
--

[~ekoifman] Can you take a look?
SHOW TRANSACTIONS now output like this:
{code}
hive> show transactions;
OK
Transaction ID  Transaction State   Started TimeLast Heartbeat Time 
UserHostname
16  OPENMon Oct 10 11:26:14 PDT 2016Mon Oct 10 11:26:14 PDT 2016
wzheng  weimac.local
Time taken: 0.028 seconds, Fetched: 2 row(s)
{code}

> SHOW TRANSACTIONS should show queryID/agent id of the creator
> -
>
> Key: HIVE-11957
> URL: https://issues.apache.org/jira/browse/HIVE-11957
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
> Attachments: HIVE-11957.1.patch, HIVE-11957.2.patch, 
> HIVE-11957.3.patch, HIVE-11957.4.patch, HIVE-11957.5.patch
>
>
> this would be very useful for debugging
> should also include heartbeat/create timestamps
> would be nice to support some filtering/sorting options, like sort by create 
> time, agent id. filter by table, database, etc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-14835) Improve ptest2 build time


[ 
https://issues.apache.org/jira/browse/HIVE-14835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569154#comment-15569154
 ] 

Siddharth Seth edited comment on HIVE-14835 at 10/12/16 4:20 PM:
-

[~prasanth_j] - did this go in again? Can the jira be closed.


was (Author: sseth):
[~prasanth_j] - did this go in again?

> Improve ptest2 build time
> -
>
> Key: HIVE-14835
> URL: https://issues.apache.org/jira/browse/HIVE-14835
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.2.0
>
> Attachments: HIVE-14835.1.patch
>
>
> NO PRECOMMIT TESTS
> 2 things can be improved
> 1) ptest2 always downloads jars for compiling its own directory which takes 
> about 1m30s which should take only 5s with cache jars. The reason for that is 
> maven.repo.local is pointing to a path under WORKSPACE which will be cleaned 
> by jenkins for every run.
> 2) For hive build we can make use of parallel build and quite the output of 
> build which should shave off another 15-30s. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14835) Improve ptest2 build time


[ 
https://issues.apache.org/jira/browse/HIVE-14835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569154#comment-15569154
 ] 

Siddharth Seth commented on HIVE-14835:
---

[~prasanth_j] - did this go in again?

> Improve ptest2 build time
> -
>
> Key: HIVE-14835
> URL: https://issues.apache.org/jira/browse/HIVE-14835
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.2.0
>
> Attachments: HIVE-14835.1.patch
>
>
> NO PRECOMMIT TESTS
> 2 things can be improved
> 1) ptest2 always downloads jars for compiling its own directory which takes 
> about 1m30s which should take only 5s with cache jars. The reason for that is 
> maven.repo.local is pointing to a path under WORKSPACE which will be cleaned 
> by jenkins for every run.
> 2) For hive build we can make use of parallel build and quite the output of 
> build which should shave off another 15-30s. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-14539) Run additional tests from the module directory


 [ 
https://issues.apache.org/jira/browse/HIVE-14539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth resolved HIVE-14539.
---
Resolution: Done

As part of HIVE-14540

> Run additional tests from the module directory
> --
>
> Key: HIVE-14539
> URL: https://issues.apache.org/jira/browse/HIVE-14539
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>
> There's still close to 400 tests which run from the wrong directory (and end 
> up checking for file changes on more modules than required)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-14827) Micro benchmark for Parquet vectorized reader

2016-10-12 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-14827:
---

Assignee: Sahil Takiar

> Micro benchmark for Parquet vectorized reader
> -
>
> Key: HIVE-14827
> URL: https://issues.apache.org/jira/browse/HIVE-14827
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Sahil Takiar
>
> We need a microbenchmark to evaluate the throughput and execution time for 
> Parquet vectorized reader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14938) Add deployed ptest properties file to repo, update to remove isolated tests


 [ 
https://issues.apache.org/jira/browse/HIVE-14938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14938:
--
Attachment: HIVE-14938.part1.patch

Initial config file - the existing one being used.

> Add deployed ptest properties file to repo, update to remove isolated tests
> ---
>
> Key: HIVE-14938
> URL: https://issues.apache.org/jira/browse/HIVE-14938
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14938.part1.patch
>
>
> The intent is to checkin the original file, and then modify it to remove 
> isolated tests (and move relevant ones to the skipBatching list), which 
> normally lead to stragglers, and sub-optimal resource utilization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14938) Add deployed ptest properties file to repo, update to remove isolated tests


 [ 
https://issues.apache.org/jira/browse/HIVE-14938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14938:
--
Attachment: HIVE-14938.part2.patch

Revision on top of the first patch with changes to remove isolation, add 
batching for spark tests and encryptedhdfs tests, skipBatching for others. This 
includes changes made by [~prasanth_j] and me for internal runs, to improve the 
runtimes.

[~prasanth_j], [~spena] - could you please take a look for sanity, before I 
commit these changes, and update the deployed ptest instance.

> Add deployed ptest properties file to repo, update to remove isolated tests
> ---
>
> Key: HIVE-14938
> URL: https://issues.apache.org/jira/browse/HIVE-14938
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14938.part1.patch, HIVE-14938.part2.patch
>
>
> The intent is to checkin the original file, and then modify it to remove 
> isolated tests (and move relevant ones to the skipBatching list), which 
> normally lead to stragglers, and sub-optimal resource utilization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14799) Query operation are not thread safe during its cancellation


[ 
https://issues.apache.org/jira/browse/HIVE-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569254#comment-15569254
 ] 

Hive QA commented on HIVE-14799:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832818/HIVE-14799.6.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10636 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testTaskStatus
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1504/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1504/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1504/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832818 - PreCommit-HIVE-Build

> Query operation are not thread safe during its cancellation
> ---
>
> Key: HIVE-14799
> URL: https://issues.apache.org/jira/browse/HIVE-14799
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-14799.1.patch, HIVE-14799.2.patch, 
> HIVE-14799.3.patch, HIVE-14799.4.patch, HIVE-14799.5.patch, 
> HIVE-14799.5.patch, HIVE-14799.6.patch, HIVE-14799.patch
>
>
> When a query is cancelled either via Beeline (Ctrl-C) or API call 
> TCLIService.Client.CancelOperation, SQLOperation.cancel is invoked in a 
> different thread from that running the query to close/destroy its 
> encapsulated Driver object. Both SQLOperation and Driver are not thread-safe 
> which could sometimes result in Runtime exceptions like NPE. The errors from 
> the running query are not handled properly therefore probably causing some 
> stuffs (files, locks etc) not being cleaned after the query termination.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14721) Fix TestJdbcWithMiniHS2 runtime


[ 
https://issues.apache.org/jira/browse/HIVE-14721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569263#comment-15569263
 ] 

Hive QA commented on HIVE-14721:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832823/HIVE-14721.7.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1505/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1505/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1505/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2016-10-12 17:07:09.998
+ [[ -n /usr/java/jdk1.8.0_25 ]]
+ export JAVA_HOME=/usr/java/jdk1.8.0_25
+ JAVA_HOME=/usr/java/jdk1.8.0_25
+ export 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-Build-1505/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-10-12 17:07:10.000
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 04b303b HIVE-14922 : Add perf logging for post job completion 
steps (Ashutosh Chauhan via Pengcheng Xiong)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 04b303b HIVE-14922 : Add perf logging for post job completion 
steps (Ashutosh Chauhan via Pengcheng Xiong)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2016-10-12 17:07:11.083
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
error: 
a/itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/JdbcWithMiniKdcSQLAuthTest.java:
 No such file or directory
error: 
a/itests/hive-unit/src/main/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java: No 
such file or directory
error: 
a/itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java: 
No such file or directory
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832823 - PreCommit-HIVE-Build

> Fix TestJdbcWithMiniHS2 runtime
> ---
>
> Key: HIVE-14721
> URL: https://issues.apache.org/jira/browse/HIVE-14721
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 2.2.0
>
> Attachments: HIVE-14721.1.patch, HIVE-14721.2.patch, 
> HIVE-14721.3.patch, HIVE-14721.3.patch, HIVE-14721.3.patch, 
> HIVE-14721.4.patch, HIVE-14721.4.patch, HIVE-14721.5.patch, 
> HIVE-14721.6.patch, HIVE-14721.6.patch, HIVE-14721.6.patch, HIVE-14721.7.patch
>
>
> Currently 450s



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14761) Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2


 [ 
https://issues.apache.org/jira/browse/HIVE-14761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-14761:


Committed to master. Thanks [~sseth].

> Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2
> --
>
> Key: HIVE-14761
> URL: https://issues.apache.org/jira/browse/HIVE-14761
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 2.2.0
>
> Attachments: HIVE-14761.1.patch
>
>
> Currently 2 min 30 sec



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14761) Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2


 [ 
https://issues.apache.org/jira/browse/HIVE-14761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-14761:

Affects Version/s: 2.1.0

> Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2
> --
>
> Key: HIVE-14761
> URL: https://issues.apache.org/jira/browse/HIVE-14761
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 2.2.0
>
> Attachments: HIVE-14761.1.patch
>
>
> Currently 2 min 30 sec



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12458) remove identity_udf.jar from source


 [ 
https://issues.apache.org/jira/browse/HIVE-12458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-12458:

Attachment: HIVE-12458.1.patch

> remove identity_udf.jar from source
> ---
>
> Key: HIVE-12458
> URL: https://issues.apache.org/jira/browse/HIVE-12458
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 2.1.0
>Reporter: Thejas M Nair
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-12458.1.patch
>
>
> We should not be checking in jars into the source repo.
> We could use hive-contrib jar like its used in 
> ./ql/src/test/queries/clientpositive/add_jar_pfile.q 
> add jar 
> pfile://${system:test.tmp.dir}/hive-contrib-${system:hive.version}.jar;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12458) remove identity_udf.jar from source


[ 
https://issues.apache.org/jira/browse/HIVE-12458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569295#comment-15569295
 ] 

Vaibhav Gumashta commented on HIVE-12458:
-

[~thejas] I've removed the code that used this jar (in tests) as part of the 
work on improving test cases. Can you review this?

> remove identity_udf.jar from source
> ---
>
> Key: HIVE-12458
> URL: https://issues.apache.org/jira/browse/HIVE-12458
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 2.1.0
>Reporter: Thejas M Nair
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-12458.1.patch
>
>
> We should not be checking in jars into the source repo.
> We could use hive-contrib jar like its used in 
> ./ql/src/test/queries/clientpositive/add_jar_pfile.q 
> add jar 
> pfile://${system:test.tmp.dir}/hive-contrib-${system:hive.version}.jar;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12458) remove identity_udf.jar from source

2016-10-12 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569312#comment-15569312
 ] 

Thejas M Nair commented on HIVE-12458:
--

+1

> remove identity_udf.jar from source
> ---
>
> Key: HIVE-12458
> URL: https://issues.apache.org/jira/browse/HIVE-12458
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 2.1.0
>Reporter: Thejas M Nair
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-12458.1.patch
>
>
> We should not be checking in jars into the source repo.
> We could use hive-contrib jar like its used in 
> ./ql/src/test/queries/clientpositive/add_jar_pfile.q 
> add jar 
> pfile://${system:test.tmp.dir}/hive-contrib-${system:hive.version}.jar;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14913) Add new unit tests

2016-10-12 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569322#comment-15569322
 ] 

Vineet Garg commented on HIVE-14913:


RB Link: https://reviews.apache.org/r/52708/

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, 
> HIVE-14913.3.patch, HIVE-14913.4.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14933) include argparse with LLAP scripts to support antique Python versions

2016-10-12 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14933:

Status: Patch Available  (was: Open)

> include argparse with LLAP scripts to support antique Python versions
> -
>
> Key: HIVE-14933
> URL: https://issues.apache.org/jira/browse/HIVE-14933
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14933.patch
>
>
> The module is a standalone file, and it's under Python license that is 
> compatible with Apache. In the long term we should probably just move 
> LlapServiceDriver code entirely to Java, as right now it's a combination of 
> part-py, part-java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14803) S3: Stats gathering for insert queries can be expensive for partitioned dataset


[ 
https://issues.apache.org/jira/browse/HIVE-14803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569355#comment-15569355
 ] 

Pengcheng Xiong commented on HIVE-14803:


Thanks [~sseth] for digging this out. [~rajesh.balamohan], it seems that we 
really have some problem in this patch. It looks like the stats are missing. In 
the explain plan, if the row of src table is 29 rather than 500, that usually 
means stats are missing. Could u take another look and upload a new patch? And, 
there is also a problem of the thread pool. People may set the 
mv.files.thread=0. In that case, threadpool will be null. Thanks.

> S3: Stats gathering for insert queries can be expensive for partitioned 
> dataset
> ---
>
> Key: HIVE-14803
> URL: https://issues.apache.org/jira/browse/HIVE-14803
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.1.0
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14803.1.patch
>
>
> StatsTask's aggregateStats populates stats details for all partitions by 
> checking the file sizes which turns out to be expensive when larger number of 
> partitions are inserted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-12458) remove identity_udf.jar from source


 [ 
https://issues.apache.org/jira/browse/HIVE-12458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta resolved HIVE-12458.
-
Resolution: Fixed

Committed. Thanks [~thejas]

> remove identity_udf.jar from source
> ---
>
> Key: HIVE-12458
> URL: https://issues.apache.org/jira/browse/HIVE-12458
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 2.1.0
>Reporter: Thejas M Nair
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-12458.1.patch
>
>
> We should not be checking in jars into the source repo.
> We could use hive-contrib jar like its used in 
> ./ql/src/test/queries/clientpositive/add_jar_pfile.q 
> add jar 
> pfile://${system:test.tmp.dir}/hive-contrib-${system:hive.version}.jar;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12458) remove identity_udf.jar from source


 [ 
https://issues.apache.org/jira/browse/HIVE-12458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-12458:

Fix Version/s: 2.2.0

> remove identity_udf.jar from source
> ---
>
> Key: HIVE-12458
> URL: https://issues.apache.org/jira/browse/HIVE-12458
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 2.1.0
>Reporter: Thejas M Nair
>Assignee: Vaibhav Gumashta
> Fix For: 2.2.0
>
> Attachments: HIVE-12458.1.patch
>
>
> We should not be checking in jars into the source repo.
> We could use hive-contrib jar like its used in 
> ./ql/src/test/queries/clientpositive/add_jar_pfile.q 
> add jar 
> pfile://${system:test.tmp.dir}/hive-contrib-${system:hive.version}.jar;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3


[ 
https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569399#comment-15569399
 ] 

Thomas Poepping commented on HIVE-14373:


[~spena] I responded to your comments on RB. I would like to open a separate 
JIRA after the submission of this one that will change the qtests to run on Tez 
by default, rather than running on MR. What do you think?

> Add integration tests for hive on S3
> 
>
> Key: HIVE-14373
> URL: https://issues.apache.org/jira/browse/HIVE-14373
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Thomas Poepping
> Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, 
> HIVE-14373.04.patch, HIVE-14373.05.patch, HIVE-14373.patch
>
>
> With Hive doing improvements to run on S3, it would be ideal to have better 
> integration testing on S3.
> These S3 tests won't be able to be executed by HiveQA because it will need 
> Amazon credentials. We need to write suite based on ideas from the Hadoop 
> project where:
> - an xml file is provided with S3 credentials
> - a committer must run these tests manually to verify it works
> - the xml file should not be part of the commit, and hiveqa should not run 
> these tests.
> https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14373) Add integration tests for hive on S3


 [ 
https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Poepping updated HIVE-14373:
---
Status: Open  (was: Patch Available)

> Add integration tests for hive on S3
> 
>
> Key: HIVE-14373
> URL: https://issues.apache.org/jira/browse/HIVE-14373
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Thomas Poepping
> Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, 
> HIVE-14373.04.patch, HIVE-14373.05.patch, HIVE-14373.06.patch, 
> HIVE-14373.patch
>
>
> With Hive doing improvements to run on S3, it would be ideal to have better 
> integration testing on S3.
> These S3 tests won't be able to be executed by HiveQA because it will need 
> Amazon credentials. We need to write suite based on ideas from the Hadoop 
> project where:
> - an xml file is provided with S3 credentials
> - a committer must run these tests manually to verify it works
> - the xml file should not be part of the commit, and hiveqa should not run 
> these tests.
> https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14373) Add integration tests for hive on S3


 [ 
https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Poepping updated HIVE-14373:
---
Status: Patch Available  (was: Open)

> Add integration tests for hive on S3
> 
>
> Key: HIVE-14373
> URL: https://issues.apache.org/jira/browse/HIVE-14373
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Thomas Poepping
> Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, 
> HIVE-14373.04.patch, HIVE-14373.05.patch, HIVE-14373.06.patch, 
> HIVE-14373.patch
>
>
> With Hive doing improvements to run on S3, it would be ideal to have better 
> integration testing on S3.
> These S3 tests won't be able to be executed by HiveQA because it will need 
> Amazon credentials. We need to write suite based on ideas from the Hadoop 
> project where:
> - an xml file is provided with S3 credentials
> - a committer must run these tests manually to verify it works
> - the xml file should not be part of the commit, and hiveqa should not run 
> these tests.
> https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14373) Add integration tests for hive on S3


 [ 
https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Poepping updated HIVE-14373:
---
Attachment: HIVE-14373.06.patch

Attach new patch, addressed comments from RB

> Add integration tests for hive on S3
> 
>
> Key: HIVE-14373
> URL: https://issues.apache.org/jira/browse/HIVE-14373
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Thomas Poepping
> Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, 
> HIVE-14373.04.patch, HIVE-14373.05.patch, HIVE-14373.06.patch, 
> HIVE-14373.patch
>
>
> With Hive doing improvements to run on S3, it would be ideal to have better 
> integration testing on S3.
> These S3 tests won't be able to be executed by HiveQA because it will need 
> Amazon credentials. We need to write suite based on ideas from the Hadoop 
> project where:
> - an xml file is provided with S3 credentials
> - a committer must run these tests manually to verify it works
> - the xml file should not be part of the commit, and hiveqa should not run 
> these tests.
> https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14835) Improve ptest2 build time


[ 
https://issues.apache.org/jira/browse/HIVE-14835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569418#comment-15569418
 ] 

Prasanth Jayachandran commented on HIVE-14835:
--

No. This patch is breaking ptest. Will apply it again when the queue is close 
to empty and will debug it further. 

> Improve ptest2 build time
> -
>
> Key: HIVE-14835
> URL: https://issues.apache.org/jira/browse/HIVE-14835
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.2.0
>
> Attachments: HIVE-14835.1.patch
>
>
> NO PRECOMMIT TESTS
> 2 things can be improved
> 1) ptest2 always downloads jars for compiling its own directory which takes 
> about 1m30s which should take only 5s with cache jars. The reason for that is 
> maven.repo.local is pointing to a path under WORKSPACE which will be cleaned 
> by jenkins for every run.
> 2) For hive build we can make use of parallel build and quite the output of 
> build which should shave off another 15-30s. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14929) Adding JDBC test for query cancellation scenario


[ 
https://issues.apache.org/jira/browse/HIVE-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569426#comment-15569426
 ] 

Hive QA commented on HIVE-14929:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832746/HIVE-14929.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 10640 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1506/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1506/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1506/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832746 - PreCommit-HIVE-Build

> Adding JDBC test for query cancellation scenario
> 
>
> Key: HIVE-14929
> URL: https://issues.apache.org/jira/browse/HIVE-14929
> Project: Hive
>  Issue Type: Test
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-14929.1.patch, HIVE-14929.2.patch
>
>
> There is some functional testing for query cancellation using JDBC which is 
> missing in unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14929) Adding JDBC test for query cancellation scenario


[ 
https://issues.apache.org/jira/browse/HIVE-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569432#comment-15569432
 ] 

Vaibhav Gumashta commented on HIVE-14929:
-

[~djaiswal] Can you submit again for QA run? There were some changes that went 
in {{TestJdbcDriver2}} yesterday, which brought down the running time to 
~60-70s. Want to be sure the new tests don't affect that in a major way. I'll 
also take a look at the patch shortly.

> Adding JDBC test for query cancellation scenario
> 
>
> Key: HIVE-14929
> URL: https://issues.apache.org/jira/browse/HIVE-14929
> Project: Hive
>  Issue Type: Test
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-14929.1.patch, HIVE-14929.2.patch
>
>
> There is some functional testing for query cancellation using JDBC which is 
> missing in unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14929) Adding JDBC test for query cancellation scenario


[ 
https://issues.apache.org/jira/browse/HIVE-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569439#comment-15569439
 ] 

Vaibhav Gumashta commented on HIVE-14929:
-

[~djaiswal] Nevermind, looks like the patch just had a fresh QA run. Please 
ignore my comment about rerunning. I'll take a look at the patch shortly.

> Adding JDBC test for query cancellation scenario
> 
>
> Key: HIVE-14929
> URL: https://issues.apache.org/jira/browse/HIVE-14929
> Project: Hive
>  Issue Type: Test
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-14929.1.patch, HIVE-14929.2.patch
>
>
> There is some functional testing for query cancellation using JDBC which is 
> missing in unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14929) Adding JDBC test for query cancellation scenario

2016-10-12 Thread Deepak Jaiswal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569441#comment-15569441
 ] 

Deepak Jaiswal commented on HIVE-14929:
---

Sure. I will refresh my code and try that.

> Adding JDBC test for query cancellation scenario
> 
>
> Key: HIVE-14929
> URL: https://issues.apache.org/jira/browse/HIVE-14929
> Project: Hive
>  Issue Type: Test
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-14929.1.patch, HIVE-14929.2.patch
>
>
> There is some functional testing for query cancellation using JDBC which is 
> missing in unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14913) Add new unit tests

2016-10-12 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14913:
---
Status: Open  (was: Patch Available)

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, 
> HIVE-14913.3.patch, HIVE-14913.4.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3


[ 
https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569450#comment-15569450
 ] 

Thomas Poepping commented on HIVE-14373:


Have two +1s on RB. Awaiting precommit tests, then patch should be good to go

> Add integration tests for hive on S3
> 
>
> Key: HIVE-14373
> URL: https://issues.apache.org/jira/browse/HIVE-14373
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Thomas Poepping
> Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, 
> HIVE-14373.04.patch, HIVE-14373.05.patch, HIVE-14373.06.patch, 
> HIVE-14373.patch
>
>
> With Hive doing improvements to run on S3, it would be ideal to have better 
> integration testing on S3.
> These S3 tests won't be able to be executed by HiveQA because it will need 
> Amazon credentials. We need to write suite based on ideas from the Hadoop 
> project where:
> - an xml file is provided with S3 credentials
> - a committer must run these tests manually to verify it works
> - the xml file should not be part of the commit, and hiveqa should not run 
> these tests.
> https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14913) Add new unit tests

2016-10-12 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14913:
---
Status: Patch Available  (was: Open)

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, 
> HIVE-14913.3.patch, HIVE-14913.4.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14929) Adding JDBC test for query cancellation scenario


[ 
https://issues.apache.org/jira/browse/HIVE-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569473#comment-15569473
 ] 

Vaibhav Gumashta commented on HIVE-14929:
-

Patch looks good. I just saw the latest test report and it doesn't add any 
overhead. +1 from my side.

> Adding JDBC test for query cancellation scenario
> 
>
> Key: HIVE-14929
> URL: https://issues.apache.org/jira/browse/HIVE-14929
> Project: Hive
>  Issue Type: Test
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-14929.1.patch, HIVE-14929.2.patch
>
>
> There is some functional testing for query cancellation using JDBC which is 
> missing in unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14929) Adding JDBC test for query cancellation scenario

2016-10-12 Thread Deepak Jaiswal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569478#comment-15569478
 ] 

Deepak Jaiswal commented on HIVE-14929:
---

Thanks Vaibhav.

> Adding JDBC test for query cancellation scenario
> 
>
> Key: HIVE-14929
> URL: https://issues.apache.org/jira/browse/HIVE-14929
> Project: Hive
>  Issue Type: Test
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-14929.1.patch, HIVE-14929.2.patch
>
>
> There is some functional testing for query cancellation using JDBC which is 
> missing in unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14906) HMS should support an API to get consistent atomic snapshot associated with a Notification ID.

2016-10-12 Thread Sravya Tirukkovalur (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569504#comment-15569504
 ] 

Sravya Tirukkovalur commented on HIVE-14906:


Seems like if we do the following, we should be able to support an atomic 
getSnapshot() API:
- Set transaction level to "repeatable-read", so that all reads within a 
transaction would be from a single generation point. In other words, concurrent 
writes would not affect the state of the read.
- Make all the reads of snapshot building function part of the same transaction.

> HMS should support an API to get consistent atomic snapshot associated with a 
> Notification ID.
> --
>
> Key: HIVE-14906
> URL: https://issues.apache.org/jira/browse/HIVE-14906
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sravya Tirukkovalur
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13966) DbNotificationListener: can loose DDL operation notifications

2016-10-12 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569597#comment-15569597
 ] 

Alan Gates commented on HIVE-13966:
---

Assigned back to Rahul as I didn't intend to take over the JIRA, I just had to 
assign it to myself to upload a patch.

> DbNotificationListener: can loose DDL operation notifications
> -
>
> Key: HIVE-13966
> URL: https://issues.apache.org/jira/browse/HIVE-13966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Nachiket Vaidya
>Assignee: Rahul Sharma
>Priority: Critical
> Attachments: HIVE-13966.1.patch, HIVE-13966.2.patch, 
> HIVE-13966.3.patch, HIVE-13966.pdf
>
>
> The code for each API in HiveMetaStore.java is like this:
> 1. openTransaction()
> 2. -- operation--
> 3. commit() or rollback() based on result of the operation.
> 4. add entry to notification log (unconditionally)
> If the operation is failed (in step 2), we still add entry to notification 
> log. Found this issue in testing.
> It is still ok as this is the case of false positive.
> If the operation is successful and adding to notification log failed, the 
> user will get an MetaException. It will not rollback the operation, as it is 
> already committed. We need to handle this case so that we will not have false 
> negatives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13966) DbNotificationListener: can loose DDL operation notifications

2016-10-12 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-13966:
--
Assignee: Rahul Sharma  (was: Alan Gates)

> DbNotificationListener: can loose DDL operation notifications
> -
>
> Key: HIVE-13966
> URL: https://issues.apache.org/jira/browse/HIVE-13966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Nachiket Vaidya
>Assignee: Rahul Sharma
>Priority: Critical
> Attachments: HIVE-13966.1.patch, HIVE-13966.2.patch, 
> HIVE-13966.3.patch, HIVE-13966.pdf
>
>
> The code for each API in HiveMetaStore.java is like this:
> 1. openTransaction()
> 2. -- operation--
> 3. commit() or rollback() based on result of the operation.
> 4. add entry to notification log (unconditionally)
> If the operation is failed (in step 2), we still add entry to notification 
> log. Found this issue in testing.
> It is still ok as this is the case of false positive.
> If the operation is successful and adding to notification log failed, the 
> user will get an MetaException. It will not rollback the operation, as it is 
> already committed. We need to handle this case so that we will not have false 
> negatives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14822) Add support for credential provider for jobs launched from Hiveserver2

2016-10-12 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-14822:
---
Attachment: HIVE-14822.05.patch

Updating the patch with the changes suggested.

> Add support for credential provider for jobs launched from Hiveserver2
> --
>
> Key: HIVE-14822
> URL: https://issues.apache.org/jira/browse/HIVE-14822
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-14822.01.patch, HIVE-14822.02.patch, 
> HIVE-14822.03.patch, HIVE-14822.05.patch
>
>
> When using encrypted passwords via the Hadoop Credential Provider, 
> HiveServer2 currently does not correctly forward enough information to the 
> job configuration for jobs to read those secrets. If your job needs to access 
> any secrets, like S3 credentials, then there's no convenient and secure way 
> to configure this today.
> You could specify the decryption key in files like mapred-site.xml that 
> HiveServer2 uses, but this would place the encryption password on local disk 
> in plaintext, which can be a security concern.
> To solve this problem, HiveServer2 should modify job configuration to include 
> the environment variable settings needed to decrypt the passwords. 
> Specifically, it will need to modify:
> * For MR2 jobs:
> ** yarn.app.mapreduce.am.admin.user.env
> ** mapreduce.admin.user.env
> * For Spark jobs:
> ** spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD
> ** spark.executorEnv.HADOOP_CREDSTORE_PASSWORD
> HiveServer2 can get the decryption password from its own environment, the 
> same way it does for its own credential provider store today.
> Additionally, it can be desirable for HiveServer2 to have a separate 
> encrypted password file than what is used by the job. HiveServer2 may have 
> secrets that the job should not have, such as the metastore database password 
> or the password to decrypt its private SSL certificate. It is also best 
> practices to have separate passwords on separate files. To facilitate this, 
> Hive will also accept:
> * A configuration for a path to a credential store to use for jobs. This 
> should already be uploaded in HDFS. (hive.server2.job.keystore.location or a 
> better name) If this is not specified, then HS2 will simply use the value of 
> hadoop.security.credential.provider.path.
> * An environment variable for the password to decrypt the credential store 
> (HIVE_JOB_KEYSTORE_PASSWORD or better). If this is not specified, then HS2 
> will simply use the standard environment variable for decrypting the Hadoop 
> Credential Provider.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization


[ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569628#comment-15569628
 ] 

Hive QA commented on HIVE-11394:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832883/HIVE-11394.091.patch

{color:green}SUCCESS:{color} +1 due to 162 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 81 failed/errored test(s), 10601 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapLocalCliDriver-orc_llap.q-delete_where_non_partitioned.q-vector_groupby_mapjoin.q-and-27-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reloadJar]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_adaptor_usage_mode]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_aggregate_9]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_between_in]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_binary_join_groupby]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_cast_constant]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_coalesce_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_complex_all]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_count]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_count_distinct]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_aggregate]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_precision]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_distinct_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_empty_where]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_3]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_include_no_sel]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_orderby_5]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_partition_diff_num_cols]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce_groupby_decimal]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_string_concat]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_when_case_null]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_0]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_13]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_limit]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_date_funcs]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_mapjoin2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_timestamp]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_timestamp_funcs]
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_aggregate_9]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_auto_smb_mapjoin_14]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_binary_join_groupby]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_coalesce_2]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_complex_all]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_count]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_count_distinct]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_aggregate]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_precision]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_udf]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_distinct_2]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby4]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby6]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_3]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_reduce]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_grouping_sets]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_include_no_sel]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_mapjoin_reduce]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_number_compare_projection]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_orderby_5]
org.apache.hadoop.hive.cli.TestMiniLlapLocalC

[jira] [Updated] (HIVE-14925) MSCK repair table hang while running with multi threading enabled

2016-10-12 Thread Ratheesh Kamoor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratheesh Kamoor updated HIVE-14925:
---
Fix Version/s: 2.2.0
 Release Note: 
Issue: MSCK is failing in multithreaded execution

Solution:
  - Moved Path processor logic to an external class which will avoid code 
duplication and it will be used in both multi-threaded and single threaded 
execution. 
   Status: Patch Available  (was: Open)

> MSCK repair table hang while running with multi threading enabled
> -
>
> Key: HIVE-14925
> URL: https://issues.apache.org/jira/browse/HIVE-14925
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Pengcheng Xiong
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14925.patch
>
>
> MSCK REPAIR TABLE hanging while running with multi-threading enabled 
> (default). I think it is because of a major design flaw in how thread pool 
> implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This 
> method has a thread pool which register Callable but callable makes a 
> recursive call to checkPartitionDirs method again. This code will hang when 
> number of directories is more than thread pool size. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14925) MSCK repair table hang while running with multi threading enabled

2016-10-12 Thread Ratheesh Kamoor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratheesh Kamoor updated HIVE-14925:
---
Attachment: HIVE-14925.patch

> MSCK repair table hang while running with multi threading enabled
> -
>
> Key: HIVE-14925
> URL: https://issues.apache.org/jira/browse/HIVE-14925
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Pengcheng Xiong
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14925.patch
>
>
> MSCK REPAIR TABLE hanging while running with multi-threading enabled 
> (default). I think it is because of a major design flaw in how thread pool 
> implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This 
> method has a thread pool which register Callable but callable makes a 
> recursive call to checkPartitionDirs method again. This code will hang when 
> number of directories is more than thread pool size. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled

2016-10-12 Thread Ratheesh Kamoor (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569746#comment-15569746
 ] 

Ratheesh Kamoor commented on HIVE-14925:


[~pxiong] I moved the logic in inline callable to an external class so that 
code can be reused in with multi-threaded and non-multi threaded scenario. 
Also, it will fix the issues of thread lock. Could you please review. Tested 
with very large partitions (5K+) we have and worked fine. 

> MSCK repair table hang while running with multi threading enabled
> -
>
> Key: HIVE-14925
> URL: https://issues.apache.org/jira/browse/HIVE-14925
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Pengcheng Xiong
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14925.patch
>
>
> MSCK REPAIR TABLE hanging while running with multi-threading enabled 
> (default). I think it is because of a major design flaw in how thread pool 
> implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This 
> method has a thread pool which register Callable but callable makes a 
> recursive call to checkPartitionDirs method again. This code will hang when 
> number of directories is more than thread pool size. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14921) Move slow CliDriver tests to MiniLlap - part 2


[ 
https://issues.apache.org/jira/browse/HIVE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569754#comment-15569754
 ] 

Hive QA commented on HIVE-14921:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832840/HIVE-14921.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10601 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reloadJar]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[alter_table_invalidate_column_stats]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[newline]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_merge10]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1508/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1508/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1508/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832840 - PreCommit-HIVE-Build

> Move slow CliDriver tests to MiniLlap - part 2
> --
>
> Key: HIVE-14921
> URL: https://issues.apache.org/jira/browse/HIVE-14921
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14921.1.patch, HIVE-14921.1.patch
>
>
> Continuation to HIVE-14877



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13316) Upgrade to Calcite 1.10


[ 
https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569767#comment-15569767
 ] 

Hive QA commented on HIVE-13316:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832847/HIVE-13316.05.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1509/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1509/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1509/

Messages:
{noformat}
 This message was trimmed, see log for full details 

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ spark-client 
---
[INFO] Compiling 28 source files to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/classes
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounter.java:
 Some input files use or override a deprecated API.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounter.java:
 Recompile with -Xlint:deprecation for details.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java:
 Some input files use unchecked or unsafe operations.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java:
 Recompile with -Xlint:unchecked for details.
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
spark-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ spark-client ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf
 [copy] Copying 15 files to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
spark-client ---
[INFO] Compiling 5 source files to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/test-classes
[INFO] 
[INFO] --- maven-dependency-plugin:2.8:copy (copy-guava-14) @ spark-client ---
[INFO] Configured Artifact: com.google.guava:guava:14.0.1:jar
[INFO] Copying guava-14.0.1.jar to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/dependency/guava-14.0.1.jar
[INFO] 
[INFO] --- maven-surefire-plugin:2.19.1:test (default-test) @ spark-client ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ spark-client ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.2.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
spark-client ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ spark-client ---
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.2.0-SNAPSHOT.jar
 to 
/data/hive-ptest/working/maven/org/apache/hive/spark-client/2.2.0-SNAPSHOT/spark-client-2.2.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/spark-client/pom.xml to 
/data/hive-ptest/working/maven/org/apache/hive/spark-client/2.2.0-SNAPSHOT/spark-client-2.2.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive Query Language 2.2.0-SNAPSHOT
[INFO] 
Downloading: 
http://www.datanucleus.org/downloads/maven2/org/apache/calcite/calcite-core/1.10.0/calcite-core-1.10.0.pom
Downloading: 
http://repo.maven.apache.org/maven2/org/apache/calcite/calcite-core/1.10.0/calcite-core-1.10.0.pom
Downloaded: 
http://repo.maven.apache.org/maven2/org/apache/calcite/calcite-core/1.10.0/calcite-core-1.10.0.pom
 (16 KB at 76.7 KB/sec)
Downloading: 
http://www.datanucleus.org/downloads/maven2/org/apache/calcite/calcite/1.10.0/calcite-1.10.0.pom
Downloading: 
http://repo.maven.apache.org/maven2/org/apache/calcite/calcite/1.10.0/calcite-1.10.0.pom
Downloaded: 
http://repo.maven.apache.org/maven2/org/apache/calcite/calcite/1.10.0/calcite-

[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization


 [ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11394:

Attachment: (was: HIVE-11394.091.patch)

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
> groupByVectorOutput: true
> inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> allNative: false
> usesVectorUDFAdaptor: false
> vectorized: true
> Reducer 2 
> Execution mode: vectorized, llap
> Reduce Vectorization:
> enabled: true
> enableConditi

[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization


 [ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11394:

Attachment: HIVE-11394.091.patch

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
> groupByVectorOutput: true
> inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> allNative: false
> usesVectorUDFAdaptor: false
> vectorized: true
> Reducer 2 
> Execution mode: vectorized, llap
> Reduce Vectorization:
> enabled: true
> en

[jira] [Updated] (HIVE-14822) Add support for credential provider for jobs launched from Hiveserver2

2016-10-12 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-14822:
---
Attachment: HIVE-14822.06.patch

> Add support for credential provider for jobs launched from Hiveserver2
> --
>
> Key: HIVE-14822
> URL: https://issues.apache.org/jira/browse/HIVE-14822
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-14822.01.patch, HIVE-14822.02.patch, 
> HIVE-14822.03.patch, HIVE-14822.05.patch, HIVE-14822.06.patch
>
>
> When using encrypted passwords via the Hadoop Credential Provider, 
> HiveServer2 currently does not correctly forward enough information to the 
> job configuration for jobs to read those secrets. If your job needs to access 
> any secrets, like S3 credentials, then there's no convenient and secure way 
> to configure this today.
> You could specify the decryption key in files like mapred-site.xml that 
> HiveServer2 uses, but this would place the encryption password on local disk 
> in plaintext, which can be a security concern.
> To solve this problem, HiveServer2 should modify job configuration to include 
> the environment variable settings needed to decrypt the passwords. 
> Specifically, it will need to modify:
> * For MR2 jobs:
> ** yarn.app.mapreduce.am.admin.user.env
> ** mapreduce.admin.user.env
> * For Spark jobs:
> ** spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD
> ** spark.executorEnv.HADOOP_CREDSTORE_PASSWORD
> HiveServer2 can get the decryption password from its own environment, the 
> same way it does for its own credential provider store today.
> Additionally, it can be desirable for HiveServer2 to have a separate 
> encrypted password file than what is used by the job. HiveServer2 may have 
> secrets that the job should not have, such as the metastore database password 
> or the password to decrypt its private SSL certificate. It is also best 
> practices to have separate passwords on separate files. To facilitate this, 
> Hive will also accept:
> * A configuration for a path to a credential store to use for jobs. This 
> should already be uploaded in HDFS. (hive.server2.job.keystore.location or a 
> better name) If this is not specified, then HS2 will simply use the value of 
> hadoop.security.credential.provider.path.
> * An environment variable for the password to decrypt the credential store 
> (HIVE_JOB_KEYSTORE_PASSWORD or better). If this is not specified, then HS2 
> will simply use the standard environment variable for decrypting the Hadoop 
> Credential Provider.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14872) Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS


[ 
https://issues.apache.org/jira/browse/HIVE-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569871#comment-15569871
 ] 

Pengcheng Xiong commented on HIVE-14872:


update the golden file. Double check that it passed. pushed to master. Thanks 
[~ashutoshc] for the review.

> Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS
> -
>
> Key: HIVE-14872
> URL: https://issues.apache.org/jira/browse/HIVE-14872
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14872.01.patch, HIVE-14872.02.patch
>
>
> The main purpose for the configuration of 
> HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS is for backward compatibility because a 
> lot of reserved key words has been used as identifiers in the previous 
> releases. We already have had several releases with this configuration. Now 
> when I tried to add new set operators to the parser, ANTLR is always 
> complaining "code too large". I think it is time to remove this 
> configuration. (1) It will simplify the parser logic and largely reduce the 
> size of generated parser code; (2) it leave space for new features, 
> especially those which require parser changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14926) Keep Schema in consistent state where schemaTool fails or succeeds.

2016-10-12 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-14926:

Attachment: HIVE-14926.1.patch

> Keep Schema in consistent state where schemaTool fails or succeeds.  
> -
>
> Key: HIVE-14926
> URL: https://issues.apache.org/jira/browse/HIVE-14926
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-14926.1.patch
>
>
> SchemaTool uses autocommit right now when executing the upgrade or init 
> scripts. Seems we should use database transaction to commit or roll back to 
> keep schema consistent.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14872) Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS


 [ 
https://issues.apache.org/jira/browse/HIVE-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14872:
---
Component/s: Parser

> Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS
> -
>
> Key: HIVE-14872
> URL: https://issues.apache.org/jira/browse/HIVE-14872
> Project: Hive
>  Issue Type: Sub-task
>  Components: Parser
>Affects Versions: 2.1.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0
>
> Attachments: HIVE-14872.01.patch, HIVE-14872.02.patch
>
>
> The main purpose for the configuration of 
> HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS is for backward compatibility because a 
> lot of reserved key words has been used as identifiers in the previous 
> releases. We already have had several releases with this configuration. Now 
> when I tried to add new set operators to the parser, ANTLR is always 
> complaining "code too large". I think it is time to remove this 
> configuration. (1) It will simplify the parser logic and largely reduce the 
> size of generated parser code; (2) it leave space for new features, 
> especially those which require parser changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14872) Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS


 [ 
https://issues.apache.org/jira/browse/HIVE-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14872:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS
> -
>
> Key: HIVE-14872
> URL: https://issues.apache.org/jira/browse/HIVE-14872
> Project: Hive
>  Issue Type: Sub-task
>  Components: Parser
>Affects Versions: 2.1.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14872.01.patch, HIVE-14872.02.patch
>
>
> The main purpose for the configuration of 
> HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS is for backward compatibility because a 
> lot of reserved key words has been used as identifiers in the previous 
> releases. We already have had several releases with this configuration. Now 
> when I tried to add new set operators to the parser, ANTLR is always 
> complaining "code too large". I think it is time to remove this 
> configuration. (1) It will simplify the parser logic and largely reduce the 
> size of generated parser code; (2) it leave space for new features, 
> especially those which require parser changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14872) Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS