[jira] [Commented] (HIVE-18723) CompactorOutputCommitter.commitJob() - check rename() ret val

2018-03-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392524#comment-16392524
 ] 

Hive QA commented on HIVE-18723:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
34s{color} | {color:red} ql: The patch generated 3 new + 46 unchanged - 0 fixed 
= 49 total (was 46) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
12s{color} | {color:red} The patch generated 49 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m  6s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-9566/dev-support/hive-personality.sh
 |
| git revision | master / 73ccc44 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9566/yetus/diff-checkstyle-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9566/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9566/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> CompactorOutputCommitter.commitJob() - check rename() ret val
> -
>
> Key: HIVE-18723
> URL: https://issues.apache.org/jira/browse/HIVE-18723
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Kryvenko Igor
>Priority: Major
> Attachments: HIVE-18723.1.patch, HIVE-18723.2.patch, 
> HIVE-18723.3.patch, HIVE-18723.patch
>
>
> right now ret val is ignored {{fs.rename(fileStatus.getPath(), newPath); }}
> Should this use {{FileUtils.ename(FileSystem fs, Path sourcePath, Path 
> destPath, Configuration conf) }}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18883) Add findbugs to yetus pre-commit checks

2018-03-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392509#comment-16392509
 ] 

Hive QA commented on HIVE-18883:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12913661/HIVE-18883.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 12954 tests 
executed
*Failed tests:*
{noformat}
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=94)

[nopart_insert.q,insert_into_with_schema.q,input41.q,having1.q,create_table_failure3.q,default_constraint_invalid_default_value.q,database_drop_not_empty_restrict.q,windowing_after_orderby.q,orderbysortby.q,subquery_select_distinct2.q,authorization_uri_alterpart_loc.q,udf_last_day_error_1.q,constraint_duplicate_name.q,create_table_failure4.q,alter_tableprops_external_with_notnull_constraint.q,semijoin5.q,udf_format_number_wrong4.q,deletejar.q,exim_11_nonpart_noncompat_sorting.q,show_tables_bad_db2.q,drop_func_nonexistent.q,nopart_load.q,alter_table_non_partitioned_table_cascade.q,load_wrong_fileformat.q,lockneg_try_db_lock_conflict.q,udf_field_wrong_args_len.q,create_table_failure2.q,create_with_fk_constraints_enforced.q,groupby2_map_skew_multi_distinct.q,udf_min.q,authorization_update_noupdatepriv.q,show_columns2.q,authorization_insert_noselectpriv.q,orc_replace_columns3_acid.q,compare_double_bigint.q,authorization_set_nonexistent_conf.q,alter_rename_partition_failure3.q,split_sample_wrong_format2.q,create_with_fk_pk_same_tab.q,compare_double_bigint_2.q,authorization_show_roles_no_admin.q,materialized_view_authorization_rebuild_no_grant.q,unionLimit.q,authorization_revoke_table_fail2.q,authorization_insert_noinspriv.q,duplicate_insert3.q,authorization_desc_table_nosel.q,stats_noscan_non_native.q,orc_change_serde_acid.q,create_or_replace_view7.q,exim_07_nonpart_noncompat_ifof.q,create_with_unique_constraints_enforced.q,udf_concat_ws_wrong2.q,fileformat_bad_class.q,merge_negative_2.q,exim_15_part_nonpart.q,authorization_not_owner_drop_view.q,external1.q,authorization_uri_insert.q,create_with_fk_wrong_ref.q,columnstats_tbllvl_incorrect_column.q,authorization_show_parts_nosel.q,authorization_not_owner_drop_tab.q,external2.q,authorization_deletejar.q,temp_table_create_like_partitions.q,udf_greatest_error_1.q,ptf_negative_AggrFuncsWithNoGBYNoPartDef.q,alter_view_as_select_not_exist.q,touch1.q,groupby3_map_skew_multi_distinct.q,insert_into_notnull_constraint.q,exchange_partition_neg_partition_missing.q,groupby_cube_multi_gby.q,columnstats_tbllvl.q,drop_invalid_constraint2.q,alter_table_add_partition.q,update_not_acid.q,archive5.q,alter_table_constraint_invalid_pk_col.q,ivyDownload.q,udf_instr_wrong_type.q,bad_sample_clause.q,authorization_not_owner_drop_tab2.q,authorization_alter_db_owner.q,show_columns1.q,orc_type_promotion3.q,create_view_failure8.q,strict_join.q,udf_add_months_error_1.q,groupby_cube2.q,groupby_cube1.q,groupby_rollup1.q,genericFileFormat.q,invalid_cast_from_binary_4.q,drop_invalid_constraint1.q,serde_regex.q,show_partitions1.q,invalid_cast_from_binary_6.q,create_with_multi_pk_constraint.q,udf_field_wrong_type.q,groupby_grouping_sets4.q,groupby_grouping_sets3.q,insertsel_fail.q,udf_locate_wrong_type.q,orc_type_promotion1_acid.q,set_table_property.q,create_or_replace_view2.q,groupby_grouping_sets2.q,alter_view_failure.q,distinct_windowing_failure1.q,invalid_t_alter2.q,alter_table_constraint_invalid_fk_col1.q,invalid_varchar_length_2.q,authorization_show_grant_otheruser_alltabs.q,subquery_windowing_corr.q,compact_non_acid_table.q,authorization_view_4.q,authorization_disallow_transform.q,materialized_view_authorization_rebuild_other.q,authorization_fail_4.q,dbtxnmgr_nodblock.q,set_hiveconf_internal_variable1.q,input_part0_neg.q,udf_printf_wrong3.q,load_orc_negative2.q,druid_buckets.q,archive2.q,authorization_addjar.q,invalid_sum_syntax.q,insert_into_with_schema1.q,udf_add_months_error_2.q,dyn_part_max_per_node.q,authorization_revoke_table_fail1.q,udf_printf_wrong2.q,archive_multi3.q,udf_printf_wrong1.q,subquery_subquery_chain.q,authorization_view_disable_cbo_4.q,no_matching_udf.q,create_view_failure7.q,drop_native_udf.q,truncate_column_list_bucketing.q,authorization_uri_add_partition.q,authorization_view_disable_cbo_3.q,bad_exec_hooks.q,authorization_view_disable_cbo_2.q,fetchtask_ioexception.q,char_pad_convert_fail2.q,authorization_set_role_neg1.q,serde_regex3.q,authorization_delete_nodeletepriv.q,materialized_view_delete.q,create_or_replace_view6.q,bucket_mapjoin_wrong_table_metadata_2.q,msck_repair_3.q,udf_sort_array_by_wrong2.q,local_mapred_error_cache.q,alter_external_acid.q,mm_concatenate.q,authorization_fail_3.q,set_hiveconf_internal_variable0.q,udf_last_day_error_2.q,alter_table_constraint_invalid_ref.q,create_table_wrong_regex.q,describe_xpat

[jira] [Commented] (HIVE-18832) Support change management for trashing data files from ACID tables.

2018-03-08 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392492#comment-16392492
 ] 

anishek commented on HIVE-18832:


not sure why PR is not getting logged here 
https://github.com/apache/hive/pull/318

> Support change management for trashing data files from ACID tables.
> ---
>
> Key: HIVE-18832
> URL: https://issues.apache.org/jira/browse/HIVE-18832
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: anishek
>Assignee: anishek
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18832.0.patch, HIVE-18832.1.patch
>
>
> Currently, cleaner process and DDL drop operations deletes the data files. 
> So, scope to support change management in source warehouse for ACID table 
> operations is given below.
> 1. Cleaner process deletes older files after compaction, aborted files etc. 
> Need to be archived to cmroot path.
> 2. DDL operations such as Drop table, partition already archive the deleted 
> files. Need to extend it for ACID tables too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18832) Support change management for trashing data files from ACID tables.

2018-03-08 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-18832:
---
Attachment: (was: HIVE-18832.1.patch)

> Support change management for trashing data files from ACID tables.
> ---
>
> Key: HIVE-18832
> URL: https://issues.apache.org/jira/browse/HIVE-18832
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: anishek
>Assignee: anishek
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18832.0.patch, HIVE-18832.1.patch
>
>
> Currently, cleaner process and DDL drop operations deletes the data files. 
> So, scope to support change management in source warehouse for ACID table 
> operations is given below.
> 1. Cleaner process deletes older files after compaction, aborted files etc. 
> Need to be archived to cmroot path.
> 2. DDL operations such as Drop table, partition already archive the deleted 
> files. Need to extend it for ACID tables too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18832) Support change management for trashing data files from ACID tables.

2018-03-08 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-18832:
---
Attachment: HIVE-18832.1.patch

> Support change management for trashing data files from ACID tables.
> ---
>
> Key: HIVE-18832
> URL: https://issues.apache.org/jira/browse/HIVE-18832
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: anishek
>Assignee: anishek
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18832.0.patch, HIVE-18832.1.patch
>
>
> Currently, cleaner process and DDL drop operations deletes the data files. 
> So, scope to support change management in source warehouse for ACID table 
> operations is given below.
> 1. Cleaner process deletes older files after compaction, aborted files etc. 
> Need to be archived to cmroot path.
> 2. DDL operations such as Drop table, partition already archive the deleted 
> files. Need to extend it for ACID tables too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18832) Support change management for trashing data files from ACID tables.

2018-03-08 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18832:

Issue Type: Sub-task  (was: Bug)
Parent: HIVE-18320

> Support change management for trashing data files from ACID tables.
> ---
>
> Key: HIVE-18832
> URL: https://issues.apache.org/jira/browse/HIVE-18832
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: anishek
>Assignee: anishek
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18832.0.patch, HIVE-18832.1.patch
>
>
> Currently, cleaner process and DDL drop operations deletes the data files. 
> So, scope to support change management in source warehouse for ACID table 
> operations is given below.
> 1. Cleaner process deletes older files after compaction, aborted files etc. 
> Need to be archived to cmroot path.
> 2. DDL operations such as Drop table, partition already archive the deleted 
> files. Need to extend it for ACID tables too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18883) Add findbugs to yetus pre-commit checks

2018-03-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392478#comment-16392478
 ] 

Hive QA commented on HIVE-18883:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
10s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
33s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
 5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
51s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
45s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
6s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
12s{color} | {color:red} The patch generated 50 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  findbugs  xml  javac  javadoc  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-9564/dev-support/hive-personality.sh
 |
| git revision | master / 73ccc44 |
| Default Java | 1.8.0_111 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9564/yetus/patch-asflicense-problems.txt
 |
| modules | C: . standalone-metastore U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9564/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Add findbugs to yetus pre-commit checks
> ---
>
> Key: HIVE-18883
> URL: https://issues.apache.org/jira/browse/HIVE-18883
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18883.1.patch, HIVE-18883.2.patch
>
>
> We should enable FindBugs for our YETUS pre-commit checks, this will help 
> overall code quality and should decrease the overall number of bugs in Hive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18751) ACID table scan through get_splits UDF doesn't receive ValidWriteIdList configuration.

2018-03-08 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18751:

Status: Patch Available  (was: Open)

> ACID table scan through get_splits UDF doesn't receive ValidWriteIdList 
> configuration.
> --
>
> Key: HIVE-18751
> URL: https://issues.apache.org/jira/browse/HIVE-18751
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, UDF, pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18751.01.patch
>
>
> Per table write ID (HIVE-18192) have replaced global transaction ID with 
> write ID to version data files in ACID/MM tables,
> To ensure snapshot isolation, need to generate ValidWriteIdList for the given 
> txn/table and use it when scan the ACID/MM tables.
> In case of get_splits UDF which runs on ACID table scan query won't receive 
> it properly through configuration (hive.txn.tables.valid.writeids) and hence 
> throws exception. 
> TestAcidOnTez.testGetSplitsLocks is the test failing for the same. Need to 
> fix it.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18751) ACID table scan through get_splits UDF doesn't receive ValidWriteIdList configuration.

2018-03-08 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18751:

Attachment: HIVE-18751.01.patch

> ACID table scan through get_splits UDF doesn't receive ValidWriteIdList 
> configuration.
> --
>
> Key: HIVE-18751
> URL: https://issues.apache.org/jira/browse/HIVE-18751
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, UDF, pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18751.01.patch
>
>
> Per table write ID (HIVE-18192) have replaced global transaction ID with 
> write ID to version data files in ACID/MM tables,
> To ensure snapshot isolation, need to generate ValidWriteIdList for the given 
> txn/table and use it when scan the ACID/MM tables.
> In case of get_splits UDF which runs on ACID table scan query won't receive 
> it properly through configuration (hive.txn.tables.valid.writeids) and hence 
> throws exception. 
> TestAcidOnTez.testGetSplitsLocks is the test failing for the same. Need to 
> fix it.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18751) ACID table scan through get_splits UDF doesn't receive ValidWriteIdList configuration.

2018-03-08 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18751:

Attachment: (was: HIVE-18751.01.patch)

> ACID table scan through get_splits UDF doesn't receive ValidWriteIdList 
> configuration.
> --
>
> Key: HIVE-18751
> URL: https://issues.apache.org/jira/browse/HIVE-18751
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, UDF, pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18751.01.patch
>
>
> Per table write ID (HIVE-18192) have replaced global transaction ID with 
> write ID to version data files in ACID/MM tables,
> To ensure snapshot isolation, need to generate ValidWriteIdList for the given 
> txn/table and use it when scan the ACID/MM tables.
> In case of get_splits UDF which runs on ACID table scan query won't receive 
> it properly through configuration (hive.txn.tables.valid.writeids) and hence 
> throws exception. 
> TestAcidOnTez.testGetSplitsLocks is the test failing for the same. Need to 
> fix it.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18864) ValidWriteIdList snapshot seems incorrect if obtained after allocating writeId by current transaction.

2018-03-08 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18864:

Status: Open  (was: Patch Available)

> ValidWriteIdList snapshot seems incorrect if obtained after allocating 
> writeId by current transaction.
> --
>
> Key: HIVE-18864
> URL: https://issues.apache.org/jira/browse/HIVE-18864
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18864.01.patch, HIVE-18864.02.patch
>
>
> For multi-statement txns, it is possible that write on a table happens after 
> a read. Let's see the below scenario.
>  # Committed txn=9 writes on table T1 with writeId=5.
>  # Open txn=10. ValidTxnList(open:null, txn_HWM=10),
>  # Read table T1 from txn=10. ValidWriteIdList(open:null, write_HWM=5).
>  # Open txn=11, writes on table T1 with writeid=6.
>  # Read table T1 from txn=10. ValidWriteIdList(open:null, write_HWM=5).
>  # Write table T1 from txn=10 with writeId=7.
>  # Read table T1 from txn=10. {color:#d04437}*ValidWriteIdList(open:null, 
> write_HWM=7)*. – This read will able to see rows added by txn=11 which is 
> still open.{color}
> {color:#d04437}So, it is needed to rebuild the open/aborted list of 
> ValidWriteIdList based on txn_HWM. Any writeId allocated by txnId > txn_HWM 
> should be marked as open. In this example, *ValidWriteIdList(open:6, 
> write_HWM=7)* should be generated.{color}
> {color:#33}cc{color} [~ekoifman], [~thejas]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18864) ValidWriteIdList snapshot seems incorrect if obtained after allocating writeId by current transaction.

2018-03-08 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18864:

Attachment: HIVE-18864.02.patch

> ValidWriteIdList snapshot seems incorrect if obtained after allocating 
> writeId by current transaction.
> --
>
> Key: HIVE-18864
> URL: https://issues.apache.org/jira/browse/HIVE-18864
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18864.01.patch, HIVE-18864.02.patch
>
>
> For multi-statement txns, it is possible that write on a table happens after 
> a read. Let's see the below scenario.
>  # Committed txn=9 writes on table T1 with writeId=5.
>  # Open txn=10. ValidTxnList(open:null, txn_HWM=10),
>  # Read table T1 from txn=10. ValidWriteIdList(open:null, write_HWM=5).
>  # Open txn=11, writes on table T1 with writeid=6.
>  # Read table T1 from txn=10. ValidWriteIdList(open:null, write_HWM=5).
>  # Write table T1 from txn=10 with writeId=7.
>  # Read table T1 from txn=10. {color:#d04437}*ValidWriteIdList(open:null, 
> write_HWM=7)*. – This read will able to see rows added by txn=11 which is 
> still open.{color}
> {color:#d04437}So, it is needed to rebuild the open/aborted list of 
> ValidWriteIdList based on txn_HWM. Any writeId allocated by txnId > txn_HWM 
> should be marked as open. In this example, *ValidWriteIdList(open:6, 
> write_HWM=7)* should be generated.{color}
> {color:#33}cc{color} [~ekoifman], [~thejas]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18864) ValidWriteIdList snapshot seems incorrect if obtained after allocating writeId by current transaction.

2018-03-08 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18864:

Status: Patch Available  (was: Open)

> ValidWriteIdList snapshot seems incorrect if obtained after allocating 
> writeId by current transaction.
> --
>
> Key: HIVE-18864
> URL: https://issues.apache.org/jira/browse/HIVE-18864
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18864.01.patch, HIVE-18864.02.patch
>
>
> For multi-statement txns, it is possible that write on a table happens after 
> a read. Let's see the below scenario.
>  # Committed txn=9 writes on table T1 with writeId=5.
>  # Open txn=10. ValidTxnList(open:null, txn_HWM=10),
>  # Read table T1 from txn=10. ValidWriteIdList(open:null, write_HWM=5).
>  # Open txn=11, writes on table T1 with writeid=6.
>  # Read table T1 from txn=10. ValidWriteIdList(open:null, write_HWM=5).
>  # Write table T1 from txn=10 with writeId=7.
>  # Read table T1 from txn=10. {color:#d04437}*ValidWriteIdList(open:null, 
> write_HWM=7)*. – This read will able to see rows added by txn=11 which is 
> still open.{color}
> {color:#d04437}So, it is needed to rebuild the open/aborted list of 
> ValidWriteIdList based on txn_HWM. Any writeId allocated by txnId > txn_HWM 
> should be marked as open. In this example, *ValidWriteIdList(open:6, 
> write_HWM=7)* should be generated.{color}
> {color:#33}cc{color} [~ekoifman], [~thejas]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18751) ACID table scan through get_splits UDF doesn't receive ValidWriteIdList configuration.

2018-03-08 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18751:

Status: Open  (was: Patch Available)

> ACID table scan through get_splits UDF doesn't receive ValidWriteIdList 
> configuration.
> --
>
> Key: HIVE-18751
> URL: https://issues.apache.org/jira/browse/HIVE-18751
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, UDF, pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18751.01.patch
>
>
> Per table write ID (HIVE-18192) have replaced global transaction ID with 
> write ID to version data files in ACID/MM tables,
> To ensure snapshot isolation, need to generate ValidWriteIdList for the given 
> txn/table and use it when scan the ACID/MM tables.
> In case of get_splits UDF which runs on ACID table scan query won't receive 
> it properly through configuration (hive.txn.tables.valid.writeids) and hence 
> throws exception. 
> TestAcidOnTez.testGetSplitsLocks is the test failing for the same. Need to 
> fix it.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18864) ValidWriteIdList snapshot seems incorrect if obtained after allocating writeId by current transaction.

2018-03-08 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-18864:

Attachment: (was: HIVE-18864.02.patch)

> ValidWriteIdList snapshot seems incorrect if obtained after allocating 
> writeId by current transaction.
> --
>
> Key: HIVE-18864
> URL: https://issues.apache.org/jira/browse/HIVE-18864
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18864.01.patch, HIVE-18864.02.patch
>
>
> For multi-statement txns, it is possible that write on a table happens after 
> a read. Let's see the below scenario.
>  # Committed txn=9 writes on table T1 with writeId=5.
>  # Open txn=10. ValidTxnList(open:null, txn_HWM=10),
>  # Read table T1 from txn=10. ValidWriteIdList(open:null, write_HWM=5).
>  # Open txn=11, writes on table T1 with writeid=6.
>  # Read table T1 from txn=10. ValidWriteIdList(open:null, write_HWM=5).
>  # Write table T1 from txn=10 with writeId=7.
>  # Read table T1 from txn=10. {color:#d04437}*ValidWriteIdList(open:null, 
> write_HWM=7)*. – This read will able to see rows added by txn=11 which is 
> still open.{color}
> {color:#d04437}So, it is needed to rebuild the open/aborted list of 
> ValidWriteIdList based on txn_HWM. Any writeId allocated by txnId > txn_HWM 
> should be marked as open. In this example, *ValidWriteIdList(open:6, 
> write_HWM=7)* should be generated.{color}
> {color:#33}cc{color} [~ekoifman], [~thejas]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18739) Add support for Export from unpartitioned Acid table

2018-03-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392450#comment-16392450
 ] 

Hive QA commented on HIVE-18739:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12913507/HIVE-18739.04.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 102 failed/errored test(s), 13358 tests 
executed
*Failed tests:*
{noformat}
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=94)

[nopart_insert.q,insert_into_with_schema.q,input41.q,having1.q,create_table_failure3.q,default_constraint_invalid_default_value.q,database_drop_not_empty_restrict.q,windowing_after_orderby.q,orderbysortby.q,subquery_select_distinct2.q,authorization_uri_alterpart_loc.q,udf_last_day_error_1.q,constraint_duplicate_name.q,create_table_failure4.q,alter_tableprops_external_with_notnull_constraint.q,semijoin5.q,udf_format_number_wrong4.q,deletejar.q,exim_11_nonpart_noncompat_sorting.q,show_tables_bad_db2.q,drop_func_nonexistent.q,nopart_load.q,alter_table_non_partitioned_table_cascade.q,load_wrong_fileformat.q,lockneg_try_db_lock_conflict.q,udf_field_wrong_args_len.q,create_table_failure2.q,create_with_fk_constraints_enforced.q,groupby2_map_skew_multi_distinct.q,udf_min.q,authorization_update_noupdatepriv.q,show_columns2.q,authorization_insert_noselectpriv.q,orc_replace_columns3_acid.q,compare_double_bigint.q,authorization_set_nonexistent_conf.q,alter_rename_partition_failure3.q,split_sample_wrong_format2.q,create_with_fk_pk_same_tab.q,compare_double_bigint_2.q,authorization_show_roles_no_admin.q,materialized_view_authorization_rebuild_no_grant.q,unionLimit.q,authorization_revoke_table_fail2.q,authorization_insert_noinspriv.q,duplicate_insert3.q,authorization_desc_table_nosel.q,stats_noscan_non_native.q,orc_change_serde_acid.q,create_or_replace_view7.q,exim_07_nonpart_noncompat_ifof.q,create_with_unique_constraints_enforced.q,udf_concat_ws_wrong2.q,fileformat_bad_class.q,merge_negative_2.q,exim_15_part_nonpart.q,authorization_not_owner_drop_view.q,external1.q,authorization_uri_insert.q,create_with_fk_wrong_ref.q,columnstats_tbllvl_incorrect_column.q,authorization_show_parts_nosel.q,authorization_not_owner_drop_tab.q,external2.q,authorization_deletejar.q,temp_table_create_like_partitions.q,udf_greatest_error_1.q,ptf_negative_AggrFuncsWithNoGBYNoPartDef.q,alter_view_as_select_not_exist.q,touch1.q,groupby3_map_skew_multi_distinct.q,insert_into_notnull_constraint.q,exchange_partition_neg_partition_missing.q,groupby_cube_multi_gby.q,columnstats_tbllvl.q,drop_invalid_constraint2.q,alter_table_add_partition.q,update_not_acid.q,archive5.q,alter_table_constraint_invalid_pk_col.q,ivyDownload.q,udf_instr_wrong_type.q,bad_sample_clause.q,authorization_not_owner_drop_tab2.q,authorization_alter_db_owner.q,show_columns1.q,orc_type_promotion3.q,create_view_failure8.q,strict_join.q,udf_add_months_error_1.q,groupby_cube2.q,groupby_cube1.q,groupby_rollup1.q,genericFileFormat.q,invalid_cast_from_binary_4.q,drop_invalid_constraint1.q,serde_regex.q,show_partitions1.q,invalid_cast_from_binary_6.q,create_with_multi_pk_constraint.q,udf_field_wrong_type.q,groupby_grouping_sets4.q,groupby_grouping_sets3.q,insertsel_fail.q,udf_locate_wrong_type.q,orc_type_promotion1_acid.q,set_table_property.q,create_or_replace_view2.q,groupby_grouping_sets2.q,alter_view_failure.q,distinct_windowing_failure1.q,invalid_t_alter2.q,alter_table_constraint_invalid_fk_col1.q,invalid_varchar_length_2.q,authorization_show_grant_otheruser_alltabs.q,subquery_windowing_corr.q,compact_non_acid_table.q,authorization_view_4.q,authorization_disallow_transform.q,materialized_view_authorization_rebuild_other.q,authorization_fail_4.q,dbtxnmgr_nodblock.q,set_hiveconf_internal_variable1.q,input_part0_neg.q,udf_printf_wrong3.q,load_orc_negative2.q,druid_buckets.q,archive2.q,authorization_addjar.q,invalid_sum_syntax.q,insert_into_with_schema1.q,udf_add_months_error_2.q,dyn_part_max_per_node.q,authorization_revoke_table_fail1.q,udf_printf_wrong2.q,archive_multi3.q,udf_printf_wrong1.q,subquery_subquery_chain.q,authorization_view_disable_cbo_4.q,no_matching_udf.q,create_view_failure7.q,drop_native_udf.q,truncate_column_list_bucketing.q,authorization_uri_add_partition.q,authorization_view_disable_cbo_3.q,bad_exec_hooks.q,authorization_view_disable_cbo_2.q,fetchtask_ioexception.q,char_pad_convert_fail2.q,authorization_set_role_neg1.q,serde_regex3.q,authorization_delete_nodeletepriv.q,materialized_view_delete.q,create_or_replace_view6.q,bucket_mapjoin_wrong_table_metadata_2.q,msck_repair_3.q,udf_sort_array_by_wrong2.q,local_mapred_error_cache.q,alter_external_acid.q,mm_concatenate.q,authorization_fail_3.q,set_hiveconf_internal_variable0.q,udf_last_day_error_2.q,alter_table_constraint_invalid_ref.q,create_table_wrong_regex.q,describe

[jira] [Commented] (HIVE-15393) Update Guava version

2018-03-08 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392441#comment-16392441
 ] 

Thejas M Nair commented on HIVE-15393:
--

bq.  Hadoop 3 is compatible with Guava 21
See HADOOP-14386, they went back to 0.11!


> Update Guava version
> 
>
> Key: HIVE-15393
> URL: https://issues.apache.org/jira/browse/HIVE-15393
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: Ashutosh Chauhan
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-15393.2.patch, HIVE-15393.3.patch, 
> HIVE-15393.5.patch, HIVE-15393.6.patch, HIVE-15393.7.patch, 
> HIVE-15393.8.patch, HIVE-15393.9.patch, HIVE-15393.patch
>
>
> Druid base code is using newer version of guava 16.0.1 that is not compatible 
> with the current version used by Hive.
> FYI Hadoop project is moving to Guava 18 not sure if it is better to move to 
> guava 18 or even 19.
> https://issues.apache.org/jira/browse/HADOOP-10101



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18739) Add support for Export from unpartitioned Acid table

2018-03-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392400#comment-16392400
 ] 

Hive QA commented on HIVE-18739:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
41s{color} | {color:red} ql: The patch generated 94 new + 677 unchanged - 4 
fixed = 771 total (was 681) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
12s{color} | {color:red} The patch generated 50 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 13s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-9563/dev-support/hive-personality.sh
 |
| git revision | master / 73ccc44 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9563/yetus/diff-checkstyle-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9563/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9563/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Add support for Export from unpartitioned Acid table
> 
>
> Key: HIVE-18739
> URL: https://issues.apache.org/jira/browse/HIVE-18739
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-18739.01.patch, HIVE-18739.04.patch, 
> HIVE-18739.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18899) Separate FetchWork required for each query that uses the results cache

2018-03-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392388#comment-16392388
 ] 

Hive QA commented on HIVE-18899:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12913646/HIVE-18899.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 13350 tests 
executed
*Failed tests:*
{noformat}
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=95)

[udf_invalid.q,authorization_uri_export.q,default_constraint_complex_default_value.q,druid_datasource2.q,view_update.q,default_partition_name.q,authorization_public_create.q,load_wrong_fileformat_rc_seq.q,default_constraint_invalid_type.q,altern1.q,describe_xpath1.q,drop_view_failure2.q,temp_table_rename.q,invalid_select_column_with_subquery.q,udf_trunc_error1.q,insert_view_failure.q,dbtxnmgr_nodbunlock.q,authorization_show_columns.q,cte_recursion.q,merge_constraint_notnull.q,load_part_nospec.q,clusterbyorderby.q,orc_type_promotion2.q,ctas_noperm_loc.q,udf_instr_wrong_args_len.q,invalid_create_tbl2.q,part_col_complex_type.q,authorization_drop_db_empty.q,smb_mapjoin_14.q,subquery_scalar_multi_rows.q,alter_partition_coltype_2columns.q,subquery_corr_in_agg.q,insert_overwrite_notnull_constraint.q,authorization_show_grant_otheruser_wtab.q,regex_col_groupby.q,udaf_collect_set_unsupported.q,ptf_negative_DuplicateWindowAlias.q,exim_22_export_authfail.q,udf_likeany_wrong1.q,groupby_key.q,ambiguous_col.q,groupby3_multi_distinct.q,authorization_alter_drop_ptn.q,invalid_cast_from_binary_5.q,show_create_table_does_not_exist.q,invalid_select_column.q,exim_20_managed_location_over_existing.q,interval_3.q,authorization_compile.q,join35.q,merge_negative_3.q,udf_concat_ws_wrong3.q,create_or_replace_view8.q,create_external_with_notnull_constraint.q,split_sample_out_of_range.q,materialized_view_no_transactional_rewrite.q,authorization_show_grant_otherrole.q,create_with_constraints_duplicate_name.q,invalid_stddev_samp_syntax.q,authorization_view_disable_cbo_7.q,autolocal1.q,avro_non_nullable_union.q,load_orc_negative_part.q,drop_view_failure1.q,columnstats_partlvl_invalid_values_autogather.q,exim_13_nonnative_import.q,alter_table_wrong_regex.q,add_partition_with_whitelist.q,udf_next_day_error_2.q,authorization_select.q,udf_trunc_error2.q,authorization_view_7.q,udf_format_number_wrong5.q,touch2.q,exim_03_nonpart_noncompat_colschema.q,orc_type_promotion1.q,lateral_view_alias.q,show_tables_bad_db1.q,unset_table_property.q,alter_non_native.q,nvl_mismatch_type.q,load_orc_negative3.q,authorization_create_role_no_admin.q,invalid_distinct1.q,authorization_grant_server.q,orc_type_promotion3_acid.q,show_tables_bad1.q,macro_unused_parameter.q,drop_invalid_constraint3.q,drop_partition_filter_failure.q,char_pad_convert_fail3.q,exim_23_import_exist_authfail.q,drop_invalid_constraint4.q,authorization_create_macro1.q,archive1.q,subquery_multiple_cols_in_select.q,change_hive_hdfs_session_path.q,udf_trunc_error3.q,invalid_variance_syntax.q,authorization_truncate_2.q,invalid_avg_syntax.q,invalid_select_column_with_tablename.q,mm_truncate_cols.q,groupby_grouping_sets1.q,druid_location.q,groupby2_multi_distinct.q,authorization_sba_drop_table.q,dynamic_partitions_with_whitelist.q,compare_string_bigint_2.q,udf_greatest_error_2.q,authorization_view_6.q,show_tablestatus.q,duplicate_alias_in_transform_schema.q,create_with_fk_uk_same_tab.q,udtf_not_supported3.q,alter_table_constraint_invalid_fk_col2.q,udtf_not_supported1.q,dbtxnmgr_notableunlock.q,ptf_negative_InvalidValueBoundary.q,alter_table_constraint_duplicate_pk.q,udf_printf_wrong4.q,create_view_failure9.q,udf_elt_wrong_type.q,selectDistinctStarNeg_1.q,invalid_mapjoin1.q,load_stored_as_dirs.q,input1.q,udf_sort_array_wrong1.q,invalid_distinct2.q,invalid_select_fn.q,authorization_role_grant_otherrole.q,archive4.q,load_nonpart_authfail.q,recursive_view.q,authorization_view_disable_cbo_1.q,desc_failure4.q,create_not_acid.q,udf_sort_array_wrong3.q,char_pad_convert_fail0.q,udf_map_values_arg_type.q,alter_view_failure6_2.q,alter_partition_change_col_nonexist.q,update_non_acid_table.q,authorization_view_disable_cbo_5.q,ct_noperm_loc.q,interval_1.q,authorization_show_grant_otheruser_all.q,authorization_view_2.q,show_tables_bad2.q,groupby_rollup2.q,truncate_column_seqfile.q,create_view_failure5.q,authorization_create_view.q,ptf_window_boundaries.q,ctasnullcol.q,input_part0_neg_2.q,create_or_replace_view1.q,udf_max.q,exim_01_nonpart_over_loaded.q,msck_repair_1.q,orc_change_fileformat_acid.q,udf_nonexistent_resource.q,exim_19_external_over_existing.q,serde_regex2.q,msck_repair_2.q,exim_06_nonpart_noncompat_storage.q,illegal_partition_type4.q,udf_sort_array_by_wrong1.q,create_or_replace_view5.q,windowing_leadlag_in_udaf.q,avro_decimal.q,materialized_view_update.q

[jira] [Commented] (HIVE-18912) Stats task should set accurate flag to false on failure

2018-03-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392354#comment-16392354
 ] 

Sergey Shelukhin commented on HIVE-18912:
-

I added a bug to StatsTask (some exception in the method that sets the stats up 
starting with the above line).
When running a q file with this issue StatsTask threw, insert results were 
still propagated to the table and stats accurate was stil ltrue from the 
previous query iirc.

> Stats task should set accurate flag to false on failure
> ---
>
> Key: HIVE-18912
> URL: https://issues.apache.org/jira/browse/HIVE-18912
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> In my testing, I noticed that sometimes when stats task fails, the 
> accurate=true flag is not cleaned and stats become misleading.
> See the code where it presets it e.g. for txn table.
> {noformat}
> StatsSetupConst.setBasicStatsState(parameters, StatsSetupConst.FALSE);
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18912) Stats task should set accurate flag to false on failure

2018-03-08 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392337#comment-16392337
 ] 

Ashutosh Chauhan commented on HIVE-18912:
-

Do you have sequence of steps which resulted in this?

> Stats task should set accurate flag to false on failure
> ---
>
> Key: HIVE-18912
> URL: https://issues.apache.org/jira/browse/HIVE-18912
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> In my testing, I noticed that sometimes when stats task fails, the 
> accurate=true flag is not cleaned and stats become misleading.
> See the code where it presets it e.g. for txn table.
> {noformat}
> StatsSetupConst.setBasicStatsState(parameters, StatsSetupConst.FALSE);
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18899) Separate FetchWork required for each query that uses the results cache

2018-03-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392336#comment-16392336
 ] 

Hive QA commented on HIVE-18899:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
12s{color} | {color:red} The patch generated 49 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-9562/dev-support/hive-personality.sh
 |
| git revision | master / 73ccc44 |
| Default Java | 1.8.0_111 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9562/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9562/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Separate FetchWork required for each query that uses the results cache
> --
>
> Key: HIVE-18899
> URL: https://issues.apache.org/jira/browse/HIVE-18899
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
>  Labels: performance
> Attachments: HIVE-18899.1.patch, HIVE-18899.2.patch
>
>
> [~gopalv] found issues when running lots of concurrent queries against HS2 
> with the query cache. Looks like the FetchWork held by the results cache 
> cannot be shared between multiple queries because it contains a 
> ListSinkOperator that is used to hold the results of a fetch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17178) Spark Partition Pruning Sink Operator can't target multiple Works

2018-03-08 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392328#comment-16392328
 ] 

Sahil Takiar commented on HIVE-17178:
-

+1

> Spark Partition Pruning Sink Operator can't target multiple Works
> -
>
> Key: HIVE-17178
> URL: https://issues.apache.org/jira/browse/HIVE-17178
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-17178.1.patch, HIVE-17178.2.patch, 
> HIVE-17178.3.patch, HIVE-17178.4.patch, HIVE-17178.5.patch, HIVE-17178.6.patch
>
>
> A Spark Partition Pruning Sink Operator cannot be used to target multiple Map 
> Work objects. The entire DPP subtree (SEL-GBY-SPARKPRUNINGSINK) is duplicated 
> if a single table needs to be used to target multiple Map Works.
> The following query shows the issue:
> {code}
> set hive.spark.dynamic.partition.pruning=true;
> set hive.auto.convert.join=true;
> create table part_table_1 (col int) partitioned by (part_col int);
> create table part_table_2 (col int) partitioned by (part_col int);
> create table regular_table (col int);
> insert into table regular_table values (1);
> alter table part_table_1 add partition (part_col=1);
> insert into table part_table_1 partition (part_col=1) values (1), (2), (3), 
> (4);
> alter table part_table_1 add partition (part_col=2);
> insert into table part_table_1 partition (part_col=2) values (1), (2), (3), 
> (4);
> alter table part_table_2 add partition (part_col=1);
> insert into table part_table_2 partition (part_col=1) values (1), (2), (3), 
> (4);
> alter table part_table_2 add partition (part_col=2);
> insert into table part_table_2 partition (part_col=2) values (1), (2), (3), 
> (4);
> explain select * from regular_table, part_table_1, part_table_2 where 
> regular_table.col = part_table_1.part_col and regular_table.col = 
> part_table_2.part_col;
> {code}
> The explain plan is
> {code}
> STAGE DEPENDENCIES:
>   Stage-2 is a root stage
>   Stage-1 depends on stages: Stage-2
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-2
> Spark
>  A masked pattern was here 
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: regular_table
>   Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE 
> Column stats: NONE
>   Filter Operator
> predicate: col is not null (type: boolean)
> Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
> Select Operator
>   expressions: col (type: int)
>   outputColumnNames: _col0
>   Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
>   Spark HashTable Sink Operator
> keys:
>   0 _col0 (type: int)
>   1 _col1 (type: int)
>   2 _col1 (type: int)
>   Select Operator
> expressions: _col0 (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
> Group By Operator
>   keys: _col0 (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
>   Spark Partition Pruning Sink Operator
> partition key expr: part_col
> Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
> target column name: part_col
> target work: Map 2
>   Select Operator
> expressions: _col0 (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
> Group By Operator
>   keys: _col0 (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
>   Spark Partition Pruning Sink Operator
> partition key expr: part_col
> Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NO

[jira] [Commented] (HIVE-16855) org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements

2018-03-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392325#comment-16392325
 ] 

Hive QA commented on HIVE-16855:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12913655/HIVE-16855.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 103 failed/errored test(s), 13348 tests 
executed
*Failed tests:*
{noformat}
TestJdbcWithMiniHA - did not produce a TEST-*.xml file (likely timed out) 
(batchId=239)
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=94)

[nopart_insert.q,insert_into_with_schema.q,input41.q,having1.q,create_table_failure3.q,default_constraint_invalid_default_value.q,database_drop_not_empty_restrict.q,windowing_after_orderby.q,orderbysortby.q,subquery_select_distinct2.q,authorization_uri_alterpart_loc.q,udf_last_day_error_1.q,constraint_duplicate_name.q,create_table_failure4.q,alter_tableprops_external_with_notnull_constraint.q,semijoin5.q,udf_format_number_wrong4.q,deletejar.q,exim_11_nonpart_noncompat_sorting.q,show_tables_bad_db2.q,drop_func_nonexistent.q,nopart_load.q,alter_table_non_partitioned_table_cascade.q,load_wrong_fileformat.q,lockneg_try_db_lock_conflict.q,udf_field_wrong_args_len.q,create_table_failure2.q,create_with_fk_constraints_enforced.q,groupby2_map_skew_multi_distinct.q,udf_min.q,authorization_update_noupdatepriv.q,show_columns2.q,authorization_insert_noselectpriv.q,orc_replace_columns3_acid.q,compare_double_bigint.q,authorization_set_nonexistent_conf.q,alter_rename_partition_failure3.q,split_sample_wrong_format2.q,create_with_fk_pk_same_tab.q,compare_double_bigint_2.q,authorization_show_roles_no_admin.q,materialized_view_authorization_rebuild_no_grant.q,unionLimit.q,authorization_revoke_table_fail2.q,authorization_insert_noinspriv.q,duplicate_insert3.q,authorization_desc_table_nosel.q,stats_noscan_non_native.q,orc_change_serde_acid.q,create_or_replace_view7.q,exim_07_nonpart_noncompat_ifof.q,create_with_unique_constraints_enforced.q,udf_concat_ws_wrong2.q,fileformat_bad_class.q,merge_negative_2.q,exim_15_part_nonpart.q,authorization_not_owner_drop_view.q,external1.q,authorization_uri_insert.q,create_with_fk_wrong_ref.q,columnstats_tbllvl_incorrect_column.q,authorization_show_parts_nosel.q,authorization_not_owner_drop_tab.q,external2.q,authorization_deletejar.q,temp_table_create_like_partitions.q,udf_greatest_error_1.q,ptf_negative_AggrFuncsWithNoGBYNoPartDef.q,alter_view_as_select_not_exist.q,touch1.q,groupby3_map_skew_multi_distinct.q,insert_into_notnull_constraint.q,exchange_partition_neg_partition_missing.q,groupby_cube_multi_gby.q,columnstats_tbllvl.q,drop_invalid_constraint2.q,alter_table_add_partition.q,update_not_acid.q,archive5.q,alter_table_constraint_invalid_pk_col.q,ivyDownload.q,udf_instr_wrong_type.q,bad_sample_clause.q,authorization_not_owner_drop_tab2.q,authorization_alter_db_owner.q,show_columns1.q,orc_type_promotion3.q,create_view_failure8.q,strict_join.q,udf_add_months_error_1.q,groupby_cube2.q,groupby_cube1.q,groupby_rollup1.q,genericFileFormat.q,invalid_cast_from_binary_4.q,drop_invalid_constraint1.q,serde_regex.q,show_partitions1.q,invalid_cast_from_binary_6.q,create_with_multi_pk_constraint.q,udf_field_wrong_type.q,groupby_grouping_sets4.q,groupby_grouping_sets3.q,insertsel_fail.q,udf_locate_wrong_type.q,orc_type_promotion1_acid.q,set_table_property.q,create_or_replace_view2.q,groupby_grouping_sets2.q,alter_view_failure.q,distinct_windowing_failure1.q,invalid_t_alter2.q,alter_table_constraint_invalid_fk_col1.q,invalid_varchar_length_2.q,authorization_show_grant_otheruser_alltabs.q,subquery_windowing_corr.q,compact_non_acid_table.q,authorization_view_4.q,authorization_disallow_transform.q,materialized_view_authorization_rebuild_other.q,authorization_fail_4.q,dbtxnmgr_nodblock.q,set_hiveconf_internal_variable1.q,input_part0_neg.q,udf_printf_wrong3.q,load_orc_negative2.q,druid_buckets.q,archive2.q,authorization_addjar.q,invalid_sum_syntax.q,insert_into_with_schema1.q,udf_add_months_error_2.q,dyn_part_max_per_node.q,authorization_revoke_table_fail1.q,udf_printf_wrong2.q,archive_multi3.q,udf_printf_wrong1.q,subquery_subquery_chain.q,authorization_view_disable_cbo_4.q,no_matching_udf.q,create_view_failure7.q,drop_native_udf.q,truncate_column_list_bucketing.q,authorization_uri_add_partition.q,authorization_view_disable_cbo_3.q,bad_exec_hooks.q,authorization_view_disable_cbo_2.q,fetchtask_ioexception.q,char_pad_convert_fail2.q,authorization_set_role_neg1.q,serde_regex3.q,authorization_delete_nodeletepriv.q,materialized_view_delete.q,create_or_replace_view6.q,bucket_mapjoin_wrong_table_metadata_2.q,msck_repair_3.q,udf_sort_array_by_wrong2.q,local_mapred_error_cache.q,alter_external_acid.q,mm_concatenate.q,authorization_fail_3.q,set_hiveconf_internal_variable0.q,udf_last_d

[jira] [Commented] (HIVE-18436) Upgrade to Spark 2.3.0

2018-03-08 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392316#comment-16392316
 ] 

Rui Li commented on HIVE-18436:
---

Got it. Thanks for explaining. +1

> Upgrade to Spark 2.3.0
> --
>
> Key: HIVE-18436
> URL: https://issues.apache.org/jira/browse/HIVE-18436
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18436.1.patch, HIVE-18436.2.patch, 
> HIVE-18436.3.patch
>
>
> Branching has been completed. Release candidates should be published soon. 
> Might be a while before the actual release, but at least we get to identify 
> any issues early.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18093) Improve logging when HoS application is killed

2018-03-08 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-18093:

Attachment: HIVE-18093.2.patch

> Improve logging when HoS application is killed
> --
>
> Key: HIVE-18093
> URL: https://issues.apache.org/jira/browse/HIVE-18093
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18093.1.patch, HIVE-18093.2.patch
>
>
> When a HoS jobs is explicitly killed via a user (via a yarn command), the 
> logs just say "RPC channel closed"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18915) Better client logging when a HoS session can't be opened

2018-03-08 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-18915:

Component/s: Spark

> Better client logging when a HoS session can't be opened
> 
>
> Key: HIVE-18915
> URL: https://issues.apache.org/jira/browse/HIVE-18915
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Priority: Major
>
> Users just get a {{FAILED: Execution Error, return code 30041 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create Spark client 
> for Spark session [id]} when a HoS session can't be opened, would be better 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18916) SparkClientImpl doesn't error out if spark-submit fails

2018-03-08 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-18916:

Component/s: Spark

> SparkClientImpl doesn't error out if spark-submit fails
> ---
>
> Key: HIVE-18916
> URL: https://issues.apache.org/jira/browse/HIVE-18916
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Priority: Major
>
> If {{spark-submit}} returns a non-zero exit code, {{SparkClientImpl}} will 
> simply log the exit code, but won't throw an error. Eventually, the 
> connection timeout will get triggered and an exception like {{Timed out 
> waiting for client connection}} will be logged, which is pretty misleading.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-15218) Kyro Exception on subsequent run of a query in LLAP mode

2018-03-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392297#comment-16392297
 ] 

Sergey Shelukhin commented on HIVE-15218:
-

Is there a reason this was never committed?

> Kyro Exception on subsequent run of a query in LLAP mode
> 
>
> Key: HIVE-15218
> URL: https://issues.apache.org/jira/browse/HIVE-15218
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Nita Dembla
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-15218.1.patch, HIVE-15218.2.patch, 
> HIVE-15218.branch-1.patch
>
>
> Following exception is observed when running TPCDS query19 during concurrency 
> test
> {code}
> Vertex failed, vertexName=Map 3, vertexId=vertex_1477340478603_0610_9_05, 
> diagnostics=[Task failed, taskId=task_1477340478603_0610_9_05_06, 
> diagnostics=[TaskAttempt 0 killed, TaskAttempt 1 failed, info=[Error: Error 
> while running task ( failure ) : 
> attempt_1477340478603_0610_9_05_06_1:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: 
> Failed to load plan: 
> hdfs://cn105-10.l42scl.hortonworks.com:8020/tmp/hive/ndembla/0559ce24-663e-482a-a0ea-106d220b53be/hi...
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.RuntimeException: Failed to load plan: 
> hdfs://cn105-10.l42scl.hortonworks.com:8020/tmp/hive/ndembla/0559ce24-663e-482a-a0ea-106d220b53be/hi...
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ObjectCache.retrieve(ObjectCache.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.ObjectCacheWrapper.retrieve(ObjectCacheWrapper.java:40)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:129)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:184)
>   ... 15 more
> Caused by: java.lang.RuntimeException: Failed to load plan: 
> hdfs://cn105-10.l42scl.hortonworks.com:8020/tmp/hive/ndembla/0559ce24-663e-482a-a0ea-106d220b53be/hi...
>   at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:469)
>   at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:305)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor$1.call(MapRecordProcessor.java:132)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ObjectCache.retrieve(ObjectCache.java:55)
>   ... 18 more
> Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: 
> Encountered unregistered class ID: 63
> Serialization trace:
> statistics (org.apache.hadoop.hive.ql.plan.TableScanDesc)
> conf (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:137)
>   at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:670)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClass(SerializationUtilities.java:182)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:118)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
>   at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtili

[jira] [Commented] (HIVE-16855) org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements

2018-03-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392263#comment-16392263
 ] 

Hive QA commented on HIVE-16855:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
13s{color} | {color:red} The patch generated 49 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-9561/dev-support/hive-personality.sh
 |
| git revision | master / 73ccc44 |
| Default Java | 1.8.0_111 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9561/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9561/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements
> --
>
> Key: HIVE-16855
> URL: https://issues.apache.org/jira/browse/HIVE-16855
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16855.1.patch, HIVE-16855.2.patch, 
> HIVE-16855.3.patch
>
>
> # Improve (Simplify) Logging
> # Remove custom buffer size for {{BufferedInputStream}} and instead rely on 
> JVM default which is often larger these days (8192)
> # Simplify looping logic



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-18920) CBO: Initialize the Janino providers ahead of 1st query

2018-03-08 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-18920:
--

Assignee: Jesus Camacho Rodriguez

> CBO: Initialize the Janino providers ahead of 1st query
> ---
>
> Key: HIVE-18920
> URL: https://issues.apache.org/jira/browse/HIVE-18920
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Hive Calcite metadata providers are compiled when the 1st query comes in.
> If a second query arrives before the 1st one has built a metadata provider, 
> it will also try to do the same thing, because the cache is not populated yet.
> With 1024 concurrent users, it takes 6 minutes for the 1st query to finish 
> fighting all the other queries which are trying to load that cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18907) Create utility to fix acid key index issue from HIVE-18817

2018-03-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392237#comment-16392237
 ] 

Hive QA commented on HIVE-18907:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12913674/HIVE-18907.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 12947 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=92)

[infer_bucket_sort_num_buckets.q,infer_bucket_sort_reducers_power_two.q,parallel_orderby.q,bucket_num_reducers_acid.q,infer_bucket_sort_map_operators.q,infer_bucket_sort_merge.q,root_dir_external_table.q,infer_bucket_sort_dyn_part.q,udf_using.q,bucket_num_reducers_acid2.q]
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=94)

[nopart_insert.q,insert_into_with_schema.q,input41.q,having1.q,create_table_failure3.q,default_constraint_invalid_default_value.q,database_drop_not_empty_restrict.q,windowing_after_orderby.q,orderbysortby.q,subquery_select_distinct2.q,authorization_uri_alterpart_loc.q,udf_last_day_error_1.q,constraint_duplicate_name.q,create_table_failure4.q,alter_tableprops_external_with_notnull_constraint.q,semijoin5.q,udf_format_number_wrong4.q,deletejar.q,exim_11_nonpart_noncompat_sorting.q,show_tables_bad_db2.q,drop_func_nonexistent.q,nopart_load.q,alter_table_non_partitioned_table_cascade.q,load_wrong_fileformat.q,lockneg_try_db_lock_conflict.q,udf_field_wrong_args_len.q,create_table_failure2.q,create_with_fk_constraints_enforced.q,groupby2_map_skew_multi_distinct.q,udf_min.q,authorization_update_noupdatepriv.q,show_columns2.q,authorization_insert_noselectpriv.q,orc_replace_columns3_acid.q,compare_double_bigint.q,authorization_set_nonexistent_conf.q,alter_rename_partition_failure3.q,split_sample_wrong_format2.q,create_with_fk_pk_same_tab.q,compare_double_bigint_2.q,authorization_show_roles_no_admin.q,materialized_view_authorization_rebuild_no_grant.q,unionLimit.q,authorization_revoke_table_fail2.q,authorization_insert_noinspriv.q,duplicate_insert3.q,authorization_desc_table_nosel.q,stats_noscan_non_native.q,orc_change_serde_acid.q,create_or_replace_view7.q,exim_07_nonpart_noncompat_ifof.q,create_with_unique_constraints_enforced.q,udf_concat_ws_wrong2.q,fileformat_bad_class.q,merge_negative_2.q,exim_15_part_nonpart.q,authorization_not_owner_drop_view.q,external1.q,authorization_uri_insert.q,create_with_fk_wrong_ref.q,columnstats_tbllvl_incorrect_column.q,authorization_show_parts_nosel.q,authorization_not_owner_drop_tab.q,external2.q,authorization_deletejar.q,temp_table_create_like_partitions.q,udf_greatest_error_1.q,ptf_negative_AggrFuncsWithNoGBYNoPartDef.q,alter_view_as_select_not_exist.q,touch1.q,groupby3_map_skew_multi_distinct.q,insert_into_notnull_constraint.q,exchange_partition_neg_partition_missing.q,groupby_cube_multi_gby.q,columnstats_tbllvl.q,drop_invalid_constraint2.q,alter_table_add_partition.q,update_not_acid.q,archive5.q,alter_table_constraint_invalid_pk_col.q,ivyDownload.q,udf_instr_wrong_type.q,bad_sample_clause.q,authorization_not_owner_drop_tab2.q,authorization_alter_db_owner.q,show_columns1.q,orc_type_promotion3.q,create_view_failure8.q,strict_join.q,udf_add_months_error_1.q,groupby_cube2.q,groupby_cube1.q,groupby_rollup1.q,genericFileFormat.q,invalid_cast_from_binary_4.q,drop_invalid_constraint1.q,serde_regex.q,show_partitions1.q,invalid_cast_from_binary_6.q,create_with_multi_pk_constraint.q,udf_field_wrong_type.q,groupby_grouping_sets4.q,groupby_grouping_sets3.q,insertsel_fail.q,udf_locate_wrong_type.q,orc_type_promotion1_acid.q,set_table_property.q,create_or_replace_view2.q,groupby_grouping_sets2.q,alter_view_failure.q,distinct_windowing_failure1.q,invalid_t_alter2.q,alter_table_constraint_invalid_fk_col1.q,invalid_varchar_length_2.q,authorization_show_grant_otheruser_alltabs.q,subquery_windowing_corr.q,compact_non_acid_table.q,authorization_view_4.q,authorization_disallow_transform.q,materialized_view_authorization_rebuild_other.q,authorization_fail_4.q,dbtxnmgr_nodblock.q,set_hiveconf_internal_variable1.q,input_part0_neg.q,udf_printf_wrong3.q,load_orc_negative2.q,druid_buckets.q,archive2.q,authorization_addjar.q,invalid_sum_syntax.q,insert_into_with_schema1.q,udf_add_months_error_2.q,dyn_part_max_per_node.q,authorization_revoke_table_fail1.q,udf_printf_wrong2.q,archive_multi3.q,udf_printf_wrong1.q,subquery_subquery_chain.q,authorization_view_disable_cbo_4.q,no_matching_udf.q,create_view_failure7.q,drop_native_udf.q,truncate_column_list_bucketing.q,authorization_uri_add_partition.q,authorization_view_disable_cbo_3.q,bad_exec_hooks.q,authorization_view_disable_cbo_2.q,fetchtask_ioexception.q,char_pad_convert_fail2.q,authorization_set_role_neg1.q,serde_regex3.q,authorization_delete

[jira] [Updated] (HIVE-18571) stats issues for MM tables; ACID doesn't check state for CTAS

2018-03-08 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18571:
--
Component/s: Transactions

> stats issues for MM tables; ACID doesn't check state for CTAS
> -
>
> Key: HIVE-18571
> URL: https://issues.apache.org/jira/browse/HIVE-18571
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18571.01.patch, HIVE-18571.02.patch, 
> HIVE-18571.03.patch, HIVE-18571.04.patch, HIVE-18571.05.patch, 
> HIVE-18571.06.patch, HIVE-18571.07.patch, HIVE-18571.patch
>
>
> There are multiple stats aggregation issues with MM tables.
> Some simple stats are double counted and some stats (simple stats) are 
> invalid for ACID table dirs altogether. 
> I have a patch almost ready, need to fix some more stuff and clean up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18911) LOAD.. code for MM has some suspect/dead code

2018-03-08 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18911:
--
Component/s: Transactions

> LOAD.. code for MM has some suspect/dead code
> -
>
> Key: HIVE-18911
> URL: https://issues.apache.org/jira/browse/HIVE-18911
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Discovered in HIVE-18571 and added TODO-s that need to be addressed.
> E.g. {noformat}
> if (isMmTableWrite) {
>// We will load into MM directory, and delete from the parent if 
> needed.
>   // TODO: this looks invalid after ACID integration. What about base 
> dirs?
>destPath = new Path(destPath, AcidUtils.deltaSubdir(writeId, 
> writeId, stmtId));
> ...
>  // TODO: loadFileType for MM table will no longer be REPLACE_ALL
>filter = (loadFileType == LoadFileType.REPLACE_ALL)
> {noformat}
> 2 places like that
> Also replaceFiles has isMmTableWrite flag that should no longer be needed 
> (since for a transactional table we should never replace files). Either 
> there's some invalid code path that relies on it (load table?), or it is just 
> unused and needs to be removed.
> Also used in 2 places, "TODO: this should never run for MM tables anymore. 
> Remove the flag, and maybe the filter?"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18918) Bad error message in CompactorMR.lanuchCompactionJob()

2018-03-08 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392224#comment-16392224
 ] 

Jason Dere commented on HIVE-18918:
---

+1

> Bad error message in CompactorMR.lanuchCompactionJob()
> --
>
> Key: HIVE-18918
> URL: https://issues.apache.org/jira/browse/HIVE-18918
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.2.2
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-18918.01.patch
>
>
> {noformat}
>   rj.waitForCompletion();
>   if (!rj.isSuccessful()) {
> throw new IOException(compactionType == CompactionType.MAJOR ? 
> "Major" : "Minor" +
>" compactor job failed for " + jobName + "! Hadoop JobId: " + 
> rj.getID());
>   }
> {noformat}
> produces no useful info in case of Major compaction
> {noformat}
> 2018-02-28 00:59:16,416 ERROR [gdpr1-61]: compactor.Worker 
> (Worker.java:run(191)) - Caught exception while trying to compact 
> id:38602,dbname:audit,tableName:COMP_ENTRY_AF_A,partN\
> ame:partition_dt=2017-04-11,state:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestTxnId:0.
>   Marking failed to avoid repeated failures, java.io.IOException: Ma\
> jor
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.launchCompactionJob(CompactorMR.java:314)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:269)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.Worker$1.run(Worker.java:175)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:172)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18918) Bad error message in CompactorMR.lanuchCompactionJob()

2018-03-08 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392219#comment-16392219
 ] 

Eugene Koifman commented on HIVE-18918:
---

[~jdere] could you review please

> Bad error message in CompactorMR.lanuchCompactionJob()
> --
>
> Key: HIVE-18918
> URL: https://issues.apache.org/jira/browse/HIVE-18918
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.2.2
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-18918.01.patch
>
>
> {noformat}
>   rj.waitForCompletion();
>   if (!rj.isSuccessful()) {
> throw new IOException(compactionType == CompactionType.MAJOR ? 
> "Major" : "Minor" +
>" compactor job failed for " + jobName + "! Hadoop JobId: " + 
> rj.getID());
>   }
> {noformat}
> produces no useful info in case of Major compaction
> {noformat}
> 2018-02-28 00:59:16,416 ERROR [gdpr1-61]: compactor.Worker 
> (Worker.java:run(191)) - Caught exception while trying to compact 
> id:38602,dbname:audit,tableName:COMP_ENTRY_AF_A,partN\
> ame:partition_dt=2017-04-11,state:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestTxnId:0.
>   Marking failed to avoid repeated failures, java.io.IOException: Ma\
> jor
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.launchCompactionJob(CompactorMR.java:314)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:269)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.Worker$1.run(Worker.java:175)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:172)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18918) Bad error message in CompactorMR.lanuchCompactionJob()

2018-03-08 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18918:
--
Description: 
{noformat}
  rj.waitForCompletion();
  if (!rj.isSuccessful()) {
throw new IOException(compactionType == CompactionType.MAJOR ? "Major" 
: "Minor" +
   " compactor job failed for " + jobName + "! Hadoop JobId: " + 
rj.getID());
  }
{noformat}

produces no useful info in case of Major compaction

{noformat}
2018-02-28 00:59:16,416 ERROR [gdpr1-61]: compactor.Worker 
(Worker.java:run(191)) - Caught exception while trying to compact 
id:38602,dbname:audit,tableName:COMP_ENTRY_AF_A,partN\
ame:partition_dt=2017-04-11,state:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestTxnId:0.
  Marking failed to avoid repeated failures, java.io.IOException: Ma\
jor
at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.launchCompactionJob(CompactorMR.java:314)
at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:269)
at org.apache.hadoop.hive.ql.txn.compactor.Worker$1.run(Worker.java:175)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:172)

{noformat}


  was:
{noformat}
  rj.waitForCompletion();
  if (!rj.isSuccessful()) {
throw new IOException(compactionType == CompactionType.MAJOR ? "Major" 
: "Minor" +
   " compactor job failed for " + jobName + "! Hadoop JobId: " + 
rj.getID());
  }
{noformat}

produces no useful info in case of Major compaction


> Bad error message in CompactorMR.lanuchCompactionJob()
> --
>
> Key: HIVE-18918
> URL: https://issues.apache.org/jira/browse/HIVE-18918
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.2.2
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-18918.01.patch
>
>
> {noformat}
>   rj.waitForCompletion();
>   if (!rj.isSuccessful()) {
> throw new IOException(compactionType == CompactionType.MAJOR ? 
> "Major" : "Minor" +
>" compactor job failed for " + jobName + "! Hadoop JobId: " + 
> rj.getID());
>   }
> {noformat}
> produces no useful info in case of Major compaction
> {noformat}
> 2018-02-28 00:59:16,416 ERROR [gdpr1-61]: compactor.Worker 
> (Worker.java:run(191)) - Caught exception while trying to compact 
> id:38602,dbname:audit,tableName:COMP_ENTRY_AF_A,partN\
> ame:partition_dt=2017-04-11,state:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestTxnId:0.
>   Marking failed to avoid repeated failures, java.io.IOException: Ma\
> jor
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.launchCompactionJob(CompactorMR.java:314)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:269)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.Worker$1.run(Worker.java:175)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:172)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18918) Bad error message in CompactorMR.lanuchCompactionJob()

2018-03-08 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18918:
--
Status: Patch Available  (was: Open)

> Bad error message in CompactorMR.lanuchCompactionJob()
> --
>
> Key: HIVE-18918
> URL: https://issues.apache.org/jira/browse/HIVE-18918
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.2.2
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-18918.01.patch
>
>
> {noformat}
>   rj.waitForCompletion();
>   if (!rj.isSuccessful()) {
> throw new IOException(compactionType == CompactionType.MAJOR ? 
> "Major" : "Minor" +
>" compactor job failed for " + jobName + "! Hadoop JobId: " + 
> rj.getID());
>   }
> {noformat}
> produces no useful info in case of Major compaction



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18918) Bad error message in CompactorMR.lanuchCompactionJob()

2018-03-08 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18918:
--
Attachment: HIVE-18918.01.patch

> Bad error message in CompactorMR.lanuchCompactionJob()
> --
>
> Key: HIVE-18918
> URL: https://issues.apache.org/jira/browse/HIVE-18918
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.2.2
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-18918.01.patch
>
>
> {noformat}
>   rj.waitForCompletion();
>   if (!rj.isSuccessful()) {
> throw new IOException(compactionType == CompactionType.MAJOR ? 
> "Major" : "Minor" +
>" compactor job failed for " + jobName + "! Hadoop JobId: " + 
> rj.getID());
>   }
> {noformat}
> produces no useful info in case of Major compaction



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18907) Create utility to fix acid key index issue from HIVE-18817

2018-03-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392204#comment-16392204
 ] 

Hive QA commented on HIVE-18907:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
48s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
55s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
6s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
36s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
46s{color} | {color:red} root: The patch generated 28 new + 0 unchanged - 0 
fixed = 28 total (was 0) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 28 new + 0 unchanged - 0 fixed 
= 28 total (was 0) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  6m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
13s{color} | {color:red} The patch generated 49 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 43m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-9560/dev-support/hive-personality.sh
 |
| git revision | master / 73ccc44 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9560/yetus/diff-checkstyle-root.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9560/yetus/diff-checkstyle-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9560/yetus/patch-asflicense-problems.txt
 |
| modules | C: . ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9560/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Create utility to fix acid key index issue from HIVE-18817
> --
>
> Key: HIVE-18907
> URL: https://issues.apache.org/jira/browse/HIVE-18907
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-18907.1.patch, HIVE-18907.2.patch, 
> HIVE-18907.3.patch
>
>
> While HIVE-18817 will create new ORC Acid files from hitting the 
> ArrayIndexOutOfBounds issue, existing files created before HIVE-18817 will 
> still cause this issue. If there are delta directories then one way to 
> generate new files is to perform a major compaction. But this does not work 
> if there are no delta directories for the table/partition.
> Add a tool to fix the Acid ORC files directly in the case that a compaction 
> cannot be performed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-18918) Bad error message in CompactorMR.lanuchCompactionJob()

2018-03-08 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-18918:
-


> Bad error message in CompactorMR.lanuchCompactionJob()
> --
>
> Key: HIVE-18918
> URL: https://issues.apache.org/jira/browse/HIVE-18918
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.2.2
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
>
> {noformat}
>   rj.waitForCompletion();
>   if (!rj.isSuccessful()) {
> throw new IOException(compactionType == CompactionType.MAJOR ? 
> "Major" : "Minor" +
>" compactor job failed for " + jobName + "! Hadoop JobId: " + 
> rj.getID());
>   }
> {noformat}
> produces no useful info in case of Major compaction



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18831) Differentiate errors that are thrown by Spark tasks

2018-03-08 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-18831:

Attachment: HIVE-18831.3.patch

> Differentiate errors that are thrown by Spark tasks
> ---
>
> Key: HIVE-18831
> URL: https://issues.apache.org/jira/browse/HIVE-18831
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18831.1.patch, HIVE-18831.2.patch, 
> HIVE-18831.3.patch
>
>
> We propagate exceptions from Spark task failures to the client well, but we 
> don't differentiate between errors from HS2 / RSC vs. errors thrown by 
> individual tasks.
> Main motivation is that when the client sees a propagated Spark exception its 
> difficult to know what part of the excution threw the exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18831) Differentiate errors that are thrown by Spark tasks

2018-03-08 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392163#comment-16392163
 ] 

Sahil Takiar commented on HIVE-18831:
-

Fixing test failure.

> Differentiate errors that are thrown by Spark tasks
> ---
>
> Key: HIVE-18831
> URL: https://issues.apache.org/jira/browse/HIVE-18831
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18831.1.patch, HIVE-18831.2.patch, 
> HIVE-18831.3.patch
>
>
> We propagate exceptions from Spark task failures to the client well, but we 
> don't differentiate between errors from HS2 / RSC vs. errors thrown by 
> individual tasks.
> Main motivation is that when the client sees a propagated Spark exception its 
> difficult to know what part of the excution threw the exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-14792) AvroSerde reads the remote schema-file at least once per mapper, per table reference.

2018-03-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392156#comment-16392156
 ] 

Hive QA commented on HIVE-14792:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12913648/HIVE-14792.4.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 22 failed/errored test(s), 12951 tests 
executed
*Failed tests:*
{noformat}
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=94)

[nopart_insert.q,insert_into_with_schema.q,input41.q,having1.q,create_table_failure3.q,default_constraint_invalid_default_value.q,database_drop_not_empty_restrict.q,windowing_after_orderby.q,orderbysortby.q,subquery_select_distinct2.q,authorization_uri_alterpart_loc.q,udf_last_day_error_1.q,constraint_duplicate_name.q,create_table_failure4.q,alter_tableprops_external_with_notnull_constraint.q,semijoin5.q,udf_format_number_wrong4.q,deletejar.q,exim_11_nonpart_noncompat_sorting.q,show_tables_bad_db2.q,drop_func_nonexistent.q,nopart_load.q,alter_table_non_partitioned_table_cascade.q,load_wrong_fileformat.q,lockneg_try_db_lock_conflict.q,udf_field_wrong_args_len.q,create_table_failure2.q,create_with_fk_constraints_enforced.q,groupby2_map_skew_multi_distinct.q,udf_min.q,authorization_update_noupdatepriv.q,show_columns2.q,authorization_insert_noselectpriv.q,orc_replace_columns3_acid.q,compare_double_bigint.q,authorization_set_nonexistent_conf.q,alter_rename_partition_failure3.q,split_sample_wrong_format2.q,create_with_fk_pk_same_tab.q,compare_double_bigint_2.q,authorization_show_roles_no_admin.q,materialized_view_authorization_rebuild_no_grant.q,unionLimit.q,authorization_revoke_table_fail2.q,authorization_insert_noinspriv.q,duplicate_insert3.q,authorization_desc_table_nosel.q,stats_noscan_non_native.q,orc_change_serde_acid.q,create_or_replace_view7.q,exim_07_nonpart_noncompat_ifof.q,create_with_unique_constraints_enforced.q,udf_concat_ws_wrong2.q,fileformat_bad_class.q,merge_negative_2.q,exim_15_part_nonpart.q,authorization_not_owner_drop_view.q,external1.q,authorization_uri_insert.q,create_with_fk_wrong_ref.q,columnstats_tbllvl_incorrect_column.q,authorization_show_parts_nosel.q,authorization_not_owner_drop_tab.q,external2.q,authorization_deletejar.q,temp_table_create_like_partitions.q,udf_greatest_error_1.q,ptf_negative_AggrFuncsWithNoGBYNoPartDef.q,alter_view_as_select_not_exist.q,touch1.q,groupby3_map_skew_multi_distinct.q,insert_into_notnull_constraint.q,exchange_partition_neg_partition_missing.q,groupby_cube_multi_gby.q,columnstats_tbllvl.q,drop_invalid_constraint2.q,alter_table_add_partition.q,update_not_acid.q,archive5.q,alter_table_constraint_invalid_pk_col.q,ivyDownload.q,udf_instr_wrong_type.q,bad_sample_clause.q,authorization_not_owner_drop_tab2.q,authorization_alter_db_owner.q,show_columns1.q,orc_type_promotion3.q,create_view_failure8.q,strict_join.q,udf_add_months_error_1.q,groupby_cube2.q,groupby_cube1.q,groupby_rollup1.q,genericFileFormat.q,invalid_cast_from_binary_4.q,drop_invalid_constraint1.q,serde_regex.q,show_partitions1.q,invalid_cast_from_binary_6.q,create_with_multi_pk_constraint.q,udf_field_wrong_type.q,groupby_grouping_sets4.q,groupby_grouping_sets3.q,insertsel_fail.q,udf_locate_wrong_type.q,orc_type_promotion1_acid.q,set_table_property.q,create_or_replace_view2.q,groupby_grouping_sets2.q,alter_view_failure.q,distinct_windowing_failure1.q,invalid_t_alter2.q,alter_table_constraint_invalid_fk_col1.q,invalid_varchar_length_2.q,authorization_show_grant_otheruser_alltabs.q,subquery_windowing_corr.q,compact_non_acid_table.q,authorization_view_4.q,authorization_disallow_transform.q,materialized_view_authorization_rebuild_other.q,authorization_fail_4.q,dbtxnmgr_nodblock.q,set_hiveconf_internal_variable1.q,input_part0_neg.q,udf_printf_wrong3.q,load_orc_negative2.q,druid_buckets.q,archive2.q,authorization_addjar.q,invalid_sum_syntax.q,insert_into_with_schema1.q,udf_add_months_error_2.q,dyn_part_max_per_node.q,authorization_revoke_table_fail1.q,udf_printf_wrong2.q,archive_multi3.q,udf_printf_wrong1.q,subquery_subquery_chain.q,authorization_view_disable_cbo_4.q,no_matching_udf.q,create_view_failure7.q,drop_native_udf.q,truncate_column_list_bucketing.q,authorization_uri_add_partition.q,authorization_view_disable_cbo_3.q,bad_exec_hooks.q,authorization_view_disable_cbo_2.q,fetchtask_ioexception.q,char_pad_convert_fail2.q,authorization_set_role_neg1.q,serde_regex3.q,authorization_delete_nodeletepriv.q,materialized_view_delete.q,create_or_replace_view6.q,bucket_mapjoin_wrong_table_metadata_2.q,msck_repair_3.q,udf_sort_array_by_wrong2.q,local_mapred_error_cache.q,alter_external_acid.q,mm_concatenate.q,authorization_fail_3.q,set_hiveconf_internal_variable0.q,udf_last_day_error_2.q,alter_table_constraint_invalid_ref.q,create_table_wrong_regex.q,describe_xpat

[jira] [Updated] (HIVE-18907) Create utility to fix acid key index issue from HIVE-18817

2018-03-08 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-18907:
--
Attachment: HIVE-18907.3.patch

> Create utility to fix acid key index issue from HIVE-18817
> --
>
> Key: HIVE-18907
> URL: https://issues.apache.org/jira/browse/HIVE-18907
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-18907.1.patch, HIVE-18907.2.patch, 
> HIVE-18907.3.patch
>
>
> While HIVE-18817 will create new ORC Acid files from hitting the 
> ArrayIndexOutOfBounds issue, existing files created before HIVE-18817 will 
> still cause this issue. If there are delta directories then one way to 
> generate new files is to perform a major compaction. But this does not work 
> if there are no delta directories for the table/partition.
> Add a tool to fix the Acid ORC files directly in the case that a compaction 
> cannot be performed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18910) Migrate to Murmur hash for shuffle and bucketing

2018-03-08 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-18910:
--
Attachment: HIVE-18910.2.patch

> Migrate to Murmur hash for shuffle and bucketing
> 
>
> Key: HIVE-18910
> URL: https://issues.apache.org/jira/browse/HIVE-18910
> Project: Hive
>  Issue Type: Task
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18910.1.patch, HIVE-18910.2.patch
>
>
> Hive uses JAVA hash which is not as good as murmur for better distribution 
> and efficiency in bucketing a table.
> Migrate to murmur hash but still keep backward compatibility for existing 
> users so that they dont have to reload the existing tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18917) Add spark.home to hive.conf.restricted.list

2018-03-08 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-18917:

Status: Patch Available  (was: Open)

> Add spark.home to hive.conf.restricted.list
> ---
>
> Key: HIVE-18917
> URL: https://issues.apache.org/jira/browse/HIVE-18917
> Project: Hive
>  Issue Type: Task
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18917.1.patch
>
>
> Add spark.home to hive.conf.restricted.list so its not settable by users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18917) Add spark.home to hive.conf.restricted.list

2018-03-08 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-18917:

Attachment: HIVE-18917.1.patch

> Add spark.home to hive.conf.restricted.list
> ---
>
> Key: HIVE-18917
> URL: https://issues.apache.org/jira/browse/HIVE-18917
> Project: Hive
>  Issue Type: Task
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18917.1.patch
>
>
> Add spark.home to hive.conf.restricted.list so its not settable by users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18879) Disallow embedded element in UDFXPathUtil needs to work if xercesImpl.jar in classpath

2018-03-08 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-18879:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.3.3
   3.0.0
   Status: Resolved  (was: Patch Available)

Further pushed to branch-2, branch-2.3.

> Disallow embedded element in UDFXPathUtil needs to work if xercesImpl.jar in 
> classpath
> --
>
> Key: HIVE-18879
> URL: https://issues.apache.org/jira/browse/HIVE-18879
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
> Fix For: 3.0.0, 2.3.3
>
> Attachments: HIVE-18879.1-branch-2.3.patch, HIVE-18879.1.patch
>
>
> This is a follow up of HIVE-18789.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-14792) AvroSerde reads the remote schema-file at least once per mapper, per table reference.

2018-03-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392093#comment-16392093
 ] 

Hive QA commented on HIVE-14792:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
13s{color} | {color:red} The patch generated 49 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-9559/dev-support/hive-personality.sh
 |
| git revision | master / 73ccc44 |
| Default Java | 1.8.0_111 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9559/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9559/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> AvroSerde reads the remote schema-file at least once per mapper, per table 
> reference.
> -
>
> Key: HIVE-14792
> URL: https://issues.apache.org/jira/browse/HIVE-14792
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Mithun Radhakrishnan
>Assignee: Aihua Xu
>Priority: Major
>  Labels: TODOC2.2, TODOC2.4
> Fix For: 3.0.0, 2.4.0, 2.2.1
>
> Attachments: HIVE-14792.1.patch, HIVE-14792.3.patch, 
> HIVE-14792.4.patch
>
>
> Avro tables that use "external" schema files stored on HDFS can cause 
> excessive calls to {{FileSystem::open()}}, especially for queries that spawn 
> large numbers of mappers.
> This is because of the following code in {{AvroSerDe::initialize()}}:
> {code:title=AvroSerDe.java|borderStyle=solid}
> public void initialize(Configuration configuration, Properties properties) 
> throws SerDeException {
> // ...
> if (hasExternalSchema(properties)
> || columnNameProperty == null || columnNameProperty.isEmpty()
> || columnTypeProperty == null || columnTypeProperty.isEmpty()) {
>   schema = determineSchemaOrReturnErrorSchema(configuration, properties);
> } else {
>   // Get column names and sort order
>   columnNames = Arrays.asList(columnNameProperty.split(","));
>   columnTypes = 
> TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProperty);
>   schema = getSchemaFromCols(properties, columnNames, columnTypes, 
> columnCommentProperty);
>  
> properties.setProperty(AvroSerdeUtils.AvroTableProperties.SCHEMA_LITERAL.getPropName(),
>  schema.toString());
>  

[jira] [Assigned] (HIVE-18917) Add spark.home to hive.conf.restricted.list

2018-03-08 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-18917:
---


> Add spark.home to hive.conf.restricted.list
> ---
>
> Key: HIVE-18917
> URL: https://issues.apache.org/jira/browse/HIVE-18917
> Project: Hive
>  Issue Type: Task
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>
> Add spark.home to hive.conf.restricted.list so its not settable by users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18871) hive on tez execution error due to set hive.aux.jars.path to hdfs://

2018-03-08 Thread zhuwei (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuwei updated HIVE-18871:
--
Fix Version/s: 2.2.1
   Status: Patch Available  (was: Open)

> hive on tez execution error due to set hive.aux.jars.path to hdfs://
> 
>
> Key: HIVE-18871
> URL: https://issues.apache.org/jira/browse/HIVE-18871
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.2.1
> Environment: hadoop 2.6.5
> hive 2.2.1
> tez 0.8.4
>Reporter: zhuwei
>Assignee: zhuwei
>Priority: Major
> Fix For: 2.2.1
>
> Attachments: HIVE-18871.1.patch, HIVE-18871.2.patch
>
>
> When set the properties 
> hive.aux.jars.path=hdfs://mycluster/apps/hive/lib/guava.jar
> and hive.execution.engine=tez; execute any query will fail with below error 
> log:
> exec.Task: Failed to execute tez graph.
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://mycluster/apps/hive/lib/guava.jar, expected: file:///
>  at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645) 
> ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:80)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:529)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:747)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:524)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
>  ~[hadoop-common-2.6.0.jar:?]
>  at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:337) 
> ~[hadoop-common-2.6.0.jar:?]
>  at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1905) 
> ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeResource(DagUtils.java:1007)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.addTempResources(DagUtils.java:902)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeTempFilesFromConf(DagUtils.java:845)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.refreshLocalResourcesFromConf(TezSessionState.java:466)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:252)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager$TezSessionPoolSession.openInternal(TezSessionPoolManager.java:622)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:206)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(TezTask.java:283) 
> ~[hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:155) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2073) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1744) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1453) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:429) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:445) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:151) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.jav

[jira] [Updated] (HIVE-18871) hive on tez execution error due to set hive.aux.jars.path to hdfs://

2018-03-08 Thread zhuwei (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuwei updated HIVE-18871:
--
Status: Open  (was: Patch Available)

> hive on tez execution error due to set hive.aux.jars.path to hdfs://
> 
>
> Key: HIVE-18871
> URL: https://issues.apache.org/jira/browse/HIVE-18871
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.2.1
> Environment: hadoop 2.6.5
> hive 2.2.1
> tez 0.8.4
>Reporter: zhuwei
>Assignee: zhuwei
>Priority: Major
> Attachments: HIVE-18871.1.patch, HIVE-18871.2.patch
>
>
> When set the properties 
> hive.aux.jars.path=hdfs://mycluster/apps/hive/lib/guava.jar
> and hive.execution.engine=tez; execute any query will fail with below error 
> log:
> exec.Task: Failed to execute tez graph.
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://mycluster/apps/hive/lib/guava.jar, expected: file:///
>  at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645) 
> ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:80)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:529)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:747)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:524)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
>  ~[hadoop-common-2.6.0.jar:?]
>  at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:337) 
> ~[hadoop-common-2.6.0.jar:?]
>  at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1905) 
> ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeResource(DagUtils.java:1007)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.addTempResources(DagUtils.java:902)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeTempFilesFromConf(DagUtils.java:845)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.refreshLocalResourcesFromConf(TezSessionState.java:466)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:252)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager$TezSessionPoolSession.openInternal(TezSessionPoolManager.java:622)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:206)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(TezTask.java:283) 
> ~[hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:155) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2073) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1744) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1453) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:429) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:445) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:151) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hi

[jira] [Updated] (HIVE-18871) hive on tez execution error due to set hive.aux.jars.path to hdfs://

2018-03-08 Thread zhuwei (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuwei updated HIVE-18871:
--
Status: Open  (was: Patch Available)

> hive on tez execution error due to set hive.aux.jars.path to hdfs://
> 
>
> Key: HIVE-18871
> URL: https://issues.apache.org/jira/browse/HIVE-18871
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.2.1
> Environment: hadoop 2.6.5
> hive 2.2.1
> tez 0.8.4
>Reporter: zhuwei
>Assignee: zhuwei
>Priority: Major
> Attachments: HIVE-18871.1.patch, HIVE-18871.2.patch
>
>
> When set the properties 
> hive.aux.jars.path=hdfs://mycluster/apps/hive/lib/guava.jar
> and hive.execution.engine=tez; execute any query will fail with below error 
> log:
> exec.Task: Failed to execute tez graph.
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://mycluster/apps/hive/lib/guava.jar, expected: file:///
>  at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645) 
> ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:80)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:529)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:747)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:524)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
>  ~[hadoop-common-2.6.0.jar:?]
>  at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:337) 
> ~[hadoop-common-2.6.0.jar:?]
>  at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1905) 
> ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeResource(DagUtils.java:1007)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.addTempResources(DagUtils.java:902)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeTempFilesFromConf(DagUtils.java:845)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.refreshLocalResourcesFromConf(TezSessionState.java:466)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:252)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager$TezSessionPoolSession.openInternal(TezSessionPoolManager.java:622)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:206)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(TezTask.java:283) 
> ~[hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:155) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2073) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1744) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1453) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:429) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:445) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:151) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hi

[jira] [Updated] (HIVE-18871) hive on tez execution error due to set hive.aux.jars.path to hdfs://

2018-03-08 Thread zhuwei (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuwei updated HIVE-18871:
--
Status: Patch Available  (was: Open)

> hive on tez execution error due to set hive.aux.jars.path to hdfs://
> 
>
> Key: HIVE-18871
> URL: https://issues.apache.org/jira/browse/HIVE-18871
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.2.1
> Environment: hadoop 2.6.5
> hive 2.2.1
> tez 0.8.4
>Reporter: zhuwei
>Assignee: zhuwei
>Priority: Major
> Attachments: HIVE-18871.1.patch, HIVE-18871.2.patch
>
>
> When set the properties 
> hive.aux.jars.path=hdfs://mycluster/apps/hive/lib/guava.jar
> and hive.execution.engine=tez; execute any query will fail with below error 
> log:
> exec.Task: Failed to execute tez graph.
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://mycluster/apps/hive/lib/guava.jar, expected: file:///
>  at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645) 
> ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:80)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:529)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:747)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:524)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
>  ~[hadoop-common-2.6.0.jar:?]
>  at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:337) 
> ~[hadoop-common-2.6.0.jar:?]
>  at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1905) 
> ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeResource(DagUtils.java:1007)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.addTempResources(DagUtils.java:902)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeTempFilesFromConf(DagUtils.java:845)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.refreshLocalResourcesFromConf(TezSessionState.java:466)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:252)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager$TezSessionPoolSession.openInternal(TezSessionPoolManager.java:622)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:206)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(TezTask.java:283) 
> ~[hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:155) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2073) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1744) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1453) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:429) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:445) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:151) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hi

[jira] [Comment Edited] (HIVE-14792) AvroSerde reads the remote schema-file at least once per mapper, per table reference.

2018-03-08 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392040#comment-16392040
 ] 

Aihua Xu edited comment on HIVE-14792 at 3/8/18 11:08 PM:
--

By checking the patch, since we are calling getTableMetadata() which puts 
"columns.types" and the other properties in table properties. Setting this to 
false will not have such metadata in there. Do you think it's a good idea to 
just get the avro schema from Serde info rather than everything? 


was (Author: aihuaxu):
By checking the patch, since we are calling getTableMetadata() which puts 
"columns.types" and the other properties in table properties. Setting this to 
false will not have such metadata in there. Do you think it's a good idea to 
just get the serde lib info rather than everything? 

> AvroSerde reads the remote schema-file at least once per mapper, per table 
> reference.
> -
>
> Key: HIVE-14792
> URL: https://issues.apache.org/jira/browse/HIVE-14792
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Mithun Radhakrishnan
>Assignee: Aihua Xu
>Priority: Major
>  Labels: TODOC2.2, TODOC2.4
> Fix For: 3.0.0, 2.4.0, 2.2.1
>
> Attachments: HIVE-14792.1.patch, HIVE-14792.3.patch, 
> HIVE-14792.4.patch
>
>
> Avro tables that use "external" schema files stored on HDFS can cause 
> excessive calls to {{FileSystem::open()}}, especially for queries that spawn 
> large numbers of mappers.
> This is because of the following code in {{AvroSerDe::initialize()}}:
> {code:title=AvroSerDe.java|borderStyle=solid}
> public void initialize(Configuration configuration, Properties properties) 
> throws SerDeException {
> // ...
> if (hasExternalSchema(properties)
> || columnNameProperty == null || columnNameProperty.isEmpty()
> || columnTypeProperty == null || columnTypeProperty.isEmpty()) {
>   schema = determineSchemaOrReturnErrorSchema(configuration, properties);
> } else {
>   // Get column names and sort order
>   columnNames = Arrays.asList(columnNameProperty.split(","));
>   columnTypes = 
> TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProperty);
>   schema = getSchemaFromCols(properties, columnNames, columnTypes, 
> columnCommentProperty);
>  
> properties.setProperty(AvroSerdeUtils.AvroTableProperties.SCHEMA_LITERAL.getPropName(),
>  schema.toString());
> }
> // ...
> }
> {code}
> For tables using {{avro.schema.url}}, every time the SerDe is initialized 
> (i.e. at least once per mapper), the schema file is read remotely. For 
> queries with thousands of mappers, this leads to a stampede to the handful 
> (3?) datanodes that host the schema-file. In the best case, this causes 
> slowdowns.
> It would be preferable to distribute the Avro-schema to all mappers as part 
> of the job-conf. The alternatives aren't exactly appealing:
> # One can't rely solely on the {{column.list.types}} stored in the Hive 
> metastore. (HIVE-14789).
> # {{avro.schema.literal}} might not always be usable, because of the 
> size-limit on table-parameters. The typical size of the Avro-schema file is 
> between 0.5-3MB, in my limited experience. Bumping the max table-parameter 
> size isn't a great solution.
> If the {{avro.schema.file}} were read during query-planning, and made 
> available as part of table-properties (but not serialized into the 
> metastore), the downstream logic will remain largely intact. I have a patch 
> that does this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17580) Remove dependency of get_fields_with_environment_context API to serde

2018-03-08 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392082#comment-16392082
 ] 

Alan Gates commented on HIVE-17580:
---

+1 for the latest PR.  Thank Vihang, this was a lot of work.

> Remove dependency of get_fields_with_environment_context API to serde
> -
>
> Key: HIVE-17580
> URL: https://issues.apache.org/jira/browse/HIVE-17580
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-17580.003-standalone-metastore.patch, 
> HIVE-17580.04-standalone-metastore.patch, 
> HIVE-17580.05-standalone-metastore.patch, 
> HIVE-17580.06-standalone-metastore.patch, 
> HIVE-17580.07-standalone-metastore.patch, 
> HIVE-17580.08-standalone-metastore.patch, 
> HIVE-17580.09-standalone-metastore.patch, 
> HIVE-17580.092-standalone-metastore.patch
>
>
> {{get_fields_with_environment_context}} metastore API uses {{Deserializer}} 
> class to access the fields metadata for the cases where it is stored along 
> with the data files (avro tables). The problem is Deserializer classes is 
> defined in hive-serde module and in order to make metastore independent of 
> Hive we will have to remove this dependency (atleast we should change it to 
> runtime dependency instead of compile time).
> The other option is investigate if we can use SearchArgument to provide this 
> functionality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18878) Lower MoveTask Lock Logging to Debug

2018-03-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392075#comment-16392075
 ] 

Hive QA commented on HIVE-18878:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12913654/HIVE-18878.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 13350 tests 
executed
*Failed tests:*
{noformat}
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=95)

[udf_invalid.q,authorization_uri_export.q,default_constraint_complex_default_value.q,druid_datasource2.q,view_update.q,default_partition_name.q,authorization_public_create.q,load_wrong_fileformat_rc_seq.q,default_constraint_invalid_type.q,altern1.q,describe_xpath1.q,drop_view_failure2.q,temp_table_rename.q,invalid_select_column_with_subquery.q,udf_trunc_error1.q,insert_view_failure.q,dbtxnmgr_nodbunlock.q,authorization_show_columns.q,cte_recursion.q,merge_constraint_notnull.q,load_part_nospec.q,clusterbyorderby.q,orc_type_promotion2.q,ctas_noperm_loc.q,udf_instr_wrong_args_len.q,invalid_create_tbl2.q,part_col_complex_type.q,authorization_drop_db_empty.q,smb_mapjoin_14.q,subquery_scalar_multi_rows.q,alter_partition_coltype_2columns.q,subquery_corr_in_agg.q,insert_overwrite_notnull_constraint.q,authorization_show_grant_otheruser_wtab.q,regex_col_groupby.q,udaf_collect_set_unsupported.q,ptf_negative_DuplicateWindowAlias.q,exim_22_export_authfail.q,udf_likeany_wrong1.q,groupby_key.q,ambiguous_col.q,groupby3_multi_distinct.q,authorization_alter_drop_ptn.q,invalid_cast_from_binary_5.q,show_create_table_does_not_exist.q,invalid_select_column.q,exim_20_managed_location_over_existing.q,interval_3.q,authorization_compile.q,join35.q,merge_negative_3.q,udf_concat_ws_wrong3.q,create_or_replace_view8.q,create_external_with_notnull_constraint.q,split_sample_out_of_range.q,materialized_view_no_transactional_rewrite.q,authorization_show_grant_otherrole.q,create_with_constraints_duplicate_name.q,invalid_stddev_samp_syntax.q,authorization_view_disable_cbo_7.q,autolocal1.q,avro_non_nullable_union.q,load_orc_negative_part.q,drop_view_failure1.q,columnstats_partlvl_invalid_values_autogather.q,exim_13_nonnative_import.q,alter_table_wrong_regex.q,add_partition_with_whitelist.q,udf_next_day_error_2.q,authorization_select.q,udf_trunc_error2.q,authorization_view_7.q,udf_format_number_wrong5.q,touch2.q,exim_03_nonpart_noncompat_colschema.q,orc_type_promotion1.q,lateral_view_alias.q,show_tables_bad_db1.q,unset_table_property.q,alter_non_native.q,nvl_mismatch_type.q,load_orc_negative3.q,authorization_create_role_no_admin.q,invalid_distinct1.q,authorization_grant_server.q,orc_type_promotion3_acid.q,show_tables_bad1.q,macro_unused_parameter.q,drop_invalid_constraint3.q,drop_partition_filter_failure.q,char_pad_convert_fail3.q,exim_23_import_exist_authfail.q,drop_invalid_constraint4.q,authorization_create_macro1.q,archive1.q,subquery_multiple_cols_in_select.q,change_hive_hdfs_session_path.q,udf_trunc_error3.q,invalid_variance_syntax.q,authorization_truncate_2.q,invalid_avg_syntax.q,invalid_select_column_with_tablename.q,mm_truncate_cols.q,groupby_grouping_sets1.q,druid_location.q,groupby2_multi_distinct.q,authorization_sba_drop_table.q,dynamic_partitions_with_whitelist.q,compare_string_bigint_2.q,udf_greatest_error_2.q,authorization_view_6.q,show_tablestatus.q,duplicate_alias_in_transform_schema.q,create_with_fk_uk_same_tab.q,udtf_not_supported3.q,alter_table_constraint_invalid_fk_col2.q,udtf_not_supported1.q,dbtxnmgr_notableunlock.q,ptf_negative_InvalidValueBoundary.q,alter_table_constraint_duplicate_pk.q,udf_printf_wrong4.q,create_view_failure9.q,udf_elt_wrong_type.q,selectDistinctStarNeg_1.q,invalid_mapjoin1.q,load_stored_as_dirs.q,input1.q,udf_sort_array_wrong1.q,invalid_distinct2.q,invalid_select_fn.q,authorization_role_grant_otherrole.q,archive4.q,load_nonpart_authfail.q,recursive_view.q,authorization_view_disable_cbo_1.q,desc_failure4.q,create_not_acid.q,udf_sort_array_wrong3.q,char_pad_convert_fail0.q,udf_map_values_arg_type.q,alter_view_failure6_2.q,alter_partition_change_col_nonexist.q,update_non_acid_table.q,authorization_view_disable_cbo_5.q,ct_noperm_loc.q,interval_1.q,authorization_show_grant_otheruser_all.q,authorization_view_2.q,show_tables_bad2.q,groupby_rollup2.q,truncate_column_seqfile.q,create_view_failure5.q,authorization_create_view.q,ptf_window_boundaries.q,ctasnullcol.q,input_part0_neg_2.q,create_or_replace_view1.q,udf_max.q,exim_01_nonpart_over_loaded.q,msck_repair_1.q,orc_change_fileformat_acid.q,udf_nonexistent_resource.q,exim_19_external_over_existing.q,serde_regex2.q,msck_repair_2.q,exim_06_nonpart_noncompat_storage.q,illegal_partition_type4.q,udf_sort_array_by_wrong1.q,create_or_replace_view5.q,windowing_leadlag_in_udaf.q,avro_decimal.q,materialized_view_update.q

[jira] [Commented] (HIVE-18727) Update GenericUDFEnforceNotNullConstraint to throw an ERROR instead of Exception on failure

2018-03-08 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392074#comment-16392074
 ] 

Vineet Garg commented on HIVE-18727:


bq. include information about the row/column which has the issue 
This would be an enhancement on the planning side. Planning will need to pass 
that information to GenericUDF. I'll open a separate JIRA to do this.

We certainly should rename the file. Also is there any reason it is part of 
org.apache.hadoop.hive.ql.metadata package? I am unable to comment on the 
review board itself.

> Update GenericUDFEnforceNotNullConstraint to throw an ERROR instead of 
> Exception on failure
> ---
>
> Key: HIVE-18727
> URL: https://issues.apache.org/jira/browse/HIVE-18727
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Kryvenko Igor
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18727.patch
>
>
> Throwing an exception makes TezProcessor stop retrying the task. Since this 
> is NOT NULL constraint violation we don't want TezProcessor to keep retrying 
> on failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18727) Update GenericUDFEnforceNotNullConstraint to throw an ERROR instead of Exception on failure

2018-03-08 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392065#comment-16392065
 ] 

Gopal V commented on HIVE-18727:


The files need APL license headers and it is important to rename HiveError as 
DataConstraintViolationError and include information about the row/column which 
has the issue (column is more important than the row, row is easy to find with 
a "where col is null").

> Update GenericUDFEnforceNotNullConstraint to throw an ERROR instead of 
> Exception on failure
> ---
>
> Key: HIVE-18727
> URL: https://issues.apache.org/jira/browse/HIVE-18727
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Kryvenko Igor
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18727.patch
>
>
> Throwing an exception makes TezProcessor stop retrying the task. Since this 
> is NOT NULL constraint violation we don't want TezProcessor to keep retrying 
> on failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18727) Update GenericUDFEnforceNotNullConstraint to throw an ERROR instead of Exception on failure

2018-03-08 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392058#comment-16392058
 ] 

Vineet Garg commented on HIVE-18727:


Thanks [~vbeshka]. [~gopalv] Would you mind taking a look at it and help me 
review this?

> Update GenericUDFEnforceNotNullConstraint to throw an ERROR instead of 
> Exception on failure
> ---
>
> Key: HIVE-18727
> URL: https://issues.apache.org/jira/browse/HIVE-18727
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Kryvenko Igor
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18727.patch
>
>
> Throwing an exception makes TezProcessor stop retrying the task. Since this 
> is NOT NULL constraint violation we don't want TezProcessor to keep retrying 
> on failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18804) StatsUtils.getColStatisticsFromExprMap may only provide info for a column once

2018-03-08 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392047#comment-16392047
 ] 

Ashutosh Chauhan commented on HIVE-18804:
-

Can you please explain the problem and proposed fix? I thought issue was same 
column appearing multiple times in select list, but your patch doesn't indicate 
that.

> StatsUtils.getColStatisticsFromExprMap may only provide info for a column once
> --
>
> Key: HIVE-18804
> URL: https://issues.apache.org/jira/browse/HIVE-18804
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-18804.01.patch, HIVE-18804.02.patch
>
>
> currently {{StatsUtils.getColStatisticsFromExprMap}} may duplicate the 
> datasize by passing the info about the same column more than once
> https://github.com/apache/hive/blob/e8e5ab24616aa834f4966efe3a5f437f6bee4d1d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L1529



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-14792) AvroSerde reads the remote schema-file at least once per mapper, per table reference.

2018-03-08 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392040#comment-16392040
 ] 

Aihua Xu commented on HIVE-14792:
-

By checking the patch, since we are calling getTableMetadata() which puts 
"columns.types" and the other properties in table properties. Setting this to 
false will not have such metadata in there. Do you think it's a good idea to 
just get the serde lib info rather than everything? 

> AvroSerde reads the remote schema-file at least once per mapper, per table 
> reference.
> -
>
> Key: HIVE-14792
> URL: https://issues.apache.org/jira/browse/HIVE-14792
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Mithun Radhakrishnan
>Assignee: Aihua Xu
>Priority: Major
>  Labels: TODOC2.2, TODOC2.4
> Fix For: 3.0.0, 2.4.0, 2.2.1
>
> Attachments: HIVE-14792.1.patch, HIVE-14792.3.patch, 
> HIVE-14792.4.patch
>
>
> Avro tables that use "external" schema files stored on HDFS can cause 
> excessive calls to {{FileSystem::open()}}, especially for queries that spawn 
> large numbers of mappers.
> This is because of the following code in {{AvroSerDe::initialize()}}:
> {code:title=AvroSerDe.java|borderStyle=solid}
> public void initialize(Configuration configuration, Properties properties) 
> throws SerDeException {
> // ...
> if (hasExternalSchema(properties)
> || columnNameProperty == null || columnNameProperty.isEmpty()
> || columnTypeProperty == null || columnTypeProperty.isEmpty()) {
>   schema = determineSchemaOrReturnErrorSchema(configuration, properties);
> } else {
>   // Get column names and sort order
>   columnNames = Arrays.asList(columnNameProperty.split(","));
>   columnTypes = 
> TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProperty);
>   schema = getSchemaFromCols(properties, columnNames, columnTypes, 
> columnCommentProperty);
>  
> properties.setProperty(AvroSerdeUtils.AvroTableProperties.SCHEMA_LITERAL.getPropName(),
>  schema.toString());
> }
> // ...
> }
> {code}
> For tables using {{avro.schema.url}}, every time the SerDe is initialized 
> (i.e. at least once per mapper), the schema file is read remotely. For 
> queries with thousands of mappers, this leads to a stampede to the handful 
> (3?) datanodes that host the schema-file. In the best case, this causes 
> slowdowns.
> It would be preferable to distribute the Avro-schema to all mappers as part 
> of the job-conf. The alternatives aren't exactly appealing:
> # One can't rely solely on the {{column.list.types}} stored in the Hive 
> metastore. (HIVE-14789).
> # {{avro.schema.literal}} might not always be usable, because of the 
> size-limit on table-parameters. The typical size of the Avro-schema file is 
> between 0.5-3MB, in my limited experience. Bumping the max table-parameter 
> size isn't a great solution.
> If the {{avro.schema.file}} were read during query-planning, and made 
> available as part of table-properties (but not serialized into the 
> metastore), the downstream logic will remain largely intact. I have a patch 
> that does this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18873) Skipping predicate pushdown for MR silently at HiveInputFormat can cause storage handlers to produce erroneous result

2018-03-08 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392041#comment-16392041
 ] 

Josh Elser commented on HIVE-18873:
---

{quote}The original issue in HIVE-15680 is when there are 2 table aliases we 
push only one filter to job conf which results in incorrect result. How does 
phoenix or accumulo deal with multiple aliases?
{quote}
Honestly, I'm not sure. The distinction between the engine is a bit obtuse down 
in StorageHandler API, iirc. I know Ankit and Sergey are looking into this.
{quote} IIRC, storage handlers use hive.optimize.ppd.storage to push filters 
down to storage handlers. Maybe we can restrict this only when 
hive.optimize.index.filters is used so that storage handlers don't get impacted 
by HIVE-15680.
{quote}
I think that would make sense. At least my take, the PPD not happening for 
storage handlers removes what was a pretty nice feature of  storage handlers 
outright. Us all being outsiders to the problems you're trying to solve in Hive 
make it hard for us to give good suggestions :)

The quick response was appreciated (even though I then let it sit for two days).

> Skipping predicate pushdown for MR silently at HiveInputFormat can cause 
> storage handlers to produce erroneous result
> -
>
> Key: HIVE-18873
> URL: https://issues.apache.org/jira/browse/HIVE-18873
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 3.0.0
>
>
> {code:java}
> // disable filter pushdown for mapreduce when there are more than one table 
> aliases,
>     // since we don't clone jobConf per alias
>     if (mrwork != null && mrwork.getAliases() != null && 
> mrwork.getAliases().size() > 1 &&
>       jobConf.get(ConfVars.HIVE_EXECUTION_ENGINE.varname).equals("mr")) {
>       return;
>     }
> {code}
> I believe this needs to be handled at OpProcFactory so that hive doesn't 
> believe that predicate is handled by storage handler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18878) Lower MoveTask Lock Logging to Debug

2018-03-08 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392025#comment-16392025
 ] 

Sahil Takiar commented on HIVE-18878:
-

Do you have an example of when this causes a lot of verbosity in the logs?

> Lower MoveTask Lock Logging to Debug
> 
>
> Key: HIVE-18878
> URL: https://issues.apache.org/jira/browse/HIVE-18878
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Kryvenko Igor
>Priority: Minor
>  Labels: noob
> Attachments: HIVE-18878.1.patch, HIVE-18878.patch
>
>
> https://github.com/apache/hive/blob/cbb9233a3b39ab8489d777fc76f0758c49b69bef/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java#L233-L234
> {code}
> LOG.info("about to release lock for output: {} lock: {}", output,
>   lock.getHiveLockObject().getName());
> try {
>   lockMgr.unlock(lock);
> } catch (LockException le) {
> {code}
> Change this to _debug_ level logging.  The lock manager's 'unlock' method 
> should have additional logging available.  Knowing that the unlock is about 
> to take place is only helpful for debugging, not for normal operation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17580) Remove dependency of get_fields_with_environment_context API to serde

2018-03-08 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392022#comment-16392022
 ] 

Vihang Karajgaonkar commented on HIVE-17580:


I update the PR to include the above proposed change in case anyone wants to 
take a look.

> Remove dependency of get_fields_with_environment_context API to serde
> -
>
> Key: HIVE-17580
> URL: https://issues.apache.org/jira/browse/HIVE-17580
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-17580.003-standalone-metastore.patch, 
> HIVE-17580.04-standalone-metastore.patch, 
> HIVE-17580.05-standalone-metastore.patch, 
> HIVE-17580.06-standalone-metastore.patch, 
> HIVE-17580.07-standalone-metastore.patch, 
> HIVE-17580.08-standalone-metastore.patch, 
> HIVE-17580.09-standalone-metastore.patch, 
> HIVE-17580.092-standalone-metastore.patch
>
>
> {{get_fields_with_environment_context}} metastore API uses {{Deserializer}} 
> class to access the fields metadata for the cases where it is stored along 
> with the data files (avro tables). The problem is Deserializer classes is 
> defined in hive-serde module and in order to make metastore independent of 
> Hive we will have to remove this dependency (atleast we should change it to 
> runtime dependency instead of compile time).
> The other option is investigate if we can use SearchArgument to provide this 
> functionality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-14792) AvroSerde reads the remote schema-file at least once per mapper, per table reference.

2018-03-08 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392017#comment-16392017
 ] 

Mithun Radhakrishnan commented on HIVE-14792:
-

bq. since you have all the context.

For what it's worth, +1. Thank you for going above and beyond here, [~aihuaxu]!

> AvroSerde reads the remote schema-file at least once per mapper, per table 
> reference.
> -
>
> Key: HIVE-14792
> URL: https://issues.apache.org/jira/browse/HIVE-14792
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Mithun Radhakrishnan
>Assignee: Aihua Xu
>Priority: Major
>  Labels: TODOC2.2, TODOC2.4
> Fix For: 3.0.0, 2.4.0, 2.2.1
>
> Attachments: HIVE-14792.1.patch, HIVE-14792.3.patch, 
> HIVE-14792.4.patch
>
>
> Avro tables that use "external" schema files stored on HDFS can cause 
> excessive calls to {{FileSystem::open()}}, especially for queries that spawn 
> large numbers of mappers.
> This is because of the following code in {{AvroSerDe::initialize()}}:
> {code:title=AvroSerDe.java|borderStyle=solid}
> public void initialize(Configuration configuration, Properties properties) 
> throws SerDeException {
> // ...
> if (hasExternalSchema(properties)
> || columnNameProperty == null || columnNameProperty.isEmpty()
> || columnTypeProperty == null || columnTypeProperty.isEmpty()) {
>   schema = determineSchemaOrReturnErrorSchema(configuration, properties);
> } else {
>   // Get column names and sort order
>   columnNames = Arrays.asList(columnNameProperty.split(","));
>   columnTypes = 
> TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProperty);
>   schema = getSchemaFromCols(properties, columnNames, columnTypes, 
> columnCommentProperty);
>  
> properties.setProperty(AvroSerdeUtils.AvroTableProperties.SCHEMA_LITERAL.getPropName(),
>  schema.toString());
> }
> // ...
> }
> {code}
> For tables using {{avro.schema.url}}, every time the SerDe is initialized 
> (i.e. at least once per mapper), the schema file is read remotely. For 
> queries with thousands of mappers, this leads to a stampede to the handful 
> (3?) datanodes that host the schema-file. In the best case, this causes 
> slowdowns.
> It would be preferable to distribute the Avro-schema to all mappers as part 
> of the job-conf. The alternatives aren't exactly appealing:
> # One can't rely solely on the {{column.list.types}} stored in the Hive 
> metastore. (HIVE-14789).
> # {{avro.schema.literal}} might not always be usable, because of the 
> size-limit on table-parameters. The typical size of the Avro-schema file is 
> between 0.5-3MB, in my limited experience. Bumping the max table-parameter 
> size isn't a great solution.
> If the {{avro.schema.file}} were read during query-planning, and made 
> available as part of table-properties (but not serialized into the 
> metastore), the downstream logic will remain largely intact. I have a patch 
> that does this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16855) org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements

2018-03-08 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392014#comment-16392014
 ] 

Sahil Takiar commented on HIVE-16855:
-

+1 pending Hive QA

> org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements
> --
>
> Key: HIVE-16855
> URL: https://issues.apache.org/jira/browse/HIVE-16855
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16855.1.patch, HIVE-16855.2.patch, 
> HIVE-16855.3.patch
>
>
> # Improve (Simplify) Logging
> # Remove custom buffer size for {{BufferedInputStream}} and instead rely on 
> JVM default which is often larger these days (8192)
> # Simplify looping logic



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-14792) AvroSerde reads the remote schema-file at least once per mapper, per table reference.

2018-03-08 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392015#comment-16392015
 ] 

Aihua Xu commented on HIVE-14792:
-

The test is going on. BTW: I tried the test after changing the config back to 
false. Let me test out when it's true.

> AvroSerde reads the remote schema-file at least once per mapper, per table 
> reference.
> -
>
> Key: HIVE-14792
> URL: https://issues.apache.org/jira/browse/HIVE-14792
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Mithun Radhakrishnan
>Assignee: Aihua Xu
>Priority: Major
>  Labels: TODOC2.2, TODOC2.4
> Fix For: 3.0.0, 2.4.0, 2.2.1
>
> Attachments: HIVE-14792.1.patch, HIVE-14792.3.patch, 
> HIVE-14792.4.patch
>
>
> Avro tables that use "external" schema files stored on HDFS can cause 
> excessive calls to {{FileSystem::open()}}, especially for queries that spawn 
> large numbers of mappers.
> This is because of the following code in {{AvroSerDe::initialize()}}:
> {code:title=AvroSerDe.java|borderStyle=solid}
> public void initialize(Configuration configuration, Properties properties) 
> throws SerDeException {
> // ...
> if (hasExternalSchema(properties)
> || columnNameProperty == null || columnNameProperty.isEmpty()
> || columnTypeProperty == null || columnTypeProperty.isEmpty()) {
>   schema = determineSchemaOrReturnErrorSchema(configuration, properties);
> } else {
>   // Get column names and sort order
>   columnNames = Arrays.asList(columnNameProperty.split(","));
>   columnTypes = 
> TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProperty);
>   schema = getSchemaFromCols(properties, columnNames, columnTypes, 
> columnCommentProperty);
>  
> properties.setProperty(AvroSerdeUtils.AvroTableProperties.SCHEMA_LITERAL.getPropName(),
>  schema.toString());
> }
> // ...
> }
> {code}
> For tables using {{avro.schema.url}}, every time the SerDe is initialized 
> (i.e. at least once per mapper), the schema file is read remotely. For 
> queries with thousands of mappers, this leads to a stampede to the handful 
> (3?) datanodes that host the schema-file. In the best case, this causes 
> slowdowns.
> It would be preferable to distribute the Avro-schema to all mappers as part 
> of the job-conf. The alternatives aren't exactly appealing:
> # One can't rely solely on the {{column.list.types}} stored in the Hive 
> metastore. (HIVE-14789).
> # {{avro.schema.literal}} might not always be usable, because of the 
> size-limit on table-parameters. The typical size of the Avro-schema file is 
> between 0.5-3MB, in my limited experience. Bumping the max table-parameter 
> size isn't a great solution.
> If the {{avro.schema.file}} were read during query-planning, and made 
> available as part of table-properties (but not serialized into the 
> metastore), the downstream logic will remain largely intact. I have a patch 
> that does this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18901) Lower ResourceDownloader Logging to Debug

2018-03-08 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392011#comment-16392011
 ] 

Sahil Takiar commented on HIVE-18901:
-

Can you give an example of when this is causing overly verbose logs? I assume 
we want something in the logs to print out which resources we are downloading.

> Lower ResourceDownloader Logging to Debug
> -
>
> Key: HIVE-18901
> URL: https://issues.apache.org/jira/browse/HIVE-18901
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0, 2.4.0
>Reporter: BELUGA BEHR
>Assignee: Kryvenko Igor
>Priority: Trivial
>  Labels: noob
> Attachments: HIVE-18901.patch
>
>
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/ql/src/java/org/apache/hadoop/hive/ql/util/ResourceDownloader.java#L98
> {code}
>   LOG.info("converting to local {}", srcUri);
> {code}
> # Set to _debug_ level
> # Uppercase the "C" to "Converting" to match all other Hive logging



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-14792) AvroSerde reads the remote schema-file at least once per mapper, per table reference.

2018-03-08 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392007#comment-16392007
 ] 

Aihua Xu commented on HIVE-14792:
-

Probably you are the best person to review the change since you have all the 
context. :) 

> AvroSerde reads the remote schema-file at least once per mapper, per table 
> reference.
> -
>
> Key: HIVE-14792
> URL: https://issues.apache.org/jira/browse/HIVE-14792
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Mithun Radhakrishnan
>Assignee: Aihua Xu
>Priority: Major
>  Labels: TODOC2.2, TODOC2.4
> Fix For: 3.0.0, 2.4.0, 2.2.1
>
> Attachments: HIVE-14792.1.patch, HIVE-14792.3.patch, 
> HIVE-14792.4.patch
>
>
> Avro tables that use "external" schema files stored on HDFS can cause 
> excessive calls to {{FileSystem::open()}}, especially for queries that spawn 
> large numbers of mappers.
> This is because of the following code in {{AvroSerDe::initialize()}}:
> {code:title=AvroSerDe.java|borderStyle=solid}
> public void initialize(Configuration configuration, Properties properties) 
> throws SerDeException {
> // ...
> if (hasExternalSchema(properties)
> || columnNameProperty == null || columnNameProperty.isEmpty()
> || columnTypeProperty == null || columnTypeProperty.isEmpty()) {
>   schema = determineSchemaOrReturnErrorSchema(configuration, properties);
> } else {
>   // Get column names and sort order
>   columnNames = Arrays.asList(columnNameProperty.split(","));
>   columnTypes = 
> TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProperty);
>   schema = getSchemaFromCols(properties, columnNames, columnTypes, 
> columnCommentProperty);
>  
> properties.setProperty(AvroSerdeUtils.AvroTableProperties.SCHEMA_LITERAL.getPropName(),
>  schema.toString());
> }
> // ...
> }
> {code}
> For tables using {{avro.schema.url}}, every time the SerDe is initialized 
> (i.e. at least once per mapper), the schema file is read remotely. For 
> queries with thousands of mappers, this leads to a stampede to the handful 
> (3?) datanodes that host the schema-file. In the best case, this causes 
> slowdowns.
> It would be preferable to distribute the Avro-schema to all mappers as part 
> of the job-conf. The alternatives aren't exactly appealing:
> # One can't rely solely on the {{column.list.types}} stored in the Hive 
> metastore. (HIVE-14789).
> # {{avro.schema.literal}} might not always be usable, because of the 
> size-limit on table-parameters. The typical size of the Avro-schema file is 
> between 0.5-3MB, in my limited experience. Bumping the max table-parameter 
> size isn't a great solution.
> If the {{avro.schema.file}} were read during query-planning, and made 
> available as part of table-properties (but not serialized into the 
> metastore), the downstream logic will remain largely intact. I have a patch 
> that does this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18723) CompactorOutputCommitter.commitJob() - check rename() ret val

2018-03-08 Thread Kryvenko Igor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kryvenko Igor updated HIVE-18723:
-
Attachment: HIVE-18723.3.patch

> CompactorOutputCommitter.commitJob() - check rename() ret val
> -
>
> Key: HIVE-18723
> URL: https://issues.apache.org/jira/browse/HIVE-18723
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Kryvenko Igor
>Priority: Major
> Attachments: HIVE-18723.1.patch, HIVE-18723.2.patch, 
> HIVE-18723.3.patch, HIVE-18723.patch
>
>
> right now ret val is ignored {{fs.rename(fileStatus.getPath(), newPath); }}
> Should this use {{FileUtils.ename(FileSystem fs, Path sourcePath, Path 
> destPath, Configuration conf) }}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18343) Remove LinkedList from ColumnStatsSemanticAnalyzer.java

2018-03-08 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-18343:
---
Status: Patch Available  (was: Open)

Added patch to address a bunch of check-style issues as well

> Remove LinkedList from ColumnStatsSemanticAnalyzer.java
> ---
>
> Key: HIVE-18343
> URL: https://issues.apache.org/jira/browse/HIVE-18343
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-18343.1.patch, HIVE-18343.2.patch
>
>
> Remove {{LinkedList}} in favor of {{ArrayList}} for class 
> {{org.apache.hadoop.hive.ql.parse.ColumnStatsSemanticAnalyzer}}.
> {quote}
> The size, isEmpty, get, set, iterator, and listIterator operations run in 
> constant time. The add operation runs in amortized constant time, that is, 
> adding n elements requires O\(n\) time. All of the other operations run in 
> linear time (roughly speaking). *The constant factor is low compared to that 
> for the LinkedList implementation.*
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-14792) AvroSerde reads the remote schema-file at least once per mapper, per table reference.

2018-03-08 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392004#comment-16392004
 ] 

Mithun Radhakrishnan commented on HIVE-14792:
-

Hey, [~aihuaxu]. Thanks for verifying that the tests are not related to my 
changes, and for restoring the default.

Am I allowed to +1 this change? ;]

> AvroSerde reads the remote schema-file at least once per mapper, per table 
> reference.
> -
>
> Key: HIVE-14792
> URL: https://issues.apache.org/jira/browse/HIVE-14792
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Mithun Radhakrishnan
>Assignee: Aihua Xu
>Priority: Major
>  Labels: TODOC2.2, TODOC2.4
> Fix For: 3.0.0, 2.4.0, 2.2.1
>
> Attachments: HIVE-14792.1.patch, HIVE-14792.3.patch, 
> HIVE-14792.4.patch
>
>
> Avro tables that use "external" schema files stored on HDFS can cause 
> excessive calls to {{FileSystem::open()}}, especially for queries that spawn 
> large numbers of mappers.
> This is because of the following code in {{AvroSerDe::initialize()}}:
> {code:title=AvroSerDe.java|borderStyle=solid}
> public void initialize(Configuration configuration, Properties properties) 
> throws SerDeException {
> // ...
> if (hasExternalSchema(properties)
> || columnNameProperty == null || columnNameProperty.isEmpty()
> || columnTypeProperty == null || columnTypeProperty.isEmpty()) {
>   schema = determineSchemaOrReturnErrorSchema(configuration, properties);
> } else {
>   // Get column names and sort order
>   columnNames = Arrays.asList(columnNameProperty.split(","));
>   columnTypes = 
> TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProperty);
>   schema = getSchemaFromCols(properties, columnNames, columnTypes, 
> columnCommentProperty);
>  
> properties.setProperty(AvroSerdeUtils.AvroTableProperties.SCHEMA_LITERAL.getPropName(),
>  schema.toString());
> }
> // ...
> }
> {code}
> For tables using {{avro.schema.url}}, every time the SerDe is initialized 
> (i.e. at least once per mapper), the schema file is read remotely. For 
> queries with thousands of mappers, this leads to a stampede to the handful 
> (3?) datanodes that host the schema-file. In the best case, this causes 
> slowdowns.
> It would be preferable to distribute the Avro-schema to all mappers as part 
> of the job-conf. The alternatives aren't exactly appealing:
> # One can't rely solely on the {{column.list.types}} stored in the Hive 
> metastore. (HIVE-14789).
> # {{avro.schema.literal}} might not always be usable, because of the 
> size-limit on table-parameters. The typical size of the Avro-schema file is 
> between 0.5-3MB, in my limited experience. Bumping the max table-parameter 
> size isn't a great solution.
> If the {{avro.schema.file}} were read during query-planning, and made 
> available as part of table-properties (but not serialized into the 
> metastore), the downstream logic will remain largely intact. I have a patch 
> that does this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18034) Improving logging with HoS executors spend lots of time in GC

2018-03-08 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-18034:

Attachment: HIVE-18034.6.patch

> Improving logging with HoS executors spend lots of time in GC
> -
>
> Key: HIVE-18034
> URL: https://issues.apache.org/jira/browse/HIVE-18034
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18034.1.patch, HIVE-18034.2.patch, 
> HIVE-18034.3.patch, HIVE-18034.4.patch, HIVE-18034.6.patch
>
>
> There are times when Spark will spend lots of time doing GC. The Spark 
> History UI shows a bunch of red flags when too much time is spent in GC. It 
> would be nice if those warnings are propagated to Hive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18343) Remove LinkedList from ColumnStatsSemanticAnalyzer.java

2018-03-08 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-18343:
---
Attachment: HIVE-18343.2.patch

> Remove LinkedList from ColumnStatsSemanticAnalyzer.java
> ---
>
> Key: HIVE-18343
> URL: https://issues.apache.org/jira/browse/HIVE-18343
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-18343.1.patch, HIVE-18343.2.patch
>
>
> Remove {{LinkedList}} in favor of {{ArrayList}} for class 
> {{org.apache.hadoop.hive.ql.parse.ColumnStatsSemanticAnalyzer}}.
> {quote}
> The size, isEmpty, get, set, iterator, and listIterator operations run in 
> constant time. The add operation runs in amortized constant time, that is, 
> adding n elements requires O\(n\) time. All of the other operations run in 
> linear time (roughly speaking). *The constant factor is low compared to that 
> for the LinkedList implementation.*
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18343) Remove LinkedList from ColumnStatsSemanticAnalyzer.java

2018-03-08 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-18343:
---
Status: Open  (was: Patch Available)

> Remove LinkedList from ColumnStatsSemanticAnalyzer.java
> ---
>
> Key: HIVE-18343
> URL: https://issues.apache.org/jira/browse/HIVE-18343
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-18343.1.patch, HIVE-18343.2.patch
>
>
> Remove {{LinkedList}} in favor of {{ArrayList}} for class 
> {{org.apache.hadoop.hive.ql.parse.ColumnStatsSemanticAnalyzer}}.
> {quote}
> The size, isEmpty, get, set, iterator, and listIterator operations run in 
> constant time. The add operation runs in amortized constant time, that is, 
> adding n elements requires O\(n\) time. All of the other operations run in 
> linear time (roughly speaking). *The constant factor is low compared to that 
> for the LinkedList implementation.*
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18883) Add findbugs to yetus pre-commit checks

2018-03-08 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-18883:

Attachment: HIVE-18883.2.patch

> Add findbugs to yetus pre-commit checks
> ---
>
> Key: HIVE-18883
> URL: https://issues.apache.org/jira/browse/HIVE-18883
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18883.1.patch, HIVE-18883.2.patch
>
>
> We should enable FindBugs for our YETUS pre-commit checks, this will help 
> overall code quality and should decrease the overall number of bugs in Hive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18883) Add findbugs to yetus pre-commit checks

2018-03-08 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16391980#comment-16391980
 ] 

Sahil Takiar commented on HIVE-18883:
-

Hive QA hasn't run yet, but attaching updated patch:

* Addressed Peter's comment
* Got FindBugs working for the standalone-metastore

> Add findbugs to yetus pre-commit checks
> ---
>
> Key: HIVE-18883
> URL: https://issues.apache.org/jira/browse/HIVE-18883
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18883.1.patch, HIVE-18883.2.patch
>
>
> We should enable FindBugs for our YETUS pre-commit checks, this will help 
> overall code quality and should decrease the overall number of bugs in Hive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18878) Lower MoveTask Lock Logging to Debug

2018-03-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16391971#comment-16391971
 ] 

Hive QA commented on HIVE-18878:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
13s{color} | {color:red} The patch generated 49 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 14m  1s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-9558/dev-support/hive-personality.sh
 |
| git revision | master / 9b36ffa |
| Default Java | 1.8.0_111 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9558/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9558/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Lower MoveTask Lock Logging to Debug
> 
>
> Key: HIVE-18878
> URL: https://issues.apache.org/jira/browse/HIVE-18878
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Kryvenko Igor
>Priority: Minor
>  Labels: noob
> Attachments: HIVE-18878.1.patch, HIVE-18878.patch
>
>
> https://github.com/apache/hive/blob/cbb9233a3b39ab8489d777fc76f0758c49b69bef/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java#L233-L234
> {code}
> LOG.info("about to release lock for output: {} lock: {}", output,
>   lock.getHiveLockObject().getName());
> try {
>   lockMgr.unlock(lock);
> } catch (LockException le) {
> {code}
> Change this to _debug_ level logging.  The lock manager's 'unlock' method 
> should have additional logging available.  Knowing that the unlock is about 
> to take place is only helpful for debugging, not for normal operation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18889) update all parts of Hive to use the same Guava version

2018-03-08 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18889:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the review!

> update all parts of Hive to use the same Guava version
> --
>
> Key: HIVE-18889
> URL: https://issues.apache.org/jira/browse/HIVE-18889
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18889.01.patch, HIVE-18889.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18910) Migrate to Murmur hash for shuffle and bucketing

2018-03-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16391946#comment-16391946
 ] 

Hive QA commented on HIVE-18910:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12913657/HIVE-18910.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9557/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9557/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9557/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-03-08 21:39:07.418
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-9557/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-03-08 21:39:07.421
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   7edb1d6..9b36ffa  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 7edb1d6 HIVE-18861 : druid-hdfs-storage is pulling in 
hadoop-aws-2.7.x and aws SDK, creating classpath problems on hadoop 3.x (Steve 
Loughran via Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at 9b36ffa HIVE-18571 : stats issues for MM tables; ACID doesn't 
check state for CTAS (Sergey Shelukhin, reviewed by Eugene Koifman)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-03-08 21:39:15.725
+ rm -rf ../yetus_PreCommit-HIVE-Build-9557
+ mkdir ../yetus_PreCommit-HIVE-Build-9557
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-9557
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-9557/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: 
a/hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java:
 does not exist in index
error: 
a/hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/BucketIdResolver.java:
 does not exist in index
error: 
a/hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/BucketIdResolverImpl.java:
 does not exist in index
error: 
a/hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/MutatorCoordinator.java:
 does not exist in index
error: 
a/hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/worker/TestBucketIdResolverImpl.java:
 does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java: 
does not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/keyseries/VectorKeySeriesSerializedImpl.java:
 does not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkCommonOperator.java:
 does not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkObjectHashOperator.java:
 does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java: does not 
exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java: does 
not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/FixedBucketPruningOptimizer.java:
 does not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/PrunerOperatorFactory.java: 
does not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java:
 does not exist in index
error: 
a

[jira] [Commented] (HIVE-18831) Differentiate errors that are thrown by Spark tasks

2018-03-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16391930#comment-16391930
 ] 

Hive QA commented on HIVE-18831:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12913484/HIVE-18831.2.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 26 failed/errored test(s), 13353 tests 
executed
*Failed tests:*
{noformat}
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=95)

[udf_invalid.q,authorization_uri_export.q,default_constraint_complex_default_value.q,druid_datasource2.q,view_update.q,default_partition_name.q,authorization_public_create.q,load_wrong_fileformat_rc_seq.q,default_constraint_invalid_type.q,altern1.q,describe_xpath1.q,drop_view_failure2.q,temp_table_rename.q,invalid_select_column_with_subquery.q,udf_trunc_error1.q,insert_view_failure.q,dbtxnmgr_nodbunlock.q,authorization_show_columns.q,cte_recursion.q,merge_constraint_notnull.q,load_part_nospec.q,clusterbyorderby.q,orc_type_promotion2.q,ctas_noperm_loc.q,udf_instr_wrong_args_len.q,invalid_create_tbl2.q,part_col_complex_type.q,authorization_drop_db_empty.q,smb_mapjoin_14.q,subquery_scalar_multi_rows.q,alter_partition_coltype_2columns.q,subquery_corr_in_agg.q,insert_overwrite_notnull_constraint.q,authorization_show_grant_otheruser_wtab.q,regex_col_groupby.q,udaf_collect_set_unsupported.q,ptf_negative_DuplicateWindowAlias.q,exim_22_export_authfail.q,udf_likeany_wrong1.q,groupby_key.q,ambiguous_col.q,groupby3_multi_distinct.q,authorization_alter_drop_ptn.q,invalid_cast_from_binary_5.q,show_create_table_does_not_exist.q,invalid_select_column.q,exim_20_managed_location_over_existing.q,interval_3.q,authorization_compile.q,join35.q,merge_negative_3.q,udf_concat_ws_wrong3.q,create_or_replace_view8.q,create_external_with_notnull_constraint.q,split_sample_out_of_range.q,materialized_view_no_transactional_rewrite.q,authorization_show_grant_otherrole.q,create_with_constraints_duplicate_name.q,invalid_stddev_samp_syntax.q,authorization_view_disable_cbo_7.q,autolocal1.q,avro_non_nullable_union.q,load_orc_negative_part.q,drop_view_failure1.q,columnstats_partlvl_invalid_values_autogather.q,exim_13_nonnative_import.q,alter_table_wrong_regex.q,add_partition_with_whitelist.q,udf_next_day_error_2.q,authorization_select.q,udf_trunc_error2.q,authorization_view_7.q,udf_format_number_wrong5.q,touch2.q,exim_03_nonpart_noncompat_colschema.q,orc_type_promotion1.q,lateral_view_alias.q,show_tables_bad_db1.q,unset_table_property.q,alter_non_native.q,nvl_mismatch_type.q,load_orc_negative3.q,authorization_create_role_no_admin.q,invalid_distinct1.q,authorization_grant_server.q,orc_type_promotion3_acid.q,show_tables_bad1.q,macro_unused_parameter.q,drop_invalid_constraint3.q,drop_partition_filter_failure.q,char_pad_convert_fail3.q,exim_23_import_exist_authfail.q,drop_invalid_constraint4.q,authorization_create_macro1.q,archive1.q,subquery_multiple_cols_in_select.q,change_hive_hdfs_session_path.q,udf_trunc_error3.q,invalid_variance_syntax.q,authorization_truncate_2.q,invalid_avg_syntax.q,invalid_select_column_with_tablename.q,mm_truncate_cols.q,groupby_grouping_sets1.q,druid_location.q,groupby2_multi_distinct.q,authorization_sba_drop_table.q,dynamic_partitions_with_whitelist.q,compare_string_bigint_2.q,udf_greatest_error_2.q,authorization_view_6.q,show_tablestatus.q,duplicate_alias_in_transform_schema.q,create_with_fk_uk_same_tab.q,udtf_not_supported3.q,alter_table_constraint_invalid_fk_col2.q,udtf_not_supported1.q,dbtxnmgr_notableunlock.q,ptf_negative_InvalidValueBoundary.q,alter_table_constraint_duplicate_pk.q,udf_printf_wrong4.q,create_view_failure9.q,udf_elt_wrong_type.q,selectDistinctStarNeg_1.q,invalid_mapjoin1.q,load_stored_as_dirs.q,input1.q,udf_sort_array_wrong1.q,invalid_distinct2.q,invalid_select_fn.q,authorization_role_grant_otherrole.q,archive4.q,load_nonpart_authfail.q,recursive_view.q,authorization_view_disable_cbo_1.q,desc_failure4.q,create_not_acid.q,udf_sort_array_wrong3.q,char_pad_convert_fail0.q,udf_map_values_arg_type.q,alter_view_failure6_2.q,alter_partition_change_col_nonexist.q,update_non_acid_table.q,authorization_view_disable_cbo_5.q,ct_noperm_loc.q,interval_1.q,authorization_show_grant_otheruser_all.q,authorization_view_2.q,show_tables_bad2.q,groupby_rollup2.q,truncate_column_seqfile.q,create_view_failure5.q,authorization_create_view.q,ptf_window_boundaries.q,ctasnullcol.q,input_part0_neg_2.q,create_or_replace_view1.q,udf_max.q,exim_01_nonpart_over_loaded.q,msck_repair_1.q,orc_change_fileformat_acid.q,udf_nonexistent_resource.q,exim_19_external_over_existing.q,serde_regex2.q,msck_repair_2.q,exim_06_nonpart_noncompat_storage.q,illegal_partition_type4.q,udf_sort_array_by_wrong1.q,create_or_replace_view5.q,windowing_leadlag_in_udaf.q,avro_decimal.q,materialized_view_updat

[jira] [Commented] (HIVE-18911) LOAD.. code for MM has some suspect/dead code

2018-03-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16391914#comment-16391914
 ] 

Sergey Shelukhin commented on HIVE-18911:
-

cc [~ekoifman] [~steveyeom2017]

> LOAD.. code for MM has some suspect/dead code
> -
>
> Key: HIVE-18911
> URL: https://issues.apache.org/jira/browse/HIVE-18911
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Discovered in HIVE-18571 and added TODO-s that need to be addressed.
> E.g. {noformat}
> if (isMmTableWrite) {
>// We will load into MM directory, and delete from the parent if 
> needed.
>   // TODO: this looks invalid after ACID integration. What about base 
> dirs?
>destPath = new Path(destPath, AcidUtils.deltaSubdir(writeId, 
> writeId, stmtId));
> ...
>  // TODO: loadFileType for MM table will no longer be REPLACE_ALL
>filter = (loadFileType == LoadFileType.REPLACE_ALL)
> {noformat}
> 2 places like that
> Also replaceFiles has isMmTableWrite flag that should no longer be needed 
> (since for a transactional table we should never replace files). Either 
> there's some invalid code path that relies on it (load table?), or it is just 
> unused and needs to be removed.
> Also used in 2 places, "TODO: this should never run for MM tables anymore. 
> Remove the flag, and maybe the filter?"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18912) Stats task should set accurate flag to false on failure

2018-03-08 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18912:

Description: 
In my testing, I noticed that sometimes when stats task fails, the 
accurate=true flag is not cleaned and stats become misleading.

See the code where it presets it e.g. for txn table.
{noformat}
StatsSetupConst.setBasicStatsState(parameters, StatsSetupConst.FALSE);
{noformat}

  was:
See the code where it presets it e.g. for txn table.
{noformat}
StatsSetupConst.setBasicStatsState(parameters, StatsSetupConst.FALSE);
{noformat}
In my testing, I noticed that sometimes when stats task fails, the 
accurate=true flag is not cleaned.


> Stats task should set accurate flag to false on failure
> ---
>
> Key: HIVE-18912
> URL: https://issues.apache.org/jira/browse/HIVE-18912
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> In my testing, I noticed that sometimes when stats task fails, the 
> accurate=true flag is not cleaned and stats become misleading.
> See the code where it presets it e.g. for txn table.
> {noformat}
> StatsSetupConst.setBasicStatsState(parameters, StatsSetupConst.FALSE);
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18912) Stats task should set accurate flag to false on failure

2018-03-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16391912#comment-16391912
 ] 

Sergey Shelukhin commented on HIVE-18912:
-

[~ashutoshc] fyi

> Stats task should set accurate flag to false on failure
> ---
>
> Key: HIVE-18912
> URL: https://issues.apache.org/jira/browse/HIVE-18912
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> See the code where it presets it e.g. for txn table.
> {noformat}
> StatsSetupConst.setBasicStatsState(parameters, StatsSetupConst.FALSE);
> {noformat}
> In my testing, I noticed that sometimes when stats task fails, the 
> accurate=true flag is not cleaned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18344) Remove LinkedList from SharedWorkOptimizer.java

2018-03-08 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16391911#comment-16391911
 ] 

BELUGA BEHR commented on HIVE-18344:


[~pvary] [~aihuaxu] Commit please :)

> Remove LinkedList from SharedWorkOptimizer.java
> ---
>
> Key: HIVE-18344
> URL: https://issues.apache.org/jira/browse/HIVE-18344
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-18344.1.patch
>
>
> Prefer {{ArrayList}} over {{LinkedList}} especially in this class because the 
> initial size of the collection is known.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18910) Migrate to Murmur hash for shuffle and bucketing

2018-03-08 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-18910:
--
Attachment: HIVE-18910.1.patch

> Migrate to Murmur hash for shuffle and bucketing
> 
>
> Key: HIVE-18910
> URL: https://issues.apache.org/jira/browse/HIVE-18910
> Project: Hive
>  Issue Type: Task
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18910.1.patch
>
>
> Hive uses JAVA hash which is not as good as murmur for better distribution 
> and efficiency in bucketing a table.
> Migrate to murmur hash but still keep backward compatibility for existing 
> users so that they dont have to reload the existing tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16855) org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements

2018-03-08 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16391908#comment-16391908
 ] 

BELUGA BEHR commented on HIVE-16855:


[~stakiar] Posted new patch with the {Arrays#fill}} added back in.  Thanks!

> org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements
> --
>
> Key: HIVE-16855
> URL: https://issues.apache.org/jira/browse/HIVE-16855
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16855.1.patch, HIVE-16855.2.patch, 
> HIVE-16855.3.patch
>
>
> # Improve (Simplify) Logging
> # Remove custom buffer size for {{BufferedInputStream}} and instead rely on 
> JVM default which is often larger these days (8192)
> # Simplify looping logic



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18912) Stats task should set accurate flag to false on failure

2018-03-08 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18912:

Summary: Stats task should set accurate flag to false on failure  (was: 
Stats task should set inaccurate flag on failure)

> Stats task should set accurate flag to false on failure
> ---
>
> Key: HIVE-18912
> URL: https://issues.apache.org/jira/browse/HIVE-18912
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> See the code where it presets it e.g. for txn table.
> {noformat}
> StatsSetupConst.setBasicStatsState(parameters, StatsSetupConst.FALSE);
> {noformat}
> In my testing, I noticed that sometimes when stats task fails, the 
> accurate=true flag is not cleaned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-16855) org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements

2018-03-08 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16391908#comment-16391908
 ] 

BELUGA BEHR edited comment on HIVE-16855 at 3/8/18 9:07 PM:


[~stakiar] Posted new patch with the {{Arrays#fill}} added back in.  Thanks!


was (Author: belugabehr):
[~stakiar] Posted new patch with the {Arrays#fill}} added back in.  Thanks!

> org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements
> --
>
> Key: HIVE-16855
> URL: https://issues.apache.org/jira/browse/HIVE-16855
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16855.1.patch, HIVE-16855.2.patch, 
> HIVE-16855.3.patch
>
>
> # Improve (Simplify) Logging
> # Remove custom buffer size for {{BufferedInputStream}} and instead rely on 
> JVM default which is often larger these days (8192)
> # Simplify looping logic



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-16855) org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements

2018-03-08 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-16855:
---
Status: Patch Available  (was: Open)

> org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements
> --
>
> Key: HIVE-16855
> URL: https://issues.apache.org/jira/browse/HIVE-16855
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16855.1.patch, HIVE-16855.2.patch, 
> HIVE-16855.3.patch
>
>
> # Improve (Simplify) Logging
> # Remove custom buffer size for {{BufferedInputStream}} and instead rely on 
> JVM default which is often larger these days (8192)
> # Simplify looping logic



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (HIVE-18910) Migrate to Murmur hash for shuffle and bucketing

2018-03-08 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-18910 started by Deepak Jaiswal.
-
> Migrate to Murmur hash for shuffle and bucketing
> 
>
> Key: HIVE-18910
> URL: https://issues.apache.org/jira/browse/HIVE-18910
> Project: Hive
>  Issue Type: Task
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
>
> Hive uses JAVA hash which is not as good as murmur for better distribution 
> and efficiency in bucketing a table.
> Migrate to murmur hash but still keep backward compatibility for existing 
> users so that they dont have to reload the existing tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18910) Migrate to Murmur hash for shuffle and bucketing

2018-03-08 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-18910:
--
Status: Patch Available  (was: In Progress)

> Migrate to Murmur hash for shuffle and bucketing
> 
>
> Key: HIVE-18910
> URL: https://issues.apache.org/jira/browse/HIVE-18910
> Project: Hive
>  Issue Type: Task
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
>
> Hive uses JAVA hash which is not as good as murmur for better distribution 
> and efficiency in bucketing a table.
> Migrate to murmur hash but still keep backward compatibility for existing 
> users so that they dont have to reload the existing tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18875) Enable SMB Join by default in Tez

2018-03-08 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-18875:
--
Issue Type: Task  (was: Bug)

> Enable SMB Join by default in Tez
> -
>
> Key: HIVE-18875
> URL: https://issues.apache.org/jira/browse/HIVE-18875
> Project: Hive
>  Issue Type: Task
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18875.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-16855) org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements

2018-03-08 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-16855:
---
Status: Open  (was: Patch Available)

> org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements
> --
>
> Key: HIVE-16855
> URL: https://issues.apache.org/jira/browse/HIVE-16855
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16855.1.patch, HIVE-16855.2.patch, 
> HIVE-16855.3.patch
>
>
> # Improve (Simplify) Logging
> # Remove custom buffer size for {{BufferedInputStream}} and instead rely on 
> JVM default which is often larger these days (8192)
> # Simplify looping logic



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-16855) org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements

2018-03-08 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-16855:
---
Attachment: HIVE-16855.3.patch

> org.apache.hadoop.hive.ql.exec.mr.HashTableLoader Improvements
> --
>
> Key: HIVE-16855
> URL: https://issues.apache.org/jira/browse/HIVE-16855
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16855.1.patch, HIVE-16855.2.patch, 
> HIVE-16855.3.patch
>
>
> # Improve (Simplify) Logging
> # Remove custom buffer size for {{BufferedInputStream}} and instead rely on 
> JVM default which is often larger these days (8192)
> # Simplify looping logic



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18910) Migrate to Murmur hash for shuffle and bucketing

2018-03-08 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-18910:
--
Issue Type: Task  (was: Bug)

> Migrate to Murmur hash for shuffle and bucketing
> 
>
> Key: HIVE-18910
> URL: https://issues.apache.org/jira/browse/HIVE-18910
> Project: Hive
>  Issue Type: Task
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
>
> Hive uses JAVA hash which is not as good as murmur for better distribution 
> and efficiency in bucketing a table.
> Migrate to murmur hash but still keep backward compatibility for existing 
> users so that they dont have to reload the existing tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-18910) Migrate to Murmur hash for shuffle and bucketing

2018-03-08 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal reassigned HIVE-18910:
-


> Migrate to Murmur hash for shuffle and bucketing
> 
>
> Key: HIVE-18910
> URL: https://issues.apache.org/jira/browse/HIVE-18910
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
>
> Hive uses JAVA hash which is not as good as murmur for better distribution 
> and efficiency in bucketing a table.
> Migrate to murmur hash but still keep backward compatibility for existing 
> users so that they dont have to reload the existing tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18878) Lower MoveTask Lock Logging to Debug

2018-03-08 Thread Kryvenko Igor (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16391904#comment-16391904
 ] 

Kryvenko Igor commented on HIVE-18878:
--

[~stakiar] Please, could you review please?

> Lower MoveTask Lock Logging to Debug
> 
>
> Key: HIVE-18878
> URL: https://issues.apache.org/jira/browse/HIVE-18878
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Kryvenko Igor
>Priority: Minor
>  Labels: noob
> Attachments: HIVE-18878.1.patch, HIVE-18878.patch
>
>
> https://github.com/apache/hive/blob/cbb9233a3b39ab8489d777fc76f0758c49b69bef/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java#L233-L234
> {code}
> LOG.info("about to release lock for output: {} lock: {}", output,
>   lock.getHiveLockObject().getName());
> try {
>   lockMgr.unlock(lock);
> } catch (LockException le) {
> {code}
> Change this to _debug_ level logging.  The lock manager's 'unlock' method 
> should have additional logging available.  Knowing that the unlock is about 
> to take place is only helpful for debugging, not for normal operation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >