date:20180404

[jira] [Commented] (HIVE-19014) utilize YARN-8028 (queue ACL check) in Hive Tez session pool

2018-04-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426490#comment-16426490
 ] 

Hive QA commented on HIVE-19014:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12917432/HIVE-19014.05.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 230 failed/errored test(s), 13309 tests 
executed
*Failed tests:*
{noformat}
TestBeeLineDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=252)
TestCopyUtils - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=246)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=252)
TestExportImport - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed 
out) (batchId=246)
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=252)
TestMiniDruidKafkaCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=252)
TestMinimrCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=93)

[infer_bucket_sort_num_buckets.q,infer_bucket_sort_reducers_power_two.q,parallel_orderby.q,bucket_num_reducers_acid.q,infer_bucket_sort_map_operators.q,infer_bucket_sort_merge.q,root_dir_external_table.q,infer_bucket_sort_dyn_part.q,udf_using.q,bucket_num_reducers_acid2.q]
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=95)

[jira] [Updated] (HIVE-18910) Migrate to Murmur hash for shuffle and bucketing

2018-04-04 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-18910:
--
Attachment: HIVE-18910.23.patch

> Migrate to Murmur hash for shuffle and bucketing
> 
>
> Key: HIVE-18910
> URL: https://issues.apache.org/jira/browse/HIVE-18910
> Project: Hive
>  Issue Type: Task
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18910.1.patch, HIVE-18910.10.patch, 
> HIVE-18910.11.patch, HIVE-18910.12.patch, HIVE-18910.13.patch, 
> HIVE-18910.14.patch, HIVE-18910.15.patch, HIVE-18910.16.patch, 
> HIVE-18910.17.patch, HIVE-18910.18.patch, HIVE-18910.19.patch, 
> HIVE-18910.2.patch, HIVE-18910.20.patch, HIVE-18910.21.patch, 
> HIVE-18910.22.patch, HIVE-18910.23.patch, HIVE-18910.3.patch, 
> HIVE-18910.4.patch, HIVE-18910.5.patch, HIVE-18910.6.patch, 
> HIVE-18910.7.patch, HIVE-18910.8.patch, HIVE-18910.9.patch
>
>
> Hive uses JAVA hash which is not as good as murmur for better distribution 
> and efficiency in bucketing a table.
> Migrate to murmur hash but still keep backward compatibility for existing 
> users so that they dont have to reload the existing tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19115) Merge: Semijoin hints are dropped by the merge

2018-04-04 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-19115:
---
Description: 
{code}
create table target stored as orc as select ss_ticket_number, ss_item_sk, 
current_timestamp as `ts` from tpcds_bin_partitioned_orc_1000.store_sales;

create table source stored as orc as select sr_ticket_number, sr_item_sk, 
d_date from tpcds_bin_partitioned_orc_1000.store_returns join 
tpcds_bin_partitioned_orc_1000.date_dim where d_date_sk = sr_returned_date_sk;


merge /* +semi(T, sr_ticket_number, S, 1) */ into target T using (select * 
from source where year(d_date) = 1998) S ON T.ss_ticket_number = 
S.sr_ticket_number and sr_item_sk = ss_item_sk 
when matched THEN UPDATE SET ts = current_timestamp
when not matched and sr_item_sk is not null and sr_ticket_number is not null 
THEN INSERT VALUES(S.sr_ticket_number, S.sr_item_sk, current_timestamp);
{code}

The semijoin hints are ignored and the code says 

{code}
 todo: do we care to preserve comments in original SQL?
{code}


https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java#L624

in this case we do.


  was:
{code}
create table target stored as orc as select ss_ticket_number, ss_item_sk, 
current_timestamp as `ts` from tpcds_bin_partitioned_orc_1000.store_sales;

create table source stored as orc as select sr_ticket_number, sr_item_sk, 
d_date from tpcds_bin_partitioned_orc_1000.store_returns join 
tpcds_bin_partitioned_orc_1000.date_dim where d_date_sk = sr_returned_date_sk;


merge /* +semi(T, sr_ticket_number, S, 1) */ into target T using (select * 
from source where year(d_date) = 1998) S ON T.ss_ticket_number = 
S.sr_ticket_number and sr_item_sk = ss_item_sk 
when matched is null THEN UPDATE SET ts = current_timestamp
when not matched and sr_item_sk is not null and sr_ticket_number is not null 
THEN INSERT VALUES(S.sr_ticket_number, S.sr_item_sk, current_timestamp);
{code}

The semijoin hints are ignored and the code says 

{code}
 todo: do we care to preserve comments in original SQL?
{code}


https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java#L624

in this case we do.



> Merge: Semijoin hints are dropped by the merge
> --
>
> Key: HIVE-19115
> URL: https://issues.apache.org/jira/browse/HIVE-19115
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Priority: Major
>
> {code}
> create table target stored as orc as select ss_ticket_number, ss_item_sk, 
> current_timestamp as `ts` from tpcds_bin_partitioned_orc_1000.store_sales;
> create table source stored as orc as select sr_ticket_number, sr_item_sk, 
> d_date from tpcds_bin_partitioned_orc_1000.store_returns join 
> tpcds_bin_partitioned_orc_1000.date_dim where d_date_sk = sr_returned_date_sk;
> merge /* +semi(T, sr_ticket_number, S, 1) */ into target T using (select 
> * from source where year(d_date) = 1998) S ON T.ss_ticket_number = 
> S.sr_ticket_number and sr_item_sk = ss_item_sk 
> when matched THEN UPDATE SET ts = current_timestamp
> when not matched and sr_item_sk is not null and sr_ticket_number is not null 
> THEN INSERT VALUES(S.sr_ticket_number, S.sr_item_sk, current_timestamp);
> {code}
> The semijoin hints are ignored and the code says 
> {code}
>  todo: do we care to preserve comments in original SQL?
> {code}
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java#L624
> in this case we do.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19115) Merge: Semijoin hints are dropped by the merge

2018-04-04 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-19115:
---
Description: 
{code}
create table target stored as orc as select ss_ticket_number, ss_item_sk, 
current_timestamp as `ts` from tpcds_bin_partitioned_orc_1000.store_sales;

create table source stored as orc as select sr_ticket_number, sr_item_sk, 
d_date from tpcds_bin_partitioned_orc_1000.store_returns join 
tpcds_bin_partitioned_orc_1000.date_dim where d_date_sk = sr_returned_date_sk;


merge /* +semi(T, sr_ticket_number, S, 1) */ into target T using (select * 
from source where year(d_date) = 1998) S ON T.ss_ticket_number = 
S.sr_ticket_number and sr_item_sk = ss_item_sk 
when matched is null THEN UPDATE SET ts = current_timestamp
when not matched and sr_item_sk is not null and sr_ticket_number is not null 
THEN INSERT VALUES(S.sr_ticket_number, S.sr_item_sk, current_timestamp);
{code}

The semijoin hints are ignored and the code says 

{code}
 todo: do we care to preserve comments in original SQL?
{code}


https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java#L624

in this case we do.


  was:
{code}
create table target stored as orc as select ss_ticket_number, ss_item_sk, 
current_timestamp as `ts` from tpcds_bin_partitioned_orc_1000.store_sales;

create table source stored as orc as select sr_ticket_number, sr_item_sk, 
d_date from tpcds_bin_partitioned_orc_1000.store_returns join 
tpcds_bin_partitioned_orc_1000.date_dim where d_date_sk = sr_returned_date_sk;


explain
merge /* +semi(T, sr_ticket_number, S, 1) */ into target T using (select * 
from source where year(d_date) = 1998) S ON T.ss_ticket_number = 
S.sr_ticket_number and sr_item_sk = ss_item_sk 
when matched and ss_item_sk is null THEN UPDATE SET ts = current_timestamp
when not matched and ss_item_sk is null and sr_item_sk is not null and 
sr_ticket_number is not null THEN INSERT VALUES(S.sr_ticket_number, 
S.sr_item_sk, current_timestamp);
{code}

The semijoin hints are ignored and the code says 

{code}
 todo: do we care to preserve comments in original SQL?
{code}


https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java#L624

in this case we do.



> Merge: Semijoin hints are dropped by the merge
> --
>
> Key: HIVE-19115
> URL: https://issues.apache.org/jira/browse/HIVE-19115
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Priority: Major
>
> {code}
> create table target stored as orc as select ss_ticket_number, ss_item_sk, 
> current_timestamp as `ts` from tpcds_bin_partitioned_orc_1000.store_sales;
> create table source stored as orc as select sr_ticket_number, sr_item_sk, 
> d_date from tpcds_bin_partitioned_orc_1000.store_returns join 
> tpcds_bin_partitioned_orc_1000.date_dim where d_date_sk = sr_returned_date_sk;
> merge /* +semi(T, sr_ticket_number, S, 1) */ into target T using (select 
> * from source where year(d_date) = 1998) S ON T.ss_ticket_number = 
> S.sr_ticket_number and sr_item_sk = ss_item_sk 
> when matched is null THEN UPDATE SET ts = current_timestamp
> when not matched and sr_item_sk is not null and sr_ticket_number is not null 
> THEN INSERT VALUES(S.sr_ticket_number, S.sr_item_sk, current_timestamp);
> {code}
> The semijoin hints are ignored and the code says 
> {code}
>  todo: do we care to preserve comments in original SQL?
> {code}
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java#L624
> in this case we do.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19014) utilize YARN-8028 (queue ACL check) in Hive Tez session pool

2018-04-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426461#comment-16426461
 ] 

Hive QA commented on HIVE-19014:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
48s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
38s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
3s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
48s{color} | {color:red} ql: The patch generated 12 new + 428 unchanged - 0 
fixed = 440 total (was 428) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
16s{color} | {color:red} The patch generated 50 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 20s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-10004/dev-support/hive-personality.sh
 |
| git revision | master / dc5a943 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10004/yetus/diff-checkstyle-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10004/yetus/patch-asflicense-problems.txt
 |
| modules | C: common ql service U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10004/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> utilize YARN-8028 (queue ACL check) in Hive Tez session pool
> 
>
> Key: HIVE-19014
> URL: https://issues.apache.org/jira/browse/HIVE-19014
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19014.01.patch, HIVE-19014.02.patch, 
> HIVE-19014.03.patch, HIVE-19014.04.patch, HIVE-19014.05.patch, 
> HIVE-19014.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19112) Support Analyze table for partitioned tables without partition spec

2018-04-04 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-19112:
---
Status: Patch Available  (was: Open)

> Support Analyze table for partitioned tables without partition spec
> ---
>
> Key: HIVE-19112
> URL: https://issues.apache.org/jira/browse/HIVE-19112
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-19112.1.patch
>
>
> Currently to run analyze table compute statistics on a partitioned table 
> partition spec needs to be specified. We should make it optional.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19112) Support Analyze table for partitioned tables without partition spec

2018-04-04 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-19112:
---
Attachment: HIVE-19112.1.patch

> Support Analyze table for partitioned tables without partition spec
> ---
>
> Key: HIVE-19112
> URL: https://issues.apache.org/jira/browse/HIVE-19112
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-19112.1.patch
>
>
> Currently to run analyze table compute statistics on a partitioned table 
> partition spec needs to be specified. We should make it optional.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19044) Duplicate field names within Druid Query Generated by Calcite plan

2018-04-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426443#comment-16426443
 ] 

Hive QA commented on HIVE-19044:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12917428/HIVE-19044.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 100 failed/errored test(s), 13181 tests 
executed
*Failed tests:*
{noformat}
TestBeeLineDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=252)
TestCopyUtils - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=246)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=252)
TestExportImport - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed 
out) (batchId=246)
TestJdbcWithMiniKdcSQLAuthHttp - did not produce a TEST-*.xml file (likely 
timed out) (batchId=253)
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=252)
TestMiniDruidKafkaCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=252)
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=95)

[jira] [Assigned] (HIVE-19114) MV rewriting not being triggered for last query in materialized_view_rewrite_4.q

2018-04-04 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-19114:
--


> MV rewriting not being triggered for last query in 
> materialized_view_rewrite_4.q
> 
>
> Key: HIVE-19114
> URL: https://issues.apache.org/jira/browse/HIVE-19114
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> {code:sql}
> create materialized view mv1 enable rewrite as
> select dependents.empid, emps.deptno, count(distinct salary) as s
> from emps
> join dependents on (emps.empid = dependents.empid)
> group by dependents.empid, emps.deptno;
> select emps.deptno, count(distinct salary) as s
> from emps
> join dependents on (emps.empid = dependents.empid)
> group by dependents.empid, emps.deptno;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18839) Implement incremental rebuild for materialized views (only insert operations in source tables)

2018-04-04 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426416#comment-16426416
 ] 

Jesus Camacho Rodriguez commented on HIVE-18839:


[~ashutoshc], addressed comments and uploaded a new patch to RB and the issue. 
I left a couple of questions in RB.
In turn, after moving one of the old rewriting tests to MiniLlapLocalDriver, 
rewriting is not being triggered for one query (last query in 
{{materialized_view_rewrite_4.q}}), which is weird because logical plan should 
be the same (I am thinking on a different configuration property value that is 
altering the plan, maybe). Issue does not seem related in any case, so I will 
trigger that in a follow-up if that is OK.

> Implement incremental rebuild for materialized views (only insert operations 
> in source tables)
> --
>
> Key: HIVE-18839
> URL: https://issues.apache.org/jira/browse/HIVE-18839
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: TODOC3.0
> Attachments: HIVE-18839.01.patch, HIVE-18839.02.patch, 
> HIVE-18839.patch
>
>
> Implementation will follow current code path for full rebuild. 
> When the MV query plan is retrieved, if the MV contents are outdated because 
> there were insert operations in the source tables, we will introduce a filter 
> with a condition based on stored value of ValidWriteIdLists. For instance, 
> {{WRITE_ID < high_txn_id AND WRITE_ID NOT IN (x, y, ...)}}. Then the 
> rewriting will do the rest of the work by creating a partial rewriting, where 
> the contents of the MV are read as well as the new contents from the source 
> tables.
> This mechanism will not work only for ALTER MV... REBUILD, but also for user 
> queries which will be able to benefit from using outdated MVs to compute part 
> of the needed results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18839) Implement incremental rebuild for materialized views (only insert operations in source tables)

2018-04-04 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-18839:
---
Attachment: HIVE-18839.02.patch

> Implement incremental rebuild for materialized views (only insert operations 
> in source tables)
> --
>
> Key: HIVE-18839
> URL: https://issues.apache.org/jira/browse/HIVE-18839
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: TODOC3.0
> Attachments: HIVE-18839.01.patch, HIVE-18839.02.patch, 
> HIVE-18839.patch
>
>
> Implementation will follow current code path for full rebuild. 
> When the MV query plan is retrieved, if the MV contents are outdated because 
> there were insert operations in the source tables, we will introduce a filter 
> with a condition based on stored value of ValidWriteIdLists. For instance, 
> {{WRITE_ID < high_txn_id AND WRITE_ID NOT IN (x, y, ...)}}. Then the 
> rewriting will do the rest of the work by creating a partial rewriting, where 
> the contents of the MV are read as well as the new contents from the source 
> tables.
> This mechanism will not work only for ALTER MV... REBUILD, but also for user 
> queries which will be able to benefit from using outdated MVs to compute part 
> of the needed results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19102) Vectorization: Suppress known Q file bugs

2018-04-04 Thread Teddy Choi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426409#comment-16426409
 ] 

Teddy Choi commented on HIVE-19102:
---

+1 LGTM.

> Vectorization: Suppress known Q file bugs
> -
>
> Key: HIVE-19102
> URL: https://issues.apache.org/jira/browse/HIVE-19102
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-19102.01.patch
>
>
> There are known bugs recently found and reported that occur when 
> vectorization is turn on in Q files.  Until those bugs are fixed, add SET 
> statements to the top of the Q files that suppress vectorization.
> Change a few EXPLAIN VECTORIZATION to EXPLAIN where it isn't needed and 
> creates an unnecessary Q file difference during enable by default experiments.
>  * +TestCliDriver+
>  ** +Execution Failures+
>  *** *input_lazyserde.q*
>   HIVE-19088
>  *** *input_lazyserde2.q*
>   HIVE-19088, too.
>  *** *nested_column_pruning.q*
>   HIVE-19016
>  *** *parquet_map_of_arrays_of_ints.q*
>   HIVE-19015
>  *** *parquet_map_of_maps.q*
>   HIVE-19015, too.
>  *** *parquet_nested_complex.q*
>   HIVE-19016
>  ** +Wrong Results+
>  *** *delete_orig_table.q*
>   HIVE-19109
>  *** *offset_limit_global_optimizer.q*
>   Added ORDER BY clauses to fix different vectorization intermediate 
> results due to LIMIT clause.
>  *** *parquet_ppd_decimal.q*
>   HIVE-19108
>  *** *udf_context_aware.q*
>   Test detects vectorization and output changes.
>  Added "set hive.test.vectorized.execution.enabled.override=none;"
>  *** *vector_udf3.q*
>   Test detects vectorization and output changes.
>  Added "set hive.test.vectorized.execution.enabled.override=none;"
>  * +TestContribCliDriver+
> ** +Wrong Results+
>  *** *udf_example_arraymapstruct.q*
>   HIVE-19110



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19044) Duplicate field names within Druid Query Generated by Calcite plan

2018-04-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426408#comment-16426408
 ] 

Hive QA commented on HIVE-19044:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  1m  
1s{color} | {color:red} The patch generated 50 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}  1m 41s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-10003/dev-support/hive-personality.sh
 |
| git revision | master / dc5a943 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10003/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10003/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Duplicate field names within Druid Query Generated by Calcite plan
> --
>
> Key: HIVE-19044
> URL: https://issues.apache.org/jira/browse/HIVE-19044
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-19044.patch
>
>
> Test case is attached to the Jira Patch



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18839) Implement incremental rebuild for materialized views (only insert operations in source tables)

2018-04-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426405#comment-16426405
 ] 

Hive QA commented on HIVE-18839:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12917425/HIVE-18839.01.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 211 failed/errored test(s), 13977 tests 
executed
*Failed tests:*
{noformat}
TestBeeLineDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=252)
TestCopyUtils - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=246)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=252)
TestExportImport - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed 
out) (batchId=246)
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=252)
TestMiniDruidKafkaCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=252)
TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed 
out) (batchId=216)
TestReplicationOnHDFSEncryptedZones - did not produce a TEST-*.xml file (likely 
timed out) (batchId=230)
TestReplicationScenarios - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestReplicationScenariosAcidTables - did not produce a TEST-*.xml file (likely 
timed out) (batchId=230)
TestReplicationScenariosAcrossInstances - did not produce a TEST-*.xml file 
(likely timed out) (batchId=230)
TestSequenceFileReadWrite - did not produce a TEST-*.xml file (likely timed 
out) (batchId=246)
TestTezPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=252)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=54)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[materialized_view_create_rewrite_time_window]
 (batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[varchar_2] (batchId=67)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez_empty]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[groupby_groupingset_bug]
 (batchId=174)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_3]
 (batchId=165)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_rebuild_dummy]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] 
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_access_time_non_current_db]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_div0]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_dynamic_semijoin_reduction]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] 
(batchId=105)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=105)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.org.apache.hadoop.hive.cli.TestNegativeCliDriver
 (batchId=95)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.org.apache.hadoop.hive.cli.TestNegativeCliDriver
 (batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[alter_notnull_constraint_violation]
 (batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[insert_into_acid_notnull]
 (batchId=95)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[insert_into_notnull_constraint]
 (batchId=95)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[insert_multi_into_notnull]
 (batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[insert_overwrite_notnull_constraint]
 (batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[nopart_insert] 
(batchId=95)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[smb_bucketmapjoin]
 (batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[smb_mapjoin_14] 
(batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[sortmerge_mapjoin_mismatch_1]
 (batchId=96)

[jira] [Commented] (HIVE-12192) Hive should carry out timestamp computations in UTC

2018-04-04 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426399#comment-16426399
 ] 

Jesus Camacho Rodriguez commented on HIVE-12192:


[~haozhun], checking again experiment 1, indeed for that specific case there 
will be a change for user (though I would argue it is a fix). (is there a 
mistake in the year for the {{expected LocalDateTime}} for 
{{current_timestamp() - interval '2880' hour}}?).

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: timestamp
> Attachments: HIVE-12192.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented because the effective zone skips them.
> {code}
> hive> select TIMESTAMP '2015-03-08 02:10:00.101';
> 2015-03-08 03:10:00.101
> {code}
> Using UTC instead of the SQL session time zone as the underlying zone for a 
> java.sql.Timestamp avoids this bug, while still returning correct values for 
> {{getYear}} etc. Using UTC as the convenience representation (timestamp 
> without time zone has no real zone) would make timestamp calculations more 
> consistent and avoid similar problems in the future.
> Notably, this would break the {{unix_timestamp}} UDF that specifies the 
> result is with respect to ["the default timezone and default 
> locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].
>  That function would need to be updated to use the 
> {{System.getProperty("user.timezone")}} zone.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-10640) Vectorized query with NULL constant throws "Unsuported vector output type: void" error

2018-04-04 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426394#comment-16426394
 ] 

Matt McCline commented on HIVE-10640:
-

We now either handle SELECT NULL or issue a good error message saying Void type 
not supported.

> Vectorized query with NULL constant  throws "Unsuported vector output type: 
> void" error
> ---
>
> Key: HIVE-10640
> URL: https://issues.apache.org/jira/browse/HIVE-10640
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-10640.01.patch, HIVE-10640.02.patch
>
>
> This query from join_nullsafe.q when vectorized throws "Unsuported vector 
> output type: void" during execution...
> {noformat}
> select * from myinput1 a join myinput1 b on a.key<=>b.value AND a.key is NULL;
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-10640) Vectorized query with NULL constant throws "Unsuported vector output type: void" error

2018-04-04 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426394#comment-16426394
 ] 

Matt McCline edited comment on HIVE-10640 at 4/5/18 1:13 AM:
-

Fixed along the way.  We now either handle SELECT NULL or issue a good error 
message saying Void type not supported.


was (Author: mmccline):
We now either handle SELECT NULL or issue a good error message saying Void type 
not supported.

> Vectorized query with NULL constant  throws "Unsuported vector output type: 
> void" error
> ---
>
> Key: HIVE-10640
> URL: https://issues.apache.org/jira/browse/HIVE-10640
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-10640.01.patch, HIVE-10640.02.patch
>
>
> This query from join_nullsafe.q when vectorized throws "Unsuported vector 
> output type: void" during execution...
> {noformat}
> select * from myinput1 a join myinput1 b on a.key<=>b.value AND a.key is NULL;
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-10640) Vectorized query with NULL constant throws "Unsuported vector output type: void" error

2018-04-04 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-10640:

   Resolution: Fixed
Fix Version/s: (was: 1.3.0)
   Status: Resolved  (was: Patch Available)

> Vectorized query with NULL constant  throws "Unsuported vector output type: 
> void" error
> ---
>
> Key: HIVE-10640
> URL: https://issues.apache.org/jira/browse/HIVE-10640
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-10640.01.patch, HIVE-10640.02.patch
>
>
> This query from join_nullsafe.q when vectorized throws "Unsuported vector 
> output type: void" during execution...
> {noformat}
> select * from myinput1 a join myinput1 b on a.key<=>b.value AND a.key is NULL;
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-11101) Vectorization decimal precision issue in vectorization_short_regress.q

2018-04-04 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426393#comment-16426393
 ] 

Matt McCline commented on HIVE-11101:
-

Fixed along the way.

> Vectorization decimal precision issue in vectorization_short_regress.q
> --
>
> Key: HIVE-11101
> URL: https://issues.apache.org/jira/browse/HIVE-11101
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Major
>
> Noticed one query result line in vectorization_short_regress.q is different 
> when that test is run without vectorization.
> It is a decimal precision issue??
> {code}
> 1785c1797
> < 1969-12-31 16:00:04.063 04XP4DrTCblC788515601.0 79.553  
> -1452617198 15601   -407009.58195572987 -15858  -511684.9   
> -15601.0158740.1750002  -6432.15344526  -79.553 NULL  
>   -15601.0-2.43391201E8
> ---
> > 1969-12-31 16:00:04.063 04XP4DrTCblC788515601.0 79.553  
> > -1452617198 15601   -407009.58195572987 -15858  -511684.9   
> > -15601.0158740.1750002  -6432.0 -79.553 NULL-15601.0
> > -2.43391201E8
> 1886a1899
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (HIVE-12559) Vectorization on MR produces different results

2018-04-04 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline resolved HIVE-12559.
-
Resolution: Incomplete

> Vectorization on MR produces different results
> --
>
> Key: HIVE-12559
> URL: https://issues.apache.org/jira/browse/HIVE-12559
> Project: Hive
>  Issue Type: Bug
>Reporter: Laljo John Pullokkaran
>Assignee: Matt McCline
>Priority: Major
>
> Vectorization on MR produces different results for semantically equivalent 
> queries.
> SET hive.vectorized.execution.enabled=true;
> SET hive.auto.convert.join=true;
> SET hive.auto.convert.join.noconditionaltask=true;
> SET hive.auto.convert.join.noconditionaltask.size=10;
> SET hive.cbo.enable=false;
> select sum(v1.cdouble) from alltypesorc v3 join alltypesorc v1 on 
> v1.csmallint=v3.csmallint join alltypesorc v2 on v1.ctinyint=v2.ctinyint;
> -- Produces 6.065190932488167E11
> select sum(v1.cdouble) from alltypesorc v1 join alltypesorc v2 on 
> v1.ctinyint=v2.ctinyint join alltypesorc v3 on v1.csmallint=v3.csmallint;
> -- Produces 6.065190932486892E11



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-13337) VectorHashKeyWrapperBatch methods assign*NullsRepeating seem to be missing isNull checks

2018-04-04 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426392#comment-16426392
 ] 

Matt McCline commented on HIVE-13337:
-

Fixed along the way.

> VectorHashKeyWrapperBatch methods assign*NullsRepeating seem to be missing 
> isNull checks
> 
>
> Key: HIVE-13337
> URL: https://issues.apache.org/jira/browse/HIVE-13337
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> Jason Dere spotted a probable problem with all assignLongNullsRepeating, etc 
> methods in VectorHashKeyWrapperBatch class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18741) Add support for Import into Acid table

2018-04-04 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18741:
--
Status: Patch Available  (was: Open)

> Add support for Import into Acid table
> --
>
> Key: HIVE-18741
> URL: https://issues.apache.org/jira/browse/HIVE-18741
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-18741.01.patch
>
>
> This should follow Load Data approach (or use load data directly)
> Note that import supports partition spec
> Does import support loading files not created by Export?  If so, similarly to 
> HIVE-19029 - should check for Acid meta columns and reject



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18741) Add support for Import into Acid table

2018-04-04 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18741:
--
Attachment: HIVE-18741.01.patch

> Add support for Import into Acid table
> --
>
> Key: HIVE-18741
> URL: https://issues.apache.org/jira/browse/HIVE-18741
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-18741.01.patch
>
>
> This should follow Load Data approach (or use load data directly)
> Note that import supports partition spec
> Does import support loading files not created by Export?  If so, similarly to 
> HIVE-19029 - should check for Acid meta columns and reject



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (HIVE-13337) VectorHashKeyWrapperBatch methods assign*NullsRepeating seem to be missing isNull checks

2018-04-04 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline resolved HIVE-13337.
-
Resolution: Fixed

> VectorHashKeyWrapperBatch methods assign*NullsRepeating seem to be missing 
> isNull checks
> 
>
> Key: HIVE-13337
> URL: https://issues.apache.org/jira/browse/HIVE-13337
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> Jason Dere spotted a probable problem with all assignLongNullsRepeating, etc 
> methods in VectorHashKeyWrapperBatch class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-14750) Vectorization: isNull and isRepeating bugs

2018-04-04 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-14750:

Resolution: Incomplete
Status: Resolved  (was: Patch Available)

> Vectorization: isNull and isRepeating bugs
> --
>
> Key: HIVE-14750
> URL: https://issues.apache.org/jira/browse/HIVE-14750
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-14750.01.patch
>
>
> Various bugs in VectorUDAF* templates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (HIVE-16919) Vectorization: vectorization_short_regress.q has query result differences with non-vectorized run. Vectorized unary function broken?

2018-04-04 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline resolved HIVE-16919.
-
Resolution: Fixed

> Vectorization: vectorization_short_regress.q has query result differences 
> with non-vectorized run.  Vectorized unary function broken?
> -
>
> Key: HIVE-16919
> URL: https://issues.apache.org/jira/browse/HIVE-16919
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> Jason spotted a difference in the query result for 
> vectorization_short_regress.q.out -- that is when vectorization is turned off 
> and a base .q.out file created, there are 2 differences.
> They both seem to be related to negation.  For example, in the first one 
> MAX(cint) and MAX(cint) appear earlier as columns and match non-vec and vec.  
> So, it doesn't appear that aggregation is failing.  It seems like the issue 
> is now that the Reducer is vectorizing, a bug is exposed.  So, even though 
> MAX and MIN are the same, the expression with negation returns different 
> results.
> 19th field of the query below: Vectorized 511 vs Non-Vectorized -58
> {noformat}
> SELECT MAX(cint),
>(MAX(cint) / -3728),
>(MAX(cint) * -3728),
>VAR_POP(cbigint),
>(-((MAX(cint) * -3728))),
>STDDEV_POP(csmallint),
>(-563 % (MAX(cint) * -3728)),
>(VAR_POP(cbigint) / STDDEV_POP(csmallint)),
>(-(STDDEV_POP(csmallint))),
>MAX(cdouble),
>AVG(ctinyint),
>(STDDEV_POP(csmallint) - 10.175),
>MIN(cint),
>((MAX(cint) * -3728) % (STDDEV_POP(csmallint) - 10.175)),
>(-(MAX(cdouble))),
>MIN(cdouble),
>(MAX(cdouble) % -26.28),
>STDDEV_SAMP(csmallint),
>(-((MAX(cint) / -3728))),
>((-((MAX(cint) * -3728))) % (-563 % (MAX(cint) * -3728))),
>((MAX(cint) / -3728) - AVG(ctinyint)),
>(-((MAX(cint) * -3728))),
>VAR_SAMP(cint)
> FROM   alltypesorc
> WHERE  (((cbigint <= 197)
>  AND (cint < cbigint))
> OR ((cdouble >= -26.28)
> AND (csmallint > cdouble))
> OR ((ctinyint > cfloat)
> AND (cstring1 RLIKE '.*ss.*'))
>OR ((cfloat > 79.553)
>AND (cstring2 LIKE '10%')))
> {noformat}
> Column expression is:  ((-((MAX(cint) * -3728))) % (-563 % (MAX(cint) * 
> -3728))),
> ---
> This is a previously existing issue and now filed as  HIVE-16919: 
> "Vectorization: vectorization_short_regress.q has query result differences 
> with non-vectorized run"
> 10th field of the query below: Non-Vectorized -6432.15344526 vs. 
> -Vectorized -6432.0
> Column expression is (-(cdouble)) as c4,
> Query result for vectorization_short_regress.q.out -- that is when 
> vectorization is turned off and a base .q.out file created.
> ---
> 10th field of the query below: Non-Vectorized -6432.15344526 vs. 
> Vectorized -6432.0
> Column expression is (-(cdouble)) as c4,
> {noformat}
> SELECT   ctimestamp1,
>  cstring2,
>  cdouble,
>  cfloat,
>  cbigint,
>  csmallint,
>  (cbigint / 3569) as c1,
>  (-257 - csmallint) as c2,
>  (-6432 * cfloat) as c3,
>  (-(cdouble)) as c4,
>  (cdouble * 10.175) as c5,
>  ((-6432 * cfloat) / cfloat) as c6,
>  (-(cfloat)) as c7,
>  (cint % csmallint) as c8,
>  (-(cdouble)) as c9,
>  (cdouble * (-(cdouble))) as c10
> FROM alltypesorc
> WHERE(((-1.389 >= cint)
>AND ((csmallint < ctinyint)
> AND (-6432 > csmallint)))
>   OR ((cdouble >= cfloat)
>   AND (cstring2 <= 'a'))
>  OR ((cstring1 LIKE 'ss%')
>  AND (10.175 > cbigint)))
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16919) Vectorization: vectorization_short_regress.q has query result differences with non-vectorized run. Vectorized unary function broken?

2018-04-04 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426391#comment-16426391
 ] 

Matt McCline commented on HIVE-16919:
-

This was fixed along the way.

> Vectorization: vectorization_short_regress.q has query result differences 
> with non-vectorized run.  Vectorized unary function broken?
> -
>
> Key: HIVE-16919
> URL: https://issues.apache.org/jira/browse/HIVE-16919
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> Jason spotted a difference in the query result for 
> vectorization_short_regress.q.out -- that is when vectorization is turned off 
> and a base .q.out file created, there are 2 differences.
> They both seem to be related to negation.  For example, in the first one 
> MAX(cint) and MAX(cint) appear earlier as columns and match non-vec and vec.  
> So, it doesn't appear that aggregation is failing.  It seems like the issue 
> is now that the Reducer is vectorizing, a bug is exposed.  So, even though 
> MAX and MIN are the same, the expression with negation returns different 
> results.
> 19th field of the query below: Vectorized 511 vs Non-Vectorized -58
> {noformat}
> SELECT MAX(cint),
>(MAX(cint) / -3728),
>(MAX(cint) * -3728),
>VAR_POP(cbigint),
>(-((MAX(cint) * -3728))),
>STDDEV_POP(csmallint),
>(-563 % (MAX(cint) * -3728)),
>(VAR_POP(cbigint) / STDDEV_POP(csmallint)),
>(-(STDDEV_POP(csmallint))),
>MAX(cdouble),
>AVG(ctinyint),
>(STDDEV_POP(csmallint) - 10.175),
>MIN(cint),
>((MAX(cint) * -3728) % (STDDEV_POP(csmallint) - 10.175)),
>(-(MAX(cdouble))),
>MIN(cdouble),
>(MAX(cdouble) % -26.28),
>STDDEV_SAMP(csmallint),
>(-((MAX(cint) / -3728))),
>((-((MAX(cint) * -3728))) % (-563 % (MAX(cint) * -3728))),
>((MAX(cint) / -3728) - AVG(ctinyint)),
>(-((MAX(cint) * -3728))),
>VAR_SAMP(cint)
> FROM   alltypesorc
> WHERE  (((cbigint <= 197)
>  AND (cint < cbigint))
> OR ((cdouble >= -26.28)
> AND (csmallint > cdouble))
> OR ((ctinyint > cfloat)
> AND (cstring1 RLIKE '.*ss.*'))
>OR ((cfloat > 79.553)
>AND (cstring2 LIKE '10%')))
> {noformat}
> Column expression is:  ((-((MAX(cint) * -3728))) % (-563 % (MAX(cint) * 
> -3728))),
> ---
> This is a previously existing issue and now filed as  HIVE-16919: 
> "Vectorization: vectorization_short_regress.q has query result differences 
> with non-vectorized run"
> 10th field of the query below: Non-Vectorized -6432.15344526 vs. 
> -Vectorized -6432.0
> Column expression is (-(cdouble)) as c4,
> Query result for vectorization_short_regress.q.out -- that is when 
> vectorization is turned off and a base .q.out file created.
> ---
> 10th field of the query below: Non-Vectorized -6432.15344526 vs. 
> Vectorized -6432.0
> Column expression is (-(cdouble)) as c4,
> {noformat}
> SELECT   ctimestamp1,
>  cstring2,
>  cdouble,
>  cfloat,
>  cbigint,
>  csmallint,
>  (cbigint / 3569) as c1,
>  (-257 - csmallint) as c2,
>  (-6432 * cfloat) as c3,
>  (-(cdouble)) as c4,
>  (cdouble * 10.175) as c5,
>  ((-6432 * cfloat) / cfloat) as c6,
>  (-(cfloat)) as c7,
>  (cint % csmallint) as c8,
>  (-(cdouble)) as c9,
>  (cdouble * (-(cdouble))) as c10
> FROM alltypesorc
> WHERE(((-1.389 >= cint)
>AND ((csmallint < ctinyint)
> AND (-6432 > csmallint)))
>   OR ((cdouble >= cfloat)
>   AND (cstring2 <= 'a'))
>  OR ((cstring1 LIKE 'ss%')
>  AND (10.175 > cbigint)))
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18600) Vectorization: Top-Level Vector Expression Scratch Column Deallocation

2018-04-04 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-18600:

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

> Vectorization: Top-Level Vector Expression Scratch Column Deallocation
> --
>
> Key: HIVE-18600
> URL: https://issues.apache.org/jira/browse/HIVE-18600
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-18600.01.patch
>
>
> The operators create various vector expression *arrays* for predicates, 
> SELECT clauses, key expressions, etc.  We could have those be marked as 
> special "top level" vector expression then we could defer deallocation until 
> the top level expression is complete.  This could be a simple solution that 
> avoids trying fix our current eager deallocation that tries to reuse scratch 
> columns as soon as possible.  It *isn't optimal*, but it *shouldn't be too 
> bad*. This solution is much better than not deallocating at all - especially 
> for queries that SELECT a large number of columns or have a lot of 
> expressions in the operator tree.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (HIVE-17892) Vectorization: Wrong results for vectorized_timestamp_funcs.q

2018-04-04 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline resolved HIVE-17892.
-
Resolution: Fixed

> Vectorization: Wrong results for vectorized_timestamp_funcs.q
> -
>
> Key: HIVE-17892
> URL: https://issues.apache.org/jira/browse/HIVE-17892
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> Query #4:
> NonVec:
> NULL  NULLNULLNULLNULLNULL8   1   1
> NULL  NULLNULLNULLNULLNULLNULLNULLNULL
> -62169765561  2   11  30  30  48  4   40  39
> Vec:
> NULL  NULLNULLNULLNULLNULLNULLNULLNULL
> NULL  NULLNULLNULLNULLNULLNULLNULLNULL
> NULL  NULLNULLNULLNULLNULLNULLNULLNULL



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17892) Vectorization: Wrong results for vectorized_timestamp_funcs.q

2018-04-04 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426390#comment-16426390
 ] 

Matt McCline commented on HIVE-17892:
-

This was fixed along the way.

> Vectorization: Wrong results for vectorized_timestamp_funcs.q
> -
>
> Key: HIVE-17892
> URL: https://issues.apache.org/jira/browse/HIVE-17892
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> Query #4:
> NonVec:
> NULL  NULLNULLNULLNULLNULL8   1   1
> NULL  NULLNULLNULLNULLNULLNULLNULLNULL
> -62169765561  2   11  30  30  48  4   40  39
> Vec:
> NULL  NULLNULLNULLNULLNULLNULLNULLNULL
> NULL  NULLNULLNULLNULLNULLNULLNULLNULL
> NULL  NULLNULLNULLNULLNULLNULLNULLNULL



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18831) Differentiate errors that are thrown by Spark tasks

2018-04-04 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-18831:

Attachment: HIVE-18831.90.patch

> Differentiate errors that are thrown by Spark tasks
> ---
>
> Key: HIVE-18831
> URL: https://issues.apache.org/jira/browse/HIVE-18831
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18831.1.patch, HIVE-18831.2.patch, 
> HIVE-18831.3.patch, HIVE-18831.4.patch, HIVE-18831.6.patch, 
> HIVE-18831.7.patch, HIVE-18831.8.WIP.patch, HIVE-18831.9.patch, 
> HIVE-18831.90.patch
>
>
> We propagate exceptions from Spark task failures to the client well, but we 
> don't differentiate between errors from HS2 / RSC vs. errors thrown by 
> individual tasks.
> Main motivation is that when the client sees a propagated Spark exception its 
> difficult to know what part of the excution threw the exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18831) Differentiate errors that are thrown by Spark tasks

2018-04-04 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-18831:

Attachment: (was: HIVE-18831.10.patch)

> Differentiate errors that are thrown by Spark tasks
> ---
>
> Key: HIVE-18831
> URL: https://issues.apache.org/jira/browse/HIVE-18831
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18831.1.patch, HIVE-18831.2.patch, 
> HIVE-18831.3.patch, HIVE-18831.4.patch, HIVE-18831.6.patch, 
> HIVE-18831.7.patch, HIVE-18831.8.WIP.patch, HIVE-18831.9.patch, 
> HIVE-18831.90.patch
>
>
> We propagate exceptions from Spark task failures to the client well, but we 
> don't differentiate between errors from HS2 / RSC vs. errors thrown by 
> individual tasks.
> Main motivation is that when the client sees a propagated Spark exception its 
> difficult to know what part of the excution threw the exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18831) Differentiate errors that are thrown by Spark tasks

2018-04-04 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-18831:

Attachment: HIVE-18831.10.patch

> Differentiate errors that are thrown by Spark tasks
> ---
>
> Key: HIVE-18831
> URL: https://issues.apache.org/jira/browse/HIVE-18831
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18831.1.patch, HIVE-18831.10.patch, 
> HIVE-18831.2.patch, HIVE-18831.3.patch, HIVE-18831.4.patch, 
> HIVE-18831.6.patch, HIVE-18831.7.patch, HIVE-18831.8.WIP.patch, 
> HIVE-18831.9.patch
>
>
> We propagate exceptions from Spark task failures to the client well, but we 
> don't differentiate between errors from HS2 / RSC vs. errors thrown by 
> individual tasks.
> Main motivation is that when the client sees a propagated Spark exception its 
> difficult to know what part of the excution threw the exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18839) Implement incremental rebuild for materialized views (only insert operations in source tables)

2018-04-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426382#comment-16426382
 ] 

Hive QA commented on HIVE-18839:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
47s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
54s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
58s{color} | {color:red} ql in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
31s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
54s{color} | {color:red} ql: The patch generated 23 new + 1161 unchanged - 20 
fixed = 1184 total (was 1181) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
34s{color} | {color:red} standalone-metastore: The patch generated 10 new + 
1608 unchanged - 1 fixed = 1618 total (was 1609) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 24 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 53 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 57s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-10001/dev-support/hive-personality.sh
 |
| git revision | master / dc5a943 |
| Default Java | 1.8.0_111 |
| mvninstall | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10001/yetus/patch-mvninstall-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10001/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10001/yetus/diff-checkstyle-standalone-metastore.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10001/yetus/whitespace-eol.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10001/yetus/patch-asflicense-problems.txt
 |
| modules | C: common ql standalone-metastore U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10001/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Implement incremental rebuild for materialized views (only insert operations 
> in source tables)
> --
>
> Key: HIVE-18839
> URL: https://issues.apache.org/jira/browse/HIVE-18839
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: TODOC3.0
> Attachments: HIVE-18839.01.patch, HIVE-18839.patch
>
>
> Implementation will follow

[jira] [Commented] (HIVE-12192) Hive should carry out timestamp computations in UTC

2018-04-04 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426368#comment-16426368
 ] 

Jesus Camacho Rodriguez commented on HIVE-12192:


[~haozhun], thanks for the experiments. I think you added a new dimension to 
the problem :) I was referring to servers in multiple time zones, it would be 
great if you could validate whether my statements were correct in that case.

Client and server in different time zones probably causes other different 
issues (that this patch would also address). I see why you mention that they 
behave as instant, but it is rather confusing. The instant is created taking as 
a reference the client's time zone. However, when we read it back, the instant 
is not shown taking the client's time zone as a reference, but rather the 
server's time zone? (I am looking at experiment 2)

The reason why we added timestamp with local tz before working on this 
timestamp issue is that if user wants to obtain instant semantics, they will 
still be able to do it via timestamp with local tz.

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: timestamp
> Attachments: HIVE-12192.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented because the effective zone skips them.
> {code}
> hive> select TIMESTAMP '2015-03-08 02:10:00.101';
> 2015-03-08 03:10:00.101
> {code}
> Using UTC instead of the SQL session time zone as the underlying zone for a 
> java.sql.Timestamp avoids this bug, while still returning correct values for 
> {{getYear}} etc. Using UTC as the convenience representation (timestamp 
> without time zone has no real zone) would make timestamp calculations more 
> consistent and avoid similar problems in the future.
> Notably, this would break the {{unix_timestamp}} UDF that specifies the 
> result is with respect to ["the default timezone and default 
> locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].
>  That function would need to be updated to use the 
> {{System.getProperty("user.timezone")}} zone.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16718) Provide a way to pass in user supplied maven build and test arguments to Ptest

2018-04-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426363#comment-16426363
 ] 

Hive QA commented on HIVE-16718:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12868980/HIVE-16718.01.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 100 failed/errored test(s), 13172 tests 
executed
*Failed tests:*
{noformat}
TestBeeLineDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=252)
TestCopyUtils - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=246)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=252)
TestExportImport - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed 
out) (batchId=246)
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=252)
TestMiniDruidKafkaCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=252)
TestMinimrCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=93)

[infer_bucket_sort_num_buckets.q,infer_bucket_sort_reducers_power_two.q,parallel_orderby.q,bucket_num_reducers_acid.q,infer_bucket_sort_map_operators.q,infer_bucket_sort_merge.q,root_dir_external_table.q,infer_bucket_sort_dyn_part.q,udf_using.q,bucket_num_reducers_acid2.q]
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=95)

[jira] [Commented] (HIVE-19014) utilize YARN-8028 (queue ACL check) in Hive Tez session pool

2018-04-04 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426350#comment-16426350
 ] 

Jason Dere commented on HIVE-19014:
---

+1 pending tests

> utilize YARN-8028 (queue ACL check) in Hive Tez session pool
> 
>
> Key: HIVE-19014
> URL: https://issues.apache.org/jira/browse/HIVE-19014
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19014.01.patch, HIVE-19014.02.patch, 
> HIVE-19014.03.patch, HIVE-19014.04.patch, HIVE-19014.05.patch, 
> HIVE-19014.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19113) Bucketing: Make CLUSTERED BY do CLUSTER BY if no explicit sorting is specified

2018-04-04 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-19113:
---
Issue Type: Improvement  (was: Bug)

> Bucketing: Make CLUSTERED BY do CLUSTER BY if no explicit sorting is specified
> --
>
> Key: HIVE-19113
> URL: https://issues.apache.org/jira/browse/HIVE-19113
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Priority: Major
>
> The user's expectation of 
> "create external table bucketed (key int) clustered by (key) into 4 buckets 
> stored as orc;"
> is that the table will cluster the key into 4 buckets, while the file layout 
> does not do any actual clustering of rows.
> In the absence of a "SORTED BY", this can automatically do a "SORTED BY 
> (key)" to cluster the keys within the file as expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18395) Using stats for aggregate query on Acid/MM is off even with "hive.compute.query.using.stats" is true.

2018-04-04 Thread Steve Yeom (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom updated HIVE-18395:
--
Priority: Major  (was: Minor)

> Using stats for aggregate query on Acid/MM is off even with 
> "hive.compute.query.using.stats" is true.
> -
>
> Key: HIVE-18395
> URL: https://issues.apache.org/jira/browse/HIVE-18395
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-12192) Hive should carry out timestamp computations in UTC

2018-04-04 Thread Haozhun Jin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426302#comment-16426302
 ] 

Haozhun Jin edited comment on HIVE-12192 at 4/4/18 11:06 PM:
-

[~jcamachorodriguez], thank you for your patient answer. Please bear with us 
for a little bit more.

What do you think after reading the two experiments and the summary below?
h2. Experiment 1

I conducted this experiment once before. And this is why I'm under the 
impression that Hive Timestamp type means Instant. I redid the experiment today 
in both 1.2.1 and 2.6.3. The result is the same. Zone is America/Los_Angeles.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/03cd09b3fa2456271f2e01759c9c1b8e]

 
||Query||Actual||expected LocalDateTime||expect Instant||
|current_timestamp()|2018-04-04 14:51|2018-04-04 14:51|2018-04-04 14:51|
|current_timestamp()
  - interval '2880' hour|2017-12-05 13:51|2018-12-05 14:51|2017-12-05 13:51|
h2. Experiment 2

After reading your comments about ORC, I conducted some experiments to see how 
other file formats behave. The result is not what I was expecting. I did the 
experiment in 2.6.3.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198]

 
||(server +05:45)||ORC||RCBinary||RCText||Text||
|Insert (client -07:00)|06:00:00|06:00:00|06:00:00|06:00:00|
|Read 1 (client -07:00)|18:45:00|06:00:00|18:45:00|18:45:00|
|Read 2 (client -04:00)|18:45:00|09:00:00|18:45:00|18:45:00|
h2. Summary

This is my take away
 * experiment 1: Hive timestamp type has Instant semantics. If it's internal 
representation is changed from java.sql.Timestamp to java.time.LocalDateTime, 
it will be a user-visible behavior change.
 * experiment 2: Do not use Hive in a zone different from the server's (given 
insert and read does not round trip in a single hive cli session). Hopefully, 
that's indeed how every one uses Hive. In that case, it does not matter whether 
experiment 2 indicates Instant or LocalDateTime.

 


was (Author: haozhun):
[~jcamachorodriguez], thank you for your patient answer. Please bear with us 
for a little bit more.

What do you think after reading the two experiments below?
h2. Experiment 1

I conducted this experiment once before. And this is why I'm under the 
impression that Hive Timestamp type means Instant. I redid the experiment today 
in both 1.2.1 and 2.6.3. The result is the same. Zone is America/Los_Angeles.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/03cd09b3fa2456271f2e01759c9c1b8e]

 
||Query||Actual||expected LocalDateTime||expect Instant||
|current_timestamp()|2018-04-04 14:51|2018-04-04 14:51|2018-04-04 14:51|
|current_timestamp()
  - interval '2880' hour|2017-12-05 13:51|2018-12-05 14:51|2017-12-05 13:51|
h2. Experiment 2

After reading your comments about ORC, I conducted some experiments to see how 
other file formats behave. The result is not what I was expecting. I did the 
experiment in 2.6.3.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198]

 
||(server +05:45)||ORC||RCBinary||RCText||Text||
|Insert (client -07:00)|06:00:00|06:00:00|06:00:00|06:00:00|
|Read 1 (client -07:00)|18:45:00|06:00:00|18:45:00|18:45:00|
|Read 2 (client -04:00)|18:45:00|09:00:00|18:45:00|18:45:00|
h2. Summary

This is my take away
 * experiment 1: Hive timestamp type has Instant semantics. If it's internal 
representation is changed from java.sql.Timestamp to java.time.LocalDateTime, 
it will be a user-visible behavior change.
 * experiment 2: Do not use Hive in a zone different from the server's (given 
insert and read does not round trip in a single hive cli session). Hopefully, 
that's indeed how every one uses Hive. In that case, it does not matter whether 
experiment 2 indicates Instant or LocalDateTime.

 

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: timestamp
> Attachments: HIVE-12192.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented because the effective zone skips them.
> {code}
> hive> select TIMESTAMP '2015-03-08 02:10:00.101';
> 2015-03-08

[jira] [Comment Edited] (HIVE-12192) Hive should carry out timestamp computations in UTC

2018-04-04 Thread Haozhun Jin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426302#comment-16426302
 ] 

Haozhun Jin edited comment on HIVE-12192 at 4/4/18 11:04 PM:
-

[~jcamachorodriguez], thank you for your patient answer. Please bear with us 
for a little bit more.

What do you think after reading the two experiments below?
h2. Experiment 1

I conducted this experiment once before. And this is why I'm under the 
impression that Hive Timestamp type means Instant. I redid the experiment today 
in both 1.2.1 and 2.6.3. The result is the same. Zone is America/Los_Angeles.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/03cd09b3fa2456271f2e01759c9c1b8e]

 
||Query||Actual||expected LocalDateTime||expect Instant||
|current_timestamp()|2018-04-04 14:51|2018-04-04 14:51|2018-04-04 14:51|
|current_timestamp()
  - interval '2880' hour|2017-12-05 13:51|2018-12-05 14:51|2017-12-05 13:51|
h2. Experiment 2

After reading your comments about ORC, I conducted some experiments to see how 
other file formats behave. The result is not what I was expecting. I did the 
experiment in 2.6.3.

The table below summarizes the outcome: 
[raw|[https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198]]

 
||(server +05:45)||ORC||RCBinary||RCText||Text||
|Insert (client -07:00)|06:00:00|06:00:00|06:00:00|06:00:00|
|Read 1 (client -07:00)|18:45:00|06:00:00|18:45:00|18:45:00|
|Read 2 (client -04:00)|18:45:00|09:00:00|18:45:00|18:45:00|
h2. Summary

This is my take away
 * experiment 1: Hive timestamp type has Instant semantics. If it's internal 
representation is changed from java.sql.Timestamp to java.time.LocalDateTime, 
it will be a user-visible behavior change.
 * experiment 2: Do not use Hive in a zone different from the server's (given 
insert and read does not round trip in a single hive cli session). Hopefully, 
that's indeed how every one uses Hive. In that case, it does not matter whether 
experiment 2 indicates Instant or LocalDateTime.

 


was (Author: haozhun):
[~jcamachorodriguez], thank you for your patient answer. Please bear with us 
for a little bit more.

What do you think after reading the two experiments below?
h2. Experiment 1

I conducted this experiment once before. And this is why I'm under the 
impression that Hive Timestamp type means Instant. I redid the experiment today 
in both 1.2.1 and 2.6.3. The result is the same. Zone is America/Los_Angeles.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/03cd09b3fa2456271f2e01759c9c1b8e]

 
||Query||Actual||expected LocalDateTime||expect Instant||
|current_timestamp()|2018-04-04 14:51|2018-04-04 14:51|2018-04-04 14:51|
|current_timestamp()
 - interval '2880' hour|2017-12-05 13:51|2018-12-05 14:51|2017-12-05 13:51|
h2. Experiment 2

After reading your comments about ORC, I conducted some experiments to see how 
other file formats behave. The result is not what I was expecting. I did the 
experiment in 2.6.3.

The table below summarizes the outcome: 
[raw|[https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198]|https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198])]

 
||(server +05:45)||ORC||RCBinary||RCText||Text||
|Insert (client -07:00)|06:00:00|06:00:00|06:00:00|06:00:00|
|Read 1 (client -07:00)|18:45:00|06:00:00|18:45:00|18:45:00|
|Read 2 (client -04:00)|18:45:00|09:00:00|18:45:00|18:45:00|
h2. Summary

This is my take away
 * experiment 1: Hive timestamp type has Instant semantics. If it's internal 
representation is changed from java.sql.Timestamp to java.time.LocalDateTime, 
it will be a user-visible behavior change.
 * experiment 2: Do not use Hive in a zone different from the server's (given 
insert and read does not round trip in a single hive cli session). Hopefully, 
that's indeed how every one uses Hive. In that case, it does not matter whether 
experiment 2 indicates Instant or LocalDateTime.

 

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: timestamp
> Attachments: HIVE-12192.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented because the effective zone skips them.
> {code}
> hive> select

[jira] [Comment Edited] (HIVE-12192) Hive should carry out timestamp computations in UTC

2018-04-04 Thread Haozhun Jin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426302#comment-16426302
 ] 

Haozhun Jin edited comment on HIVE-12192 at 4/4/18 11:04 PM:
-

[~jcamachorodriguez], thank you for your patient answer. Please bear with us 
for a little bit more.

What do you think after reading the two experiments below?
h2. Experiment 1

I conducted this experiment once before. And this is why I'm under the 
impression that Hive Timestamp type means Instant. I redid the experiment today 
in both 1.2.1 and 2.6.3. The result is the same. Zone is America/Los_Angeles.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/03cd09b3fa2456271f2e01759c9c1b8e]

 
||Query||Actual||expected LocalDateTime||expect Instant||
|current_timestamp()|2018-04-04 14:51|2018-04-04 14:51|2018-04-04 14:51|
|current_timestamp()
  - interval '2880' hour|2017-12-05 13:51|2018-12-05 14:51|2017-12-05 13:51|
h2. Experiment 2

After reading your comments about ORC, I conducted some experiments to see how 
other file formats behave. The result is not what I was expecting. I did the 
experiment in 2.6.3.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198]

 
||(server +05:45)||ORC||RCBinary||RCText||Text||
|Insert (client -07:00)|06:00:00|06:00:00|06:00:00|06:00:00|
|Read 1 (client -07:00)|18:45:00|06:00:00|18:45:00|18:45:00|
|Read 2 (client -04:00)|18:45:00|09:00:00|18:45:00|18:45:00|
h2. Summary

This is my take away
 * experiment 1: Hive timestamp type has Instant semantics. If it's internal 
representation is changed from java.sql.Timestamp to java.time.LocalDateTime, 
it will be a user-visible behavior change.
 * experiment 2: Do not use Hive in a zone different from the server's (given 
insert and read does not round trip in a single hive cli session). Hopefully, 
that's indeed how every one uses Hive. In that case, it does not matter whether 
experiment 2 indicates Instant or LocalDateTime.

 


was (Author: haozhun):
[~jcamachorodriguez], thank you for your patient answer. Please bear with us 
for a little bit more.

What do you think after reading the two experiments below?
h2. Experiment 1

I conducted this experiment once before. And this is why I'm under the 
impression that Hive Timestamp type means Instant. I redid the experiment today 
in both 1.2.1 and 2.6.3. The result is the same. Zone is America/Los_Angeles.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/03cd09b3fa2456271f2e01759c9c1b8e]

 
||Query||Actual||expected LocalDateTime||expect Instant||
|current_timestamp()|2018-04-04 14:51|2018-04-04 14:51|2018-04-04 14:51|
|current_timestamp()
  - interval '2880' hour|2017-12-05 13:51|2018-12-05 14:51|2017-12-05 13:51|
h2. Experiment 2

After reading your comments about ORC, I conducted some experiments to see how 
other file formats behave. The result is not what I was expecting. I did the 
experiment in 2.6.3.

The table below summarizes the outcome: 
[raw|[https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198]]

 
||(server +05:45)||ORC||RCBinary||RCText||Text||
|Insert (client -07:00)|06:00:00|06:00:00|06:00:00|06:00:00|
|Read 1 (client -07:00)|18:45:00|06:00:00|18:45:00|18:45:00|
|Read 2 (client -04:00)|18:45:00|09:00:00|18:45:00|18:45:00|
h2. Summary

This is my take away
 * experiment 1: Hive timestamp type has Instant semantics. If it's internal 
representation is changed from java.sql.Timestamp to java.time.LocalDateTime, 
it will be a user-visible behavior change.
 * experiment 2: Do not use Hive in a zone different from the server's (given 
insert and read does not round trip in a single hive cli session). Hopefully, 
that's indeed how every one uses Hive. In that case, it does not matter whether 
experiment 2 indicates Instant or LocalDateTime.

 

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: timestamp
> Attachments: HIVE-12192.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented because the effective zone skips them.
> {code}
> hive> select TIMESTAMP '2015-03-08 02:10:00.101';
> 2015-03-08 03:10:00.101
>

[jira] [Comment Edited] (HIVE-12192) Hive should carry out timestamp computations in UTC

2018-04-04 Thread Haozhun Jin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426302#comment-16426302
 ] 

Haozhun Jin edited comment on HIVE-12192 at 4/4/18 11:03 PM:
-

[~jcamachorodriguez], thank you for your patient answer. Please bear with us 
for a little bit more.

What do you think after reading the two experiments below?
h2. Experiment 1

I conducted this experiment once before. And this is why I'm under the 
impression that Hive Timestamp type means Instant. I redid the experiment today 
in both 1.2.1 and 2.6.3. The result is the same. Zone is America/Los_Angeles.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/03cd09b3fa2456271f2e01759c9c1b8e]

 
||Query||Actual||expected LocalDateTime||expect Instant||
|current_timestamp()|2018-04-04 14:51|2018-04-04 14:51|2018-04-04 14:51|
|current_timestamp()
 - interval '2880' hour|2017-12-05 13:51|2018-12-05 14:51|2017-12-05 13:51|
h2. Experiment 2

After reading your comments about ORC, I conducted some experiments to see how 
other file formats behave. The result is not what I was expecting. I did the 
experiment in 2.6.3.

The table below summarizes the outcome: 
[raw|[https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198]|https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198])]

 
||(server +05:45)||ORC||RCBinary||RCText||Text||
|Insert (client -07:00)|06:00:00|06:00:00|06:00:00|06:00:00|
|Read 1 (client -07:00)|18:45:00|06:00:00|18:45:00|18:45:00|
|Read 2 (client -04:00)|18:45:00|09:00:00|18:45:00|18:45:00|
h2. Summary

This is my take away
 * experiment 1: Hive timestamp type has Instant semantics. If it's internal 
representation is changed from java.sql.Timestamp to java.time.LocalDateTime, 
it will be a user-visible behavior change.
 * experiment 2: Do not use Hive in a zone different from the server's (given 
insert and read does not round trip in a single hive cli session). Hopefully, 
that's indeed how every one uses Hive. In that case, it does not matter whether 
experiment 2 indicates Instant or LocalDateTime.

 


was (Author: haozhun):
[~jcamachorodriguez], thank you for your patient answer. Please bear with us 
for a little bit more.

What do you think after reading the two experiments below?
h2. Experiment 1

I conducted this experiment once before. And this is why I'm under the 
impression that Hive Timestamp type means Instant. I redid the experiment today 
in both 1.2.1 and 2.6.3. The result is the same. Zone is America/Los_Angeles.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/03cd09b3fa2456271f2e01759c9c1b8e]

 
||Query||Actual||expected LocalDateTime||expect Instant||
|current_timestamp()|2018-04-04 14:51|2018-04-04 14:51|2018-04-04 14:51|
|`current_timestamp() - interval '2880' hour`|2017-12-05 13:51|2018-12-05 
14:51|2017-12-05 13:51|
h2. Experiment 2

After reading your comments about ORC, I conducted some experiments to see how 
other file formats behave. The result is not what I was expecting. I did the 
experiment in 2.6.3.

The table below summarizes the outcome: 
[raw|[https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198]|https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198])]

 
||(server +05:45)||ORC||RCBinary||RCText||Text||
|Insert (client -07:00)|06:00:00|06:00:00|06:00:00|06:00:00|
|Read 1 (client -07:00)|18:45:00|06:00:00|18:45:00|18:45:00|
|Read 2 (client -04:00)|18:45:00|09:00:00|18:45:00|18:45:00|
h2. Summary

This is my take away
 * experiment 1: Hive timestamp type has Instant semantics. If it's internal 
representation is changed from java.sql.Timestamp to java.time.LocalDateTime, 
it will be a user-visible behavior change.
 * experiment 2: Do not use Hive in a zone different from the server's (given 
insert and read does not round trip in a single hive cli session). Hopefully, 
that's indeed how every one uses Hive. In that case, it does not matter whether 
experiment 2 indicates Instant or LocalDateTime.

 

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: timestamp
> Attachments: HIVE-12192.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
>

[jira] [Comment Edited] (HIVE-12192) Hive should carry out timestamp computations in UTC

2018-04-04 Thread Haozhun Jin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426302#comment-16426302
 ] 

Haozhun Jin edited comment on HIVE-12192 at 4/4/18 11:02 PM:
-

[~jcamachorodriguez], thank you for your patient answer. Please bear with us 
for a little bit more.

What do you think after reading the two experiments below?
h2. Experiment 1

I conducted this experiment once before. And this is why I'm under the 
impression that Hive Timestamp type means Instant. I redid the experiment today 
in both 1.2.1 and 2.6.3. The result is the same. Zone is America/Los_Angeles.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/03cd09b3fa2456271f2e01759c9c1b8e]

 
||Query||Actual||expected LocalDateTime||expect Instant||
|current_timestamp()|2018-04-04 14:51|2018-04-04 14:51|2018-04-04 14:51|
|current_timestamp()
- interval '2880' hour|2017-12-05 13:51|2018-12-05 14:51|2017-12-05 13:51|
h2. Experiment 2

After reading your comments about ORC, I conducted some experiments to see how 
other file formats behave. The result is not what I was expecting. I did the 
experiment in 2.6.3.

The table below summarizes the outcome: 
[raw|[https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198]|https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198])]

 
||(server +05:45)||ORC||RCBinary||RCText||Text||
|Insert (client -07:00)|06:00:00|06:00:00|06:00:00|06:00:00|
|Read 1 (client -07:00)|18:45:00|06:00:00|18:45:00|18:45:00|
|Read 2 (client -04:00)|18:45:00|09:00:00|18:45:00|18:45:00|
h2. Summary

This is my take away
 * experiment 1: Hive timestamp type has Instant semantics. If it's internal 
representation is changed from java.sql.Timestamp to java.time.LocalDateTime, 
it will be a user-visible behavior change.
 * experiment 2: Do not use Hive in a zone different from the server's (given 
insert and read does not round trip in a single hive cli session). Hopefully, 
that's indeed how every one uses Hive. In that case, it does not matter whether 
experiment 2 indicates Instant or LocalDateTime.

 


was (Author: haozhun):
[~jcamachorodriguez], thank you for your patient answer. Please bear with us 
for a little bit more.

What do you think after reading the two experiments below?
h2. Experiment 1

I conducted this experiment once before. And this is why I'm under the 
impression that Hive Timestamp type means Instant. I redid the experiment today 
in both 1.2.1 and 2.6.3. The result is the same. Zone is America/Los_Angeles.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/03cd09b3fa2456271f2e01759c9c1b8e]

 
||Query||Actual||expected LocalDateTime||expect Instant||
|current_timestamp()|2018-04-04 14:51|2018-04-04 14:51|2018-04-04 14:51|
|current_timestamp() - interval '2880' hour|2017-12-05 13:51|2018-12-05 
14:51|2017-12-05 13:51|
h2. Experiment 2

After reading your comments about ORC, I conducted some experiments to see how 
other file formats behave. The result is not what I was expecting. I did the 
experiment in 2.6.3.

The table below summarizes the outcome: 
[raw|[https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198]|https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198])]

 
||(server +05:45)||ORC||RCBinary||RCText||Text||
|Insert (client -07:00)|06:00:00|06:00:00|06:00:00|06:00:00|
|Read 1 (client -07:00)|18:45:00|06:00:00|18:45:00|18:45:00|
|Read 2 (client -04:00)|18:45:00|09:00:00|18:45:00|18:45:00|
h2. Summary

This is my take away
 * experiment 1: Hive timestamp type has Instant semantics. If it's internal 
representation is changed from java.sql.Timestamp to java.time.LocalDateTime, 
it will be a user-visible behavior change.
 * experiment 2: Do not use Hive in a zone different from the server's (given 
insert and read does not round trip in a single hive cli session). Hopefully, 
that's indeed how every one uses Hive. In that case, it does not matter whether 
experiment 2 indicates Instant or LocalDateTime.

 

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: timestamp
> Attachments: HIVE-12192.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented

[jira] [Comment Edited] (HIVE-12192) Hive should carry out timestamp computations in UTC

2018-04-04 Thread Haozhun Jin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426302#comment-16426302
 ] 

Haozhun Jin edited comment on HIVE-12192 at 4/4/18 11:02 PM:
-

[~jcamachorodriguez], thank you for your patient answer. Please bear with us 
for a little bit more.

What do you think after reading the two experiments below?
h2. Experiment 1

I conducted this experiment once before. And this is why I'm under the 
impression that Hive Timestamp type means Instant. I redid the experiment today 
in both 1.2.1 and 2.6.3. The result is the same. Zone is America/Los_Angeles.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/03cd09b3fa2456271f2e01759c9c1b8e]

 
||Query||Actual||expected LocalDateTime||expect Instant||
|current_timestamp()|2018-04-04 14:51|2018-04-04 14:51|2018-04-04 14:51|
|`current_timestamp() - interval '2880' hour`|2017-12-05 13:51|2018-12-05 
14:51|2017-12-05 13:51|
h2. Experiment 2

After reading your comments about ORC, I conducted some experiments to see how 
other file formats behave. The result is not what I was expecting. I did the 
experiment in 2.6.3.

The table below summarizes the outcome: 
[raw|[https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198]|https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198])]

 
||(server +05:45)||ORC||RCBinary||RCText||Text||
|Insert (client -07:00)|06:00:00|06:00:00|06:00:00|06:00:00|
|Read 1 (client -07:00)|18:45:00|06:00:00|18:45:00|18:45:00|
|Read 2 (client -04:00)|18:45:00|09:00:00|18:45:00|18:45:00|
h2. Summary

This is my take away
 * experiment 1: Hive timestamp type has Instant semantics. If it's internal 
representation is changed from java.sql.Timestamp to java.time.LocalDateTime, 
it will be a user-visible behavior change.
 * experiment 2: Do not use Hive in a zone different from the server's (given 
insert and read does not round trip in a single hive cli session). Hopefully, 
that's indeed how every one uses Hive. In that case, it does not matter whether 
experiment 2 indicates Instant or LocalDateTime.

 


was (Author: haozhun):
[~jcamachorodriguez], thank you for your patient answer. Please bear with us 
for a little bit more.

What do you think after reading the two experiments below?
h2. Experiment 1

I conducted this experiment once before. And this is why I'm under the 
impression that Hive Timestamp type means Instant. I redid the experiment today 
in both 1.2.1 and 2.6.3. The result is the same. Zone is America/Los_Angeles.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/03cd09b3fa2456271f2e01759c9c1b8e]

 
||Query||Actual||expected LocalDateTime||expect Instant||
|current_timestamp()|2018-04-04 14:51|2018-04-04 14:51|2018-04-04 14:51|
|current_timestamp()
- interval '2880' hour|2017-12-05 13:51|2018-12-05 14:51|2017-12-05 13:51|
h2. Experiment 2

After reading your comments about ORC, I conducted some experiments to see how 
other file formats behave. The result is not what I was expecting. I did the 
experiment in 2.6.3.

The table below summarizes the outcome: 
[raw|[https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198]|https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198])]

 
||(server +05:45)||ORC||RCBinary||RCText||Text||
|Insert (client -07:00)|06:00:00|06:00:00|06:00:00|06:00:00|
|Read 1 (client -07:00)|18:45:00|06:00:00|18:45:00|18:45:00|
|Read 2 (client -04:00)|18:45:00|09:00:00|18:45:00|18:45:00|
h2. Summary

This is my take away
 * experiment 1: Hive timestamp type has Instant semantics. If it's internal 
representation is changed from java.sql.Timestamp to java.time.LocalDateTime, 
it will be a user-visible behavior change.
 * experiment 2: Do not use Hive in a zone different from the server's (given 
insert and read does not round trip in a single hive cli session). Hopefully, 
that's indeed how every one uses Hive. In that case, it does not matter whether 
experiment 2 indicates Instant or LocalDateTime.

 

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: timestamp
> Attachments: HIVE-12192.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
>

[jira] [Comment Edited] (HIVE-12192) Hive should carry out timestamp computations in UTC

2018-04-04 Thread Haozhun Jin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426302#comment-16426302
 ] 

Haozhun Jin edited comment on HIVE-12192 at 4/4/18 11:01 PM:
-

[~jcamachorodriguez], thank you for your patient answer. Please bear with us 
for a little bit more.

What do you think after reading the two experiments below?
h2. Experiment 1

I conducted this experiment once before. And this is why I'm under the 
impression that Hive Timestamp type means Instant. I redid the experiment today 
in both 1.2.1 and 2.6.3. The result is the same. Zone is America/Los_Angeles.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/03cd09b3fa2456271f2e01759c9c1b8e]

 
||Query||Actual||expected LocalDateTime||expect Instant||
|current_timestamp()|2018-04-04 14:51|2018-04-04 14:51|2018-04-04 14:51|
|current_timestamp() - interval '2880' hour|2017-12-05 13:51|2018-12-05 
14:51|2017-12-05 13:51|
h2. Experiment 2

After reading your comments about ORC, I conducted some experiments to see how 
other file formats behave. The result is not what I was expecting. I did the 
experiment in 2.6.3.

The table below summarizes the outcome: 
[raw|[https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198]|https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198])]

 
||(server +05:45)||ORC||RCBinary||RCText||Text||
|Insert (client -07:00)|06:00:00|06:00:00|06:00:00|06:00:00|
|Read 1 (client -07:00)|18:45:00|06:00:00|18:45:00|18:45:00|
|Read 2 (client -04:00)|18:45:00|09:00:00|18:45:00|18:45:00|
h2. Summary

This is my take away
 * experiment 1: Hive timestamp type has Instant semantics. If it's internal 
representation is changed from java.sql.Timestamp to java.time.LocalDateTime, 
it will be a user-visible behavior change.
 * experiment 2: Do not use Hive in a zone different from the server's (given 
insert and read does not round trip in a single hive cli session). Hopefully, 
that's indeed how every one uses Hive. In that case, it does not matter whether 
experiment 2 indicates Instant or LocalDateTime.

 


was (Author: haozhun):
[~jcamachorodriguez], thank you for your patient answer. Please bear with us 
for a little bit more.

What do you think after reading the two experiments below?
h1. Experiment 1

I conducted this experiment once before. And this is why I'm under the 
impression that Hive Timestamp type means Instant. I redid the experiment today 
in both 1.2.1 and 2.6.3. The result is the same. Zone is America/Los_Angeles.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/03cd09b3fa2456271f2e01759c9c1b8e]

 
||Query||Actual||expected LocalDateTime||expect Instant||
|current_timestamp()|2018-04-04 14:51|2018-04-04 14:51|2018-04-04 14:51|
|current_timestamp() - interval '2880' hour|2017-12-05 13:51|2018-12-05 
14:51|2017-12-05 13:51|
h1. Experiment 2

After reading your comments about ORC, I conducted some experiments to see how 
other file formats behave. The result is not what I was expecting. I did the 
experiment in 2.6.3.

The table below summarizes the outcome: 
[raw|[https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198]|https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198])]

 
||(server +05:45)||ORC||RCBinary||RCText||Text||
|Insert (client -07:00)|06:00:00|06:00:00|06:00:00|06:00:00|
|Read 1 (client -07:00)|18:45:00|06:00:00|18:45:00|18:45:00|
|Read 2 (client -04:00)|18:45:00|09:00:00|18:45:00|18:45:00|
h1. Summary

This is my take away
 * experiment 1: Hive timestamp type has Instant semantics. If it's internal 
representation is changed from java.sql.Timestamp to java.time.LocalDateTime, 
it will be a user-visible behavior change.
 * experiment 2: Do not use Hive in a zone different from the server's (given 
insert and read does not round trip in a single hive cli session). Hopefully, 
that's indeed how every one uses Hive. In that case, it does not matter whether 
experiment 2 indicates Instant or LocalDateTime.

 

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: timestamp
> Attachments: HIVE-12192.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented

[jira] [Commented] (HIVE-12192) Hive should carry out timestamp computations in UTC

2018-04-04 Thread Haozhun Jin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426302#comment-16426302
 ] 

Haozhun Jin commented on HIVE-12192:


[~jcamachorodriguez], thank you for your patient answer. Please bear with us 
for a little bit more.

What do you think after reading the two experiments before?
h1. Experiment 1

I conducted this experiment once before. And this is why I'm under the 
impression that Hive Timestamp type means Instant. I redid the experiment today 
in both 1.2.1 and 2.6.3. The result is the same. Zone is America/Los_Angeles.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/03cd09b3fa2456271f2e01759c9c1b8e]

 
||Query||Actual||expected LocalDateTime||expect Instant||
|current_timestamp()|2018-04-04 14:51|2018-04-04 14:51|2018-04-04 14:51|
|current_timestamp() - interval '2880' hour|2017-12-05 13:51|2018-12-05 
14:51|2017-12-05 13:51|
h1. Experiment 2

After reading your comments about ORC, I conducted some experiments to see how 
other file formats behave. The result is not what I was expecting. I did the 
experiment in 2.6.3.

The table below summarizes the outcome: 
[raw|[https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198]|https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198])]

 
||(server +05:45)||ORC||RCBinary||RCText||Text||
|Insert (client -07:00)|06:00:00|06:00:00|06:00:00|06:00:00|
|Read 1 (client -07:00)|18:45:00|06:00:00|18:45:00|18:45:00|
|Read 2 (client -04:00)|18:45:00|09:00:00|18:45:00|18:45:00|
h1. Summary

This is my take away
 * experiment 1: Hive timestamp type has Instant semantics. If it's internal 
representation is changed from java.sql.Timestamp to java.time.LocalDateTime, 
it will be a user-visible behavior change.
 * experiment 2: Do not use Hive in a zone different from the server's (given 
insert and read does not round trip in a single hive cli session). Hopefully, 
that's indeed how every one uses Hive. In that case, it does not matter whether 
experiment 2 indicates Instant or LocalDateTime.

 

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: timestamp
> Attachments: HIVE-12192.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented because the effective zone skips them.
> {code}
> hive> select TIMESTAMP '2015-03-08 02:10:00.101';
> 2015-03-08 03:10:00.101
> {code}
> Using UTC instead of the SQL session time zone as the underlying zone for a 
> java.sql.Timestamp avoids this bug, while still returning correct values for 
> {{getYear}} etc. Using UTC as the convenience representation (timestamp 
> without time zone has no real zone) would make timestamp calculations more 
> consistent and avoid similar problems in the future.
> Notably, this would break the {{unix_timestamp}} UDF that specifies the 
> result is with respect to ["the default timezone and default 
> locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].
>  That function would need to be updated to use the 
> {{System.getProperty("user.timezone")}} zone.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-12192) Hive should carry out timestamp computations in UTC

2018-04-04 Thread Haozhun Jin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426302#comment-16426302
 ] 

Haozhun Jin edited comment on HIVE-12192 at 4/4/18 11:01 PM:
-

[~jcamachorodriguez], thank you for your patient answer. Please bear with us 
for a little bit more.

What do you think after reading the two experiments below?
h1. Experiment 1

I conducted this experiment once before. And this is why I'm under the 
impression that Hive Timestamp type means Instant. I redid the experiment today 
in both 1.2.1 and 2.6.3. The result is the same. Zone is America/Los_Angeles.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/03cd09b3fa2456271f2e01759c9c1b8e]

 
||Query||Actual||expected LocalDateTime||expect Instant||
|current_timestamp()|2018-04-04 14:51|2018-04-04 14:51|2018-04-04 14:51|
|current_timestamp() - interval '2880' hour|2017-12-05 13:51|2018-12-05 
14:51|2017-12-05 13:51|
h1. Experiment 2

After reading your comments about ORC, I conducted some experiments to see how 
other file formats behave. The result is not what I was expecting. I did the 
experiment in 2.6.3.

The table below summarizes the outcome: 
[raw|[https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198]|https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198])]

 
||(server +05:45)||ORC||RCBinary||RCText||Text||
|Insert (client -07:00)|06:00:00|06:00:00|06:00:00|06:00:00|
|Read 1 (client -07:00)|18:45:00|06:00:00|18:45:00|18:45:00|
|Read 2 (client -04:00)|18:45:00|09:00:00|18:45:00|18:45:00|
h1. Summary

This is my take away
 * experiment 1: Hive timestamp type has Instant semantics. If it's internal 
representation is changed from java.sql.Timestamp to java.time.LocalDateTime, 
it will be a user-visible behavior change.
 * experiment 2: Do not use Hive in a zone different from the server's (given 
insert and read does not round trip in a single hive cli session). Hopefully, 
that's indeed how every one uses Hive. In that case, it does not matter whether 
experiment 2 indicates Instant or LocalDateTime.

 


was (Author: haozhun):
[~jcamachorodriguez], thank you for your patient answer. Please bear with us 
for a little bit more.

What do you think after reading the two experiments before?
h1. Experiment 1

I conducted this experiment once before. And this is why I'm under the 
impression that Hive Timestamp type means Instant. I redid the experiment today 
in both 1.2.1 and 2.6.3. The result is the same. Zone is America/Los_Angeles.

The table below summarizes the outcome: 
[raw|https://gist.github.com/haozhun/03cd09b3fa2456271f2e01759c9c1b8e]

 
||Query||Actual||expected LocalDateTime||expect Instant||
|current_timestamp()|2018-04-04 14:51|2018-04-04 14:51|2018-04-04 14:51|
|current_timestamp() - interval '2880' hour|2017-12-05 13:51|2018-12-05 
14:51|2017-12-05 13:51|
h1. Experiment 2

After reading your comments about ORC, I conducted some experiments to see how 
other file formats behave. The result is not what I was expecting. I did the 
experiment in 2.6.3.

The table below summarizes the outcome: 
[raw|[https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198]|https://gist.github.com/haozhun/2e4b7c52bf3c56c03ed06e1ad895e198])]

 
||(server +05:45)||ORC||RCBinary||RCText||Text||
|Insert (client -07:00)|06:00:00|06:00:00|06:00:00|06:00:00|
|Read 1 (client -07:00)|18:45:00|06:00:00|18:45:00|18:45:00|
|Read 2 (client -04:00)|18:45:00|09:00:00|18:45:00|18:45:00|
h1. Summary

This is my take away
 * experiment 1: Hive timestamp type has Instant semantics. If it's internal 
representation is changed from java.sql.Timestamp to java.time.LocalDateTime, 
it will be a user-visible behavior change.
 * experiment 2: Do not use Hive in a zone different from the server's (given 
insert and read does not round trip in a single hive cli session). Hopefully, 
that's indeed how every one uses Hive. In that case, it does not matter whether 
experiment 2 indicates Instant or LocalDateTime.

 

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: timestamp
> Attachments: HIVE-12192.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
>

[jira] [Commented] (HIVE-16718) Provide a way to pass in user supplied maven build and test arguments to Ptest

2018-04-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426299#comment-16426299
 ] 

Hive QA commented on HIVE-16718:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 53 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 10m 27s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-1/dev-support/hive-personality.sh
 |
| git revision | master / dc5a943 |
| Default Java | 1.8.0_111 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-1/yetus/patch-asflicense-problems.txt
 |
| modules | C: testutils/ptest2 U: testutils/ptest2 |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-1/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Provide a way to pass in user supplied maven build and test arguments to Ptest
> --
>
> Key: HIVE-16718
> URL: https://issues.apache.org/jira/browse/HIVE-16718
> Project: Hive
>  Issue Type: New Feature
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Attachments: HIVE-16718.01.patch
>
>
> Currently we can only pass in maven build and test arguments from the 
> properties file, so all of them need to be hardcoded.
> We should find a way to pass in arguments from the command line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16843) PrimaryToReplicaResourceFunctionTest.createDestinationPath fails with AssertionError

2018-04-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426278#comment-16426278
 ] 

Hive QA commented on HIVE-16843:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12871826/HIVE-16843.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build//testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build//console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-04-04 22:38:02.658
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-04-04 22:38:02.661
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at dc5a943 HIVE-19083: Make partition clause optional for 
INSERT(Vineet Garg,reviewed by Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at dc5a943 HIVE-19083: Make partition clause optional for 
INSERT(Vineet Garg,reviewed by Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-04-04 22:38:03.207
+ rm -rf ../yetus_PreCommit-HIVE-Build-
+ mkdir ../yetus_PreCommit-HIVE-Build-
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: 
a/ql/src/test/org/apache/hadoop/hive/ql/parse/repl/load/message/PrimaryToReplicaResourceFunctionTest.java:
 does not exist in index
error: 
ql/src/test/org/apache/hadoop/hive/ql/parse/repl/load/message/PrimaryToReplicaResourceFunctionTest.java:
 does not exist in index
error: 
src/test/org/apache/hadoop/hive/ql/parse/repl/load/message/PrimaryToReplicaResourceFunctionTest.java:
 does not exist in index
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12871826 - PreCommit-HIVE-Build

> PrimaryToReplicaResourceFunctionTest.createDestinationPath fails with 
> AssertionError
> 
>
> Key: HIVE-16843
> URL: https://issues.apache.org/jira/browse/HIVE-16843
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 3.0.0
> Environment: # cat /etc/lsb-release
> DISTRIB_ID=Ubuntu
> DISTRIB_RELEASE=14.04
> DISTRIB_CODENAME=trusty
> DISTRIB_DESCRIPTION="Ubuntu 14.04.5 LTS"
> # uname -a
> Linux 9efcdb4d8880 3.19.0-37-generic #42-Ubuntu SMP Fri Nov 20 18:22:05 UTC 
> 2015 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Yussuf Shaikh
>Assignee: Yussuf Shaikh
>Priority: Minor
> Attachments: HIVE-16843.patch
>
>
> Stacktrace:
> java.lang.AssertionError: 
> Expected: is 
> "hdfs://somehost:9000/someBasePath/withADir/replicaDbName/somefunctionname/9223372036854775807/ab.jar"
>  but: was 
> "hdfs://somehost:9000/someBasePath/withADir/replicadbname/somefunctionname/0/ab.jar"
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.junit.Assert.assertThat(Assert.java:865)
>   at org.junit.Assert.assertThat(Assert.java:832)
>   at 
>

[jira] [Commented] (HIVE-16612) PerfLogger is configurable, but not extensible

2018-04-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426276#comment-16426276
 ] 

Hive QA commented on HIVE-16612:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12867355/HIVE-16612.02.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9998/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9998/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9998/

Messages:
{noformat}
 This message was trimmed, see log for full details 
Checking out files:  76% (1082/1423)   
Checking out files:  77% (1096/1423)   
Checking out files:  78% (1110/1423)   
Checking out files:  79% (1125/1423)   
Checking out files:  80% (1139/1423)   
Checking out files:  81% (1153/1423)   
Checking out files:  82% (1167/1423)   
Checking out files:  83% (1182/1423)   
Checking out files:  84% (1196/1423)   
Checking out files:  85% (1210/1423)   
Checking out files:  86% (1224/1423)   
Checking out files:  87% (1239/1423)   
Checking out files:  88% (1253/1423)   
Checking out files:  89% (1267/1423)   
Checking out files:  90% (1281/1423)   
Checking out files:  91% (1295/1423)   
Checking out files:  92% (1310/1423)   
Checking out files:  93% (1324/1423)   
Checking out files:  94% (1338/1423)   
Checking out files:  95% (1352/1423)   
Checking out files:  96% (1367/1423)   
Checking out files:  97% (1381/1423)   
Checking out files:  98% (1395/1423)   
Checking out files:  99% (1409/1423)   
Checking out files: 100% (1423/1423)   
Checking out files: 100% (1423/1423), done.
HEAD is now at 078b9c3 HIVE-19100 : investigate TestStreaming failures(Eugene 
Koifman, reviewed by Alan Gates)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at dc5a943 HIVE-19083: Make partition clause optional for 
INSERT(Vineet Garg,reviewed by Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-04-04 22:35:29.206
+ rm -rf ../yetus_PreCommit-HIVE-Build-9998
+ mkdir ../yetus_PreCommit-HIVE-Build-9998
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-9998
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-9998/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: does not 
exist in index
error: a/common/src/java/org/apache/hadoop/hive/ql/log/PerfLogger.java: does 
not exist in index
error: 
a/metastore/src/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java: 
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/Driver.java: does not exist in 
index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java: does 
not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/SerializationUtilities.java: does 
not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/SparkHashTableSinkOperator.java: 
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java: does not 
exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkMapRecordHandler.java: 
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java: does 
not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java: 
does not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkRecordHandler.java: 
does not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java:
 does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java: does 
not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/LocalSparkJobMonitor.java:
 does not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/RemoteSparkJobMonitor.java:
 does not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobMonitor.java: 
does not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MapRecordProcessor.java: does 
not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileRecordProcessor.java: 
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RecordProcessor.java: 
does not exist in

[jira] [Commented] (HIVE-18910) Migrate to Murmur hash for shuffle and bucketing

2018-04-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426268#comment-16426268
 ] 

Hive QA commented on HIVE-18910:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12917607/HIVE-18910.22.patch

{color:green}SUCCESS:{color} +1 due to 8 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 226 failed/errored test(s), 13975 tests 
executed
*Failed tests:*
{noformat}
TestBeeLineDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=252)
TestCopyUtils - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=246)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=252)
TestExportImport - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed 
out) (batchId=246)
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=252)
TestMiniDruidKafkaCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=252)
TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed 
out) (batchId=216)
TestReplicationOnHDFSEncryptedZones - did not produce a TEST-*.xml file (likely 
timed out) (batchId=230)
TestReplicationScenarios - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestReplicationScenariosAcidTables - did not produce a TEST-*.xml file (likely 
timed out) (batchId=230)
TestReplicationScenariosAcrossInstances - did not produce a TEST-*.xml file 
(likely timed out) (batchId=230)
TestSequenceFileReadWrite - did not produce a TEST-*.xml file (likely timed 
out) (batchId=246)
TestTezPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=252)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=54)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_basic2] 
(batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[nonmr_fetch] (batchId=20)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample8] (batchId=30)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=180)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[intersect_all] 
(batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[intersect_distinct]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez_empty]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_4]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[groupby_groupingset_bug]
 (batchId=174)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] 
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_vector_dynpart_hashjoin_1]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_access_time_non_current_db]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_div0]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_dynamic_semijoin_reduction]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=105)
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver[bucket_num_reducers]
 (batchId=94)
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver[bucket_num_reducers_acid2]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.org.apache.hadoop.hive.cli.TestNegativeCliDriver
 (batchId=95)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.org.apache.hadoop.hive.cli.TestNegativeCliDriver
 (batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[alter_file_format]
 (batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[alter_notnull_constraint_violation]
 (batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[alter_view_as_select_with_partition]
 (batchId=95)

[jira] [Commented] (HIVE-17824) msck repair table should drop the missing partitions from metastore

2018-04-04 Thread Janaki Lahorani (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426253#comment-16426253
 ] 

Janaki Lahorani commented on HIVE-17824:


The test failures are not related.

> msck repair table should drop the missing partitions from metastore
> ---
>
> Key: HIVE-17824
> URL: https://issues.apache.org/jira/browse/HIVE-17824
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Janaki Lahorani
>Priority: Major
> Attachments: HIVE-17824.1.patch, HIVE-17824.2.patch
>
>
> {{msck repair table }} is often used in environments where the new 
> partitions are loaded as directories on HDFS or S3 and users want to create 
> the missing partitions in bulk. However, currently it only supports addition 
> of missing partitions. If there are any partitions which are present in 
> metastore but not on the FileSystem, it should also delete them so that it 
> truly repairs the table metadata.
> We should be careful not to break backwards compatibility so we should either 
> introduce a new config or keyword to add support to delete unnecessary 
> partitions from the metastore. This way users who want the old behavior can 
> easily turn it off. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-17824) msck repair table should drop the missing partitions from metastore

2018-04-04 Thread Janaki Lahorani (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani updated HIVE-17824:
---
Attachment: HIVE-17824.2.patch

> msck repair table should drop the missing partitions from metastore
> ---
>
> Key: HIVE-17824
> URL: https://issues.apache.org/jira/browse/HIVE-17824
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Janaki Lahorani
>Priority: Major
> Attachments: HIVE-17824.1.patch, HIVE-17824.2.patch
>
>
> {{msck repair table }} is often used in environments where the new 
> partitions are loaded as directories on HDFS or S3 and users want to create 
> the missing partitions in bulk. However, currently it only supports addition 
> of missing partitions. If there are any partitions which are present in 
> metastore but not on the FileSystem, it should also delete them so that it 
> truly repairs the table metadata.
> We should be careful not to break backwards compatibility so we should either 
> introduce a new config or keyword to add support to delete unnecessary 
> partitions from the metastore. This way users who want the old behavior can 
> easily turn it off. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-17824) msck repair table should drop the missing partitions from metastore

2018-04-04 Thread Janaki Lahorani (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani updated HIVE-17824:
---
Attachment: (was: HIVE-17824.1.patch)

> msck repair table should drop the missing partitions from metastore
> ---
>
> Key: HIVE-17824
> URL: https://issues.apache.org/jira/browse/HIVE-17824
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Janaki Lahorani
>Priority: Major
> Attachments: HIVE-17824.1.patch, HIVE-17824.2.patch
>
>
> {{msck repair table }} is often used in environments where the new 
> partitions are loaded as directories on HDFS or S3 and users want to create 
> the missing partitions in bulk. However, currently it only supports addition 
> of missing partitions. If there are any partitions which are present in 
> metastore but not on the FileSystem, it should also delete them so that it 
> truly repairs the table metadata.
> We should be careful not to break backwards compatibility so we should either 
> introduce a new config or keyword to add support to delete unnecessary 
> partitions from the metastore. This way users who want the old behavior can 
> easily turn it off. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19083) Make partition clause optional for INSERT

2018-04-04 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-19083:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master

> Make partition clause optional for INSERT
> -
>
> Key: HIVE-19083
> URL: https://issues.apache.org/jira/browse/HIVE-19083
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19083.1.patch, HIVE-19083.2.patch, 
> HIVE-19083.3.patch, HIVE-19083.4.patch
>
>
> Partition clause should be optional for
>  * INSERT INTO VALUES
>  * INSERT OVERWRITE
>  * INSERT SELECT



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18910) Migrate to Murmur hash for shuffle and bucketing

2018-04-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426213#comment-16426213
 ] 

Hive QA commented on HIVE-18910:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
42s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
38s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
21s{color} | {color:red} streaming in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
57s{color} | {color:red} ql in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m  
0s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
14s{color} | {color:red} storage-api: The patch generated 3 new + 97 unchanged 
- 3 fixed = 100 total (was 100) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
22s{color} | {color:red} serde: The patch generated 150 new + 214 unchanged - 3 
fixed = 364 total (was 217) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
14s{color} | {color:red} hcatalog/streaming: The patch generated 1 new + 33 
unchanged - 0 fixed = 34 total (was 33) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
2s{color} | {color:red} ql: The patch generated 26 new + 1267 unchanged - 3 
fixed = 1293 total (was 1270) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 248 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
40s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 51 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 37m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-9997/dev-support/hive-personality.sh
 |
| git revision | master / 078b9c3 |
| Default Java | 1.8.0_111 |
| mvninstall | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9997/yetus/patch-mvninstall-hcatalog_streaming.txt
 |
| mvninstall | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9997/yetus/patch-mvninstall-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9997/yetus/diff-checkstyle-storage-api.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9997/yetus/diff-checkstyle-serde.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9997/yetus/diff-checkstyle-hcatalog_streaming.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9997/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9997/yetus/whitespace-tabs.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9997/yetus/patch-asflicense-problems.txt
 |
| modules | C: storage-api serde hbase-handler hcatalog/streaming 
itests/hive-blobstore ql standalone-metastore U: . |
| Console output |

[jira] [Commented] (HIVE-18775) HIVE-17983 missed deleting metastore/scripts/upgrade/derby/hive-schema-3.0.0.derby.sql

2018-04-04 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426209#comment-16426209
 ] 

Vineet Garg commented on HIVE-18775:


+1 pending tests.

> HIVE-17983 missed deleting 
> metastore/scripts/upgrade/derby/hive-schema-3.0.0.derby.sql
> --
>
> Key: HIVE-18775
> URL: https://issues.apache.org/jira/browse/HIVE-18775
> Project: Hive
>  Issue Type: Bug
>Reporter: Vineet Garg
>Assignee: Alan Gates
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HIVE-18775.1.patch, HIVE-18775.2.patch, 
> HIVE-18775.3.patch
>
>
> HIVE-17983 moved hive metastore schema sql files for all databases but derby 
> to standalone-metastore. As a result there are not two copies of 
> {{hive-schema-3.0.0.derby.sql}}.
> {{metastore/scripts/upgrade/derby/hive-schema-3.0.0.derby.sql}} needs to be 
> removed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19112) Support Analyze table for partitioned tables without partition spec

2018-04-04 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-19112:
--


> Support Analyze table for partitioned tables without partition spec
> ---
>
> Key: HIVE-19112
> URL: https://issues.apache.org/jira/browse/HIVE-19112
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> Currently to run analyze table compute statistics on a partitioned table 
> partition spec needs to be specified. We should make it optional.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19083) Make partition clause optional for INSERT

2018-04-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426156#comment-16426156
 ] 

Hive QA commented on HIVE-19083:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12917422/HIVE-19083.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 198 failed/errored test(s), 13308 tests 
executed
*Failed tests:*
{noformat}
TestCopyUtils - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=246)
TestExportImport - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed 
out) (batchId=246)
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=95)

[jira] [Commented] (HIVE-12192) Hive should carry out timestamp computations in UTC

2018-04-04 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426142#comment-16426142
 ] 

Jesus Camacho Rodriguez commented on HIVE-12192:


bq. How much similar? Is it really the same as SQL (standard) timestamp, i.e. a 
"Record" of year/month/day/hour/minute/second[/fraction] ? Did this semantics 
change over time?
They should be equal, and this has not changed for some time.
As I mentioned, for end-user there should not be a visible difference with this 
patch, except for bugs such as the one mentioned in the description of the 
issue. Other complicated scenarios may be fixed with this patch too, e.g. query 
execution across multiple clusters with different timezones, but I am not sure 
this is something that is supported by Hive right now in any case.

bq. Did you also consider using a different representation, like 
java.time.LocalDateTime? (if this representation is indeed applicable)
This is precisely what the patch attached to this issue is doing, you can check 
it above.

bq. Do you happen to know it is handled for other file types? Parquet, RC 
binary, RC text, textfile?
If I remember correctly, for text based formats, the string representation is 
persisted, e.g. '1970-01-01 00:00:00'. I do not remember how other formats 
handle the mismatch, but if they use a long representation, I would expect that 
they transform the timestamp from system time zone to UTC when they write to 
disk from Hive, and then from UTC to current system time zone when they read 
from disk into Hive.


> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: timestamp
> Attachments: HIVE-12192.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented because the effective zone skips them.
> {code}
> hive> select TIMESTAMP '2015-03-08 02:10:00.101';
> 2015-03-08 03:10:00.101
> {code}
> Using UTC instead of the SQL session time zone as the underlying zone for a 
> java.sql.Timestamp avoids this bug, while still returning correct values for 
> {{getYear}} etc. Using UTC as the convenience representation (timestamp 
> without time zone has no real zone) would make timestamp calculations more 
> consistent and avoid similar problems in the future.
> Notably, this would break the {{unix_timestamp}} UDF that specifies the 
> result is with respect to ["the default timezone and default 
> locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].
>  That function would need to be updated to use the 
> {{System.getProperty("user.timezone")}} zone.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-12369) Native Vector GroupBy

2018-04-04 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-12369:

Attachment: HIVE-12369.091.patch

> Native Vector GroupBy
> -
>
> Key: HIVE-12369
> URL: https://issues.apache.org/jira/browse/HIVE-12369
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-12369.01.patch, HIVE-12369.02.patch, 
> HIVE-12369.05.patch, HIVE-12369.06.patch, HIVE-12369.091.patch
>
>
> Implement Native Vector GroupBy using fast hash table technology developed 
> for Native Vector MapJoin, etc.
> Patch is currently limited to a single Long key with a single COUNT 
> aggregation.  Or, a single Long key and no aggregation also known as 
> duplicate reduction.
> 3 new classes introduces that stored the count in the slot table and don't 
> allocate hash elements:
> {noformat}
>   COUNT(column)  VectorGroupByHashLongKeyCountColumnOperator  
>   COUNT(key) VectorGroupByHashLongKeyCountKeyOperator
>   COUNT(*)   VectorGroupByHashLongKeyCountStarOperator   
> {noformat}
> And the duplicate reduction operator a single Long key:
> {noformat}
>   VectorGroupByHashLongKeyDuplicateReductionOperator
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-12369) Native Vector GroupBy

2018-04-04 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-12369:

Description: 
Implement Native Vector GroupBy using fast hash table technology developed for 
Native Vector MapJoin, etc.

Patch is currently limited to a single Long key with a single COUNT 
aggregation.  Or, a single Long key and no aggregation also known as duplicate 
reduction.

3 new classes introduces that stored the count in the slot table and don't 
allocate hash elements:
{noformat}
  COUNT(column)  VectorGroupByHashLongKeyCountColumnOperator  
  COUNT(key) VectorGroupByHashLongKeyCountKeyOperator
  COUNT(*)   VectorGroupByHashLongKeyCountStarOperator   
{noformat}
And the duplicate reduction operator a single Long key:
{noformat}
  VectorGroupByHashLongKeyDuplicateReductionOperator
{noformat}

  was:
Implement Native Vector GroupBy using fast hash table technology developed for 
Native Vector MapJoin, etc.

Patch is currently limited to a single Long key, aggregation on Long columns, 
no more than 31 columns.

3 new classes introduces that stored the count in the slot table and don't 
allocate hash elements:

{noformat}
  COUNT(column)  VectorGroupByHashOneLongKeyCountColumnOperator  
  COUNT(key) VectorGroupByHashOneLongKeyCountKeyOperator
  COUNT(*)   VectorGroupByHashOneLongKeyCountStarOperator   
{noformat}

And a new class that aggregates a single Long key:

{noformat}
  VectorGroupByHashOneLongKeyOperator
{noformat}


> Native Vector GroupBy
> -
>
> Key: HIVE-12369
> URL: https://issues.apache.org/jira/browse/HIVE-12369
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-12369.01.patch, HIVE-12369.02.patch, 
> HIVE-12369.05.patch, HIVE-12369.06.patch
>
>
> Implement Native Vector GroupBy using fast hash table technology developed 
> for Native Vector MapJoin, etc.
> Patch is currently limited to a single Long key with a single COUNT 
> aggregation.  Or, a single Long key and no aggregation also known as 
> duplicate reduction.
> 3 new classes introduces that stored the count in the slot table and don't 
> allocate hash elements:
> {noformat}
>   COUNT(column)  VectorGroupByHashLongKeyCountColumnOperator  
>   COUNT(key) VectorGroupByHashLongKeyCountKeyOperator
>   COUNT(*)   VectorGroupByHashLongKeyCountStarOperator   
> {noformat}
> And the duplicate reduction operator a single Long key:
> {noformat}
>   VectorGroupByHashLongKeyDuplicateReductionOperator
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-12192) Hive should carry out timestamp computations in UTC

2018-04-04 Thread Piotr Findeisen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426109#comment-16426109
 ] 

Piotr Findeisen commented on HIVE-12192:


Few clarifying questions

bq. The type is not really an instant, even before HIVE-12192. Semantics from 
querying perspective are similar to SQL timestamp. 

How much similar? Is it really the same as SQL (standard) timestamp, i.e. a 
"Record" of year/month/day/hour/minute/second[/fraction] ?
Did this semantics change over time? ([~haozhun] and I work on Presto and we 
would like Presto's Hive connector to work appropriately with different 
versions of Hive)

bq. it uses java.sql.timestamp class to store the value

What about values that cannot be represented using {{java.sql.Timestamp}}? 
(i.e. whenever there's a forward offset change in JVM's zone)
Did you also consider using a different representation, like 
{{java.time.LocalDateTime}}? (if this representation is indeed applicable)

bq. How does ORC fix this? \[...]

Do you happen to know it is handled for other file types? Parquet, RC binary, 
RC text, textfile?

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: timestamp
> Attachments: HIVE-12192.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented because the effective zone skips them.
> {code}
> hive> select TIMESTAMP '2015-03-08 02:10:00.101';
> 2015-03-08 03:10:00.101
> {code}
> Using UTC instead of the SQL session time zone as the underlying zone for a 
> java.sql.Timestamp avoids this bug, while still returning correct values for 
> {{getYear}} etc. Using UTC as the convenience representation (timestamp 
> without time zone has no real zone) would make timestamp calculations more 
> consistent and avoid similar problems in the future.
> Notably, this would break the {{unix_timestamp}} UDF that specifies the 
> result is with respect to ["the default timezone and default 
> locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].
>  That function would need to be updated to use the 
> {{System.getProperty("user.timezone")}} zone.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18910) Migrate to Murmur hash for shuffle and bucketing

2018-04-04 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-18910:
--
Attachment: HIVE-18910.22.patch

> Migrate to Murmur hash for shuffle and bucketing
> 
>
> Key: HIVE-18910
> URL: https://issues.apache.org/jira/browse/HIVE-18910
> Project: Hive
>  Issue Type: Task
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18910.1.patch, HIVE-18910.10.patch, 
> HIVE-18910.11.patch, HIVE-18910.12.patch, HIVE-18910.13.patch, 
> HIVE-18910.14.patch, HIVE-18910.15.patch, HIVE-18910.16.patch, 
> HIVE-18910.17.patch, HIVE-18910.18.patch, HIVE-18910.19.patch, 
> HIVE-18910.2.patch, HIVE-18910.20.patch, HIVE-18910.21.patch, 
> HIVE-18910.22.patch, HIVE-18910.3.patch, HIVE-18910.4.patch, 
> HIVE-18910.5.patch, HIVE-18910.6.patch, HIVE-18910.7.patch, 
> HIVE-18910.8.patch, HIVE-18910.9.patch
>
>
> Hive uses JAVA hash which is not as good as murmur for better distribution 
> and efficiency in bucketing a table.
> Migrate to murmur hash but still keep backward compatibility for existing 
> users so that they dont have to reload the existing tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19031) Mark duplicate configs in HiveConf as deprecated

2018-04-04 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426095#comment-16426095
 ] 

Thejas M Nair commented on HIVE-19031:
--

+1


> Mark duplicate configs in HiveConf as deprecated
> 
>
> Key: HIVE-19031
> URL: https://issues.apache.org/jira/browse/HIVE-19031
> Project: Hive
>  Issue Type: Sub-task
>  Components: Configuration, Standalone Metastore
>Affects Versions: 3.0.0
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Blocker
> Attachments: HIVE-19031.2.patch, HIVE-19031.patch
>
>
> There are a number of configuration values that were copied from HiveConf to 
> MetastoreConf.  They have been left in HiveConf for backwards compatibility.  
> But they need to be marked as deprecated so that users know to use the new 
> values in MetastoreConf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19100) investigate TestStreaming failures

2018-04-04 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19100:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
 Release Note: n/a
   Status: Resolved  (was: Patch Available)

> investigate TestStreaming failures
> --
>
> Key: HIVE-19100
> URL: https://issues.apache.org/jira/browse/HIVE-19100
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19100.01.patch, HIVE-19100.02.patch, 
> HIVE-19100.03.patch
>
>
> {noformat}
> [ERROR] Failures: 
> [ERROR]   
> TestStreaming.testInterleavedTransactionBatchCommits:1218->checkDataWritten2:619
>  expected:<11> but was:<12>
> [ERROR]   
> TestStreaming.testMultipleTransactionBatchCommits:1157->checkDataWritten2:619 
> expected:<1> but was:<3>
> [ERROR]   
> TestStreaming.testTransactionBatchAbortAndCommit:1138->checkDataWritten:566 
> expected:<1> but was:<2>
> [ERROR]   
> TestStreaming.testTransactionBatchCommit_Delimited:861->testTransactionBatchCommit_Delimited:881->checkDataWritten:566
>  expected:<1> but was:<3>
> [ERROR]   
> TestStreaming.testTransactionBatchCommit_DelimitedUGI:865->testTransactionBatchCommit_Delimited:881->checkDataWritten:566
>  expected:<1> but was:<3>
> [ERROR]   
> TestStreaming.testTransactionBatchCommit_Json:1011->checkDataWritten:566 
> expected:<1> but was:<3>
> [ERROR]   
> TestStreaming.testTransactionBatchCommit_Regex:928->testTransactionBatchCommit_Regex:949->checkDataWritten:566
>  expected:<1> but was:<3>
> [ERROR]   
> TestStreaming.testTransactionBatchCommit_RegexUGI:932->testTransactionBatchCommit_Regex:949->checkDataWritten:566
>  expected:<1> but was:<3>
> [INFO] 
> [ERROR] Tests run: 26, Failures: 8, Errors: 0, Skipped: 0
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19100) investigate TestStreaming failures

2018-04-04 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426084#comment-16426084
 ] 

Eugene Koifman commented on HIVE-19100:
---

there is a bunch of TestTezPerfCliDriver failures with age = 1 but other runs
https://builds.apache.org/job/PreCommit-HIVE-Build/9992/testReport
https://builds.apache.org/job/PreCommit-HIVE-Build/9991/testReport
contain identical failures

thanks Alan for the review
committed to master

> investigate TestStreaming failures
> --
>
> Key: HIVE-19100
> URL: https://issues.apache.org/jira/browse/HIVE-19100
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19100.01.patch, HIVE-19100.02.patch, 
> HIVE-19100.03.patch
>
>
> {noformat}
> [ERROR] Failures: 
> [ERROR]   
> TestStreaming.testInterleavedTransactionBatchCommits:1218->checkDataWritten2:619
>  expected:<11> but was:<12>
> [ERROR]   
> TestStreaming.testMultipleTransactionBatchCommits:1157->checkDataWritten2:619 
> expected:<1> but was:<3>
> [ERROR]   
> TestStreaming.testTransactionBatchAbortAndCommit:1138->checkDataWritten:566 
> expected:<1> but was:<2>
> [ERROR]   
> TestStreaming.testTransactionBatchCommit_Delimited:861->testTransactionBatchCommit_Delimited:881->checkDataWritten:566
>  expected:<1> but was:<3>
> [ERROR]   
> TestStreaming.testTransactionBatchCommit_DelimitedUGI:865->testTransactionBatchCommit_Delimited:881->checkDataWritten:566
>  expected:<1> but was:<3>
> [ERROR]   
> TestStreaming.testTransactionBatchCommit_Json:1011->checkDataWritten:566 
> expected:<1> but was:<3>
> [ERROR]   
> TestStreaming.testTransactionBatchCommit_Regex:928->testTransactionBatchCommit_Regex:949->checkDataWritten:566
>  expected:<1> but was:<3>
> [ERROR]   
> TestStreaming.testTransactionBatchCommit_RegexUGI:932->testTransactionBatchCommit_Regex:949->checkDataWritten:566
>  expected:<1> but was:<3>
> [INFO] 
> [ERROR] Tests run: 26, Failures: 8, Errors: 0, Skipped: 0
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18972) beeline command suggestion to kill job deprecated

2018-04-04 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-18972:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

patch merged into master. Thanks for your contribution [~bharos92]

> beeline command suggestion to kill job deprecated
> -
>
> Key: HIVE-18972
> URL: https://issues.apache.org/jira/browse/HIVE-18972
> Project: Hive
>  Issue Type: Bug
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-18972.1.patch
>
>
> When I run a beeline command that uses YARN:
> {code}
> INFO : The url to track the job: 
> http://vd0514.halxg.cloudera.com:8088/proxy/application_1488996234407_0010/
> INFO : Starting Job = job_1488996234407_0010, Tracking URL = 
> http://vd0514.halxg.cloudera.com:8088/proxy/application_1488996234407_0010/
> INFO : Kill Command = 
> /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.9/lib/hadoop/bin/hadoop job 
> -kill job_1488996234407_0010
> {code}
> If I then try to kill the job using that command:
> {code}
> [systest@vd0514 ~]$ 
> /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.9/lib/hadoop/bin/hadoop job 
> -kill job_1488996234407_0010
> DEPRECATED: Use of this script to execute mapred command is deprecated.
> Instead use the mapred command for it.
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19083) Make partition clause optional for INSERT

2018-04-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426078#comment-16426078
 ] 

Hive QA commented on HIVE-19083:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 50 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 17m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-9996/dev-support/hive-personality.sh
 |
| git revision | master / d283899 |
| Default Java | 1.8.0_111 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9996/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9996/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Make partition clause optional for INSERT
> -
>
> Key: HIVE-19083
> URL: https://issues.apache.org/jira/browse/HIVE-19083
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19083.1.patch, HIVE-19083.2.patch, 
> HIVE-19083.3.patch, HIVE-19083.4.patch
>
>
> Partition clause should be optional for
>  * INSERT INTO VALUES
>  * INSERT OVERWRITE
>  * INSERT SELECT



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18976) Add ability to setup Druid Kafka Ingestion from Hive

2018-04-04 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-18976:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Nishant!

> Add ability to setup Druid Kafka Ingestion from Hive
> 
>
> Key: HIVE-18976
> URL: https://issues.apache.org/jira/browse/HIVE-18976
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18976.03.patch, HIVE-18976.04.patch, 
> HIVE-18976.05.patch, HIVE-18976.patch
>
>
> Add Ability to setup druid kafka Ingestion using Hive CREATE TABLE statement
> e.g. Below query can submit a kafka supervisor spec to the druid overlord and 
> druid can start ingesting events from kafka. 
> {code:java}
>  
> CREATE TABLE druid_kafka_test(`__time` timestamp, page string, language 
> string, `user` string, added int, deleted int, delta int)
> STORED BY 
> 'org.apache.hadoop.hive.druid.DruidKafkaStreamingStorageHandler'
> TBLPROPERTIES (
> "druid.segment.granularity" = "HOUR",
> "druid.query.granularity" = "MINUTE",
> "kafka.bootstrap.servers" = "localhost:9092",
> "kafka.topic" = "test-topic",
> "druid.kafka.ingest.useEarliestOffset" = "true"
> );
> {code}
> Design - This can be done via a DruidKafkaStreamingStorageHandler that 
> extends existing DruidStorageHandler and adds the additional functionality 
> for Streaming. 
> Testing - Add a DruidKafkaMiniCluster which will consist of DruidMiniCluster 
> + Single Node Kafka Broker. The broker can be populated with a test topic 
> that has some predefined data. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19100) investigate TestStreaming failures

2018-04-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426035#comment-16426035
 ] 

Hive QA commented on HIVE-19100:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12917431/HIVE-19100.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 243 failed/errored test(s), 13635 tests 
executed
*Failed tests:*
{noformat}
TestCopyUtils - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=246)
TestExportImport - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed 
out) (batchId=246)
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=95)

[jira] [Commented] (HIVE-19033) Provide an option to purge LLAP IO cache

2018-04-04 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426030#comment-16426030
 ] 

Prasanth Jayachandran commented on HIVE-19033:
--

Fixes TestBeeLineDriver.insert_overwrite_local_directory_1.q test failure that 
does dfs -cat whose data is tab delimited. Updated the ColumnBasedSet to split 
only when schema has >1 column (which is only for llap commands, all other 
commands just emit single column)

> Provide an option to purge LLAP IO cache
> 
>
> Key: HIVE-19033
> URL: https://issues.apache.org/jira/browse/HIVE-19033
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19033.1.patch, HIVE-19033.2.patch, 
> HIVE-19033.3.patch, HIVE-19033.4.patch, HIVE-19033.5.patch, 
> HIVE-19033.6.patch, HIVE-19033.7.patch, HIVE-19033.8.patch
>
>
> Provide an API endpoint that will trigger purging of LLAP IO cache. Also CLI 
> tool to invoke the endpoint of all LLAP daemons. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19033) Provide an option to purge LLAP IO cache

2018-04-04 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-19033:
-
Attachment: HIVE-19033.8.patch

> Provide an option to purge LLAP IO cache
> 
>
> Key: HIVE-19033
> URL: https://issues.apache.org/jira/browse/HIVE-19033
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19033.1.patch, HIVE-19033.2.patch, 
> HIVE-19033.3.patch, HIVE-19033.4.patch, HIVE-19033.5.patch, 
> HIVE-19033.6.patch, HIVE-19033.7.patch, HIVE-19033.8.patch
>
>
> Provide an API endpoint that will trigger purging of LLAP IO cache. Also CLI 
> tool to invoke the endpoint of all LLAP daemons. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18841) Support authorization of UDF usage in hive

2018-04-04 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-18841:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master.
Thanks for the review [~daijy]!


> Support authorization of UDF usage in hive
> --
>
> Key: HIVE-18841
> URL: https://issues.apache.org/jira/browse/HIVE-18841
> Project: Hive
>  Issue Type: New Feature
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-18841.1.patch, HIVE-18841.1.patch, 
> HIVE-18841.2.patch
>
>
> It should be possible to create authorization policies on UDF usage. 
> ie, it should be possible to control who can use certain UDF in their queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18841) Support authorization of UDF usage in hive

2018-04-04 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-18841:
-
Attachment: HIVE-18841.2.patch

> Support authorization of UDF usage in hive
> --
>
> Key: HIVE-18841
> URL: https://issues.apache.org/jira/browse/HIVE-18841
> Project: Hive
>  Issue Type: New Feature
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>Priority: Critical
> Attachments: HIVE-18841.1.patch, HIVE-18841.1.patch, 
> HIVE-18841.2.patch
>
>
> It should be possible to create authorization policies on UDF usage. 
> ie, it should be possible to control who can use certain UDF in their queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18841) Support authorization of UDF usage in hive

2018-04-04 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426013#comment-16426013
 ] 

Thejas M Nair commented on HIVE-18841:
--

2.patch - Updated q.out files, fixed checkstyle issues


> Support authorization of UDF usage in hive
> --
>
> Key: HIVE-18841
> URL: https://issues.apache.org/jira/browse/HIVE-18841
> Project: Hive
>  Issue Type: New Feature
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>Priority: Critical
> Attachments: HIVE-18841.1.patch, HIVE-18841.1.patch, 
> HIVE-18841.2.patch
>
>
> It should be possible to create authorization policies on UDF usage. 
> ie, it should be possible to control who can use certain UDF in their queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17687) CompactorMR.run() should update compaction_queue table for MM

2018-04-04 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425999#comment-16425999
 ] 

Eugene Koifman commented on HIVE-17687:
---

currently, compactor for MM will delete Aborted dirs in Worker and put the 
queue entry in READY FOR CLEANING state - via TxnHandler.markCompacted() but it 
should probably do TxnHandler.markCleaned() since Cleaner has nothing nothing 
to do for MM.

it'd be useful but not critical

> CompactorMR.run() should update compaction_queue table for MM
> -
>
> Key: HIVE-17687
> URL: https://issues.apache.org/jira/browse/HIVE-17687
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Minor
>
> for MM it deletes Aborted dirs and bails.  Should probably update 
> compaction_queue so that it's clear why it doesn't have HadoopJobId etc



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17687) CompactorMR.run() should update compaction_queue table for MM

2018-04-04 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425983#comment-16425983
 ] 

Sergey Shelukhin commented on HIVE-17687:
-

[~ekoifman] is this important? :)

> CompactorMR.run() should update compaction_queue table for MM
> -
>
> Key: HIVE-17687
> URL: https://issues.apache.org/jira/browse/HIVE-17687
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Minor
>
> for MM it deletes Aborted dirs and bails.  Should probably update 
> compaction_queue so that it's clear why it doesn't have HadoopJobId etc



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18972) beeline command suggestion to kill job deprecated

2018-04-04 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425984#comment-16425984
 ] 

Vihang Karajgaonkar commented on HIVE-18972:


+1

> beeline command suggestion to kill job deprecated
> -
>
> Key: HIVE-18972
> URL: https://issues.apache.org/jira/browse/HIVE-18972
> Project: Hive
>  Issue Type: Bug
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
> Attachments: HIVE-18972.1.patch
>
>
> When I run a beeline command that uses YARN:
> {code}
> INFO : The url to track the job: 
> http://vd0514.halxg.cloudera.com:8088/proxy/application_1488996234407_0010/
> INFO : Starting Job = job_1488996234407_0010, Tracking URL = 
> http://vd0514.halxg.cloudera.com:8088/proxy/application_1488996234407_0010/
> INFO : Kill Command = 
> /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.9/lib/hadoop/bin/hadoop job 
> -kill job_1488996234407_0010
> {code}
> If I then try to kill the job using that command:
> {code}
> [systest@vd0514 ~]$ 
> /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.9/lib/hadoop/bin/hadoop job 
> -kill job_1488996234407_0010
> DEPRECATED: Use of this script to execute mapred command is deprecated.
> Instead use the mapred command for it.
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18840) CachedStore: Prioritize loading of recently accessed tables during prewarm

2018-04-04 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-18840:

Attachment: HIVE-18840.2.patch

> CachedStore: Prioritize loading of recently accessed tables during prewarm
> --
>
> Key: HIVE-18840
> URL: https://issues.apache.org/jira/browse/HIVE-18840
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-18840.1.patch, HIVE-18840.2.patch, 
> HIVE-18840.2.patch, HIVE-18840.2.patch
>
>
> On clusters with large metadata, prewarming the cache can take several hours. 
> Now that CachedStore does not block on prewarm anymore (after HIVE-18264), we 
> should prioritize loading of recently accessed tables during prewarm.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19102) Vectorization: Suppress known Q file bugs

2018-04-04 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19102:

Attachment: HIVE-19102.01.patch

> Vectorization: Suppress known Q file bugs
> -
>
> Key: HIVE-19102
> URL: https://issues.apache.org/jira/browse/HIVE-19102
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-19102.01.patch
>
>
> There are known bugs recently found and reported that occur when 
> vectorization is turn on in Q files.  Until those bugs are fixed, add SET 
> statements to the top of the Q files that suppress vectorization.
> Change a few EXPLAIN VECTORIZATION to EXPLAIN where it isn't needed and 
> creates an unnecessary Q file difference during enable by default experiments.
>  * +TestCliDriver+
>  ** +Execution Failures+
>  *** *input_lazyserde.q*
>   HIVE-19088
>  *** *input_lazyserde2.q*
>   HIVE-19088, too.
>  *** *nested_column_pruning.q*
>   HIVE-19016
>  *** *parquet_map_of_arrays_of_ints.q*
>   HIVE-19015
>  *** *parquet_map_of_maps.q*
>   HIVE-19015, too.
>  *** *parquet_nested_complex.q*
>   HIVE-19016
>  ** +Wrong Results+
>  *** *delete_orig_table.q*
>   HIVE-19109
>  *** *offset_limit_global_optimizer.q*
>   Added ORDER BY clauses to fix different vectorization intermediate 
> results due to LIMIT clause.
>  *** *parquet_ppd_decimal.q*
>   HIVE-19108
>  *** *udf_context_aware.q*
>   Test detects vectorization and output changes.
>  Added "set hive.test.vectorized.execution.enabled.override=none;"
>  *** *vector_udf3.q*
>   Test detects vectorization and output changes.
>  Added "set hive.test.vectorized.execution.enabled.override=none;"
>  * +TestContribCliDriver+
> ** +Wrong Results+
>  *** *udf_example_arraymapstruct.q*
>   HIVE-19110



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19102) Vectorization: Suppress known Q file bugs

2018-04-04 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19102:

Attachment: (was: HIVE-19102.01.patch)

> Vectorization: Suppress known Q file bugs
> -
>
> Key: HIVE-19102
> URL: https://issues.apache.org/jira/browse/HIVE-19102
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> There are known bugs recently found and reported that occur when 
> vectorization is turn on in Q files.  Until those bugs are fixed, add SET 
> statements to the top of the Q files that suppress vectorization.
> Change a few EXPLAIN VECTORIZATION to EXPLAIN where it isn't needed and 
> creates an unnecessary Q file difference during enable by default experiments.
>  * +TestCliDriver+
>  ** +Execution Failures+
>  *** *input_lazyserde.q*
>   HIVE-19088
>  *** *input_lazyserde2.q*
>   HIVE-19088, too.
>  *** *nested_column_pruning.q*
>   HIVE-19016
>  *** *parquet_map_of_arrays_of_ints.q*
>   HIVE-19015
>  *** *parquet_map_of_maps.q*
>   HIVE-19015, too.
>  *** *parquet_nested_complex.q*
>   HIVE-19016
>  ** +Wrong Results+
>  *** *delete_orig_table.q*
>   HIVE-19109
>  *** *offset_limit_global_optimizer.q*
>   Added ORDER BY clauses to fix different vectorization intermediate 
> results due to LIMIT clause.
>  *** *parquet_ppd_decimal.q*
>   HIVE-19108
>  *** *udf_context_aware.q*
>   Test detects vectorization and output changes.
>  Added "set hive.test.vectorized.execution.enabled.override=none;"
>  *** *vector_udf3.q*
>   Test detects vectorization and output changes.
>  Added "set hive.test.vectorized.execution.enabled.override=none;"
>  * +TestContribCliDriver+
> ** +Wrong Results+
>  *** *udf_example_arraymapstruct.q*
>   HIVE-19110



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19100) investigate TestStreaming failures

2018-04-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425948#comment-16425948
 ] 

Hive QA commented on HIVE-19100:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
46s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
15s{color} | {color:red} hcatalog/streaming: The patch generated 1 new + 223 
unchanged - 1 fixed = 224 total (was 224) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} ql: The patch generated 0 new + 250 unchanged - 1 
fixed = 250 total (was 251) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 50 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 19m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-9995/dev-support/hive-personality.sh
 |
| git revision | master / ee3724c |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9995/yetus/diff-checkstyle-hcatalog_streaming.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9995/yetus/patch-asflicense-problems.txt
 |
| modules | C: hcatalog/streaming ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9995/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> investigate TestStreaming failures
> --
>
> Key: HIVE-19100
> URL: https://issues.apache.org/jira/browse/HIVE-19100
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-19100.01.patch, HIVE-19100.02.patch, 
> HIVE-19100.03.patch
>
>
> {noformat}
> [ERROR] Failures: 
> [ERROR]   
> TestStreaming.testInterleavedTransactionBatchCommits:1218->checkDataWritten2:619
>  expected:<11> but was:<12>
> [ERROR]   
> TestStreaming.testMultipleTransactionBatchCommits:1157->checkDataWritten2:619 
> expected:<1> but was:<3>
> [ERROR]   
> TestStreaming.testTransactionBatchAbortAndCommit:1138->checkDataWritten:566 
> expected:<1> but was:<2>
> [ERROR]   
> TestStreaming.testTransactionBatchCommit_Delimited:861->testTransactionBatchCommit_Delimited:881->checkDataWritten:566
>  expected:<1> but was:<3>
> [ERROR]   
>

[jira] [Updated] (HIVE-19111) ACID: PPD & Split-pruning for txn_id filters

2018-04-04 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-19111:
---
Issue Type: Improvement  (was: Bug)

> ACID: PPD & Split-pruning for txn_id filters
> 
>
> Key: HIVE-19111
> URL: https://issues.apache.org/jira/browse/HIVE-19111
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Priority: Major
>
> HIVE-18839 uses transaction id filtering to do incremental scans of a table 
> (for a "from snapshot to snapshot" range).
> This filter can be pushed down into the Split-generation phase to skip entire 
> files and directories.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-12192) Hive should carry out timestamp computations in UTC

2018-04-04 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425936#comment-16425936
 ] 

Jesus Camacho Rodriguez commented on HIVE-12192:


[~haozhun], the table is nice, thanks.

For 'Timestamp w local tz' and 'Timestamp w tz', I think you got the 
progression right.

However, 'timestamp' is slightly different.
The type is not really an instant, even before HIVE-12192. Semantics from 
querying perspective are similar to SQL timestamp (e.g. localdatetime). But 
internally, e.g., during optimization or execution, it is represented 
differently depending on the timezone of your system (it uses 
java.sql.timestamp class to store the value).
With an example, if you store '1970-01-01 00:00:00' in PST, you will get 
'1970-01-01 00:00:00' if you read it from IST. However, the internal 
representation will be different on the writer HS2 and the reader HS2.
How does ORC fix this? By recording the system timezone for timestamp type when 
the data is written, and then using that timezone and reader timezone to create 
the difference between both (and hence applying the displacement).

The goal of this issue is to make internal representation independent from 
system time zone. This would fix the issue described above in addition to other 
issues derived from this representation when Hive interacts with other 
projects, e.g., Calcite.


> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: timestamp
> Attachments: HIVE-12192.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented because the effective zone skips them.
> {code}
> hive> select TIMESTAMP '2015-03-08 02:10:00.101';
> 2015-03-08 03:10:00.101
> {code}
> Using UTC instead of the SQL session time zone as the underlying zone for a 
> java.sql.Timestamp avoids this bug, while still returning correct values for 
> {{getYear}} etc. Using UTC as the convenience representation (timestamp 
> without time zone has no real zone) would make timestamp calculations more 
> consistent and avoid similar problems in the future.
> Notably, this would break the {{unix_timestamp}} UDF that specifies the 
> result is with respect to ["the default timezone and default 
> locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].
>  That function would need to be updated to use the 
> {{System.getProperty("user.timezone")}} zone.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18999) Filter operator does not work for List

2018-04-04 Thread Steve Yeom (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425921#comment-16425921
 ] 

Steve Yeom commented on HIVE-18999:
---

Hi [~ashutoshc] [~jcamachorodriguez], could you please review the patch? 
Please let me know if you need a RB. 

Thanks, 
Steve. 

> Filter operator does not work for List
> --
>
> Key: HIVE-18999
> URL: https://issues.apache.org/jira/browse/HIVE-18999
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Attachments: HIVE-18999.01.patch, HIVE-18999.02.patch, 
> HIVE-18999.03.patch
>
>
> {code:sql}
> create table table1(col0 int, col1 bigint, col2 string, col3 bigint, col4 
> bigint);
> insert into table1 values (1, 1, 'ccl',2014, 11);
> insert into table1 values (1, 1, 'ccl',2015, 11);
> insert into table1 values (1, 1, 'ccl',2014, 11);
> insert into table1 values (1, 1, 'ccl',2013, 11);
> -- INCORRECT
> SELECT COUNT(t1.col0) from table1 t1 where struct(col3, col4) in 
> (struct(2014,11));
> -- CORRECT
> SELECT COUNT(t1.col0) from table1 t1 where struct(col3, col4) in 
> (struct('2014','11'));
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18999) Filter operator does not work for List

2018-04-04 Thread Steve Yeom (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425916#comment-16425916
 ] 

Steve Yeom commented on HIVE-18999:
---

The failed 4 tests with age 1 of the above p-test run are clear in my laptop 
test environment. 

> Filter operator does not work for List
> --
>
> Key: HIVE-18999
> URL: https://issues.apache.org/jira/browse/HIVE-18999
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Attachments: HIVE-18999.01.patch, HIVE-18999.02.patch, 
> HIVE-18999.03.patch
>
>
> {code:sql}
> create table table1(col0 int, col1 bigint, col2 string, col3 bigint, col4 
> bigint);
> insert into table1 values (1, 1, 'ccl',2014, 11);
> insert into table1 values (1, 1, 'ccl',2015, 11);
> insert into table1 values (1, 1, 'ccl',2014, 11);
> insert into table1 values (1, 1, 'ccl',2013, 11);
> -- INCORRECT
> SELECT COUNT(t1.col0) from table1 t1 where struct(col3, col4) in 
> (struct(2014,11));
> -- CORRECT
> SELECT COUNT(t1.col0) from table1 t1 where struct(col3, col4) in 
> (struct('2014','11'));
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19102) Vectorization: Suppress known Q file bugs

2018-04-04 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19102:

Description: 
There are known bugs recently found and reported that occur when vectorization 
is turn on in Q files.  Until those bugs are fixed, add SET statements to the 
top of the Q files that suppress vectorization.

Change a few EXPLAIN VECTORIZATION to EXPLAIN where it isn't needed and creates 
an unnecessary Q file difference during enable by default experiments.

 * +TestCliDriver+

 ** +Execution Failures+
 *** *input_lazyserde.q*
  HIVE-19088
 *** *input_lazyserde2.q*
  HIVE-19088, too.
 *** *nested_column_pruning.q*
  HIVE-19016
 *** *parquet_map_of_arrays_of_ints.q*
  HIVE-19015
 *** *parquet_map_of_maps.q*
  HIVE-19015, too.
 *** *parquet_nested_complex.q*
  HIVE-19016

 ** +Wrong Results+
 *** *delete_orig_table.q*
  HIVE-19109
 *** *offset_limit_global_optimizer.q*
  Added ORDER BY clauses to fix different vectorization intermediate 
results due to LIMIT clause.
 *** *parquet_ppd_decimal.q*
  HIVE-19108
 *** *udf_context_aware.q*
  Test detects vectorization and output changes.
 Added "set hive.test.vectorized.execution.enabled.override=none;"
 *** *vector_udf3.q*
  Test detects vectorization and output changes.
 Added "set hive.test.vectorized.execution.enabled.override=none;"

 * +TestContribCliDriver+

** +Wrong Results+
 *** *udf_example_arraymapstruct.q*
  HIVE-19110



  was:
There are known bugs recently found and reported that occur when vectorization 
is turn on in Q files.  Until those bugs are fixed, add SET statements to the 
top of the Q files that suppress vectorization.

Change a few EXPLAIN VECTORIZATION to EXPLAIN where it isn't needed and creates 
an unnecessary Q file difference during enable by default experiments.

 * +Execution Failures+
 ** *input_lazyserde*
 *** HIVE-19088
 ** *input_lazyserde2*
 *** HIVE-19088, too.
 ** *nested_column_pruning*
 *** HIVE-19016
 ** *parquet_map_of_arrays_of_ints*
 *** HIVE-19015
 ** *parquet_map_of_maps*
 *** HIVE-19015, too.
 ** *parquet_nested_complex*
 *** HIVE-19016

 * +Wrong Results+
 ** *udf_context_aware*
 *** Test detects vectorization and output changes.
 Added "set hive.test.vectorized.execution.enabled.override=none;"


> Vectorization: Suppress known Q file bugs
> -
>
> Key: HIVE-19102
> URL: https://issues.apache.org/jira/browse/HIVE-19102
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-19102.01.patch
>
>
> There are known bugs recently found and reported that occur when 
> vectorization is turn on in Q files.  Until those bugs are fixed, add SET 
> statements to the top of the Q files that suppress vectorization.
> Change a few EXPLAIN VECTORIZATION to EXPLAIN where it isn't needed and 
> creates an unnecessary Q file difference during enable by default experiments.
>  * +TestCliDriver+
>  ** +Execution Failures+
>  *** *input_lazyserde.q*
>   HIVE-19088
>  *** *input_lazyserde2.q*
>   HIVE-19088, too.
>  *** *nested_column_pruning.q*
>   HIVE-19016
>  *** *parquet_map_of_arrays_of_ints.q*
>   HIVE-19015
>  *** *parquet_map_of_maps.q*
>   HIVE-19015, too.
>  *** *parquet_nested_complex.q*
>   HIVE-19016
>  ** +Wrong Results+
>  *** *delete_orig_table.q*
>   HIVE-19109
>  *** *offset_limit_global_optimizer.q*
>   Added ORDER BY clauses to fix different vectorization intermediate 
> results due to LIMIT clause.
>  *** *parquet_ppd_decimal.q*
>   HIVE-19108
>  *** *udf_context_aware.q*
>   Test detects vectorization and output changes.
>  Added "set hive.test.vectorized.execution.enabled.override=none;"
>  *** *vector_udf3.q*
>   Test detects vectorization and output changes.
>  Added "set hive.test.vectorized.execution.enabled.override=none;"
>  * +TestContribCliDriver+
> ** +Wrong Results+
>  *** *udf_example_arraymapstruct.q*
>   HIVE-19110



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18976) Add ability to setup Druid Kafka Ingestion from Hive

2018-04-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425903#comment-16425903
 ] 

Hive QA commented on HIVE-18976:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12917408/HIVE-18976.04.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 108 failed/errored test(s), 13182 tests 
executed
*Failed tests:*
{noformat}
TestBeeLineDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=252)
TestCopyUtils - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=246)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=252)
TestExportImport - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed 
out) (batchId=246)
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=252)
TestMiniDruidKafkaCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=252)
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=95)

[jira] [Commented] (HIVE-12192) Hive should carry out timestamp computations in UTC

2018-04-04 Thread Haozhun Jin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425899#comment-16425899
 ] 

Haozhun Jin commented on HIVE-12192:


[~jcamachorodriguez], I work with [~findepi]. We are trying to understand the 
current status and vision of Hive project on Hive types. Thank you for helping 
us.

Below I summarize my understanding after reading HIVE-12192 and HIVE-16614. The 
table describes what each type means using the semantically equivalent type in 
java.time. (Instant here is a bit more general in that you are allowed to get 
year/month/day/hour/minute fields from it. But this doesn't change its 
fundamental meaning.)
||Hive Type||Legacy Hive||Before HIVE-16614||After HIVE-16614||After 
HIVE-12192||Eventually||
|Timestamp|Instant|Instant|Instant|LocalDateTime|LocalDateTime|
|Timestamp w local tz|(not present)|(not present)|Instant|Instant|Instant|
|Timestamp w tz|(not present)|Instant|(not present)|(not present)|ZonedDateTime|

Is this understanding correct?

If my understanding is correct, what is the difference between "Timestamp" and 
"Timestamp w local tz" before HIVE-12192 (except maybe the constructor)?

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: timestamp
> Attachments: HIVE-12192.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented because the effective zone skips them.
> {code}
> hive> select TIMESTAMP '2015-03-08 02:10:00.101';
> 2015-03-08 03:10:00.101
> {code}
> Using UTC instead of the SQL session time zone as the underlying zone for a 
> java.sql.Timestamp avoids this bug, while still returning correct values for 
> {{getYear}} etc. Using UTC as the convenience representation (timestamp 
> without time zone has no real zone) would make timestamp calculations more 
> consistent and avoid similar problems in the future.
> Notably, this would break the {{unix_timestamp}} UDF that specifies the 
> result is with respect to ["the default timezone and default 
> locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].
>  That function would need to be updated to use the 
> {{System.getProperty("user.timezone")}} zone.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19109) Vectorization: Enabling vectorization causes TestCliDriver delete_orig_table.q to produce Wrong Results

2018-04-04 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19109:

Summary: Vectorization: Enabling vectorization causes TestCliDriver 
delete_orig_table.q to produce Wrong Results  (was: Vectorization: Enabling 
vectorization causes delete_orig_table to produce Wrong Results)

> Vectorization: Enabling vectorization causes TestCliDriver 
> delete_orig_table.q to produce Wrong Results
> ---
>
> Key: HIVE-19109
> URL: https://issues.apache.org/jira/browse/HIVE-19109
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Priority: Critical
>
> Found in vectorization enable by default experiment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18976) Add ability to setup Druid Kafka Ingestion from Hive

2018-04-04 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425875#comment-16425875
 ] 

Hive QA commented on HIVE-18976:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
45s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  4m 
 5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  9m  
7s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
45s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m  
0s{color} | {color:red} root: The patch generated 113 new + 676 unchanged - 45 
fixed = 789 total (was 721) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} druid-handler: The patch generated 103 new + 123 
unchanged - 43 fixed = 226 total (was 166) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
15s{color} | {color:red} itests/qtest: The patch generated 3 new + 0 unchanged 
- 0 fixed = 3 total (was 0) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
10s{color} | {color:red} itests/qtest-druid: The patch generated 3 new + 4 
unchanged - 1 fixed = 7 total (was 5) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
16s{color} | {color:red} itests/util: The patch generated 4 new + 119 unchanged 
- 1 fixed = 123 total (was 120) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  9m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 50 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 65m 13s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  
xml  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-9993/dev-support/hive-personality.sh
 |
| git revision | master / ee3724c |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9993/yetus/diff-checkstyle-root.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9993/yetus/diff-checkstyle-druid-handler.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9993/yetus/diff-checkstyle-itests_qtest.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9993/yetus/diff-checkstyle-itests_qtest-druid.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9993/yetus/diff-checkstyle-itests_util.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9993/yetus/whitespace-eol.txt 
|
| asflicense |

[jira] [Commented] (HIVE-14044) Newlines in Avro maps cause external table to return corrupt values

2018-04-04 Thread Jelmer Kuperus (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425849#comment-16425849
 ] 

Jelmer Kuperus commented on HIVE-14044:
---

[~Sh4pe] I think that's only for LazySimpleSerDe

If i look at this code it declares the SERIALIZATION_ESCAPE_CRLF property

[https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java#L69]

But the avro one doesn't

[https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java#L46]

Specifying it on the table does absolutely nothing for me on CDH-5.9.2

> Newlines in Avro maps cause external table to return corrupt values
> ---
>
> Key: HIVE-14044
> URL: https://issues.apache.org/jira/browse/HIVE-14044
> Project: Hive
>  Issue Type: Bug
> Environment: Hive version: 1.1.0-cdh5.5.1 (bundled with cloudera 
> 5.5.1)
>Reporter: David Nies
>Assignee: Sahil Takiar
>Priority: Critical
> Attachments: test.json, test.schema
>
>
> When {{\n}} characters are contained in Avro files that are used as data 
> bases for an external table, the result of {{SELECT}} queries may be corrupt. 
> I encountered this error when querying hive both from {{beeline}} and from 
> JDBC.
> h3. Steps to reproduce (used files are attached to ticket)
> # Create an {{.avro}} file that contains newline characters in a value of a 
> map:
> {code}
> avro-tools fromjson --schema-file test.schema test.json > test.avro
> {code}
> # Copy {{.avro}} file to HDFS
> {code}
> hdfs dfs -copyFromLocal test.avro /some/location/
> {code}
> # Create an external table in beeline containing this {{.avro}}:
> {code}
> beeline> CREATE EXTERNAL TABLE broken_newline_map
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> LOCATION '/some/location/'
> TBLPROPERTIES ('avro.schema.literal'='
> {
>   "type" : "record",
>   "name" : "myEntry",
>   "namespace" : "myNamespace",
>   "fields" : [ {
> "name" : "foo",
> "type" : "long"
>   }, {
> "name" : "bar",
> "type" : {
>   "type" : "map",
>   "values" : "string"
> }
>   } ]
> }
> ');
> {code}
> # Now, selecting may return corrupt results:
> {code}
> jdbc:hive2://my-server:1/> select * from broken_newline_map;
> +-+---+--+
> | broken_newline_map.foo  |  broken_newline_map.bar   
> |
> +-+---+--+
> | 1   | {"key2":"value2","key1":"value1\nafter newline"}  
> |
> | 2   | {"key2":"new value2","key1":"new value"}  
> |
> +-+---+--+
> 2 rows selected (1.661 seconds)
> jdbc:hive2://my-server:1/> select foo, map_keys(bar), map_values(bar) 
> from broken_newline_map;
> +---+--+-+--+
> |  foo  |   _c1| _c2 |
> +---+--+-+--+
> | 1 | ["key2","key1"]  | ["value2","value1"] |
> | NULL  | NULL | NULL|
> | 2 | ["key2","key1"]  | ["new value2","new value"]  |
> +---+--+-+--+
> 3 rows selected (28.05 seconds)
> {code}
> Obviously, the last result set contains corrupt entries (line 2) and 
> incorrect entries (line 1). I also encountered this when doing this query 
> with JDBC. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19108) Vectorization and Parquet: Turning on vectorization in parquet_ppd_decimal.q causes Wrong Query Results

2018-04-04 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425840#comment-16425840
 ] 

Vihang Karajgaonkar commented on HIVE-19108:


I see .. let me take a look.

> Vectorization and Parquet: Turning on vectorization in parquet_ppd_decimal.q 
> causes Wrong Query Results
> ---
>
> Key: HIVE-19108
> URL: https://issues.apache.org/jira/browse/HIVE-19108
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Priority: Critical
>
> Found in vectorization enable by default experiment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19108) Vectorization and Parquet: Turning on vectorization in parquet_ppd_decimal.q causes Wrong Query Results

2018-04-04 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425831#comment-16425831
 ] 

Matt McCline commented on HIVE-19108:
-

I think just adding "set hive.vectorized.execution.enabled=true;" to the top of 
the Q file and running TestCliDriver.

> Vectorization and Parquet: Turning on vectorization in parquet_ppd_decimal.q 
> causes Wrong Query Results
> ---
>
> Key: HIVE-19108
> URL: https://issues.apache.org/jira/browse/HIVE-19108
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Priority: Critical
>
> Found in vectorization enable by default experiment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19108) Vectorization and Parquet: Turning on vectorization in parquet_ppd_decimal.q causes Wrong Query Results

2018-04-04 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425828#comment-16425828
 ] 

Vihang Karajgaonkar commented on HIVE-19108:


Hi [~mmccline] Thank you for reporting vectorization issues with Parquet. How 
do I reproduce these problems?

> Vectorization and Parquet: Turning on vectorization in parquet_ppd_decimal.q 
> causes Wrong Query Results
> ---
>
> Key: HIVE-19108
> URL: https://issues.apache.org/jira/browse/HIVE-19108
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Priority: Critical
>
> Found in vectorization enable by default experiment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19102) Vectorization: Suppress known Q file bugs

2018-04-04 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19102:

Description: 
There are known bugs recently found and reported that occur when vectorization 
is turn on in Q files.  Until those bugs are fixed, add SET statements to the 
top of the Q files that suppress vectorization.

Change a few EXPLAIN VECTORIZATION to EXPLAIN where it isn't needed and creates 
an unnecessary Q file difference during enable by default experiments.

 * +Execution Failures+
 ** *input_lazyserde*
 *** HIVE-19088
 ** *input_lazyserde2*
 *** HIVE-19088, too.
 ** *nested_column_pruning*
 *** HIVE-19016
 ** *parquet_map_of_arrays_of_ints*
 *** HIVE-19015
 ** *parquet_map_of_maps*
 *** HIVE-19015, too.
 ** *parquet_nested_complex*
 *** HIVE-19016

 * +Wrong Results+
 ** *udf_context_aware*
 *** Test detects vectorization and output changes.
 Added "set hive.test.vectorized.execution.enabled.override=none;"

  was:
There are known bugs recently found and reported that occur when vectorization 
is turn on in Q files.  Until those bugs are fixed, add SET statements to the 
top of the Q files that suppress vectorization.

Change a few EXPLAIN VECTORIZATION to EXPLAIN where it isn't needed and creates 
an unnecessary Q file difference during enable by default experiments.

 
 * +Execution Failures+
 ** *input_lazyserde*
 *** HIVE-19088
 ** *input_lazyserde2*
 *** HIVE-19088, too.
 ** *nested_column_pruning*
 *** HIVE-19016
 ** *parquet_map_of_arrays_of_ints*
 *** HIVE-19015
 ** *parquet_map_of_maps*
 *** HIVE-19015, too.
 ** *parquet_nested_complex*
 *** HIVE-19016


> Vectorization: Suppress known Q file bugs
> -
>
> Key: HIVE-19102
> URL: https://issues.apache.org/jira/browse/HIVE-19102
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-19102.01.patch
>
>
> There are known bugs recently found and reported that occur when 
> vectorization is turn on in Q files.  Until those bugs are fixed, add SET 
> statements to the top of the Q files that suppress vectorization.
> Change a few EXPLAIN VECTORIZATION to EXPLAIN where it isn't needed and 
> creates an unnecessary Q file difference during enable by default experiments.
>  * +Execution Failures+
>  ** *input_lazyserde*
>  *** HIVE-19088
>  ** *input_lazyserde2*
>  *** HIVE-19088, too.
>  ** *nested_column_pruning*
>  *** HIVE-19016
>  ** *parquet_map_of_arrays_of_ints*
>  *** HIVE-19015
>  ** *parquet_map_of_maps*
>  *** HIVE-19015, too.
>  ** *parquet_nested_complex*
>  *** HIVE-19016
>  * +Wrong Results+
>  ** *udf_context_aware*
>  *** Test detects vectorization and output changes.
>  Added "set hive.test.vectorized.execution.enabled.override=none;"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18976) Add ability to setup Druid Kafka Ingestion from Hive

2018-04-04 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-18976:

Attachment: HIVE-18976.05.patch

> Add ability to setup Druid Kafka Ingestion from Hive
> 
>
> Key: HIVE-18976
> URL: https://issues.apache.org/jira/browse/HIVE-18976
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-18976.03.patch, HIVE-18976.04.patch, 
> HIVE-18976.05.patch, HIVE-18976.patch
>
>
> Add Ability to setup druid kafka Ingestion using Hive CREATE TABLE statement
> e.g. Below query can submit a kafka supervisor spec to the druid overlord and 
> druid can start ingesting events from kafka. 
> {code:java}
>  
> CREATE TABLE druid_kafka_test(`__time` timestamp, page string, language 
> string, `user` string, added int, deleted int, delta int)
> STORED BY 
> 'org.apache.hadoop.hive.druid.DruidKafkaStreamingStorageHandler'
> TBLPROPERTIES (
> "druid.segment.granularity" = "HOUR",
> "druid.query.granularity" = "MINUTE",
> "kafka.bootstrap.servers" = "localhost:9092",
> "kafka.topic" = "test-topic",
> "druid.kafka.ingest.useEarliestOffset" = "true"
> );
> {code}
> Design - This can be done via a DruidKafkaStreamingStorageHandler that 
> extends existing DruidStorageHandler and adds the additional functionality 
> for Streaming. 
> Testing - Add a DruidKafkaMiniCluster which will consist of DruidMiniCluster 
> + Single Node Kafka Broker. The broker can be populated with a test topic 
> that has some predefined data. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18783) ALTER TABLE post-commit listener does not include the transactional listener responses

2018-04-04 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-18783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425777#comment-16425777
 ] 

Sergio Peña commented on HIVE-18783:


Thanks [~vihangk1] . I made the change and submitted another patch for testing 
and review. Btw, I did found an issue with the new catalog api, could you just 
double check that the getPartitions() is correctly made with the catName 
parameter?

> ALTER TABLE post-commit listener does not include the transactional listener 
> responses 
> ---
>
> Key: HIVE-18783
> URL: https://issues.apache.org/jira/browse/HIVE-18783
> Project: Hive
>  Issue Type: Bug
>Reporter: Na Li
>Assignee: Sergio Peña
>Priority: Major
> Attachments: HIVE-18783.1.patch, HIVE-18783.2.patch, 
> HIVE-18783.3.patch
>
>
>  in HiveMetaStore, alter_table_core does NOT call transactional listener, and 
> the notification ID corresponding to the alter table event is NOT set in the 
> event parameters.
> {code}
> + alter_table_core
>   
>   try {
> Table oldt = this.get_table_core(dbname, name);
> this.firePreEvent(new PreAlterTableEvent(oldt, newTable, this));
> this.alterHandler.alterTable(this.getMS(), this.wh, dbname, name, 
> newTable, envContext, this);
> success = true;
> if (!this.listeners.isEmpty()) {
>   MetaStoreListenerNotifier.notifyEvent(this.listeners, 
> EventType.ALTER_TABLE, new AlterTableEvent(oldt, newTable, true, this), 
> envContext);
> }
>   } catch (NoSuchObjectException var12) {
> ex = var12;
> throw new InvalidOperationException(var12.getMessage());
>   } catch (Exception var13) {
> ex = var13;
> if (var13 instanceof MetaException) {
>   throw (MetaException)var13;
> }
> if (var13 instanceof InvalidOperationException) {
>   throw (InvalidOperationException)var13;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

1 2 >

1 - 100 of 159 matches

Mail list logo