[jira] [Commented] (HIVE-20817) Reading Timestamp datatype via HiveServer2 gives errors

2018-11-05 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16674793#comment-16674793
 ] 

Sankar Hariappan commented on HIVE-20817:
-

02.patch is committed to master!

Thanks [~maheshk114] and [~thejas]!

> Reading Timestamp datatype via HiveServer2 gives errors
> ---
>
> Key: HIVE-20817
> URL: https://issues.apache.org/jira/browse/HIVE-20817
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20817.01.patch, HIVE-20817.02.patch
>
>
> CREATE TABLE JdbcBasicRead ( empno int, desg string,empname string,doj 
> timestamp,Salary float,mgrid smallint, deptno tinyint ) ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY ',';
> LOAD DATA LOCAL INPATH '/tmp/art_jdbc/hive/input/input_7columns.txt' 
> OVERWRITE INTO TABLE JdbcBasicRead;
> Sample Data.
> —
> 7369,M,SMITH,1980-12-17 17:07:29.234234,5000.00,7902,20
> 7499,X,ALLEN,1981-02-20 17:07:29.234234,1250.00,7698,30
> 7521,X,WARD,1981-02-22 17:07:29.234234,01600.57,7698,40
> 7566,M,JONES,1981-04-02 17:07:29.234234,02975.65,7839,10
> 7654,X,MARTIN,1981-09-28 17:07:29.234234,01250.00,7698,20
> 7698,M,BLAKE,1981-05-01 17:07:29.234234,2850.98,7839,30
> 7782,M,CLARK,1981-06-09 17:07:29.234234,02450.00,7839,20
> —
> Select statement: SELECT empno, desg, empname, doj, salary, mgrid, deptno 
> FROM JdbcBasicWrite
> {code}
> 2018-09-25T07:11:03,222 WARN [HiveServer2-Handler-Pool: Thread-83]: 
> thrift.ThriftCLIService (:()) - Error fetching results:
> org.apache.hive.service.cli.HiveSQLException: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.type.Timestamp cannot be cast to 
> java.sql.Timestamp
> at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:469)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:328)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:910)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at sun.reflect.GeneratedMethodAccessor50.invoke(Unknown Source) ~[?:?]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_112]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112]
> at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  ~[hadoop-common-3.1.1.3.0.1.0-187.jar:?]
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at com.sun.proxy.$Proxy46.fetchResults(Unknown Source) ~[?:?]
> at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:564) 
> ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:786)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1837)
>  ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1822)
>  ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
> ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
> ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>  ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> java.util.concurrent.ThreadPoolExe

[jira] [Updated] (HIVE-20817) Reading Timestamp datatype via HiveServer2 gives errors

2018-11-05 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20817:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Reading Timestamp datatype via HiveServer2 gives errors
> ---
>
> Key: HIVE-20817
> URL: https://issues.apache.org/jira/browse/HIVE-20817
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20817.01.patch, HIVE-20817.02.patch
>
>
> CREATE TABLE JdbcBasicRead ( empno int, desg string,empname string,doj 
> timestamp,Salary float,mgrid smallint, deptno tinyint ) ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY ',';
> LOAD DATA LOCAL INPATH '/tmp/art_jdbc/hive/input/input_7columns.txt' 
> OVERWRITE INTO TABLE JdbcBasicRead;
> Sample Data.
> —
> 7369,M,SMITH,1980-12-17 17:07:29.234234,5000.00,7902,20
> 7499,X,ALLEN,1981-02-20 17:07:29.234234,1250.00,7698,30
> 7521,X,WARD,1981-02-22 17:07:29.234234,01600.57,7698,40
> 7566,M,JONES,1981-04-02 17:07:29.234234,02975.65,7839,10
> 7654,X,MARTIN,1981-09-28 17:07:29.234234,01250.00,7698,20
> 7698,M,BLAKE,1981-05-01 17:07:29.234234,2850.98,7839,30
> 7782,M,CLARK,1981-06-09 17:07:29.234234,02450.00,7839,20
> —
> Select statement: SELECT empno, desg, empname, doj, salary, mgrid, deptno 
> FROM JdbcBasicWrite
> {code}
> 2018-09-25T07:11:03,222 WARN [HiveServer2-Handler-Pool: Thread-83]: 
> thrift.ThriftCLIService (:()) - Error fetching results:
> org.apache.hive.service.cli.HiveSQLException: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.type.Timestamp cannot be cast to 
> java.sql.Timestamp
> at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:469)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:328)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:910)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at sun.reflect.GeneratedMethodAccessor50.invoke(Unknown Source) ~[?:?]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_112]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112]
> at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  ~[hadoop-common-3.1.1.3.0.1.0-187.jar:?]
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at com.sun.proxy.$Proxy46.fetchResults(Unknown Source) ~[?:?]
> at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:564) 
> ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:786)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1837)
>  ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1822)
>  ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
> ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
> ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>  ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[?:1.8.

[jira] [Updated] (HIVE-20805) Hive does not copy source data when importing as non-hive user

2018-11-05 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20805:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Hive does not copy source data when importing as non-hive user 
> ---
>
> Key: HIVE-20805
> URL: https://issues.apache.org/jira/browse/HIVE-20805
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20805.03.patch, HIVE-20805.1.patch, 
> HIVE-20805.2.patch
>
>
> while loading data to a managed table from user given path, Hive uses move 
> operation to copy data from user location to table location. In case move can 
> not be used due to permission issue or mismatched encryption zone etc, hive 
> uses copy and then deletes the files from source location to keep to behavior 
> same. But in case the user does not have write access to the source location, 
> delete will fail with file permission exception and load operation will fail. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20805) Hive does not copy source data when importing as non-hive user

2018-11-05 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16674798#comment-16674798
 ] 

Sankar Hariappan commented on HIVE-20805:
-

03.patch committed to master!

Thanks [~maheshk114] and [~thejas]!

> Hive does not copy source data when importing as non-hive user 
> ---
>
> Key: HIVE-20805
> URL: https://issues.apache.org/jira/browse/HIVE-20805
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20805.03.patch, HIVE-20805.1.patch, 
> HIVE-20805.2.patch
>
>
> while loading data to a managed table from user given path, Hive uses move 
> operation to copy data from user location to table location. In case move can 
> not be used due to permission issue or mismatched encryption zone etc, hive 
> uses copy and then deletes the files from source location to keep to behavior 
> same. But in case the user does not have write access to the source location, 
> delete will fail with file permission exception and load operation will fail. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20818) Views created with a WHERE subquery will regard views referenced in the subquery as direct input

2018-11-05 Thread Karen Coppage (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage reassigned HIVE-20818:


Assignee: Karen Coppage

> Views created with a WHERE subquery will regard views referenced in the 
> subquery as direct input
> 
>
> Key: HIVE-20818
> URL: https://issues.apache.org/jira/browse/HIVE-20818
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>
> If Hive is configured with an authorization hook like Sentry, and a view is 
> created with a WHERE clause referencing a different view' user has no access 
> to, user cannot access the view as view' is considered direct input.
> For example:
> {code:java}
> create database db1;
> create database db2;
> create database db3;
>  
> create table db1.table1 (cola string, colb string, colc string);
> insert into db1.table1 values ('a','b','c');
> insert into db1.table1 values ('x','y','z');
> CREATE VIEW db2.view1 AS SELECT cola, colb, colc FROM db1.table1 WHERE 
> cola="x"; 
> CREATE VIEW db2.view2 AS SELECT table1.cola, table1.colb, table1.colc FROM 
> db1.table1 WHERE table1.cola NOT IN (SELECT view1.cola FROM db2.view1); 
> create view db3.view3 as select * from db2.view2
> {code}
>  If test_user has read permission for only db3 (but not db1 or db2), their 
> query
> {code:java}
> select * from db3.view3;{code}
> will fail with :
> {code:java}
> Error while compiling statement: FAILED: SemanticException No valid 
> privileges User test_user does not have privileges for QUERY The required 
> privileges: Server=server1->Db=db2->Table=view1->action=select; {code}
> WHERE IN and WHERE EXISTS cause the same issue.
> Cascading views created with no WHERE clauses (i.e. with simple SELECTs and 
> FROM clauses) work fine.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20818) Views created with a WHERE subquery will regard views referenced in the subquery as direct input

2018-11-05 Thread Karen Coppage (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-20818:
-
Attachment: HIVE-20818.patch
Status: Patch Available  (was: Open)

> Views created with a WHERE subquery will regard views referenced in the 
> subquery as direct input
> 
>
> Key: HIVE-20818
> URL: https://issues.apache.org/jira/browse/HIVE-20818
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-20818.patch
>
>
> If Hive is configured with an authorization hook like Sentry, and a view is 
> created with a WHERE clause referencing a different view' user has no access 
> to, user cannot access the view as view' is considered direct input.
> For example:
> {code:java}
> create database db1;
> create database db2;
> create database db3;
>  
> create table db1.table1 (cola string, colb string, colc string);
> insert into db1.table1 values ('a','b','c');
> insert into db1.table1 values ('x','y','z');
> CREATE VIEW db2.view1 AS SELECT cola, colb, colc FROM db1.table1 WHERE 
> cola="x"; 
> CREATE VIEW db2.view2 AS SELECT table1.cola, table1.colb, table1.colc FROM 
> db1.table1 WHERE table1.cola NOT IN (SELECT view1.cola FROM db2.view1); 
> create view db3.view3 as select * from db2.view2
> {code}
>  If test_user has read permission for only db3 (but not db1 or db2), their 
> query
> {code:java}
> select * from db3.view3;{code}
> will fail with :
> {code:java}
> Error while compiling statement: FAILED: SemanticException No valid 
> privileges User test_user does not have privileges for QUERY The required 
> privileges: Server=server1->Db=db2->Table=view1->action=select; {code}
> WHERE IN and WHERE EXISTS cause the same issue.
> Cascading views created with no WHERE clauses (i.e. with simple SELECTs and 
> FROM clauses) work fine.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20796) jdbc URL can contain sensitive information that should not be logged

2018-11-05 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-20796:
--
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master.
Thanks for the patch [~lpinter], and [~asherman], [~dkuzmenko] for the review!

> jdbc URL can contain sensitive information that should not be logged
> 
>
> Key: HIVE-20796
> URL: https://issues.apache.org/jira/browse/HIVE-20796
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Laszlo Pinter
>Assignee: Laszlo Pinter
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20796.01.patch, HIVE-20796.02.patch, 
> HIVE-20796.03.patch, HIVE-20796.04.patch, HIVE-20796.05.patch
>
>
> It is possible to put passwords in the jdbc connection url and some jdbc 
> drivers will supposedly use that. (derby, mysql). This information is 
> considered sensitive, and should be masked out, while logging the connection 
> url.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20079) Populate more accurate rawDataSize for parquet format

2018-11-05 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-20079:
---

Assignee: (was: Sahil Takiar)

> Populate more accurate rawDataSize for parquet format
> -
>
> Key: HIVE-20079
> URL: https://issues.apache.org/jira/browse/HIVE-20079
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Priority: Major
> Attachments: HIVE-20079.1.patch, HIVE-20079.2.patch
>
>
> Run the following queries and you will see the raw data for the table is 4 
> (that is the number of fields) incorrectly. We need to populate correct data 
> size so data can be split properly.
> {noformat}
> SET hive.stats.autogather=true;
> CREATE TABLE parquet_stats (id int,str string) STORED AS PARQUET;
> INSERT INTO parquet_stats values(0, 'this is string 0'), (1, 'string 1');
> DESC FORMATTED parquet_stats;
> {noformat}
> {noformat}
> Table Parameters:
>   COLUMN_STATS_ACCURATE   true
>   numFiles1
>   numRows 2
>   rawDataSize 4
>   totalSize   373
>   transient_lastDdlTime   1530660523
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20804) Further improvements to group by optimization with constraints

2018-11-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20804:
---
Attachment: HIVE-20804.4.patch

> Further improvements to group by optimization with constraints
> --
>
> Key: HIVE-20804
> URL: https://issues.apache.org/jira/browse/HIVE-20804
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20804.1.patch, HIVE-20804.2.patch, 
> HIVE-20804.3.patch, HIVE-20804.4.patch
>
>
> Continuation of HIVE-17043



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20804) Further improvements to group by optimization with constraints

2018-11-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20804:
---
Status: Open  (was: Patch Available)

> Further improvements to group by optimization with constraints
> --
>
> Key: HIVE-20804
> URL: https://issues.apache.org/jira/browse/HIVE-20804
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20804.1.patch, HIVE-20804.2.patch, 
> HIVE-20804.3.patch, HIVE-20804.4.patch
>
>
> Continuation of HIVE-17043



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20804) Further improvements to group by optimization with constraints

2018-11-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20804:
---
Attachment: (was: HIVE-20804.4.patch)

> Further improvements to group by optimization with constraints
> --
>
> Key: HIVE-20804
> URL: https://issues.apache.org/jira/browse/HIVE-20804
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20804.1.patch, HIVE-20804.2.patch, 
> HIVE-20804.3.patch
>
>
> Continuation of HIVE-17043



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20833) package.jdo needs to be updated to conform with HIVE-20221 changes

2018-11-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20833:
---
Status: Patch Available  (was: Open)

> package.jdo needs to be updated to conform with HIVE-20221 changes
> --
>
> Key: HIVE-20833
> URL: https://issues.apache.org/jira/browse/HIVE-20833
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20833.1.patch, HIVE-20833.2.patch, 
> HIVE-20833.3.patch, HIVE-20833.4.patch, HIVE-20833.5.patch
>
>
> Following test if run with TestMiniLlapLocalCliDriver will fail:
> {code:sql}
> CREATE TABLE `alterPartTbl`(
>`po_header_id` bigint,
>`vendor_num` string,
>`requester_name` string,
>`approver_name` string,
>`buyer_name` string,
>`preparer_name` string,
>`po_requisition_number` string,
>`po_requisition_id` bigint,
>`po_requisition_desc` string,
>`rate_type` string,
>`rate_date` date,
>`rate` double,
>`blanket_total_amount` double,
>`authorization_status` string,
>`revision_num` bigint,
>`revised_date` date,
>`approved_flag` string,
>`approved_date` timestamp,
>`amount_limit` double,
>`note_to_authorizer` string,
>`note_to_vendor` string,
>`note_to_receiver` string,
>`vendor_order_num` string,
>`comments` string,
>`acceptance_required_flag` string,
>`acceptance_due_date` date,
>`closed_date` timestamp,
>`user_hold_flag` string,
>`approval_required_flag` string,
>`cancel_flag` string,
>`firm_status_lookup_code` string,
>`firm_date` date,
>`frozen_flag` string,
>`closed_code` string,
>`org_id` bigint,
>`reference_num` string,
>`wf_item_type` string,
>`wf_item_key` string,
>`submit_date` date,
>`sap_company_code` string,
>`sap_fiscal_year` bigint,
>`po_number` string,
>`sap_line_item` bigint,
>`closed_status_flag` string,
>`balancing_segment` string,
>`cost_center_segment` string,
>`base_amount_limit` double,
>`base_blanket_total_amount` double,
>`base_open_amount` double,
>`base_ordered_amount` double,
>`cancel_date` timestamp,
>`cbc_accounting_date` date,
>`change_requested_by` string,
>`change_summary` string,
>`confirming_order_flag` string,
>`document_creation_method` string,
>`edi_processed_flag` string,
>`edi_processed_status` string,
>`enabled_flag` string,
>`encumbrance_required_flag` string,
>`end_date` date,
>`end_date_active` date,
>`from_header_id` bigint,
>`from_type_lookup_code` string,
>`global_agreement_flag` string,
>`government_context` string,
>`interface_source_code` string,
>`ledger_currency_code` string,
>`open_amount` double,
>`ordered_amount` double,
>`pay_on_code` string,
>`payment_term_name` string,
>`pending_signature_flag` string,
>`po_revision_num` double,
>`preparer_id` bigint,
>`price_update_tolerance` double,
>`print_count` double,
>`printed_date` date,
>`reply_date` date,
>`reply_method_lookup_code` string,
>`rfq_close_date` date,
>`segment2` string,
>`segment3` string,
>`segment4` string,
>`segment5` string,
>`shipping_control` string,
>`start_date` date,
>`start_date_active` date,
>`summary_flag` string,
>`supply_agreement_flag` string,
>`usd_amount_limit` double,
>`usd_blanket_total_amount` double,
>`usd_exchange_rate` double,
>`usd_open_amount` double,
>`usd_order_amount` double,
>`ussgl_transaction_code` string,
>`xml_flag` string,
>`purchasing_organization_id` bigint,
>`purchasing_group_code` string,
>`last_updated_by_name` string,
>`created_by_name` string,
>`incoterms_1` string,
>`incoterms_2` string,
>`ame_approval_id` double,
>`ame_transaction_type` string,
>`auto_sourcing_flag` string,
>`cat_admin_auth_enabled_flag` string,
>`clm_document_number` string,
>`comm_rev_num` double,
>`consigned_consumption_flag` string,
>`consume_req_demand_flag` string,
>`conterms_articles_upd_date` timestamp,
>`conterms_deliv_upd_date` timestamp,
>`conterms_exist_flag` string,
>`cpa_reference` double,
>`created_language` string,
>`email_address` string,
>`enable_all_sites` string,
>`fax` string,
>`lock_owner_role` string,
>`lock_owner_user_id` double,
>`min_release_amount` double,
>`mrc_rate` string,
>`mrc_rate_date` string,
>`mrc_rate_type` string,
>`otm_recovery_flag` string,
>`otm_status_code` string,
>`pay_when_paid` string,
>`pcard_id` bigint,
>`program_update_date` timestamp,
>`quotation

[jira] [Updated] (HIVE-20833) package.jdo needs to be updated to conform with HIVE-20221 changes

2018-11-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20833:
---
Status: Open  (was: Patch Available)

> package.jdo needs to be updated to conform with HIVE-20221 changes
> --
>
> Key: HIVE-20833
> URL: https://issues.apache.org/jira/browse/HIVE-20833
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20833.1.patch, HIVE-20833.2.patch, 
> HIVE-20833.3.patch, HIVE-20833.4.patch, HIVE-20833.5.patch
>
>
> Following test if run with TestMiniLlapLocalCliDriver will fail:
> {code:sql}
> CREATE TABLE `alterPartTbl`(
>`po_header_id` bigint,
>`vendor_num` string,
>`requester_name` string,
>`approver_name` string,
>`buyer_name` string,
>`preparer_name` string,
>`po_requisition_number` string,
>`po_requisition_id` bigint,
>`po_requisition_desc` string,
>`rate_type` string,
>`rate_date` date,
>`rate` double,
>`blanket_total_amount` double,
>`authorization_status` string,
>`revision_num` bigint,
>`revised_date` date,
>`approved_flag` string,
>`approved_date` timestamp,
>`amount_limit` double,
>`note_to_authorizer` string,
>`note_to_vendor` string,
>`note_to_receiver` string,
>`vendor_order_num` string,
>`comments` string,
>`acceptance_required_flag` string,
>`acceptance_due_date` date,
>`closed_date` timestamp,
>`user_hold_flag` string,
>`approval_required_flag` string,
>`cancel_flag` string,
>`firm_status_lookup_code` string,
>`firm_date` date,
>`frozen_flag` string,
>`closed_code` string,
>`org_id` bigint,
>`reference_num` string,
>`wf_item_type` string,
>`wf_item_key` string,
>`submit_date` date,
>`sap_company_code` string,
>`sap_fiscal_year` bigint,
>`po_number` string,
>`sap_line_item` bigint,
>`closed_status_flag` string,
>`balancing_segment` string,
>`cost_center_segment` string,
>`base_amount_limit` double,
>`base_blanket_total_amount` double,
>`base_open_amount` double,
>`base_ordered_amount` double,
>`cancel_date` timestamp,
>`cbc_accounting_date` date,
>`change_requested_by` string,
>`change_summary` string,
>`confirming_order_flag` string,
>`document_creation_method` string,
>`edi_processed_flag` string,
>`edi_processed_status` string,
>`enabled_flag` string,
>`encumbrance_required_flag` string,
>`end_date` date,
>`end_date_active` date,
>`from_header_id` bigint,
>`from_type_lookup_code` string,
>`global_agreement_flag` string,
>`government_context` string,
>`interface_source_code` string,
>`ledger_currency_code` string,
>`open_amount` double,
>`ordered_amount` double,
>`pay_on_code` string,
>`payment_term_name` string,
>`pending_signature_flag` string,
>`po_revision_num` double,
>`preparer_id` bigint,
>`price_update_tolerance` double,
>`print_count` double,
>`printed_date` date,
>`reply_date` date,
>`reply_method_lookup_code` string,
>`rfq_close_date` date,
>`segment2` string,
>`segment3` string,
>`segment4` string,
>`segment5` string,
>`shipping_control` string,
>`start_date` date,
>`start_date_active` date,
>`summary_flag` string,
>`supply_agreement_flag` string,
>`usd_amount_limit` double,
>`usd_blanket_total_amount` double,
>`usd_exchange_rate` double,
>`usd_open_amount` double,
>`usd_order_amount` double,
>`ussgl_transaction_code` string,
>`xml_flag` string,
>`purchasing_organization_id` bigint,
>`purchasing_group_code` string,
>`last_updated_by_name` string,
>`created_by_name` string,
>`incoterms_1` string,
>`incoterms_2` string,
>`ame_approval_id` double,
>`ame_transaction_type` string,
>`auto_sourcing_flag` string,
>`cat_admin_auth_enabled_flag` string,
>`clm_document_number` string,
>`comm_rev_num` double,
>`consigned_consumption_flag` string,
>`consume_req_demand_flag` string,
>`conterms_articles_upd_date` timestamp,
>`conterms_deliv_upd_date` timestamp,
>`conterms_exist_flag` string,
>`cpa_reference` double,
>`created_language` string,
>`email_address` string,
>`enable_all_sites` string,
>`fax` string,
>`lock_owner_role` string,
>`lock_owner_user_id` double,
>`min_release_amount` double,
>`mrc_rate` string,
>`mrc_rate_date` string,
>`mrc_rate_type` string,
>`otm_recovery_flag` string,
>`otm_status_code` string,
>`pay_when_paid` string,
>`pcard_id` bigint,
>`program_update_date` timestamp,
>`quotation

[jira] [Updated] (HIVE-20833) package.jdo needs to be updated to conform with HIVE-20221 changes

2018-11-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20833:
---
Attachment: HIVE-20833.5.patch

> package.jdo needs to be updated to conform with HIVE-20221 changes
> --
>
> Key: HIVE-20833
> URL: https://issues.apache.org/jira/browse/HIVE-20833
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20833.1.patch, HIVE-20833.2.patch, 
> HIVE-20833.3.patch, HIVE-20833.4.patch, HIVE-20833.5.patch
>
>
> Following test if run with TestMiniLlapLocalCliDriver will fail:
> {code:sql}
> CREATE TABLE `alterPartTbl`(
>`po_header_id` bigint,
>`vendor_num` string,
>`requester_name` string,
>`approver_name` string,
>`buyer_name` string,
>`preparer_name` string,
>`po_requisition_number` string,
>`po_requisition_id` bigint,
>`po_requisition_desc` string,
>`rate_type` string,
>`rate_date` date,
>`rate` double,
>`blanket_total_amount` double,
>`authorization_status` string,
>`revision_num` bigint,
>`revised_date` date,
>`approved_flag` string,
>`approved_date` timestamp,
>`amount_limit` double,
>`note_to_authorizer` string,
>`note_to_vendor` string,
>`note_to_receiver` string,
>`vendor_order_num` string,
>`comments` string,
>`acceptance_required_flag` string,
>`acceptance_due_date` date,
>`closed_date` timestamp,
>`user_hold_flag` string,
>`approval_required_flag` string,
>`cancel_flag` string,
>`firm_status_lookup_code` string,
>`firm_date` date,
>`frozen_flag` string,
>`closed_code` string,
>`org_id` bigint,
>`reference_num` string,
>`wf_item_type` string,
>`wf_item_key` string,
>`submit_date` date,
>`sap_company_code` string,
>`sap_fiscal_year` bigint,
>`po_number` string,
>`sap_line_item` bigint,
>`closed_status_flag` string,
>`balancing_segment` string,
>`cost_center_segment` string,
>`base_amount_limit` double,
>`base_blanket_total_amount` double,
>`base_open_amount` double,
>`base_ordered_amount` double,
>`cancel_date` timestamp,
>`cbc_accounting_date` date,
>`change_requested_by` string,
>`change_summary` string,
>`confirming_order_flag` string,
>`document_creation_method` string,
>`edi_processed_flag` string,
>`edi_processed_status` string,
>`enabled_flag` string,
>`encumbrance_required_flag` string,
>`end_date` date,
>`end_date_active` date,
>`from_header_id` bigint,
>`from_type_lookup_code` string,
>`global_agreement_flag` string,
>`government_context` string,
>`interface_source_code` string,
>`ledger_currency_code` string,
>`open_amount` double,
>`ordered_amount` double,
>`pay_on_code` string,
>`payment_term_name` string,
>`pending_signature_flag` string,
>`po_revision_num` double,
>`preparer_id` bigint,
>`price_update_tolerance` double,
>`print_count` double,
>`printed_date` date,
>`reply_date` date,
>`reply_method_lookup_code` string,
>`rfq_close_date` date,
>`segment2` string,
>`segment3` string,
>`segment4` string,
>`segment5` string,
>`shipping_control` string,
>`start_date` date,
>`start_date_active` date,
>`summary_flag` string,
>`supply_agreement_flag` string,
>`usd_amount_limit` double,
>`usd_blanket_total_amount` double,
>`usd_exchange_rate` double,
>`usd_open_amount` double,
>`usd_order_amount` double,
>`ussgl_transaction_code` string,
>`xml_flag` string,
>`purchasing_organization_id` bigint,
>`purchasing_group_code` string,
>`last_updated_by_name` string,
>`created_by_name` string,
>`incoterms_1` string,
>`incoterms_2` string,
>`ame_approval_id` double,
>`ame_transaction_type` string,
>`auto_sourcing_flag` string,
>`cat_admin_auth_enabled_flag` string,
>`clm_document_number` string,
>`comm_rev_num` double,
>`consigned_consumption_flag` string,
>`consume_req_demand_flag` string,
>`conterms_articles_upd_date` timestamp,
>`conterms_deliv_upd_date` timestamp,
>`conterms_exist_flag` string,
>`cpa_reference` double,
>`created_language` string,
>`email_address` string,
>`enable_all_sites` string,
>`fax` string,
>`lock_owner_role` string,
>`lock_owner_user_id` double,
>`min_release_amount` double,
>`mrc_rate` string,
>`mrc_rate_date` string,
>`mrc_rate_type` string,
>`otm_recovery_flag` string,
>`otm_status_code` string,
>`pay_when_paid` string,
>`pcard_id` bigint,
>`program_update_date` timestamp,
>`quotation_class

[jira] [Commented] (HIVE-20839) "Cannot find field" error during dynamically partitioned hash join

2018-11-05 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675473#comment-16675473
 ] 

Vineet Garg commented on HIVE-20839:


+1

Can you open a jira to add tests once HIVE-20833 is in?

> "Cannot find field" error during dynamically partitioned hash join
> --
>
> Key: HIVE-20839
> URL: https://issues.apache.org/jira/browse/HIVE-20839
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-20839.1.patch, HIVE-20839.2.patch, 
> HIVE-20839.3.patch, HIVE-20839.4.patch
>
>
> Occurs in some cases in the non-CBO optimized queries, either if CBO is 
> disabled or has failed due to error.
> {noformat}
> 2018-10-11T04:40:22,724 ERROR [TezTR-85144_8944_1085_28_996_2 
> (1539092085144_8944_1085_28_000996_2)] tez.ReduceRecordProcessor: Hit error 
> while closing operators - failing tree
> 2018-10-11T04:40:22,724 ERROR [TezTR-85144_8944_1085_28_996_2 
> (1539092085144_8944_1085_28_000996_2)] tez.TezProcessor: 
> java.lang.RuntimeException: cannot find field _col304 from [0:_col0, 1:_col1, 
> 2:_col2, 3:_col3, 4:_col4, 5:_col5, 6:_col6, 7:_col7, 8:_col8, 9:_col9, 
> 10:_col10, 11:_col11, 12:_col12, 13:_col13, 14:_col15, 15:_col16, 16:_col17, 
> 17:_col18, 18:_col19, 19:_col20, 20:_col21, 21:_col22, 22:_col23, 23:_col24, 
> 24:_col25, 25:_col26, 26:_col27, 27:_col28, 28:_col29, 29:_col30, 30:_col31, 
> 31:_col32, 32:_col33, 33:_col34, 34:_col35, 35:_col36, 36:_col37, 37:_col38, 
> 38:_col39, 39:_col40, 40:_col41, 41:_col42, 42:_col43, 43:_col44, 44:_col45, 
> 45:_col46, 46:_col47, 47:_col48, 48:_col49, 49:_col50, 50:_col51, 51:_col52, 
> 52:_col53, 53:_col54, 54:_col55, 55:_col56, 56:_col57, 57:_col58, 58:_col59, 
> 59:_col60, 60:_col61, 61:_col62, 62:_col63, 63:_col64, 64:_col65, 65:_col66, 
> 66:_col67, 67:_col68, 68:_col70, 69:_col72, 70:_col73, 71:_col74, 72:_col75, 
> 73:_col76, 74:_col77, 75:_col78, 76:_col79, 77:_col80, 78:_col81, 79:_col82, 
> 80:_col83, 81:_col84, 82:_col85, 83:_col86, 84:_col87, 85:_col88, 86:_col89, 
> 87:_col90, 88:_col91, 89:_col92, 90:_col93, 91:_col94, 92:_col95, 93:_col96, 
> 94:_col97, 95:_col98, 96:_col99, 97:_col100, 98:_col101, 99:_col102, 
> 100:_col103, 101:_col104, 102:_col105, 103:_col106, 104:_col107, 105:_col108, 
> 106:_col109, 107:_col110, 108:_col111, 109:_col112, 110:_col113, 111:_col114, 
> 112:_col115, 113:_col116, 114:_col117, 115:_col118, 116:_col119, 117:_col120, 
> 118:_col121, 119:_col122, 120:_col123, 121:_col124, 122:_col125, 123:_col126, 
> 124:_col127, 125:_col128, 126:_col129, 127:_col130, 128:_col131, 129:_col132, 
> 130:_col133, 131:_col134, 132:_col135, 133:_col136, 134:_col137, 135:_col138, 
> 136:_col139, 137:_col140, 138:_col141, 139:_col142, 140:_col143, 141:_col144, 
> 142:_col145, 143:_col146, 144:_col147, 145:_col148, 146:_col149, 147:_col150, 
> 148:_col151, 149:_col152, 150:_col153, 151:_col154, 152:_col155, 153:_col156, 
> 154:_col157, 155:_col158, 156:_col159, 157:_col160, 158:_col161, 159:_col162, 
> 160:_col163, 161:_col164, 162:_col165, 163:_col166, 164:_col167, 165:_col168, 
> 166:_col169, 167:_col170, 168:_col171, 169:_col318]
> at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:485)
> at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:153)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:80)
> at 
> org.apache.hadoop.hive.ql.exec.JoinUtil.getObjectInspectorsFromEvaluators(JoinUtil.java:91)
> at 
> org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.initializeOp(AbstractMapJoinOperator.java:74)
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.initializeOp(MapJoinOperator.java:144)
> at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:374)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:195)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:188)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:172)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
> {noformat}



--
This m

[jira] [Commented] (HIVE-20858) Serializer is not correctly initialized with configuration in Utilities.createEmptyBuckets()

2018-11-05 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675491#comment-16675491
 ] 

Wei Zheng commented on HIVE-20858:
--

[~daijy] Can you take a look please :)

> Serializer is not correctly initialized with configuration in 
> Utilities.createEmptyBuckets()
> 
>
> Key: HIVE-20858
> URL: https://issues.apache.org/jira/browse/HIVE-20858
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>Priority: Major
> Attachments: HIVE-20858.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20842:
---
Status: Open  (was: Patch Available)

> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.2.patch, 
> HIVE-20842.3.patch, HIVE-20842.4.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20842:
---
Status: Patch Available  (was: Open)

> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.2.patch, 
> HIVE-20842.3.patch, HIVE-20842.4.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20842:
---
Attachment: HIVE-20842.4.patch

> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.2.patch, 
> HIVE-20842.3.patch, HIVE-20842.4.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20867) Rewrite INTERSECT into LEFT SEMI JOIN instead of UNION + Group by

2018-11-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-20867:
--


> Rewrite INTERSECT into LEFT SEMI JOIN instead of UNION + Group by
> -
>
> Key: HIVE-20867
> URL: https://issues.apache.org/jira/browse/HIVE-20867
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20818) Views created with a WHERE subquery will regard views referenced in the subquery as direct input

2018-11-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675539#comment-16675539
 ] 

Hive QA commented on HIVE-20818:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
40s{color} | {color:blue} ql in master has 2315 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 1 new + 182 unchanged - 0 
fixed = 183 total (was 182) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14749/dev-support/hive-personality.sh
 |
| git revision | master / 39b4f94 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14749/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14749/yetus/whitespace-eol.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14749/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Views created with a WHERE subquery will regard views referenced in the 
> subquery as direct input
> 
>
> Key: HIVE-20818
> URL: https://issues.apache.org/jira/browse/HIVE-20818
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-20818.patch
>
>
> If Hive is configured with an authorization hook like Sentry, and a view is 
> created with a WHERE clause referencing a different view' user has no access 
> to, user cannot access the view as view' is considered direct input.
> For example:
> {code:java}
> create database db1;
> create database db2;
> create database db3;
>  
> create table db1.table1 (cola string, colb string, colc string);
> insert into db1.table1 values ('a','b','c');
> insert into db1.table1 values ('x','y','z');
> CREATE VIEW db2.view1 AS SELECT cola, colb, colc FROM db1.table1 WHERE 
> cola="x"; 
> CREATE VIEW db2.view2 AS SELECT table1.cola, table1.colb, table1.colc FROM 
> db1.table1 WHERE table1.cola NOT IN (SELECT view1.cola FROM db2.view1); 
> create view db3.view3 as select * from db2.view2
> {code}
>  If test_user has read permission for only db3 (but not db1 or d

[jira] [Commented] (HIVE-20867) Rewrite INTERSECT into LEFT SEMI JOIN instead of UNION + Group by

2018-11-05 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675550#comment-16675550
 ] 

Pengcheng Xiong commented on HIVE-20867:


I have some questions about this jira. Could you share your design document on 
this? I assumed that we compared several candidates when we made the decision, 
and lefts semi join was one of them. We chose union-based one because a) a 
similar approach can be applied to except(all) as well, thus we have better 
code reuse. b) when we have more then 2 branches as the inputs of intersect, we 
assume that in the future those branches can be executed in parallel. Comparing 
with left-semi join one, we need to do the join one by one.

> Rewrite INTERSECT into LEFT SEMI JOIN instead of UNION + Group by
> -
>
> Key: HIVE-20867
> URL: https://issues.apache.org/jira/browse/HIVE-20867
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20862) QueryId no longer shows up in the logs

2018-11-05 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675580#comment-16675580
 ] 

Eugene Koifman commented on HIVE-20862:
---

no related failures

 

> QueryId no longer shows up in the logs
> --
>
> Key: HIVE-20862
> URL: https://issues.apache.org/jira/browse/HIVE-20862
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20862.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20863) remove dead code

2018-11-05 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20863:
--
Status: Open  (was: Patch Available)

> remove dead code
> 
>
> Key: HIVE-20863
> URL: https://issues.apache.org/jira/browse/HIVE-20863
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Minor
> Attachments: HIVE-20863.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20818) Views created with a WHERE subquery will regard views referenced in the subquery as direct input

2018-11-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675624#comment-16675624
 ] 

Hive QA commented on HIVE-20818:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12946905/HIVE-20818.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 15526 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_12] (batchId=1)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
 (batchId=190)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] 
(batchId=116)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14749/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14749/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14749/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12946905 - PreCommit-HIVE-Build

> Views created with a WHERE subquery will regard views referenced in the 
> subquery as direct input
> 
>
> Key: HIVE-20818
> URL: https://issues.apache.org/jira/browse/HIVE-20818
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-20818.patch
>
>
> If Hive is configured with an authorization hook like Sentry, and a view is 
> created with a WHERE clause referencing a different view' user has no access 
> to, user cannot access the view as view' is considered direct input.
> For example:
> {code:java}
> create database db1;
> create database db2;
> create database db3;
>  
> create table db1.table1 (cola string, colb string, colc string);
> insert into db1.table1 values ('a','b','c');
> insert into db1.table1 values ('x','y','z');
> CREATE VIEW db2.view1 AS SELECT cola, colb, colc FROM db1.table1 WHERE 
> cola="x"; 
> CREATE VIEW db2.view2 AS SELECT table1.cola, table1.colb, table1.colc FROM 
> db1.table1 WHERE table1.cola NOT IN (SELECT view1.cola FROM db2.view1); 
> create view db3.view3 as select * from db2.view2
> {code}
>  If test_user has read permission for only db3 (but not db1 or db2), their 
> query
> {code:java}
> select * from db3.view3;{code}
> will fail with :
> {code:java}
> Error while compiling statement: FAILED: SemanticException No valid 
> privileges User test_user does not have privileges for QUERY The required 
> privileges: Server=server1->Db=db2->Table=view1->action=select; {code}
> WHERE IN and WHERE EXISTS cause the same issue.
> Cascading views created with no WHERE clauses (i.e. with simple SELECTs and 
> FROM clauses) work fine.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20486) Kafka: Use Row SerDe + vectorization

2018-11-05 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20486:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Kafka: Use Row SerDe + vectorization
> 
>
> Key: HIVE-20486
> URL: https://issues.apache.org/jira/browse/HIVE-20486
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Reporter: Gopal V
>Assignee: slim bouguerra
>Priority: Major
>  Labels: kafka, vectorization
> Fix For: 4.0.0
>
> Attachments: HIVE-20486.3.patch, HIVE-20486.3.patch, 
> HIVE-20486.4.patch, HIVE-20486.4.patch, HIVE-20486.patch
>
>
> KafkaHandler returns unvectorized rows which causes the operators downstream 
> to be slower and sub-optimal.
> Hive has a vectorization shim which allows Kafka streams without complex 
> projections to be wrapped into a vectorized reader via 
> {{hive.vectorized.use.row.serde.deserialize}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20486) Kafka: Use Row SerDe + vectorization

2018-11-05 Thread slim bouguerra (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675622#comment-16675622
 ] 

slim bouguerra commented on HIVE-20486:
---

Push this to master. 
https://github.com/apache/hive/commit/b789aebcfbf19d7b0fd6c2d6643adfd5a8de5f12


> Kafka: Use Row SerDe + vectorization
> 
>
> Key: HIVE-20486
> URL: https://issues.apache.org/jira/browse/HIVE-20486
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Reporter: Gopal V
>Assignee: slim bouguerra
>Priority: Major
>  Labels: kafka, vectorization
> Fix For: 4.0.0
>
> Attachments: HIVE-20486.3.patch, HIVE-20486.3.patch, 
> HIVE-20486.4.patch, HIVE-20486.4.patch, HIVE-20486.patch
>
>
> KafkaHandler returns unvectorized rows which causes the operators downstream 
> to be slower and sub-optimal.
> Hive has a vectorization shim which allows Kafka streams without complex 
> projections to be wrapped into a vectorized reader via 
> {{hive.vectorized.use.row.serde.deserialize}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20079) Populate more accurate rawDataSize for parquet format

2018-11-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675625#comment-16675625
 ] 

Hive QA commented on HIVE-20079:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12931668/HIVE-20079.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14750/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14750/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14750/

Messages:
{noformat}
 This message was trimmed, see log for full details 
Applied patch to 
'ql/src/test/results/clientpositive/parquet_vectorization_limit.q.out' cleanly.
error: patch failed: 
ql/src/test/results/clientpositive/spark/parquet_vectorization_0.q.out:34
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/parquet_vectorization_0.q.out' with 
conflicts.
error: patch failed: 
ql/src/test/results/clientpositive/spark/parquet_vectorization_1.q.out:60
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/parquet_vectorization_1.q.out' 
cleanly.
error: patch failed: 
ql/src/test/results/clientpositive/spark/parquet_vectorization_10.q.out:64
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/parquet_vectorization_10.q.out' with 
conflicts.
error: patch failed: 
ql/src/test/results/clientpositive/spark/parquet_vectorization_11.q.out:46
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/parquet_vectorization_11.q.out' 
cleanly.
error: patch failed: 
ql/src/test/results/clientpositive/spark/parquet_vectorization_12.q.out:83
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/parquet_vectorization_12.q.out' with 
conflicts.
error: patch failed: 
ql/src/test/results/clientpositive/spark/parquet_vectorization_13.q.out:85
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/parquet_vectorization_13.q.out' with 
conflicts.
error: patch failed: 
ql/src/test/results/clientpositive/spark/parquet_vectorization_14.q.out:85
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/parquet_vectorization_14.q.out' with 
conflicts.
error: patch failed: 
ql/src/test/results/clientpositive/spark/parquet_vectorization_15.q.out:81
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/parquet_vectorization_15.q.out' with 
conflicts.
error: patch failed: 
ql/src/test/results/clientpositive/spark/parquet_vectorization_16.q.out:58
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/parquet_vectorization_16.q.out' with 
conflicts.
error: patch failed: 
ql/src/test/results/clientpositive/spark/parquet_vectorization_17.q.out:66
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/parquet_vectorization_17.q.out' with 
conflicts.
error: patch failed: 
ql/src/test/results/clientpositive/spark/parquet_vectorization_2.q.out:64
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/parquet_vectorization_2.q.out' 
cleanly.
error: patch failed: 
ql/src/test/results/clientpositive/spark/parquet_vectorization_3.q.out:69
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/parquet_vectorization_3.q.out' 
cleanly.
error: patch failed: 
ql/src/test/results/clientpositive/spark/parquet_vectorization_4.q.out:64
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/parquet_vectorization_4.q.out' 
cleanly.
error: patch failed: 
ql/src/test/results/clientpositive/spark/parquet_vectorization_5.q.out:58
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/parquet_vectorization_5.q.out' 
cleanly.
error: patch failed: 
ql/src/test/results/clientpositive/spark/parquet_vectorization_6.q.out:58
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/parquet_vectorization_6.q.out' with 
conflicts.
error: patch failed: 
ql/src/test/results/clientpositive/spark/parquet_vectorization_7.q.out:72
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/parquet_vectorization_7.q.out' with 
conflicts.
error: patch failed: 
ql/src/test/results/clientpositive/spark/parquet_vectorization_8.q.out:68
Falling back to three-way merge...
Applied patch to 
'ql/src/test/results/clientpositive/spark/parquet_vectorization_8.q.out' with 
conflicts.
error: patch failed: 
ql/src/test/results/clientpositive/spark/parquet_vectorization_9.q.out:58
Falling back

[jira] [Updated] (HIVE-20813) udf to_epoch_milli need to support timestamp without time zone as well

2018-11-05 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20813:
--
Attachment: HIVE-20813.patch

> udf to_epoch_milli need to support timestamp without time zone as well
> --
>
> Key: HIVE-20813
> URL: https://issues.apache.org/jira/browse/HIVE-20813
> Project: Hive
>  Issue Type: Bug
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-20813.patch, HIVE-20813.patch
>
>
> Currently the following query will fail with a cast exception (tries to cast 
> timestamp to timestamp with local timezone).
> {code}
>  select to_epoch_milli(current_timestamp)
> {code}
> As a simple fix we need to add support for timestamp object inspector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20782) Cleaning some unused code

2018-11-05 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra reassigned HIVE-20782:
-

Assignee: slim bouguerra  (was: Teddy Choi)

> Cleaning some unused code
> -
>
> Key: HIVE-20782
> URL: https://issues.apache.org/jira/browse/HIVE-20782
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-20782.2.patch, HIVE-20782.2.patch, 
> HIVE-20782.3.patch, HIVE-20782.3.patch, HIVE-20782.patch
>
>
> Am making my way into the vectorize code and trying understand the APIs. Ran 
> into this unused one, i guess it is not used anymore.
> [~ashutoshc] maybe can explain as you are the main contributor to this file 
> {code} 
> a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedSerde.java{code}
>  ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20782) Cleaning some unused code

2018-11-05 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20782:
--
Attachment: HIVE-20782.3.patch

> Cleaning some unused code
> -
>
> Key: HIVE-20782
> URL: https://issues.apache.org/jira/browse/HIVE-20782
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: Teddy Choi
>Priority: Major
> Attachments: HIVE-20782.2.patch, HIVE-20782.2.patch, 
> HIVE-20782.3.patch, HIVE-20782.3.patch, HIVE-20782.patch
>
>
> Am making my way into the vectorize code and trying understand the APIs. Ran 
> into this unused one, i guess it is not used anymore.
> [~ashutoshc] maybe can explain as you are the main contributor to this file 
> {code} 
> a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedSerde.java{code}
>  ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20835) Interaction between constraints and MV rewriting may create loop in Calcite planner

2018-11-05 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20835:
---
Attachment: HIVE-20835.01.patch

> Interaction between constraints and MV rewriting may create loop in Calcite 
> planner
> ---
>
> Key: HIVE-20835
> URL: https://issues.apache.org/jira/browse/HIVE-20835
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20835.01.patch, HIVE-20835.01.patch, 
> HIVE-20835.01.patch, HIVE-20835.01.patch, HIVE-20835.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20822) Improvements to push computation to JDBC from Calcite

2018-11-05 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20822:
---
Attachment: HIVE-20822.02.patch

> Improvements to push computation to JDBC from Calcite
> -
>
> Key: HIVE-20822
> URL: https://issues.apache.org/jira/browse/HIVE-20822
> Project: Hive
>  Issue Type: Improvement
>  Components: StorageHandler
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20822.01.patch, HIVE-20822.02.patch, 
> HIVE-20822.02.patch, HIVE-20822.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20822) Improvements to push computation to JDBC from Calcite

2018-11-05 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675673#comment-16675673
 ] 

Jesus Camacho Rodriguez commented on HIVE-20822:


[~daijy], could you take a look? Thanks
https://reviews.apache.org/r/69256/

> Improvements to push computation to JDBC from Calcite
> -
>
> Key: HIVE-20822
> URL: https://issues.apache.org/jira/browse/HIVE-20822
> Project: Hive
>  Issue Type: Improvement
>  Components: StorageHandler
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20822.01.patch, HIVE-20822.02.patch, 
> HIVE-20822.02.patch, HIVE-20822.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20867) Rewrite INTERSECT into LEFT SEMI JOIN instead of UNION + Group by

2018-11-05 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675678#comment-16675678
 ] 

Gopal V commented on HIVE-20867:


bq. Comparing with left-semi join one, we need to do the join one by one.

That is true at the logical level, but the physical runtime uses the bloom 
filter semi-join before data-gets shuffled & the join operations actually do 
not force a full sort (while the group-by does).

The current implementation's achilles heel is when the two branches are unequal 
in size - if you run something like

{code}
select ss_item_sk as k from store_sales
intersects 
select i_item_sk  as k from item where i_category = 'Sports'
{code}

vs

{code}
select ss_item_sk as k from store_sales where ss_item_sk IN (select i_item_sk  
as k from item where i_category = 'Sports').
{code}

you can see the vast difference between the two approaches.

However, I think this might need a null-safe semi-join, because within 
intersects null == null.

> Rewrite INTERSECT into LEFT SEMI JOIN instead of UNION + Group by
> -
>
> Key: HIVE-20867
> URL: https://issues.apache.org/jira/browse/HIVE-20867
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-20867) Rewrite INTERSECT into LEFT SEMI JOIN instead of UNION + Group by

2018-11-05 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675678#comment-16675678
 ] 

Gopal V edited comment on HIVE-20867 at 11/5/18 7:41 PM:
-

bq. Comparing with left-semi join one, we need to do the join one by one.

That is true at the logical level, but the physical runtime uses the bloom 
filter semi-join before data-gets shuffled & the join operations actually do 
not force a full sort (while the group-by does).

The current implementation's achilles heel is when the two branches are unequal 
in size - if you run something like

{code}
select ss_item_sk as k from store_sales
intersects 
select i_item_sk  as k from item where i_category = 'Sports'
{code}

vs

{code}
select distinct ss_item_sk as k from store_sales where ss_item_sk IN (select 
i_item_sk  as k from item where i_category = 'Sports').
{code}

you can see the vast difference between the two approaches.

However, I think this might need a null-safe semi-join, because within 
intersects null == null.


was (Author: gopalv):
bq. Comparing with left-semi join one, we need to do the join one by one.

That is true at the logical level, but the physical runtime uses the bloom 
filter semi-join before data-gets shuffled & the join operations actually do 
not force a full sort (while the group-by does).

The current implementation's achilles heel is when the two branches are unequal 
in size - if you run something like

{code}
select ss_item_sk as k from store_sales
intersects 
select i_item_sk  as k from item where i_category = 'Sports'
{code}

vs

{code}
select ss_item_sk as k from store_sales where ss_item_sk IN (select i_item_sk  
as k from item where i_category = 'Sports').
{code}

you can see the vast difference between the two approaches.

However, I think this might need a null-safe semi-join, because within 
intersects null == null.

> Rewrite INTERSECT into LEFT SEMI JOIN instead of UNION + Group by
> -
>
> Key: HIVE-20867
> URL: https://issues.apache.org/jira/browse/HIVE-20867
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20833) package.jdo needs to be updated to conform with HIVE-20221 changes

2018-11-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675689#comment-16675689
 ] 

Hive QA commented on HIVE-20833:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
34s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
22s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
45s{color} | {color:blue} ql in master has 2315 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
0s{color} | {color:blue} standalone-metastore/metastore-server in master has 
185 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
16s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 27m 35s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14751/dev-support/hive-personality.sh
 |
| git revision | master / b789aeb |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: itests ql standalone-metastore/metastore-server U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14751/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> package.jdo needs to be updated to conform with HIVE-20221 changes
> --
>
> Key: HIVE-20833
> URL: https://issues.apache.org/jira/browse/HIVE-20833
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20833.1.patch, HIVE-20833.2.patch, 
> HIVE-20833.3.patch, HIVE-20833.4.patch, HIVE-20833.5.patch
>
>
> Following test if run with TestMiniLlapLocalCliDriver will fail:
> {code:sql}
> CREATE TABLE `alterPartTbl`(
>`po_header_id` bigint,
>`vendor_num` string,
>`requester_name` string,
>`approver_name` string,
>`buyer_name` string,
>`preparer_name` string,
>`po_requisition_number` string,
>`po_requisition_id` bigint,
>`po_requisition_desc` string,
>`rate_type` string,
>`rate_date` date,
>`rate` double,
>`blanket_total_amount` double,
>`authorization_status` string,
>`revision_num` bigint,
>`revised_date` date,
>`approved_flag` string,
>`approved

[jira] [Commented] (HIVE-20854) Sensible Defaults: Hive's Zookeeper heartbeat interval is 20 minutes, change to 2

2018-11-05 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675722#comment-16675722
 ] 

Peter Vary commented on HIVE-20854:
---

[~gopalv]: You might be interested in the discussion here: HIVE-14979.
[~thejas] wrote at that time:
{quote}
Regarding the session timeout -
Looks like the original setting for the session timeout was 10 mins, and 
HIVE-9119 changed it to 20 mins. 
In case of zookeeper service discovery, it is not a major issue if the entry in 
zookeeper stays around for longer. Larger timeout can provide better resilience 
against temporary gc or network issues. 10 mins might be still OK for this 
purpose.
{quote}
I am not sure the things mentioned are still valid, but I hope this info could 
help.

Thanks,
Peter

> Sensible Defaults: Hive's Zookeeper heartbeat interval is 20 minutes, change 
> to 2
> -
>
> Key: HIVE-20854
> URL: https://issues.apache.org/jira/browse/HIVE-20854
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-20854.1.patch
>
>
> {code}
> HIVE_ZOOKEEPER_SESSION_TIMEOUT("hive.zookeeper.session.timeout", 
> "120ms",
> new TimeValidator(TimeUnit.MILLISECONDS),
> "ZooKeeper client's session timeout (in milliseconds). The client is 
> disconnected, and as a result, all locks released, \n" +
> "if a heartbeat is not sent in the timeout."),
> {code}
> That's 1,200,000ms which is too long for all practical purposes - a 20 minute 
> outage in case a node has a failure is too long.
> That is too long for the JDBC load-balancing, LLAP failure tolerance and the 
> lock manager expiry.
> Change to 2 minutes, as a sensible default



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20833) package.jdo needs to be updated to conform with HIVE-20221 changes

2018-11-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675716#comment-16675716
 ] 

Hive QA commented on HIVE-20833:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12946940/HIVE-20833.5.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15525 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[timestamptz_2] 
(batchId=85)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[kafka_storage_handler]
 (batchId=194)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14751/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14751/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14751/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12946940 - PreCommit-HIVE-Build

> package.jdo needs to be updated to conform with HIVE-20221 changes
> --
>
> Key: HIVE-20833
> URL: https://issues.apache.org/jira/browse/HIVE-20833
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20833.1.patch, HIVE-20833.2.patch, 
> HIVE-20833.3.patch, HIVE-20833.4.patch, HIVE-20833.5.patch
>
>
> Following test if run with TestMiniLlapLocalCliDriver will fail:
> {code:sql}
> CREATE TABLE `alterPartTbl`(
>`po_header_id` bigint,
>`vendor_num` string,
>`requester_name` string,
>`approver_name` string,
>`buyer_name` string,
>`preparer_name` string,
>`po_requisition_number` string,
>`po_requisition_id` bigint,
>`po_requisition_desc` string,
>`rate_type` string,
>`rate_date` date,
>`rate` double,
>`blanket_total_amount` double,
>`authorization_status` string,
>`revision_num` bigint,
>`revised_date` date,
>`approved_flag` string,
>`approved_date` timestamp,
>`amount_limit` double,
>`note_to_authorizer` string,
>`note_to_vendor` string,
>`note_to_receiver` string,
>`vendor_order_num` string,
>`comments` string,
>`acceptance_required_flag` string,
>`acceptance_due_date` date,
>`closed_date` timestamp,
>`user_hold_flag` string,
>`approval_required_flag` string,
>`cancel_flag` string,
>`firm_status_lookup_code` string,
>`firm_date` date,
>`frozen_flag` string,
>`closed_code` string,
>`org_id` bigint,
>`reference_num` string,
>`wf_item_type` string,
>`wf_item_key` string,
>`submit_date` date,
>`sap_company_code` string,
>`sap_fiscal_year` bigint,
>`po_number` string,
>`sap_line_item` bigint,
>`closed_status_flag` string,
>`balancing_segment` string,
>`cost_center_segment` string,
>`base_amount_limit` double,
>`base_blanket_total_amount` double,
>`base_open_amount` double,
>`base_ordered_amount` double,
>`cancel_date` timestamp,
>`cbc_accounting_date` date,
>`change_requested_by` string,
>`change_summary` string,
>`confirming_order_flag` string,
>`document_creation_method` string,
>`edi_processed_flag` string,
>`edi_processed_status` string,
>`enabled_flag` string,
>`encumbrance_required_flag` string,
>`end_date` date,
>`end_date_active` date,
>`from_header_id` bigint,
>`from_type_lookup_code` string,
>`global_agreement_flag` string,
>`government_context` string,
>`interface_source_code` string,
>`ledger_currency_code` string,
>`open_amount` double,
>`ordered_amount` double,
>`pay_on_code` string,
>`payment_term_name` string,
>`pending_signature_flag` string,
>`po_revision_num` double,
>`preparer_id` bigint,
>`price_update_tolerance` double,
>`print_count` double,
>`printed_date` date,
>`reply_date` date,
>`reply_method_lookup_code` string,
>`rfq_close_date` date,
>`segment2` string,
>`segment3` string,
>`segment4` string,
>`segment5` string,
>`shipping_control` string,
>`start_date` date,
>`start_date_active` date,
>`summary_flag` string,
>`supply_agreement_flag` string,
>`usd_amount_limit` double,
>`usd_blanket_total_amount` double,
>`usd_exchange_rate` double,
>`usd_open

[jira] [Assigned] (HIVE-20189) Separate metastore client code into its own module

2018-11-05 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-20189:
-

Assignee: Alexander Kolbasov  (was: Peter Vary)

> Separate metastore client code into its own module
> --
>
> Key: HIVE-20189
> URL: https://issues.apache.org/jira/browse/HIVE-20189
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
>Priority: Major
> Attachments: HIVE-20189.01.patch, HIVE-20189.03.patch, 
> HIVE-20189.04.patch, HIVE-20189.05.patch, HIVE-20189.06.patch
>
>
> The goal of this JIRA is to split HiveMetastoreClient code out of 
> metastore-common. This is a pom-only change that does not require any changes 
> in the code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20189) Separate metastore client code into its own module

2018-11-05 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675730#comment-16675730
 ] 

Peter Vary commented on HIVE-20189:
---

[~karthik.manamcheri]: you might be interested in this patch - needs rebasing 
and pushing through pre-commit. Happy to review and commit, but do not have 
time to do the stuff myself.

Thanks,

Peter

> Separate metastore client code into its own module
> --
>
> Key: HIVE-20189
> URL: https://issues.apache.org/jira/browse/HIVE-20189
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
>Priority: Major
> Attachments: HIVE-20189.01.patch, HIVE-20189.03.patch, 
> HIVE-20189.04.patch, HIVE-20189.05.patch, HIVE-20189.06.patch
>
>
> The goal of this JIRA is to split HiveMetastoreClient code out of 
> metastore-common. This is a pom-only change that does not require any changes 
> in the code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20831) Add Session ID to Operation Logging

2018-11-05 Thread Roohi Syeda (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roohi Syeda reassigned HIVE-20831:
--

Assignee: Roohi Syeda

> Add Session ID to Operation Logging
> ---
>
> Key: HIVE-20831
> URL: https://issues.apache.org/jira/browse/HIVE-20831
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.1.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Roohi Syeda
>Priority: Major
>  Labels: newbie, noob
>
> {code:java|title=OperationManager.java}
> LOG.info("Adding operation: " + operation.getHandle());
> {code}
> Please add additional logging to explicitly state which Hive session this 
> operation is being added to.
> https://github.com/apache/hive/blob/3963c729fabf90009cb67d277d40fe5913936358/service/src/java/org/apache/hive/service/cli/operation/OperationManager.java#L201



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675747#comment-16675747
 ] 

Hive QA commented on HIVE-20842:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
40s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
47s{color} | {color:blue} ql in master has 2315 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 3 new + 29 unchanged - 0 fixed 
= 32 total (was 29) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14752/dev-support/hive-personality.sh
 |
| git revision | master / b789aeb |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14752/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14752/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.2.patch, 
> HIVE-20842.3.patch, HIVE-20842.4.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20633) Incorrect column lineage: each output column has input from *all columns* of the input table

2018-11-05 Thread Madhan Neethiraj (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Madhan Neethiraj updated HIVE-20633:

Issue Type: Bug  (was: New Feature)

> Incorrect column lineage: each output column has input from *all columns* of 
> the input table
> 
>
> Key: HIVE-20633
> URL: https://issues.apache.org/jira/browse/HIVE-20633
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.2
>Reporter: Madhan Neethiraj
>Priority: Critical
>
> Column lineage details made available to post hook is incorrect for certain 
> queries - like the following INSERT:
> {noformat}
> CREATE TABLE source_tbl(col_001 INT, col_002 INT, col_003 INT);
> CREATE TABLE target_tbl(col_001 INT, col_002 INT, col_003 INT);
> INSERT INTO target_tbl SELECT v1.col_001, v1.col_002, v1.col_003 FROM (SELECT 
> col_001, col_002, col_003, ROW_NUMBER() OVER() AS r_num FROM source_tbl) v1;
> {noformat}
> Below are the details of the lineage given to post hooks (like Atlas hook) 
> via HookContext.getLinfo(). It contains 3 entries, one for each target table 
> column. Note the dependency for each column has all columns of the source 
> tables.
> {noformat}
> DependencyKey=default.target_tbl:FieldSchema(name:col_001, type:int, 
> comment:null)
> Dependency=[SCRIPT]
>[default.source_tbl(src):FieldSchema(name:col_001, type:int, 
> comment:null),
> default.source_tbl(src):FieldSchema(name:col_002, type:int, 
> comment:null),
> default.source_tbl(src):FieldSchema(name:col_003, type:int, 
> comment:null),
> 
> default.source_tbl(src):FieldSchema(name:BLOCK__OFFSET__INSIDE__FILE, 
> type:bigint, comment:),
> default.source_tbl(src):FieldSchema(name:INPUT__FILE__NAME, 
> type:string, comment:),
> default.source_tbl(src):FieldSchema(name:ROW__ID, 
> type:struct, comment:)
>];
>  
> DependencyKey=default.target_tbl:FieldSchema(name:col_002, type:int, 
> comment:null)
> Dependency=[SCRIPT]
>[default.source_tbl(src):FieldSchema(name:col_001, type:int, 
> comment:null),
> default.source_tbl(src):FieldSchema(name:col_002, type:int, 
> comment:null),
> default.source_tbl(src):FieldSchema(name:col_003, type:int, 
> comment:null),
> 
> default.source_tbl(src):FieldSchema(name:BLOCK__OFFSET__INSIDE__FILE, 
> type:bigint, comment:),
> default.source_tbl(src):FieldSchema(name:INPUT__FILE__NAME, 
> type:string, comment:),
> default.source_tbl(src):FieldSchema(name:ROW__ID, 
> type:struct, comment:)
>];
>  
> DependencyKey=default.target_tbl:FieldSchema(name:col_003, type:int, 
> comment:null)
> Dependency=[SCRIPT]
>[default.source_tbl(src):FieldSchema(name:col_001, type:int, 
> comment:null),
> default.source_tbl(src):FieldSchema(name:col_002, type:int, 
> comment:null),
> default.source_tbl(src):FieldSchema(name:col_003, type:int, 
> comment:null),
> 
> default.source_tbl(src):FieldSchema(name:BLOCK__OFFSET__INSIDE__FILE, 
> type:bigint, comment:),
> default.source_tbl(src):FieldSchema(name:INPUT__FILE__NAME, 
> type:string, comment:),
> default.source_tbl(src):FieldSchema(name:ROW__ID, 
> type:struct, comment:)
>];
> {noformat}
> When INSERT statement doesn't include "ROW_NUMBER() OVER() AS r_num", the 
> lineage details look correct. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20867) Rewrite INTERSECT into LEFT SEMI JOIN instead of UNION + Group by

2018-11-05 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675765#comment-16675765
 ] 

Pengcheng Xiong commented on HIVE-20867:


Thanks Gopal for the explanation. I can see the potential benefit of using left 
semi join over the existing implementation in some scenarios. If it is decided 
case-by-case, I think it may be better to add some cost-based metrics or a hive 
configuration on which the decision can be made. That is only my suggestion. 
You guys can decided what to do after all.  :)

> Rewrite INTERSECT into LEFT SEMI JOIN instead of UNION + Group by
> -
>
> Key: HIVE-20867
> URL: https://issues.apache.org/jira/browse/HIVE-20867
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675785#comment-16675785
 ] 

Hive QA commented on HIVE-20842:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12946945/HIVE-20842.4.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 42 failed/errored test(s), 15525 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[kafka_storage_handler]
 (batchId=194)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[except_distinct] 
(batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[intersect_all] 
(batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[intersect_distinct]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[intersect_merge] 
(batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parallel_colstats]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer2]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainanalyze_2]
 (batchId=177)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_in_having]
 (batchId=174)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_union2] 
(batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_union_multiinsert]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union7] 
(batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[unionDistinct_3]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_reduce]
 (batchId=173)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_short_regress]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_ptf]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[windowing] 
(batchId=172)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
 (batchId=190)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query38] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query49] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query58] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query5] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query64] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query70] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query83] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query87] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query23]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query38]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query45]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query49]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query58]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query66]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query70]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query75]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query80]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query83]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query87]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query8]
 (batchId=272)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14752/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14752/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14752/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.Y

[jira] [Commented] (HIVE-20486) Kafka: Use Row SerDe + vectorization

2018-11-05 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675787#comment-16675787
 ] 

Vineet Garg commented on HIVE-20486:


[~bslim] {{ kafka_storage_handler}} has been failing consistently in last few 
runs, most likely caused by the commit for this jira. Can you please take a 
look?

Ref: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14752/testReport/org.apache.hadoop.hive.cli/TestMiniDruidCliDriver/testCliDriver_kafka_storage_handler_/

> Kafka: Use Row SerDe + vectorization
> 
>
> Key: HIVE-20486
> URL: https://issues.apache.org/jira/browse/HIVE-20486
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Reporter: Gopal V
>Assignee: slim bouguerra
>Priority: Major
>  Labels: kafka, vectorization
> Fix For: 4.0.0
>
> Attachments: HIVE-20486.3.patch, HIVE-20486.3.patch, 
> HIVE-20486.4.patch, HIVE-20486.4.patch, HIVE-20486.patch
>
>
> KafkaHandler returns unvectorized rows which causes the operators downstream 
> to be slower and sub-optimal.
> Hive has a vectorization shim which allows Kafka streams without complex 
> projections to be wrapped into a vectorized reader via 
> {{hive.vectorized.use.row.serde.deserialize}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-20486) Kafka: Use Row SerDe + vectorization

2018-11-05 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675787#comment-16675787
 ] 

Vineet Garg edited comment on HIVE-20486 at 11/5/18 9:43 PM:
-

[~bslim] {{kafka_storage_handler}} has been failing consistently in last few 
runs, most likely caused by the commit for this jira. Can you please take a 
look?

Ref: 
[https://builds.apache.org/job/PreCommit-HIVE-Build/14752/testReport/org.apache.hadoop.hive.cli/TestMiniDruidCliDriver/testCliDriver_kafka_storage_handler_/]


was (Author: vgarg):
[~bslim] {{ kafka_storage_handler}} has been failing consistently in last few 
runs, most likely caused by the commit for this jira. Can you please take a 
look?

Ref: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14752/testReport/org.apache.hadoop.hive.cli/TestMiniDruidCliDriver/testCliDriver_kafka_storage_handler_/

> Kafka: Use Row SerDe + vectorization
> 
>
> Key: HIVE-20486
> URL: https://issues.apache.org/jira/browse/HIVE-20486
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Reporter: Gopal V
>Assignee: slim bouguerra
>Priority: Major
>  Labels: kafka, vectorization
> Fix For: 4.0.0
>
> Attachments: HIVE-20486.3.patch, HIVE-20486.3.patch, 
> HIVE-20486.4.patch, HIVE-20486.4.patch, HIVE-20486.patch
>
>
> KafkaHandler returns unvectorized rows which causes the operators downstream 
> to be slower and sub-optimal.
> Hive has a vectorization shim which allows Kafka streams without complex 
> projections to be wrapped into a vectorized reader via 
> {{hive.vectorized.use.row.serde.deserialize}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20839) "Cannot find field" error during dynamically partitioned hash join

2018-11-05 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-20839:
--
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Committed to master

> "Cannot find field" error during dynamically partitioned hash join
> --
>
> Key: HIVE-20839
> URL: https://issues.apache.org/jira/browse/HIVE-20839
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20839.1.patch, HIVE-20839.2.patch, 
> HIVE-20839.3.patch, HIVE-20839.4.patch
>
>
> Occurs in some cases in the non-CBO optimized queries, either if CBO is 
> disabled or has failed due to error.
> {noformat}
> 2018-10-11T04:40:22,724 ERROR [TezTR-85144_8944_1085_28_996_2 
> (1539092085144_8944_1085_28_000996_2)] tez.ReduceRecordProcessor: Hit error 
> while closing operators - failing tree
> 2018-10-11T04:40:22,724 ERROR [TezTR-85144_8944_1085_28_996_2 
> (1539092085144_8944_1085_28_000996_2)] tez.TezProcessor: 
> java.lang.RuntimeException: cannot find field _col304 from [0:_col0, 1:_col1, 
> 2:_col2, 3:_col3, 4:_col4, 5:_col5, 6:_col6, 7:_col7, 8:_col8, 9:_col9, 
> 10:_col10, 11:_col11, 12:_col12, 13:_col13, 14:_col15, 15:_col16, 16:_col17, 
> 17:_col18, 18:_col19, 19:_col20, 20:_col21, 21:_col22, 22:_col23, 23:_col24, 
> 24:_col25, 25:_col26, 26:_col27, 27:_col28, 28:_col29, 29:_col30, 30:_col31, 
> 31:_col32, 32:_col33, 33:_col34, 34:_col35, 35:_col36, 36:_col37, 37:_col38, 
> 38:_col39, 39:_col40, 40:_col41, 41:_col42, 42:_col43, 43:_col44, 44:_col45, 
> 45:_col46, 46:_col47, 47:_col48, 48:_col49, 49:_col50, 50:_col51, 51:_col52, 
> 52:_col53, 53:_col54, 54:_col55, 55:_col56, 56:_col57, 57:_col58, 58:_col59, 
> 59:_col60, 60:_col61, 61:_col62, 62:_col63, 63:_col64, 64:_col65, 65:_col66, 
> 66:_col67, 67:_col68, 68:_col70, 69:_col72, 70:_col73, 71:_col74, 72:_col75, 
> 73:_col76, 74:_col77, 75:_col78, 76:_col79, 77:_col80, 78:_col81, 79:_col82, 
> 80:_col83, 81:_col84, 82:_col85, 83:_col86, 84:_col87, 85:_col88, 86:_col89, 
> 87:_col90, 88:_col91, 89:_col92, 90:_col93, 91:_col94, 92:_col95, 93:_col96, 
> 94:_col97, 95:_col98, 96:_col99, 97:_col100, 98:_col101, 99:_col102, 
> 100:_col103, 101:_col104, 102:_col105, 103:_col106, 104:_col107, 105:_col108, 
> 106:_col109, 107:_col110, 108:_col111, 109:_col112, 110:_col113, 111:_col114, 
> 112:_col115, 113:_col116, 114:_col117, 115:_col118, 116:_col119, 117:_col120, 
> 118:_col121, 119:_col122, 120:_col123, 121:_col124, 122:_col125, 123:_col126, 
> 124:_col127, 125:_col128, 126:_col129, 127:_col130, 128:_col131, 129:_col132, 
> 130:_col133, 131:_col134, 132:_col135, 133:_col136, 134:_col137, 135:_col138, 
> 136:_col139, 137:_col140, 138:_col141, 139:_col142, 140:_col143, 141:_col144, 
> 142:_col145, 143:_col146, 144:_col147, 145:_col148, 146:_col149, 147:_col150, 
> 148:_col151, 149:_col152, 150:_col153, 151:_col154, 152:_col155, 153:_col156, 
> 154:_col157, 155:_col158, 156:_col159, 157:_col160, 158:_col161, 159:_col162, 
> 160:_col163, 161:_col164, 162:_col165, 163:_col166, 164:_col167, 165:_col168, 
> 166:_col169, 167:_col170, 168:_col171, 169:_col318]
> at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:485)
> at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:153)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:80)
> at 
> org.apache.hadoop.hive.ql.exec.JoinUtil.getObjectInspectorsFromEvaluators(JoinUtil.java:91)
> at 
> org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.initializeOp(AbstractMapJoinOperator.java:74)
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.initializeOp(MapJoinOperator.java:144)
> at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:374)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:195)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:188)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:172)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja

[jira] [Updated] (HIVE-20868) SMB Join fails intermittently when TezDummyOperator has child op in getFinalOp in MapRecordProcessor

2018-11-05 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-20868:
--
Status: Patch Available  (was: Open)

> SMB Join fails intermittently when TezDummyOperator has child op in 
> getFinalOp in MapRecordProcessor
> 
>
> Key: HIVE-20868
> URL: https://issues.apache.org/jira/browse/HIVE-20868
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20868) SMB Join fails intermittently when TezDummyOperator has child op in getFinalOp in MapRecordProcessor

2018-11-05 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal reassigned HIVE-20868:
-


> SMB Join fails intermittently when TezDummyOperator has child op in 
> getFinalOp in MapRecordProcessor
> 
>
> Key: HIVE-20868
> URL: https://issues.apache.org/jira/browse/HIVE-20868
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20868) SMB Join fails intermittently when TezDummyOperator has child op in getFinalOp in MapRecordProcessor

2018-11-05 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-20868:
--
Attachment: HIVE-20868.1.patch

> SMB Join fails intermittently when TezDummyOperator has child op in 
> getFinalOp in MapRecordProcessor
> 
>
> Key: HIVE-20868
> URL: https://issues.apache.org/jira/browse/HIVE-20868
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20868.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20751) Upgrade arrow version to 0.10.0

2018-11-05 Thread Dongjoon Hyun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675806#comment-16675806
 ] 

Dongjoon Hyun commented on HIVE-20751:
--

Hi, [~teddy.choi].
Could you update the fix version?

> Upgrade arrow version to 0.10.0
> ---
>
> Key: HIVE-20751
> URL: https://issues.apache.org/jira/browse/HIVE-20751
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.1.0
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20751.1.patch, HIVE-20751.2.patch
>
>
> Need to upgrade arrow version as spark is moving to arrow version 0.10.0 in 
> it's upcoming release 2.4.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-05 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675807#comment-16675807
 ] 

Ashutosh Chauhan commented on HIVE-20842:
-

[~vgarg] Can you please create RB for it?

> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.2.patch, 
> HIVE-20842.3.patch, HIVE-20842.4.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20813) udf to_epoch_milli need to support timestamp without time zone as well

2018-11-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675809#comment-16675809
 ] 

Hive QA commented on HIVE-20813:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
49s{color} | {color:blue} ql in master has 2315 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14753/dev-support/hive-personality.sh
 |
| git revision | master / 12a6f93 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14753/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> udf to_epoch_milli need to support timestamp without time zone as well
> --
>
> Key: HIVE-20813
> URL: https://issues.apache.org/jira/browse/HIVE-20813
> Project: Hive
>  Issue Type: Bug
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-20813.patch, HIVE-20813.patch
>
>
> Currently the following query will fail with a cast exception (tries to cast 
> timestamp to timestamp with local timezone).
> {code}
>  select to_epoch_milli(current_timestamp)
> {code}
> As a simple fix we need to add support for timestamp object inspector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20868) SMB Join fails intermittently when TezDummyOperator has child op in getFinalOp in MapRecordProcessor

2018-11-05 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-20868:
--
Description: In MapRecordProcessor::getFinalOp() due to external cause(not 
known), the TezDummyStoreOperator may have MergeJoin Op as child 
intermittently. Due to this, the fetchDone remains set to true for the DummyOp 
which was set by previous task. Ideally, fetchDone should be reset for each 
task. This eventually leads to the join op skip rows from that dummy op 
resulting in wrong results.

> SMB Join fails intermittently when TezDummyOperator has child op in 
> getFinalOp in MapRecordProcessor
> 
>
> Key: HIVE-20868
> URL: https://issues.apache.org/jira/browse/HIVE-20868
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20868.1.patch
>
>
> In MapRecordProcessor::getFinalOp() due to external cause(not known), the 
> TezDummyStoreOperator may have MergeJoin Op as child intermittently. Due to 
> this, the fetchDone remains set to true for the DummyOp which was set by 
> previous task. Ideally, fetchDone should be reset for each task. This 
> eventually leads to the join op skip rows from that dummy op resulting in 
> wrong results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20819) Leaking Metastore connections when HADOOP_USER_NAME environmental variable is set

2018-11-05 Thread Roohi Syeda (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roohi Syeda updated HIVE-20819:
---
Attachment: HIVE-20819.1.patch

> Leaking Metastore connections when HADOOP_USER_NAME environmental variable is 
> set
> -
>
> Key: HIVE-20819
> URL: https://issues.apache.org/jira/browse/HIVE-20819
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Roohi Syeda
>Assignee: Roohi Syeda
>Priority: Minor
> Attachments: HIVE-20819.1.patch
>
>
> Leaking Metastore connections when HADOOP_USER_NAME environmental variable is 
> set.
> The connections created are in ESTABLISHED state and never closed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20819) Leaking Metastore connections when HADOOP_USER_NAME environmental variable is set

2018-11-05 Thread Roohi Syeda (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roohi Syeda updated HIVE-20819:
---
Status: Patch Available  (was: Open)

Use the same UGI for both the handler thread (sessionUGI) and background thread 
for the same session

> Leaking Metastore connections when HADOOP_USER_NAME environmental variable is 
> set
> -
>
> Key: HIVE-20819
> URL: https://issues.apache.org/jira/browse/HIVE-20819
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Roohi Syeda
>Assignee: Roohi Syeda
>Priority: Minor
> Attachments: HIVE-20819.1.patch
>
>
> Leaking Metastore connections when HADOOP_USER_NAME environmental variable is 
> set.
> The connections created are in ESTABLISHED state and never closed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20858) Serializer is not correctly initialized with configuration in Utilities.createEmptyBuckets()

2018-11-05 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675836#comment-16675836
 ] 

Daniel Dai commented on HIVE-20858:
---

[~wzheng], patch looks fine, but can you describe which issue you saw? That 
helps to evaluate the impact of the patch. Also is it possible for a unit test?

> Serializer is not correctly initialized with configuration in 
> Utilities.createEmptyBuckets()
> 
>
> Key: HIVE-20858
> URL: https://issues.apache.org/jira/browse/HIVE-20858
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>Priority: Major
> Attachments: HIVE-20858.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20744) Use SQL constraints to improve join reordering algorithm

2018-11-05 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675837#comment-16675837
 ] 

Ashutosh Chauhan commented on HIVE-20744:
-

[~jcamachorodriguez] can you add RB for this?

> Use SQL constraints to improve join reordering algorithm
> 
>
> Key: HIVE-20744
> URL: https://issues.apache.org/jira/browse/HIVE-20744
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20744.01.patch, HIVE-20744.02.patch, 
> HIVE-20744.patch
>
>
> Till now, it was all based on stats stored for the base tables and their 
> columns. Now the optimizer can rely on constraints. Hence, this patch is for 
> the join reordering costing to use constraints, and if it does not find any, 
> rely on old code path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20813) udf to_epoch_milli need to support timestamp without time zone as well

2018-11-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675858#comment-16675858
 ] 

Hive QA commented on HIVE-20813:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12946953/HIVE-20813.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15525 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[kafka_storage_handler]
 (batchId=194)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14753/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14753/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14753/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12946953 - PreCommit-HIVE-Build

> udf to_epoch_milli need to support timestamp without time zone as well
> --
>
> Key: HIVE-20813
> URL: https://issues.apache.org/jira/browse/HIVE-20813
> Project: Hive
>  Issue Type: Bug
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-20813.patch, HIVE-20813.patch
>
>
> Currently the following query will fail with a cast exception (tries to cast 
> timestamp to timestamp with local timezone).
> {code}
>  select to_epoch_milli(current_timestamp)
> {code}
> As a simple fix we need to add support for timestamp object inspector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20861) Pass queryId as the client CallerContext to Spark

2018-11-05 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675860#comment-16675860
 ] 

Aihua Xu commented on HIVE-20861:
-

Seems spark-16759 has the limitation that "When running on Spark Yarn cluster 
mode, the driver is unable to pass 'spark.log.callerContext' to Yarn client and 
AM since Yarn client and AM have already started before the driver performs 
.config("spark.log.callerContext", "infoSpecifiedByUpstreamApp")." 

[~Weiqingy] I'm wondering if we could implement such callerContext in 
SparkContext as JobGroup within sparkContext to solve such issue? [~stakiar] do 
you have any idea on this?

> Pass queryId as the client CallerContext to Spark 
> --
>
> Key: HIVE-20861
> URL: https://issues.apache.org/jira/browse/HIVE-20861
> Project: Hive
>  Issue Type: Improvement
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
>
> SPARK-16759 exposes a way for the client to pass the client CallerContext 
> such as QueryId. For better debug, hive should pass queryId to spark.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20804) Further improvements to group by optimization with constraints

2018-11-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20804:
---
Attachment: HIVE-20804.4.patch

> Further improvements to group by optimization with constraints
> --
>
> Key: HIVE-20804
> URL: https://issues.apache.org/jira/browse/HIVE-20804
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20804.1.patch, HIVE-20804.2.patch, 
> HIVE-20804.3.patch, HIVE-20804.4.patch
>
>
> Continuation of HIVE-17043



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20804) Further improvements to group by optimization with constraints

2018-11-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20804:
---
Status: Patch Available  (was: Open)

> Further improvements to group by optimization with constraints
> --
>
> Key: HIVE-20804
> URL: https://issues.apache.org/jira/browse/HIVE-20804
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20804.1.patch, HIVE-20804.2.patch, 
> HIVE-20804.3.patch, HIVE-20804.4.patch
>
>
> Continuation of HIVE-17043



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-20861) Pass queryId as the client CallerContext to Spark

2018-11-05 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675860#comment-16675860
 ] 

Aihua Xu edited comment on HIVE-20861 at 11/5/18 10:56 PM:
---

Seems spark-16759 has the limitation that "When running on Spark Yarn cluster 
mode, the driver is unable to pass 'spark.log.callerContext' to Yarn client and 
AM since Yarn client and AM have already started before the driver performs 
.config("spark.log.callerContext", "infoSpecifiedByUpstreamApp")." 

[~WeiqingYang]] I'm wondering if we could implement such callerContext in 
SparkContext as JobGroup within sparkContext to solve such issue? [~stakiar] do 
you have any idea on this?


was (Author: aihuaxu):
Seems spark-16759 has the limitation that "When running on Spark Yarn cluster 
mode, the driver is unable to pass 'spark.log.callerContext' to Yarn client and 
AM since Yarn client and AM have already started before the driver performs 
.config("spark.log.callerContext", "infoSpecifiedByUpstreamApp")." 

[~Weiqingy] I'm wondering if we could implement such callerContext in 
SparkContext as JobGroup within sparkContext to solve such issue? [~stakiar] do 
you have any idea on this?

> Pass queryId as the client CallerContext to Spark 
> --
>
> Key: HIVE-20861
> URL: https://issues.apache.org/jira/browse/HIVE-20861
> Project: Hive
>  Issue Type: Improvement
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
>
> SPARK-16759 exposes a way for the client to pass the client CallerContext 
> such as QueryId. For better debug, hive should pass queryId to spark.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20782) Cleaning some unused code

2018-11-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675874#comment-16675874
 ] 

Hive QA commented on HIVE-20782:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
45s{color} | {color:blue} ql in master has 2315 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} ql: The patch generated 0 new + 0 unchanged - 6 
fixed = 0 total (was 6) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
55s{color} | {color:green} ql generated 0 new + 2314 unchanged - 1 fixed = 2314 
total (was 2315) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14754/dev-support/hive-personality.sh
 |
| git revision | master / 12a6f93 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14754/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Cleaning some unused code
> -
>
> Key: HIVE-20782
> URL: https://issues.apache.org/jira/browse/HIVE-20782
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-20782.2.patch, HIVE-20782.2.patch, 
> HIVE-20782.3.patch, HIVE-20782.3.patch, HIVE-20782.patch
>
>
> Am making my way into the vectorize code and trying understand the APIs. Ran 
> into this unused one, i guess it is not used anymore.
> [~ashutoshc] maybe can explain as you are the main contributor to this file 
> {code} 
> a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedSerde.java{code}
>  ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20854) Sensible Defaults: Hive's Zookeeper heartbeat interval is 20 minutes, change to 2

2018-11-05 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675879#comment-16675879
 ] 

Gopal V commented on HIVE-20854:


Sure, will have a clearer reading of that soon - I had a prod outage lasting 
from 6:05 AM to 6:25 AM which "magically" resolved itself, which I was 
debugging.

The 20 minute timing ended up being too long for a prod outage and this was 
specifically a customer who was running ~800ms average queries at 60+ 
queries/sec, so a lot of queries went missing.

bq.  So I think we should still keep the session.timeout in order of minutes.

That I agree to, but I think 2 minutes is a sane default instead of 20, since 
SLAs have gone from hours to seconds since hive1.


> Sensible Defaults: Hive's Zookeeper heartbeat interval is 20 minutes, change 
> to 2
> -
>
> Key: HIVE-20854
> URL: https://issues.apache.org/jira/browse/HIVE-20854
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-20854.1.patch
>
>
> {code}
> HIVE_ZOOKEEPER_SESSION_TIMEOUT("hive.zookeeper.session.timeout", 
> "120ms",
> new TimeValidator(TimeUnit.MILLISECONDS),
> "ZooKeeper client's session timeout (in milliseconds). The client is 
> disconnected, and as a result, all locks released, \n" +
> "if a heartbeat is not sent in the timeout."),
> {code}
> That's 1,200,000ms which is too long for all practical purposes - a 20 minute 
> outage in case a node has a failure is too long.
> That is too long for the JDBC load-balancing, LLAP failure tolerance and the 
> lock manager expiry.
> Change to 2 minutes, as a sensible default



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-16839) Unbalanced calls to openTransaction/commitTransaction when alter the same partition concurrently

2018-11-05 Thread Guang Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guang Yang updated HIVE-16839:
--
Attachment: HIVE-16839.02.patch

> Unbalanced calls to openTransaction/commitTransaction when alter the same 
> partition concurrently
> 
>
> Key: HIVE-16839
> URL: https://issues.apache.org/jira/browse/HIVE-16839
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1, 1.1.0
>Reporter: Nemon Lou
>Assignee: Guang Yang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-16839.01.patch, HIVE-16839.02.patch
>
>
> SQL to reproduce:
> prepare:
> {noformat}
>  hdfs dfs -mkdir -p 
> /hzsrc/external/writing_dc/ltgsm/16e7a9b2-21a1-3f4f-8061-bc3395281627
>  1,create external table tb_ltgsm_external (id int) PARTITIONED by (cp 
> string,ld string);
> {noformat}
> open one beeline run these two sql many times 
> {noformat} 2,ALTER TABLE tb_ltgsm_external ADD IF NOT EXISTS PARTITION 
> (cp=2017060513,ld=2017060610);
>  3,ALTER TABLE tb_ltgsm_external PARTITION (cp=2017060513,ld=2017060610) SET 
> LOCATION 
> 'hdfs://hacluster/hzsrc/external/writing_dc/ltgsm/16e7a9b2-21a1-3f4f-8061-bc3395281627';
> {noformat}
> open another beeline to run this sql many times at the same time.
> {noformat}
>  4,ALTER TABLE tb_ltgsm_external DROP PARTITION (cp=2017060513,ld=2017060610);
> {noformat}
> MetaStore logs:
> {noformat}
> 2017-06-06 21:58:34,213 | ERROR | pool-6-thread-197 | Retrying HMSHandler 
> after 2000 ms (attempt 1 of 10) with error: 
> javax.jdo.JDOObjectNotFoundException: No such database row
> FailedObject:49[OID]org.apache.hadoop.hive.metastore.model.MStorageDescriptor
>   at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:475)
>   at 
> org.datanucleus.api.jdo.JDOAdapter.getApiExceptionForNucleusException(JDOAdapter.java:1158)
>   at 
> org.datanucleus.state.JDOStateManager.isLoaded(JDOStateManager.java:3231)
>   at 
> org.apache.hadoop.hive.metastore.model.MStorageDescriptor.jdoGetcd(MStorageDescriptor.java)
>   at 
> org.apache.hadoop.hive.metastore.model.MStorageDescriptor.getCD(MStorageDescriptor.java:184)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1282)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1299)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.convertToPart(ObjectStore.java:1680)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartition(ObjectStore.java:1586)
>   at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98)
>   at com.sun.proxy.$Proxy0.getPartition(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterPartitions(HiveAlterHandler.java:538)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_partitions(HiveMetaStore.java:3317)
>   at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
>   at com.sun.proxy.$Proxy12.alter_partitions(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_partitions.getResult(ThriftHiveMetastore.java:9963)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_partitions.getResult(ThriftHiveMetastore.java:9947)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1673)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.

[jira] [Commented] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-05 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675889#comment-16675889
 ] 

Sahil Takiar commented on HIVE-20512:
-

I'm not sure why {{awaitTermination}} would be causing the tests to timeout. Do 
they hang locally? If not, it could have just been a temporary test infra 
issue. The problem with calling {{shutdownNow}} directly is that is cancels any 
in progress tasks by interrupting any in progress threads. This can lead to 
spurious errors in the task logs, which can be confusing. It's generally 
recommended to follow the shutdown pattern outlined in 
[https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ExecutorService.html]

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, HIVE-20512.6.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20853) Expose ShuffleHandler.registerDag in the llap daemon API

2018-11-05 Thread Jaume M (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaume M updated HIVE-20853:
---
Attachment: HIVE-20838.3.patch
Status: Patch Available  (was: Open)

> Expose ShuffleHandler.registerDag in the llap daemon API
> 
>
> Key: HIVE-20853
> URL: https://issues.apache.org/jira/browse/HIVE-20853
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 3.1.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
> Attachments: HIVE-20853.1.patch, HIVE-20853.2.patch
>
>
> Currently DAGs are only registered when a submitWork is called for that DAG. 
> At this point the crendentials are added to the ShuffleHandler and it can 
> start serving.
> However Tez might (and will) schedule tasks to fetch from the ShuffleHandler 
> before anything of this happens and all this tasks will fail which may 
> results in the query failing.
> This happens in the scenario in which a LlapDaemon just comes up and tez 
> fetchers try to open a connection before a DAG has been registered.
> Adding this API will allow to register the DAG against the Daemon when the AM 
> notices that a new Daemon is up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20853) Expose ShuffleHandler.registerDag in the llap daemon API

2018-11-05 Thread Jaume M (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaume M updated HIVE-20853:
---
Status: Open  (was: Patch Available)

> Expose ShuffleHandler.registerDag in the llap daemon API
> 
>
> Key: HIVE-20853
> URL: https://issues.apache.org/jira/browse/HIVE-20853
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 3.1.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
> Attachments: HIVE-20853.1.patch, HIVE-20853.2.patch
>
>
> Currently DAGs are only registered when a submitWork is called for that DAG. 
> At this point the crendentials are added to the ShuffleHandler and it can 
> start serving.
> However Tez might (and will) schedule tasks to fetch from the ShuffleHandler 
> before anything of this happens and all this tasks will fail which may 
> results in the query failing.
> This happens in the scenario in which a LlapDaemon just comes up and tez 
> fetchers try to open a connection before a DAG has been registered.
> Adding this API will allow to register the DAG against the Daemon when the AM 
> notices that a new Daemon is up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20853) Expose ShuffleHandler.registerDag in the llap daemon API

2018-11-05 Thread Jaume M (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaume M updated HIVE-20853:
---
Attachment: (was: HIVE-20838.3.patch)

> Expose ShuffleHandler.registerDag in the llap daemon API
> 
>
> Key: HIVE-20853
> URL: https://issues.apache.org/jira/browse/HIVE-20853
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 3.1.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
> Attachments: HIVE-20853.1.patch, HIVE-20853.2.patch
>
>
> Currently DAGs are only registered when a submitWork is called for that DAG. 
> At this point the crendentials are added to the ShuffleHandler and it can 
> start serving.
> However Tez might (and will) schedule tasks to fetch from the ShuffleHandler 
> before anything of this happens and all this tasks will fail which may 
> results in the query failing.
> This happens in the scenario in which a LlapDaemon just comes up and tez 
> fetchers try to open a connection before a DAG has been registered.
> Adding this API will allow to register the DAG against the Daemon when the AM 
> notices that a new Daemon is up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20853) Expose ShuffleHandler.registerDag in the llap daemon API

2018-11-05 Thread Jaume M (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaume M updated HIVE-20853:
---
Attachment: HIVE-20853.3.patch
Status: Patch Available  (was: Open)

> Expose ShuffleHandler.registerDag in the llap daemon API
> 
>
> Key: HIVE-20853
> URL: https://issues.apache.org/jira/browse/HIVE-20853
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 3.1.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
> Attachments: HIVE-20853.1.patch, HIVE-20853.2.patch, 
> HIVE-20853.3.patch
>
>
> Currently DAGs are only registered when a submitWork is called for that DAG. 
> At this point the crendentials are added to the ShuffleHandler and it can 
> start serving.
> However Tez might (and will) schedule tasks to fetch from the ShuffleHandler 
> before anything of this happens and all this tasks will fail which may 
> results in the query failing.
> This happens in the scenario in which a LlapDaemon just comes up and tez 
> fetchers try to open a connection before a DAG has been registered.
> Adding this API will allow to register the DAG against the Daemon when the AM 
> notices that a new Daemon is up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20853) Expose ShuffleHandler.registerDag in the llap daemon API

2018-11-05 Thread Jaume M (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaume M updated HIVE-20853:
---
Status: Open  (was: Patch Available)

> Expose ShuffleHandler.registerDag in the llap daemon API
> 
>
> Key: HIVE-20853
> URL: https://issues.apache.org/jira/browse/HIVE-20853
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 3.1.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
> Attachments: HIVE-20853.1.patch, HIVE-20853.2.patch, 
> HIVE-20853.3.patch
>
>
> Currently DAGs are only registered when a submitWork is called for that DAG. 
> At this point the crendentials are added to the ShuffleHandler and it can 
> start serving.
> However Tez might (and will) schedule tasks to fetch from the ShuffleHandler 
> before anything of this happens and all this tasks will fail which may 
> results in the query failing.
> This happens in the scenario in which a LlapDaemon just comes up and tez 
> fetchers try to open a connection before a DAG has been registered.
> Adding this API will allow to register the DAG against the Daemon when the AM 
> notices that a new Daemon is up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20838) Timestamps with timezone are set to null when using the streaming API

2018-11-05 Thread Jaume M (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaume M updated HIVE-20838:
---
Status: Open  (was: Patch Available)

> Timestamps with timezone are set to null when using the streaming API
> -
>
> Key: HIVE-20838
> URL: https://issues.apache.org/jira/browse/HIVE-20838
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
> Attachments: HIVE-20838.1.patch, HIVE-20838.2.patch, 
> HIVE-20838.3.patch, HIVE-20838.3.patch, HIVE-20838.4.patch, HIVE-20838.5.patch
>
>
> For example:
> {code}
> beeline> create table default.timest (a TIMESTAMP) stored as orc " +
> "TBLPROPERTIES('transactional'='true')
> # And then:
> connection.write("2018-10-19 10:35:00 America/Los_Angeles".getBytes());
> {code}
> inserts NULL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20838) Timestamps with timezone are set to null when using the streaming API

2018-11-05 Thread Jaume M (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaume M updated HIVE-20838:
---
Attachment: HIVE-20838.6.patch
Status: Patch Available  (was: Open)

> Timestamps with timezone are set to null when using the streaming API
> -
>
> Key: HIVE-20838
> URL: https://issues.apache.org/jira/browse/HIVE-20838
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
> Attachments: HIVE-20838.1.patch, HIVE-20838.2.patch, 
> HIVE-20838.3.patch, HIVE-20838.3.patch, HIVE-20838.4.patch, 
> HIVE-20838.5.patch, HIVE-20838.6.patch
>
>
> For example:
> {code}
> beeline> create table default.timest (a TIMESTAMP) stored as orc " +
> "TBLPROPERTIES('transactional'='true')
> # And then:
> connection.write("2018-10-19 10:35:00 America/Los_Angeles".getBytes());
> {code}
> inserts NULL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20782) Cleaning some unused code

2018-11-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675909#comment-16675909
 ] 

Hive QA commented on HIVE-20782:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12946954/HIVE-20782.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15525 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[kafka_storage_handler]
 (batchId=194)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14754/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14754/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14754/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12946954 - PreCommit-HIVE-Build

> Cleaning some unused code
> -
>
> Key: HIVE-20782
> URL: https://issues.apache.org/jira/browse/HIVE-20782
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-20782.2.patch, HIVE-20782.2.patch, 
> HIVE-20782.3.patch, HIVE-20782.3.patch, HIVE-20782.patch
>
>
> Am making my way into the vectorize code and trying understand the APIs. Ran 
> into this unused one, i guess it is not used anymore.
> [~ashutoshc] maybe can explain as you are the main contributor to this file 
> {code} 
> a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedSerde.java{code}
>  ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20858) Serializer is not correctly initialized with configuration in Utilities.createEmptyBuckets()

2018-11-05 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675919#comment-16675919
 ] 

Wei Zheng commented on HIVE-20858:
--

Nothing specific. I discovered this bug when reviewing another ticket 
HIVE-9651. They suffer the same initialization issue.

> Serializer is not correctly initialized with configuration in 
> Utilities.createEmptyBuckets()
> 
>
> Key: HIVE-20858
> URL: https://issues.apache.org/jira/browse/HIVE-20858
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>Priority: Major
> Attachments: HIVE-20858.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20822) Improvements to push computation to JDBC from Calcite

2018-11-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675927#comment-16675927
 ] 

Hive QA commented on HIVE-20822:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
41s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
39s{color} | {color:blue} ql in master has 2315 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
37s{color} | {color:red} ql: The patch generated 6 new + 175 unchanged - 0 
fixed = 181 total (was 175) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
1s{color} | {color:red} The patch has 552 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
7s{color} | {color:red} The patch 142 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14755/dev-support/hive-personality.sh
 |
| git revision | master / 12a6f93 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14755/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14755/yetus/whitespace-eol.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14755/yetus/whitespace-tabs.txt
 |
| modules | C: itests ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14755/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Improvements to push computation to JDBC from Calcite
> -
>
> Key: HIVE-20822
> URL: https://issues.apache.org/jira/browse/HIVE-20822
> Project: Hive
>  Issue Type: Improvement
>  Components: StorageHandler
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20822.01.patch, HIVE-20822.02.patch, 
> HIVE-20822.02.patch, HIVE-20822.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20486) Kafka: Use Row SerDe + vectorization

2018-11-05 Thread slim bouguerra (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675926#comment-16675926
 ] 

slim bouguerra commented on HIVE-20486:
---

okay

> Kafka: Use Row SerDe + vectorization
> 
>
> Key: HIVE-20486
> URL: https://issues.apache.org/jira/browse/HIVE-20486
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Reporter: Gopal V
>Assignee: slim bouguerra
>Priority: Major
>  Labels: kafka, vectorization
> Fix For: 4.0.0
>
> Attachments: HIVE-20486.3.patch, HIVE-20486.3.patch, 
> HIVE-20486.4.patch, HIVE-20486.4.patch, HIVE-20486.patch
>
>
> KafkaHandler returns unvectorized rows which causes the operators downstream 
> to be slower and sub-optimal.
> Hive has a vectorization shim which allows Kafka streams without complex 
> projections to be wrapped into a vectorized reader via 
> {{hive.vectorized.use.row.serde.deserialize}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20869) Fix test results file

2018-11-05 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra reassigned HIVE-20869:
-


> Fix test results file
> -
>
> Key: HIVE-20869
> URL: https://issues.apache.org/jira/browse/HIVE-20869
> Project: Hive
>  Issue Type: Sub-task
>  Components: kafka integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>
> seems like between the time test run and HIVE-20486 was merged new hive 
> property was added 
> {code}
> discover.partitions true
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20869) Fix test results file

2018-11-05 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20869:
--
Status: Patch Available  (was: Open)

> Fix test results file
> -
>
> Key: HIVE-20869
> URL: https://issues.apache.org/jira/browse/HIVE-20869
> Project: Hive
>  Issue Type: Sub-task
>  Components: kafka integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-20869.patch
>
>
> seems like between the time test run and HIVE-20486 was merged new hive 
> property was added 
> {code}
> discover.partitions true
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20869) Fix test results file

2018-11-05 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20869:
--
Attachment: HIVE-20869.patch

> Fix test results file
> -
>
> Key: HIVE-20869
> URL: https://issues.apache.org/jira/browse/HIVE-20869
> Project: Hive
>  Issue Type: Sub-task
>  Components: kafka integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-20869.patch
>
>
> seems like between the time test run and HIVE-20486 was merged new hive 
> property was added 
> {code}
> discover.partitions true
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20858) Serializer is not correctly initialized with configuration in Utilities.createEmptyBuckets()

2018-11-05 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675956#comment-16675956
 ] 

Daniel Dai commented on HIVE-20858:
---

+1.

> Serializer is not correctly initialized with configuration in 
> Utilities.createEmptyBuckets()
> 
>
> Key: HIVE-20858
> URL: https://issues.apache.org/jira/browse/HIVE-20858
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>Priority: Major
> Attachments: HIVE-20858.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20822) Improvements to push computation to JDBC from Calcite

2018-11-05 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675965#comment-16675965
 ] 

Daniel Dai commented on HIVE-20822:
---

Thanks Jesus for the fix. How about the calcite dependency? Do we rely on a new 
calcite version to make it work? Does the ptest failures because of this?

> Improvements to push computation to JDBC from Calcite
> -
>
> Key: HIVE-20822
> URL: https://issues.apache.org/jira/browse/HIVE-20822
> Project: Hive
>  Issue Type: Improvement
>  Components: StorageHandler
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20822.01.patch, HIVE-20822.02.patch, 
> HIVE-20822.02.patch, HIVE-20822.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20822) Improvements to push computation to JDBC from Calcite

2018-11-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675970#comment-16675970
 ] 

Hive QA commented on HIVE-20822:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12946957/HIVE-20822.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15526 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[kafka_storage_handler]
 (batchId=194)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[external_jdbc_table_perf]
 (batchId=181)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14755/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14755/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14755/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12946957 - PreCommit-HIVE-Build

> Improvements to push computation to JDBC from Calcite
> -
>
> Key: HIVE-20822
> URL: https://issues.apache.org/jira/browse/HIVE-20822
> Project: Hive
>  Issue Type: Improvement
>  Components: StorageHandler
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20822.01.patch, HIVE-20822.02.patch, 
> HIVE-20822.02.patch, HIVE-20822.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20869) Fix test results file

2018-11-05 Thread slim bouguerra (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675979#comment-16675979
 ] 

slim bouguerra commented on HIVE-20869:
---

cc [~vgarg]

> Fix test results file
> -
>
> Key: HIVE-20869
> URL: https://issues.apache.org/jira/browse/HIVE-20869
> Project: Hive
>  Issue Type: Sub-task
>  Components: kafka integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-20869.patch
>
>
> seems like between the time test run and HIVE-20486 was merged new hive 
> property was added 
> {code}
> discover.partitions true
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18661) CachedStore: Use metastore notification log events to update cache

2018-11-05 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675988#comment-16675988
 ] 

Daniel Dai commented on HIVE-18661:
---

Discussed with [~thejas], we still need a strong consistent cache which 
requires old data not overwrite new data. There are two approaches to do that:
1. For every cache entry, add a timestamp field (we can use notification_id for 
that purpose). Compare timestamp before every write. The complication is we 
need a tombstone field for entries deleted. Otherwise, some abnormality will 
happen. For example, if metastore B add a partition, then metastore A drop a 
partition later. However, on metastore A, it first get drop partition request, 
then from notification, create the partition. If there's no tombstone entry in 
partition cache to tell drop is after creation, we end up consumes the creation 
request. Though eventually there's drop partition notification, but during the 
interim, later event takes precedence.
2. For every local cache write, apply eariler notification first. That is, if 
metastore B add a partition, then metastore A drop a partition later, metastore 
A will first apply all notification before drooping the partition. That may 
delay the write operation (drop the partition in the example), but will 
serialize the events in metastores.

So probably approach 2 is easier and we can proceed with it.

> CachedStore: Use metastore notification log events to update cache
> --
>
> Key: HIVE-18661
> URL: https://issues.apache.org/jira/browse/HIVE-18661
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vaibhav Gumashta
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-18661.02.patch, HIVE-18661.03.patch
>
>
> Currently, a background thread updates the entire cache which is pretty 
> inefficient. We capture the updates to metadata in NOTIFICATION_LOG table 
> which is getting used in the Replication work. We should have the background 
> thread apply these notifications to incrementally update the cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20835) Interaction between constraints and MV rewriting may create loop in Calcite planner

2018-11-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675993#comment-16675993
 ] 

Hive QA commented on HIVE-20835:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
41s{color} | {color:blue} itests/util in master has 48 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
43s{color} | {color:blue} ql in master has 2315 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
36s{color} | {color:red} ql: The patch generated 3 new + 170 unchanged - 2 
fixed = 173 total (was 172) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 3 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 41s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14756/dev-support/hive-personality.sh
 |
| git revision | master / 12a6f93 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14756/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14756/yetus/whitespace-eol.txt
 |
| modules | C: itests itests/util ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14756/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Interaction between constraints and MV rewriting may create loop in Calcite 
> planner
> ---
>
> Key: HIVE-20835
> URL: https://issues.apache.org/jira/browse/HIVE-20835
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20835.01.patch, HIVE-20835.01.patch, 
> HIVE-20835.01.patch, HIVE-20835.01.patch, HIVE-20835.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20835) Interaction between constraints and MV rewriting may create loop in Calcite planner

2018-11-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16676051#comment-16676051
 ] 

Hive QA commented on HIVE-20835:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12946955/HIVE-20835.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15526 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[kafka_storage_handler]
 (batchId=194)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14756/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14756/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14756/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12946955 - PreCommit-HIVE-Build

> Interaction between constraints and MV rewriting may create loop in Calcite 
> planner
> ---
>
> Key: HIVE-20835
> URL: https://issues.apache.org/jira/browse/HIVE-20835
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20835.01.patch, HIVE-20835.01.patch, 
> HIVE-20835.01.patch, HIVE-20835.01.patch, HIVE-20835.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20836) Fix TestJdbcDriver2.testYarnATSGuid flakiness

2018-11-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16676065#comment-16676065
 ] 

Hive QA commented on HIVE-20836:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
35s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 40s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14757/dev-support/hive-personality.sh
 |
| git revision | master / 12a6f93 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: itests/hive-unit U: itests/hive-unit |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14757/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Fix TestJdbcDriver2.testYarnATSGuid flakiness
> -
>
> Key: HIVE-20836
> URL: https://issues.apache.org/jira/browse/HIVE-20836
> Project: Hive
>  Issue Type: Test
>  Components: JDBC
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20836.patch
>
>
> Seen flakiness in internal test.
> {code:java}
> Error Message
> Failed to set the YARN ATS Guid
> Stacktrace
> java.lang.AssertionError: Failed to set the YARN ATS Guid
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.hive.jdbc.TestJdbcDriver2.testYarnATSGuid(TestJdbcDriver2.java:2434){code}
> The query finished too fast, and the GUID thread did not try to check the 
> value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18661) CachedStore: Use metastore notification log events to update cache

2018-11-05 Thread mahesh kumar behera (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16676081#comment-16676081
 ] 

mahesh kumar behera commented on HIVE-18661:


[~thejas] . [~daijy]
in that case we should skip local metastore cache invalidation, just trigger 
the task to update using events ?

> CachedStore: Use metastore notification log events to update cache
> --
>
> Key: HIVE-18661
> URL: https://issues.apache.org/jira/browse/HIVE-18661
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vaibhav Gumashta
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-18661.02.patch, HIVE-18661.03.patch
>
>
> Currently, a background thread updates the entire cache which is pretty 
> inefficient. We capture the updates to metadata in NOTIFICATION_LOG table 
> which is getting used in the Replication work. We should have the background 
> thread apply these notifications to incrementally update the cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20869) Fix test results file

2018-11-05 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16676083#comment-16676083
 ] 

Gopal V commented on HIVE-20869:


+1 tests-pending.

> Fix test results file
> -
>
> Key: HIVE-20869
> URL: https://issues.apache.org/jira/browse/HIVE-20869
> Project: Hive
>  Issue Type: Sub-task
>  Components: kafka integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-20869.patch
>
>
> seems like between the time test run and HIVE-20486 was merged new hive 
> property was added 
> {code}
> discover.partitions true
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20836) Fix TestJdbcDriver2.testYarnATSGuid flakiness

2018-11-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16676093#comment-16676093
 ] 

Hive QA commented on HIVE-20836:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12946217/HIVE-20836.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15523 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=196)

[druidmini_masking.q,druidmini_test1.q,druidkafkamini_basic.q,druidmini_joins.q,druid_timestamptz.q]
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[kafka_storage_handler]
 (batchId=194)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14757/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14757/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14757/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12946217 - PreCommit-HIVE-Build

> Fix TestJdbcDriver2.testYarnATSGuid flakiness
> -
>
> Key: HIVE-20836
> URL: https://issues.apache.org/jira/browse/HIVE-20836
> Project: Hive
>  Issue Type: Test
>  Components: JDBC
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20836.patch
>
>
> Seen flakiness in internal test.
> {code:java}
> Error Message
> Failed to set the YARN ATS Guid
> Stacktrace
> java.lang.AssertionError: Failed to set the YARN ATS Guid
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.hive.jdbc.TestJdbcDriver2.testYarnATSGuid(TestJdbcDriver2.java:2434){code}
> The query finished too fast, and the GUID thread did not try to check the 
> value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20833) package.jdo needs to be updated to conform with HIVE-20221 changes

2018-11-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16676094#comment-16676094
 ] 

Hive QA commented on HIVE-20833:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12946940/HIVE-20833.5.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14758/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14758/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14758/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12946940/HIVE-20833.5.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12946940 - PreCommit-HIVE-Build

> package.jdo needs to be updated to conform with HIVE-20221 changes
> --
>
> Key: HIVE-20833
> URL: https://issues.apache.org/jira/browse/HIVE-20833
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20833.1.patch, HIVE-20833.2.patch, 
> HIVE-20833.3.patch, HIVE-20833.4.patch, HIVE-20833.5.patch
>
>
> Following test if run with TestMiniLlapLocalCliDriver will fail:
> {code:sql}
> CREATE TABLE `alterPartTbl`(
>`po_header_id` bigint,
>`vendor_num` string,
>`requester_name` string,
>`approver_name` string,
>`buyer_name` string,
>`preparer_name` string,
>`po_requisition_number` string,
>`po_requisition_id` bigint,
>`po_requisition_desc` string,
>`rate_type` string,
>`rate_date` date,
>`rate` double,
>`blanket_total_amount` double,
>`authorization_status` string,
>`revision_num` bigint,
>`revised_date` date,
>`approved_flag` string,
>`approved_date` timestamp,
>`amount_limit` double,
>`note_to_authorizer` string,
>`note_to_vendor` string,
>`note_to_receiver` string,
>`vendor_order_num` string,
>`comments` string,
>`acceptance_required_flag` string,
>`acceptance_due_date` date,
>`closed_date` timestamp,
>`user_hold_flag` string,
>`approval_required_flag` string,
>`cancel_flag` string,
>`firm_status_lookup_code` string,
>`firm_date` date,
>`frozen_flag` string,
>`closed_code` string,
>`org_id` bigint,
>`reference_num` string,
>`wf_item_type` string,
>`wf_item_key` string,
>`submit_date` date,
>`sap_company_code` string,
>`sap_fiscal_year` bigint,
>`po_number` string,
>`sap_line_item` bigint,
>`closed_status_flag` string,
>`balancing_segment` string,
>`cost_center_segment` string,
>`base_amount_limit` double,
>`base_blanket_total_amount` double,
>`base_open_amount` double,
>`base_ordered_amount` double,
>`cancel_date` timestamp,
>`cbc_accounting_date` date,
>`change_requested_by` string,
>`change_summary` string,
>`confirming_order_flag` string,
>`document_creation_method` string,
>`edi_processed_flag` string,
>`edi_processed_status` string,
>`enabled_flag` string,
>`encumbrance_required_flag` string,
>`end_date` date,
>`end_date_active` date,
>`from_header_id` bigint,
>`from_type_lookup_code` string,
>`global_agreement_flag` string,
>`government_context` string,
>`interface_source_code` string,
>`ledger_currency_code` string,
>`open_amount` double,
>`ordered_amount` double,
>`pay_on_code` string,
>`payment_term_name` string,
>`pending_signature_flag` string,
>`po_revision_num` double,
>`preparer_id` bigint,
>`price_update_tolerance` double,
>`print_count` double,
>`printed_date` date,
>`reply_date` date,
>`reply_method_lookup_code` string,
>`rfq_close_date` date,
>`segment2` string,
>`segment3` string,
>`segment4` string,
>`segment5` string,
>`shipping_control` string,
>`start_date` date,
>`start_date_active` date,
>`summary_flag` string,
>`supply_agreement_flag` string,
>`usd_amount_limit` double,
>`usd_blanket_total_amount` double,
>`usd_exchange_rate` double,
>`usd_open_amount` double,
>`usd_order_amount` double,
>`ussgl_transaction_code` string,
>`xml_flag` string,
>`purchasing_organization_id` bigint,
>`purchasing_group_code` string,
>`last_updated_by_name` string,
>`created_by_name` string,
>`incoterms_1` string,
>`incoterms_2` string,
>`ame_approval_id` double,
>`ame_transaction_type` string,
>

[jira] [Updated] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20842:
---
Status: Open  (was: Patch Available)

> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.2.patch, 
> HIVE-20842.3.patch, HIVE-20842.4.patch, HIVE-20842.5.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20842:
---
Attachment: HIVE-20842.5.patch

> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.2.patch, 
> HIVE-20842.3.patch, HIVE-20842.4.patch, HIVE-20842.5.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20842:
---
Status: Patch Available  (was: Open)

> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.2.patch, 
> HIVE-20842.3.patch, HIVE-20842.4.patch, HIVE-20842.5.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-05 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16676095#comment-16676095
 ] 

Vineet Garg commented on HIVE-20842:


[~ashutoshc] Review request is at https://reviews.apache.org/r/69257/

> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.2.patch, 
> HIVE-20842.3.patch, HIVE-20842.4.patch, HIVE-20842.5.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20868) SMB Join fails intermittently when TezDummyOperator has child op in getFinalOp in MapRecordProcessor

2018-11-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16676116#comment-16676116
 ] 

Hive QA commented on HIVE-20868:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
41s{color} | {color:blue} ql in master has 2315 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
37s{color} | {color:red} ql: The patch generated 1 new + 9 unchanged - 0 fixed 
= 10 total (was 9) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m  3s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14759/dev-support/hive-personality.sh
 |
| git revision | master / 12a6f93 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14759/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14759/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> SMB Join fails intermittently when TezDummyOperator has child op in 
> getFinalOp in MapRecordProcessor
> 
>
> Key: HIVE-20868
> URL: https://issues.apache.org/jira/browse/HIVE-20868
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20868.1.patch
>
>
> In MapRecordProcessor::getFinalOp() due to external cause(not known), the 
> TezDummyStoreOperator may have MergeJoin Op as child intermittently. Due to 
> this, the fetchDone remains set to true for the DummyOp which was set by 
> previous task. Ideally, fetchDone should be reset for each task. This 
> eventually leads to the join op skip rows from that dummy op resulting in 
> wrong results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >