date:20210615

[jira] [Assigned] (HIVE-25251) Reduce overhead of adding partitions during batch loading of partitions.

2021-06-15 Thread mahesh kumar behera (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera reassigned HIVE-25251:
--


> Reduce overhead of adding partitions during batch loading of partitions.
> 
>
> Key: HIVE-25251
> URL: https://issues.apache.org/jira/browse/HIVE-25251
> Project: Hive
>  Issue Type: Sub-task
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>
> The add partitions call done to HMS does a serial execution of data nucleus 
> calls to add the partitions to backend DB. This can be further optimised by 
> batching those sql statements.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25204) Reduce overhead of adding notification log for update partition column statistics

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25204?focusedWorklogId=611734=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611734
 ]

ASF GitHub Bot logged work on HIVE-25204:
-

Author: ASF GitHub Bot
Created on: 16/Jun/21 04:22
Start Date: 16/Jun/21 04:22
Worklog Time Spent: 10m 
  Work Description: maheshk114 merged pull request #2365:
URL: https://github.com/apache/hive/pull/2365


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611734)
Time Spent: 1h 10m  (was: 1h)

> Reduce overhead of adding notification log for update partition column 
> statistics
> -
>
> Key: HIVE-25204
> URL: https://issues.apache.org/jira/browse/HIVE-25204
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: perfomance, pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The notification logs for partition column statistics can be optimised by 
> adding them in batch. In the current implementation its done one by one 
> causing multiple sql execution in the backend RDBMS. These SQL executions can 
> be batched to reduce the execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-25204) Reduce overhead of adding notification log for update partition column statistics

2021-06-15 Thread mahesh kumar behera (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera resolved HIVE-25204.

Resolution: Fixed

> Reduce overhead of adding notification log for update partition column 
> statistics
> -
>
> Key: HIVE-25204
> URL: https://issues.apache.org/jira/browse/HIVE-25204
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: perfomance, pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The notification logs for partition column statistics can be optimised by 
> adding them in batch. In the current implementation its done one by one 
> causing multiple sql execution in the backend RDBMS. These SQL executions can 
> be batched to reduce the execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23633) Metastore some JDO query objects do not close properly

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23633?focusedWorklogId=611687=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611687
 ]

ASF GitHub Bot logged work on HIVE-23633:
-

Author: ASF GitHub Bot
Created on: 16/Jun/21 01:11
Start Date: 16/Jun/21 01:11
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 opened a new pull request #2344:
URL: https://github.com/apache/hive/pull/2344


   …properly
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611687)
Time Spent: 6h 10m  (was: 6h)

> Metastore some JDO query objects do not close properly
> --
>
> Key: HIVE-23633
> URL: https://issues.apache.org/jira/browse/HIVE-23633
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23633.01.patch
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> After patched [HIVE-10895|https://issues.apache.org/jira/browse/HIVE-10895],  
> The metastore still has seen a memory leak on db resources: many 
> StatementImpls left unclosed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23633) Metastore some JDO query objects do not close properly

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23633?focusedWorklogId=611686=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611686
 ]

ASF GitHub Bot logged work on HIVE-23633:
-

Author: ASF GitHub Bot
Created on: 16/Jun/21 01:09
Start Date: 16/Jun/21 01:09
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 closed pull request #2344:
URL: https://github.com/apache/hive/pull/2344


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611686)
Time Spent: 6h  (was: 5h 50m)

> Metastore some JDO query objects do not close properly
> --
>
> Key: HIVE-23633
> URL: https://issues.apache.org/jira/browse/HIVE-23633
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23633.01.patch
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> After patched [HIVE-10895|https://issues.apache.org/jira/browse/HIVE-10895],  
> The metastore still has seen a memory leak on db resources: many 
> StatementImpls left unclosed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-25055) Improve the exception handling in HMSHandler

2021-06-15 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17364001#comment-17364001
 ] 

Zhihua Deng commented on HIVE-25055:


Thank you [~vihangk1] for reviewing the changes!

> Improve the exception handling in HMSHandler
> 
>
> Key: HIVE-25055
> URL: https://issues.apache.org/jira/browse/HIVE-25055
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-25055) Improve the exception handling in HMSHandler

2021-06-15 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-25055.

Fix Version/s: 4.0.0
   Resolution: Resolved

> Improve the exception handling in HMSHandler
> 
>
> Key: HIVE-25055
> URL: https://issues.apache.org/jira/browse/HIVE-25055
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25055) Improve the exception handling in HMSHandler

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25055?focusedWorklogId=611672=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611672
 ]

ASF GitHub Bot logged work on HIVE-25055:
-

Author: ASF GitHub Bot
Created on: 16/Jun/21 00:16
Start Date: 16/Jun/21 00:16
Worklog Time Spent: 10m 
  Work Description: vihangk1 merged pull request #2218:
URL: https://github.com/apache/hive/pull/2218


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611672)
Time Spent: 3.5h  (was: 3h 20m)

> Improve the exception handling in HMSHandler
> 
>
> Key: HIVE-25055
> URL: https://issues.apache.org/jira/browse/HIVE-25055
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24970) Reject location and managed locations in DDL for REMOTE databases.

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24970?focusedWorklogId=611515=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611515
 ]

ASF GitHub Bot logged work on HIVE-24970:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 18:27
Start Date: 15/Jun/21 18:27
Worklog Time Spent: 10m 
  Work Description: nrg4878 commented on a change in pull request #2389:
URL: https://github.com/apache/hive/pull/2389#discussion_r652049945



##
File path: parser/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g
##
@@ -1156,7 +1156,7 @@ createDatabaseStatement
 dbManagedLocation?
 dbConnectorName?
 (KW_WITH KW_DBPROPERTIES dbprops=dbProperties)?
--> {$remote != null}? ^(TOK_CREATEDATABASE $name ifNotExists? 
databaseComment? $dbprops? dbConnectorName?)
+-> {$remote != null}? ^(TOK_CREATEDATABASE $name ifNotExists? dbLocation? 
dbManagedLocation? databaseComment? $dbprops? dbConnectorName?)

Review comment:
   Let me ponder on the first part.
   
   create database localdb using connector xyz
   If this does not throw an exception, we should change this behavior to force 
users to be explicit and specify REMOTE.  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611515)
Time Spent: 50m  (was: 40m)

> Reject location and managed locations in DDL for REMOTE databases.
> --
>
> Key: HIVE-24970
> URL: https://issues.apache.org/jira/browse/HIVE-24970
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Naveen Gangam
>Assignee: Dantong Dong
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This was part of the review feedback from Yongzhi. Creating a followup jira 
> to track this discussion.
> So, using DB connector for DB, will not create managed tables?
>  
> @nrg4878 nrg4878 1 hour ago Author Member
> we don't support create/drop/alter in REMOTE databases at this point. the 
> concepts of managed vs external is not in the picture at this point. When we 
> do support it, it will be application to the hive connectors only (or other 
> hive based connectors like AWS Glue)
>  
> @nrg4878 nrg4878 2 minutes ago Author Member
> will file a separate jira for this. Basically, instead of ignoring the 
> location and managedlocation that may be specified for remote database, the 
> grammer needs to not accept any locations in the DDL at all.
> The argument is fair, why accept something we do not honor or entirely 
> irrelevant for such databases. However, this requires some thought when we 
> have additional connectors for remote hive instances. It might have some 
> relevance in terms of security with Ranger etc.
> So will create new jira for followup discussion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-25168) Add mutable validWriteIdList

2021-06-15 Thread Yu-Wen Lai (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363817#comment-17363817
 ] 

Yu-Wen Lai commented on HIVE-25168:
---

We've decided not to put this in Hive since there is no other use case in Hive 
now.

> Add mutable validWriteIdList
> 
>
> Key: HIVE-25168
> URL: https://issues.apache.org/jira/browse/HIVE-25168
> Project: Hive
>  Issue Type: New Feature
>  Components: storage-api
>Reporter: Yu-Wen Lai
>Assignee: Yu-Wen Lai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Although the current implementation for validWriteIdList is not strictly 
> immutable, it is in some sense to provide a read-only view snapshot. This 
> change is to add another class to provide functionalities for manipulating 
> the writeIdList. We could use this to keep writeIdList up-to-date in an 
> external cache layer for event-based metadata refreshing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24425) Create table in REMOTE db should fail

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24425?focusedWorklogId=611492=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611492
 ]

ASF GitHub Bot logged work on HIVE-24425:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 17:46
Start Date: 15/Jun/21 17:46
Worklog Time Spent: 10m 
  Work Description: dantongdong commented on a change in pull request #2393:
URL: https://github.com/apache/hive/pull/2393#discussion_r652018186



##
File path: ql/src/test/results/clientnegative/createTbl_remoteDB_fail.q.out
##
@@ -0,0 +1,42 @@
+PREHOOK: query: CREATE CONNECTOR IF NOT EXISTS mysql_test
+TYPE 'mysql'
+URL 'jdbc:mysql://nightly1.apache.org:3306/hive1'
+COMMENT 'test connector'
+WITH DCPROPERTIES (
+"hive.sql.dbcp.username"="hive1",
+"hive.sql.dbcp.password"="hive1")
+PREHOOK: type: CREATEDATACONNECTOR
+PREHOOK: Output: connector:mysql_test
+POSTHOOK: query: CREATE CONNECTOR IF NOT EXISTS mysql_test
+TYPE 'mysql'
+URL 'jdbc:mysql://nightly1.apache.org:3306/hive1'
+COMMENT 'test connector'
+WITH DCPROPERTIES (
+"hive.sql.dbcp.username"="hive1",
+"hive.sql.dbcp.password"="hive1")
+POSTHOOK: type: CREATEDATACONNECTOR
+POSTHOOK: Output: connector:mysql_test
+PREHOOK: query: SHOW CONNECTORS
+PREHOOK: type: SHOWDATACONNECTORS
+POSTHOOK: query: SHOW CONNECTORS
+POSTHOOK: type: SHOWDATACONNECTORS
+mysql_test
+PREHOOK: query: CREATE REMOTE database mysql_db using mysql_test with 
DBPROPERTIES("connector.remoteDbName"="hive1")
+PREHOOK: type: CREATEDATABASE
+PREHOOK: Output: database:mysql_db
+ A masked pattern was here 
+POSTHOOK: query: CREATE REMOTE database mysql_db using mysql_test with 
DBPROPERTIES("connector.remoteDbName"="hive1")
+POSTHOOK: type: CREATEDATABASE
+POSTHOOK: Output: database:mysql_db
+ A masked pattern was here 
+PREHOOK: query: USE mysql_db
+PREHOOK: type: SWITCHDATABASE
+PREHOOK: Input: database:mysql_db
+POSTHOOK: query: USE mysql_db
+POSTHOOK: type: SWITCHDATABASE
+POSTHOOK: Input: database:mysql_db
+PREHOOK: query: create table bees (id int, name string)
+PREHOOK: type: CREATETABLE
+PREHOOK: Output: database:mysql_db
+PREHOOK: Output: mysql_db@bees
+FAILED: Execution Error, return code 4 from 
org.apache.hadoop.hive.ql.ddl.DDLTask. MetaException(message:Could not 
instantiate a provider for database mysql_db)

Review comment:
   Yes. It seems I forgot to recompile the code when generating golden .out 
file. And I am failing my own test in Jenkins because of that. Will change




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611492)
Time Spent: 0.5h  (was: 20m)

> Create table in REMOTE db should fail
> -
>
> Key: HIVE-24425
> URL: https://issues.apache.org/jira/browse/HIVE-24425
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Naveen Gangam
>Assignee: Dantong Dong
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently it creates the table in that DB but show tables does not show 
> anything. Preventing the creation of table will resolve this inconsistency 
> too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24970) Reject location and managed locations in DDL for REMOTE databases.

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24970?focusedWorklogId=611490=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611490
 ]

ASF GitHub Bot logged work on HIVE-24970:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 17:40
Start Date: 15/Jun/21 17:40
Worklog Time Spent: 10m 
  Work Description: dantongdong commented on a change in pull request #2389:
URL: https://github.com/apache/hive/pull/2389#discussion_r652014451



##
File path: parser/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g
##
@@ -1156,7 +1156,7 @@ createDatabaseStatement
 dbManagedLocation?
 dbConnectorName?
 (KW_WITH KW_DBPROPERTIES dbprops=dbProperties)?
--> {$remote != null}? ^(TOK_CREATEDATABASE $name ifNotExists? 
databaseComment? $dbprops? dbConnectorName?)
+-> {$remote != null}? ^(TOK_CREATEDATABASE $name ifNotExists? dbLocation? 
dbManagedLocation? databaseComment? $dbprops? dbConnectorName?)

Review comment:
   It doesn't throw an exception because this line is not in charge of the 
actual parsing and compiling. This line is in charge of generating ASTNode 
based on the pared and compiled result. Not including dbLocation and 
dbManagedLocation in this line will only make it ignore dbLocation and 
dbManaged when generating ASTNode. The actual parsing and compiling is done in 
the lines above it(line1131-1139). All the create database statement shares the 
same parse and compile code, which means that all create database(including 
remote)statement will parse and compile dbLocation and dbManagedLocation.
   
   Likewise, if you do something like: "create database localdb using connector 
xyz" will not throw an error neither. I can also fix this if it is not desired 
behavior.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611490)
Time Spent: 40m  (was: 0.5h)

> Reject location and managed locations in DDL for REMOTE databases.
> --
>
> Key: HIVE-24970
> URL: https://issues.apache.org/jira/browse/HIVE-24970
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Naveen Gangam
>Assignee: Dantong Dong
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> This was part of the review feedback from Yongzhi. Creating a followup jira 
> to track this discussion.
> So, using DB connector for DB, will not create managed tables?
>  
> @nrg4878 nrg4878 1 hour ago Author Member
> we don't support create/drop/alter in REMOTE databases at this point. the 
> concepts of managed vs external is not in the picture at this point. When we 
> do support it, it will be application to the hive connectors only (or other 
> hive based connectors like AWS Glue)
>  
> @nrg4878 nrg4878 2 minutes ago Author Member
> will file a separate jira for this. Basically, instead of ignoring the 
> location and managedlocation that may be specified for remote database, the 
> grammer needs to not accept any locations in the DDL at all.
> The argument is fair, why accept something we do not honor or entirely 
> irrelevant for such databases. However, this requires some thought when we 
> have additional connectors for remote hive instances. It might have some 
> relevance in terms of security with Ranger etc.
> So will create new jira for followup discussion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-25173) Fix build failure of hive-pre-upgrade due to missing dependency on pentaho-aggdesigner-algorithm

2021-06-15 Thread Julian Hyde (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363800#comment-17363800
 ] 

Julian Hyde commented on HIVE-25173:


I made a release of this library under my groupid on maven central. I don’t 
recall the coordinates but you can find them in Calcite (calcite depends on the 
new version). 

If conjars.org is at the root of this problem, let me know. I know the owner of 
that repo. He took it offline to find out who, if anyone, was using it. 

> Fix build failure of hive-pre-upgrade due to missing dependency on 
> pentaho-aggdesigner-algorithm
> 
>
> Key: HIVE-25173
> URL: https://issues.apache.org/jira/browse/HIVE-25173
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.2
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {noformat}
> [ERROR] Failed to execute goal on project hive-pre-upgrade: Could not resolve 
> dependencies for project org.apache.hive:hive-pre-upgrade:jar:4.0.0-SNAPSHOT: 
> Failure to find org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde in 
> https://repo.maven.apache.org/maven2 was cached in the local repository, 
> resolution will not be reattempted until the update interval of central has 
> elapsed or updates are forced
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24970) Reject location and managed locations in DDL for REMOTE databases.

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24970?focusedWorklogId=611478=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611478
 ]

ASF GitHub Bot logged work on HIVE-24970:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 17:23
Start Date: 15/Jun/21 17:23
Worklog Time Spent: 10m 
  Work Description: dantongdong commented on a change in pull request #2389:
URL: https://github.com/apache/hive/pull/2389#discussion_r651999419



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/ddl/database/create/CreateDatabaseAnalyzer.java
##
@@ -78,10 +78,17 @@ public void analyzeInternal(ASTNode root) throws 
SemanticException {
 ASTNode nextNode = (ASTNode) root.getChild(i);
 connectorName = ((ASTNode)nextNode).getChild(0).getText();
 outputs.add(toWriteEntity(connectorName));
-if (managedLocationUri != null) {
-  outputs.remove(toWriteEntity(managedLocationUri));
-  managedLocationUri = null;
+
+// HIVE-2436: Reject location and managed locations in DDL for REMOTE 
databases.
+if (locationUri != null || managedLocationUri != null ) {
+  if (locationUri == null) {
+outputs.remove(toWriteEntity(locationUri));

Review comment:
   Will change.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611478)
Time Spent: 0.5h  (was: 20m)

> Reject location and managed locations in DDL for REMOTE databases.
> --
>
> Key: HIVE-24970
> URL: https://issues.apache.org/jira/browse/HIVE-24970
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Naveen Gangam
>Assignee: Dantong Dong
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This was part of the review feedback from Yongzhi. Creating a followup jira 
> to track this discussion.
> So, using DB connector for DB, will not create managed tables?
>  
> @nrg4878 nrg4878 1 hour ago Author Member
> we don't support create/drop/alter in REMOTE databases at this point. the 
> concepts of managed vs external is not in the picture at this point. When we 
> do support it, it will be application to the hive connectors only (or other 
> hive based connectors like AWS Glue)
>  
> @nrg4878 nrg4878 2 minutes ago Author Member
> will file a separate jira for this. Basically, instead of ignoring the 
> location and managedlocation that may be specified for remote database, the 
> grammer needs to not accept any locations in the DDL at all.
> The argument is fair, why accept something we do not honor or entirely 
> irrelevant for such databases. However, this requires some thought when we 
> have additional connectors for remote hive instances. It might have some 
> relevance in terms of security with Ranger etc.
> So will create new jira for followup discussion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24970) Reject location and managed locations in DDL for REMOTE databases.

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24970?focusedWorklogId=611477=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611477
 ]

ASF GitHub Bot logged work on HIVE-24970:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 17:22
Start Date: 15/Jun/21 17:22
Worklog Time Spent: 10m 
  Work Description: dantongdong commented on a change in pull request #2389:
URL: https://github.com/apache/hive/pull/2389#discussion_r651998832



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/ddl/database/create/CreateDatabaseAnalyzer.java
##
@@ -78,10 +78,17 @@ public void analyzeInternal(ASTNode root) throws 
SemanticException {
 ASTNode nextNode = (ASTNode) root.getChild(i);
 connectorName = ((ASTNode)nextNode).getChild(0).getText();
 outputs.add(toWriteEntity(connectorName));
-if (managedLocationUri != null) {
-  outputs.remove(toWriteEntity(managedLocationUri));
-  managedLocationUri = null;
+
+// HIVE-2436: Reject location and managed locations in DDL for REMOTE 
databases.

Review comment:
   It was not the correct reference indeed. Will change




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611477)
Time Spent: 20m  (was: 10m)

> Reject location and managed locations in DDL for REMOTE databases.
> --
>
> Key: HIVE-24970
> URL: https://issues.apache.org/jira/browse/HIVE-24970
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Naveen Gangam
>Assignee: Dantong Dong
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This was part of the review feedback from Yongzhi. Creating a followup jira 
> to track this discussion.
> So, using DB connector for DB, will not create managed tables?
>  
> @nrg4878 nrg4878 1 hour ago Author Member
> we don't support create/drop/alter in REMOTE databases at this point. the 
> concepts of managed vs external is not in the picture at this point. When we 
> do support it, it will be application to the hive connectors only (or other 
> hive based connectors like AWS Glue)
>  
> @nrg4878 nrg4878 2 minutes ago Author Member
> will file a separate jira for this. Basically, instead of ignoring the 
> location and managedlocation that may be specified for remote database, the 
> grammer needs to not accept any locations in the DDL at all.
> The argument is fair, why accept something we do not honor or entirely 
> irrelevant for such databases. However, this requires some thought when we 
> have additional connectors for remote hive instances. It might have some 
> relevance in terms of security with Ranger etc.
> So will create new jira for followup discussion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25213) Implement List getTables() for existing connectors.

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25213?focusedWorklogId=611475=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611475
 ]

ASF GitHub Bot logged work on HIVE-25213:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 17:19
Start Date: 15/Jun/21 17:19
Worklog Time Spent: 10m 
  Work Description: dantongdong commented on a change in pull request #2371:
URL: https://github.com/apache/hive/pull/2371#discussion_r651997059



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/dataconnector/jdbc/AbstractJDBCConnectorProvider.java
##
@@ -129,7 +129,30 @@ protected Connection getConnection() {
* @throws MetaException To indicate any failures with executing this API
* @param regex
*/
-  @Override public abstract List getTables(String regex) throws 
MetaException;
+  @Override public List getTables(String regex) throws MetaException {
+ResultSet rs = null;
+try {
+  rs = fetchTablesViaDBMetaData(regex);
+  if (rs != null) {
+List tables = new ArrayList();
+while(rs.next()) {
+  tables.add(getTable(rs.getString(3)));

Review comment:
   If I am looking at the right 
[place](https://javadoc.scijava.org/Java6/java/sql/DatabaseMetaData.html). It 
does not specify its behavior neither. But I think it makes more sense to just 
leave out the one with column exception and continue returning the rest instead 
of stopping? Because we want the behavior to be as close as getTables(), we can 
just filter out the ones that has corrupted columns and return as many as we 
can.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611475)
Time Spent: 50m  (was: 40m)

> Implement List getTables() for existing connectors.
> --
>
> Key: HIVE-25213
> URL: https://issues.apache.org/jira/browse/HIVE-25213
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Naveen Gangam
>Assignee: Dantong Dong
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In the initial implementation, connector providers do not implement the 
> getTables(string pattern) spi. We had deferred it for later. Only 
> getTableNames() and getTable() were implemented. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-25238) Make SSL cipher suites configurable for Hive Web UI and HS2

2021-06-15 Thread Yongzhi Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen resolved HIVE-25238.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

> Make SSL cipher suites configurable for Hive Web UI and HS2
> ---
>
> Key: HIVE-25238
> URL: https://issues.apache.org/jira/browse/HIVE-25238
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Web UI
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When starting a jetty http server, one can explicitly exclude certain 
> (unsecure)
> SSL cipher suites. This can be especially important, when Hive
> needs to be compliant with security regulations. Need add properties to 
> support Hive WebUi and HiveServer2 to this
> For Hive Binary Cli Server, we can set include certain SSL cipher suites. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25213) Implement List getTables() for existing connectors.

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25213?focusedWorklogId=611443=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611443
 ]

ASF GitHub Bot logged work on HIVE-25213:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 16:34
Start Date: 15/Jun/21 16:34
Worklog Time Spent: 10m 
  Work Description: dantongdong commented on a change in pull request #2371:
URL: https://github.com/apache/hive/pull/2371#discussion_r651962841



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java
##
@@ -3799,10 +3799,33 @@ public Table get_table_core(GetTableRequest 
getTableRequest) throws MetaExceptio
   @Override
   public GetTablesResult get_table_objects_by_name_req(GetTablesRequest req) 
throws TException {
 String catName = req.isSetCatName() ? req.getCatName() : 
getDefaultCatalog(conf);
+if (isDatabaseRemote(req.getDbName())) {
+  return new 
GetTablesResult(getRemoteTableObjectsInternal(req.getDbName(), 
req.getTblNames(), req.getTablesPattern()));
+}
 return new GetTablesResult(getTableObjectsInternal(catName, 
req.getDbName(),
 req.getTblNames(), req.getCapabilities(), req.getProjectionSpec(), 
req.getTablesPattern()));
   }
 
+  private String tableNames2regex(List tableNames) {
+return "/^(" + String.join("|", tableNames) + ")$/";

Review comment:
   Good catch. Will look into that. As a side note, it seems that at least 
mysql does not like ".*" to be passed in as a regex getTable API call and will 
throw an error. I have to manually changed the regex to null, is it normal?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611443)
Time Spent: 40m  (was: 0.5h)

> Implement List getTables() for existing connectors.
> --
>
> Key: HIVE-25213
> URL: https://issues.apache.org/jira/browse/HIVE-25213
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Naveen Gangam
>Assignee: Dantong Dong
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In the initial implementation, connector providers do not implement the 
> getTables(string pattern) spi. We had deferred it for later. Only 
> getTableNames() and getTable() were implemented. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-25244) Hive predicate pushdown with Parquet format for `date` as partitioned column name produce empty resultset

2021-06-15 Thread Aniket Adnaik (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aniket Adnaik reassigned HIVE-25244:


Assignee: Aniket Adnaik

> Hive predicate pushdown with Parquet format for `date` as partitioned column 
> name produce empty resultset
> -
>
> Key: HIVE-25244
> URL: https://issues.apache.org/jira/browse/HIVE-25244
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Parquet
>Affects Versions: 3.1.0, 3.1.1, 3.1.2
>Reporter: Aniket Adnaik
>Assignee: Aniket Adnaik
>Priority: Major
> Fix For: 3.1.0, 3.1.1, 3.1.2, 3.2.0
>
> Attachments: test_table3_data.tar.gz
>
>
> Hive predicate push down with Parquet format for partitioned column with 
> column name as  keyword -> `date` produces empty result set.
> If any of the followings configs is set to false, then the select query 
> returns results.
> hive.optimize.ppd.storage, hive.optimize.ppd , hive.optimize.index.filter .
> Repro steps:
> --
> 1. 
> 1) Create an external partitioned table in Hive
> CREATE EXTERNAL TABLE `test_table3`(`id` string) PARTITIONED BY (`date` 
> string) STORED AS parquet;
> 2) In spark-shell create data frame and write the data parquet file
> import java.sql.Timestamp
> import org.apache.spark.sql.Row
> import org.apache.spark.sql.types._
> import spark.implicits._
> val someDF = Seq(("1", "05172021"),("2", "05172021"), ("3", "06182021"), 
> ("4", "07192021")).toDF("id", "date")
> someDF.write.mode("overwrite").parquet(" path>/hive/warehouse/external/test_table3/date=05172021")
> 3) In Hive change the permissions and add partition to the table
> $> hdfs dfs -chmod -R 777 /hive/warehouse/external/test_table3
> Hive Beeline ->
> ALTER TABLE test_table3 ADD PARTITION(`date`='05172021') LOCATION  ' path>/hive/warehouse/external/test_table3/date=05172021'
> 4) SELECT * FROM test_table3;   <- produces all rows
> SELECT * FROM test_table3 WHERE `date`='05172021';   <--- produces no rows   
> SET hive.optimize.ppd.storage=false;  <--- turn off ppd push down optimization
> SELECT * FROM test_table3 WHERE `date`='05172021'; <--- produces rows after 
> setting above config to false
> Attaching parquet data files for reference:
>  
>  
>  
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23556) Support hive.metastore.limit.partition.request for get_partitions_ps

2021-06-15 Thread Zoltan Haindrich (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363755#comment-17363755
 ] 

Zoltan Haindrich commented on HIVE-23556:
-

[~touchida]: could you please open a PR on github with your patch?

> Support hive.metastore.limit.partition.request for get_partitions_ps
> 
>
> Key: HIVE-23556
> URL: https://issues.apache.org/jira/browse/HIVE-23556
> Project: Hive
>  Issue Type: Improvement
>Reporter: Toshihiko Uchida
>Assignee: Toshihiko Uchida
>Priority: Minor
> Attachments: HIVE-23556.2.patch, HIVE-23556.3.patch, 
> HIVE-23556.4.patch, HIVE-23556.patch
>
>
> HIVE-13884 added the configuration hive.metastore.limit.partition.request to 
> limit the number of partitions that can be requested.
> Currently, it takes in effect for the following MetaStore APIs
> * get_partitions,
> * get_partitions_with_auth,
> * get_partitions_by_filter,
> * get_partitions_spec_by_filter,
> * get_partitions_by_expr,
> but not for
> * get_partitions_ps,
> * get_partitions_ps_with_auth.
> This issue proposes to apply the configuration also to get_partitions_ps and 
> get_partitions_ps_with_auth.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25213) Implement List getTables() for existing connectors.

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25213?focusedWorklogId=611436=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611436
 ]

ASF GitHub Bot logged work on HIVE-25213:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 16:25
Start Date: 15/Jun/21 16:25
Worklog Time Spent: 10m 
  Work Description: dantongdong commented on a change in pull request #2371:
URL: https://github.com/apache/hive/pull/2371#discussion_r651955572



##
File path: itests/qtest/target/db_for_connectortest.db/service.properties
##
@@ -0,0 +1,23 @@
+#/private/tmp/db_for_connectortest.db

Review comment:
   Yes, we have discussed this over the meeting but I forgot to add it. 
Will change it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611436)
Time Spent: 0.5h  (was: 20m)

> Implement List getTables() for existing connectors.
> --
>
> Key: HIVE-25213
> URL: https://issues.apache.org/jira/browse/HIVE-25213
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Naveen Gangam
>Assignee: Dantong Dong
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In the initial implementation, connector providers do not implement the 
> getTables(string pattern) spi. We had deferred it for later. Only 
> getTableNames() and getTable() were implemented. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25238) Make SSL cipher suites configurable for Hive Web UI and HS2

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25238?focusedWorklogId=611404=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611404
 ]

ASF GitHub Bot logged work on HIVE-25238:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 15:38
Start Date: 15/Jun/21 15:38
Worklog Time Spent: 10m 
  Work Description: yongzhi merged pull request #2385:
URL: https://github.com/apache/hive/pull/2385


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611404)
Time Spent: 50m  (was: 40m)

> Make SSL cipher suites configurable for Hive Web UI and HS2
> ---
>
> Key: HIVE-25238
> URL: https://issues.apache.org/jira/browse/HIVE-25238
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Web UI
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When starting a jetty http server, one can explicitly exclude certain 
> (unsecure)
> SSL cipher suites. This can be especially important, when Hive
> needs to be compliant with security regulations. Need add properties to 
> support Hive WebUi and HiveServer2 to this
> For Hive Binary Cli Server, we can set include certain SSL cipher suites. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25204) Reduce overhead of adding notification log for update partition column statistics

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25204?focusedWorklogId=611385=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611385
 ]

ASF GitHub Bot logged work on HIVE-25204:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 15:07
Start Date: 15/Jun/21 15:07
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on a change in pull request #2365:
URL: https://github.com/apache/hive/pull/2365#discussion_r651886691



##
File path: 
hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java
##
@@ -905,25 +909,32 @@ public void 
onUpdatePartitionColumnStat(UpdatePartitionColumnStatEvent updatePar
   }
 
   @Override
-  public void 
onUpdatePartitionColumnStatDirectSql(UpdatePartitionColumnStatEvent 
updatePartColStatEvent,
-   Connection dbConn, 
SQLGenerator sqlGenerator)
+  public void 
onUpdatePartitionColumnStatInBatch(UpdatePartitionColumnStatEventBatch 
updatePartColStatEventBatch,

Review comment:
   Yes ..for normal listeners we can not change as they may be expecting 
the data in that way. The change is done only for transactional listeners 
(DBNotification listener is a transactional listener). The notification for 
transactional listeners are done inside the direct sql method  as it has to be 
within same transaction. For normal listeners we need not have them in the same 
transaction.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611385)
Time Spent: 1h  (was: 50m)

> Reduce overhead of adding notification log for update partition column 
> statistics
> -
>
> Key: HIVE-25204
> URL: https://issues.apache.org/jira/browse/HIVE-25204
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: perfomance, pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The notification logs for partition column statistics can be optimised by 
> adding them in batch. In the current implementation its done one by one 
> causing multiple sql execution in the backend RDBMS. These SQL executions can 
> be batched to reduce the execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25233) Removing deprecated unix_timestamp UDF

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25233?focusedWorklogId=611380=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611380
 ]

ASF GitHub Bot logged work on HIVE-25233:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 14:52
Start Date: 15/Jun/21 14:52
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #2380:
URL: https://github.com/apache/hive/pull/2380#issuecomment-861567894


   you seem to have hit the "surefire bug" in your testruns ; you should run:
   
   ```
   mvn install -pl itests/qtest -Pqsplits 
-Dtest=org.apache.hadoop.hive.cli.split6.TestMiniLlapLocalCliDriver 
-Dtest.output.overwrite
   ```
   or similar locally to run the whole split which was "failed-to-read"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611380)
Time Spent: 20m  (was: 10m)

> Removing deprecated unix_timestamp UDF
> --
>
> Key: HIVE-25233
> URL: https://issues.apache.org/jira/browse/HIVE-25233
> Project: Hive
>  Issue Type: Task
>  Components: UDF
>Affects Versions: All Versions
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Description
> Since unix_timestamp() UDF was deprecated as part of 
> https://issues.apache.org/jira/browse/HIVE-10728. Internal 
> GenericUDFUnixTimeStamp extend GenericUDFToUnixTimeStamp and call 
> to_utc_timestamp() for unix_timestamp(string date) & unix_timestamp(string 
> date, string pattern).
> unix_timestamp()   => CURRENT_TIMESTAMP
> unix_timestamp(string date) => to_unix_timestamp()
> unix_timestamp(string date, string pattern) => to_unix_timestamp()
> We should clean up unix_timestamp() and points to to_unix_timestamp()
>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-25249) Fix TestWorker

2021-06-15 Thread Zoltan Haindrich (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363684#comment-17363684
 ] 

Zoltan Haindrich commented on HIVE-25249:
-

fyi: [~pvargacl], [~dkuzmenko]

> Fix TestWorker
> --
>
> Key: HIVE-25249
> URL: https://issues.apache.org/jira/browse/HIVE-25249
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Priority: Major
>
> http://ci.hive.apache.org/job/hive-precommit/job/PR-2381/1/
> http://ci.hive.apache.org/job/hive-flaky-check/236/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25246) Fix the clean up of open repl created transactions

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25246:
--
Labels: pull-request-available  (was: )

> Fix the clean up of open repl created transactions
> --
>
> Key: HIVE-25246
> URL: https://issues.apache.org/jira/browse/HIVE-25246
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25246) Fix the clean up of open repl created transactions

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25246?focusedWorklogId=611367=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611367
 ]

ASF GitHub Bot logged work on HIVE-25246:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 14:27
Start Date: 15/Jun/21 14:27
Worklog Time Spent: 10m 
  Work Description: hmangla98 opened a new pull request #2396:
URL: https://github.com/apache/hive/pull/2396


   Fix the clean up of open repl created transactions


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611367)
Remaining Estimate: 0h
Time Spent: 10m

> Fix the clean up of open repl created transactions
> --
>
> Key: HIVE-25246
> URL: https://issues.apache.org/jira/browse/HIVE-25246
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23633) Metastore some JDO query objects do not close properly

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23633?focusedWorklogId=611363=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611363
 ]

ASF GitHub Bot logged work on HIVE-23633:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 14:16
Start Date: 15/Jun/21 14:16
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 opened a new pull request #2344:
URL: https://github.com/apache/hive/pull/2344


   …properly
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611363)
Time Spent: 5h 50m  (was: 5h 40m)

> Metastore some JDO query objects do not close properly
> --
>
> Key: HIVE-23633
> URL: https://issues.apache.org/jira/browse/HIVE-23633
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23633.01.patch
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> After patched [HIVE-10895|https://issues.apache.org/jira/browse/HIVE-10895],  
> The metastore still has seen a memory leak on db resources: many 
> StatementImpls left unclosed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-25248) Fix TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1

2021-06-15 Thread Panagiotis Garefalakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25248:
-

Assignee: Panagiotis Garefalakis

> Fix 
> TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1
> ---
>
> Key: HIVE-25248
> URL: https://issues.apache.org/jira/browse/HIVE-25248
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> This test is failing randomly recently
> http://ci.hive.apache.org/job/hive-flaky-check/233/testReport/org.apache.hadoop.hive.llap.tezplugins/TestLlapTaskSchedulerService/testForcedLocalityMultiplePreemptionsSameHost1/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23633) Metastore some JDO query objects do not close properly

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23633?focusedWorklogId=611362=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611362
 ]

ASF GitHub Bot logged work on HIVE-23633:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 14:14
Start Date: 15/Jun/21 14:14
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 closed pull request #2344:
URL: https://github.com/apache/hive/pull/2344


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611362)
Time Spent: 5h 40m  (was: 5.5h)

> Metastore some JDO query objects do not close properly
> --
>
> Key: HIVE-23633
> URL: https://issues.apache.org/jira/browse/HIVE-23633
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23633.01.patch
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> After patched [HIVE-10895|https://issues.apache.org/jira/browse/HIVE-10895],  
> The metastore still has seen a memory leak on db resources: many 
> StatementImpls left unclosed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25055) Improve the exception handling in HMSHandler

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25055?focusedWorklogId=611360=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611360
 ]

ASF GitHub Bot logged work on HIVE-25055:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 14:12
Start Date: 15/Jun/21 14:12
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on pull request #2218:
URL: https://github.com/apache/hive/pull/2218#issuecomment-861534654


   Hi, @vihangk1, would you mind taking another look if have secs? thanks! :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611360)
Time Spent: 3h 20m  (was: 3h 10m)

> Improve the exception handling in HMSHandler
> 
>
> Key: HIVE-25055
> URL: https://issues.apache.org/jira/browse/HIVE-25055
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24802) Show operation log at webui

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24802?focusedWorklogId=611356=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611356
 ]

ASF GitHub Bot logged work on HIVE-24802:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 14:10
Start Date: 15/Jun/21 14:10
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on pull request #1998:
URL: https://github.com/apache/hive/pull/1998#issuecomment-861532574


   @pvary sorry for pinging, cloud the pr be moved a little further? thank you! 
:)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611356)
Time Spent: 6h  (was: 5h 50m)

> Show operation log at webui
> ---
>
> Key: HIVE-24802
> URL: https://issues.apache.org/jira/browse/HIVE-24802
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
> Attachments: operationlog.png
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Currently we provide getQueryLog in HiveStatement to fetch the operation log, 
>  and the operation log would be deleted on operation closing(delay for the 
> canceled operation).  Sometimes it's would be not easy for the user(jdbc) or 
> administrators to deep into the details of the finished(failed) operation, so 
> we present the operation log on webui and keep the operation log for some 
> time for latter analysis.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25248) Fix TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1

2021-06-15 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-25248:

Summary: Fix 
TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1  
(was: Fix 
.TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1)

> Fix 
> TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1
> ---
>
> Key: HIVE-25248
> URL: https://issues.apache.org/jira/browse/HIVE-25248
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Priority: Major
>
> This test is failing randomly recently
> http://ci.hive.apache.org/job/hive-flaky-check/233/testReport/org.apache.hadoop.hive.llap.tezplugins/TestLlapTaskSchedulerService/testForcedLocalityMultiplePreemptionsSameHost1/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-25248) Fix .TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1

2021-06-15 Thread Zoltan Haindrich (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363658#comment-17363658
 ] 

Zoltan Haindrich commented on HIVE-25248:
-

another testcase from this class might also a candaidate 
http://ci.hive.apache.org/job/hive-precommit/job/PR-2381/4/testReport/junit/org.apache.hadoop.hive.llap.tezplugins/TestLlapTaskSchedulerService/Testing___split_17___PostProcess___testPreemption/



> Fix 
> .TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1
> 
>
> Key: HIVE-25248
> URL: https://issues.apache.org/jira/browse/HIVE-25248
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Priority: Major
>
> This test is failing randomly recently
> http://ci.hive.apache.org/job/hive-flaky-check/233/testReport/org.apache.hadoop.hive.llap.tezplugins/TestLlapTaskSchedulerService/testForcedLocalityMultiplePreemptionsSameHost1/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-25246) Fix the clean up of open repl created transactions

2021-06-15 Thread Haymant Mangla (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haymant Mangla reassigned HIVE-25246:
-


> Fix the clean up of open repl created transactions
> --
>
> Key: HIVE-25246
> URL: https://issues.apache.org/jira/browse/HIVE-25246
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25234) Implement ALTER TABLE ... SET PARTITION SPEC to change partitioning on Iceberg tables

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25234?focusedWorklogId=611283=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611283
 ]

ASF GitHub Bot logged work on HIVE-25234:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 12:18
Start Date: 15/Jun/21 12:18
Worklog Time Spent: 10m 
  Work Description: lcspinter commented on a change in pull request #2382:
URL: https://github.com/apache/hive/pull/2382#discussion_r651732394



##
File path: 
iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerNoScan.java
##
@@ -155,6 +155,58 @@ public void after() throws Exception {
 HiveIcebergStorageHandlerTestUtils.close(shell);
   }
 
+  @Test
+  public void testSetPartitionTransform() {
+Schema schema = new Schema(
+optional(1, "id", Types.LongType.get()),
+optional(2, "year_field", Types.DateType.get()),
+optional(3, "month_field", Types.TimestampType.withZone()),
+optional(4, "day_field", Types.TimestampType.withoutZone()),
+optional(5, "hour_field", Types.TimestampType.withoutZone()),
+optional(6, "truncate_field", Types.StringType.get()),
+optional(7, "bucket_field", Types.StringType.get()),
+optional(8, "identity_field", Types.StringType.get())
+);
+
+TableIdentifier identifier = TableIdentifier.of("default", "part_test");
+shell.executeStatement("CREATE EXTERNAL TABLE " + identifier +
+" PARTITIONED BY SPEC (year(year_field), hour(hour_field), " +
+"truncate(2, truncate_field), bucket(2, bucket_field), 
identity_field)" +
+" STORED BY ICEBERG " +
+testTables.locationForCreateTableSQL(identifier) +
+"TBLPROPERTIES ('" + InputFormatConfig.TABLE_SCHEMA + "'='" +
+SchemaParser.toJson(schema) + "', " +
+"'" + InputFormatConfig.CATALOG_NAME + "'='" + 
Catalogs.ICEBERG_DEFAULT_CATALOG_NAME + "')");
+
+PartitionSpec spec = PartitionSpec.builderFor(schema)
+.year("year_field")
+.hour("hour_field")
+.truncate("truncate_field", 2)
+.bucket("bucket_field", 2)
+.identity("identity_field")
+.build();
+
+Table table = testTables.loadTable(identifier);
+Assert.assertEquals(spec, table.spec());
+
+shell.executeStatement("ALTER TABLE default.part_test SET PARTITION 
SPEC(year(year_field), month(month_field), " +
+"day(day_field))");
+
+spec = PartitionSpec.builderFor(schema)
+.withSpecId(1)
+.year("year_field")
+.alwaysNull("hour_field", "hour_field_hour")
+.alwaysNull("truncate_field", "truncate_field_trunc")
+.alwaysNull("bucket_field", "bucket_field_bucket")
+.alwaysNull("identity_field", "identity_field")
+.month("month_field")
+.day("day_field")
+.build();
+
+table.refresh();
+Assert.assertEquals(spec, table.spec());
+  }
+

Review comment:
   I added an additional test case to cover the partition evolution. Thanks 
for the idea!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611283)
Time Spent: 1h 10m  (was: 1h)

> Implement ALTER TABLE ... SET PARTITION SPEC to change partitioning on 
> Iceberg tables
> -
>
> Key: HIVE-25234
> URL: https://issues.apache.org/jira/browse/HIVE-25234
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Provide a way to change the schema and the Iceberg partitioning specification 
> using Hive syntax.
> {code:sql}
> ALTER TABLE tbl SET PARTITION SPEC(...)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25204) Reduce overhead of adding notification log for update partition column statistics

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25204?focusedWorklogId=611282=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611282
 ]

ASF GitHub Bot logged work on HIVE-25204:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 12:17
Start Date: 15/Jun/21 12:17
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #2365:
URL: https://github.com/apache/hive/pull/2365#discussion_r651731735



##
File path: 
hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java
##
@@ -905,25 +909,32 @@ public void 
onUpdatePartitionColumnStat(UpdatePartitionColumnStatEvent updatePar
   }
 
   @Override
-  public void 
onUpdatePartitionColumnStatDirectSql(UpdatePartitionColumnStatEvent 
updatePartColStatEvent,
-   Connection dbConn, 
SQLGenerator sqlGenerator)
+  public void 
onUpdatePartitionColumnStatInBatch(UpdatePartitionColumnStatEventBatch 
updatePartColStatEventBatch,

Review comment:
   updatePartitionColStatsForOneBatch calls the direct sql method but also 
does the following MetaStoreListenerNotifier.notifyEvent(listeners,
   EventMessage.EventType.UPDATE_PARTITION_COLUMN_STAT,
   new UpdatePartitionColumnStatEvent(colStats, partVals, 
parameters,
   tbl, writeId, this));
   for each event.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611282)
Time Spent: 50m  (was: 40m)

> Reduce overhead of adding notification log for update partition column 
> statistics
> -
>
> Key: HIVE-25204
> URL: https://issues.apache.org/jira/browse/HIVE-25204
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: perfomance, pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The notification logs for partition column statistics can be optimised by 
> adding them in batch. In the current implementation its done one by one 
> causing multiple sql execution in the backend RDBMS. These SQL executions can 
> be batched to reduce the execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25242) Query performs extremely slow with hive.vectorized.adaptor.usage.mode = chosen

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25242?focusedWorklogId=611280=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611280
 ]

ASF GitHub Bot logged work on HIVE-25242:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 12:13
Start Date: 15/Jun/21 12:13
Worklog Time Spent: 10m 
  Work Description: zeroflag commented on pull request #2390:
URL: https://github.com/apache/hive/pull/2390#issuecomment-861446954


   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611280)
Time Spent: 20m  (was: 10m)

>  Query performs extremely slow with hive.vectorized.adaptor.usage.mode = 
> chosen
> ---
>
> Key: HIVE-25242
> URL: https://issues.apache.org/jira/browse/HIVE-25242
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> If hive.vectorized.adaptor.usage.mode is set to chosen only certain UDFS are 
> vectorized through the vectorized adaptor.
> Queries like this one, performs very slowly because the concat is not chosen 
> to be vectorized.
> {code:java}
> select count(*) from tbl where to_date(concat(year, '-', month, '-', day)) 
> between to_date('2018-12-01') and to_date('2021-03-01');  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25235) Remove ThreadPoolExecutorWithOomHook

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25235?focusedWorklogId=611220=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611220
 ]

ASF GitHub Bot logged work on HIVE-25235:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 10:01
Start Date: 15/Jun/21 10:01
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on a change in pull request 
#2383:
URL: https://github.com/apache/hive/pull/2383#discussion_r651639876



##
File path: 
service/src/java/org/apache/hive/service/cli/session/SessionManager.java
##
@@ -224,7 +224,7 @@ private void createBackgroundOperationPool() {
 // Threads terminate when they are idle for more than the keepAliveTime
 // A bounded blocking queue is used to queue incoming operations, if 
#operations > poolSize
 String threadPoolName = "HiveServer2-Background-Pool";
-final BlockingQueue queue = new 
LinkedBlockingQueue(poolQueueSize);
+final BlockingQueue queue = new 
LinkedBlockingQueue(poolQueueSize);

Review comment:
   You can use <> here




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611220)
Time Spent: 50m  (was: 40m)

> Remove ThreadPoolExecutorWithOomHook
> 
>
> Key: HIVE-25235
> URL: https://issues.apache.org/jira/browse/HIVE-25235
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> While I was looking at [HIVE-24846] to better perform OOM logging and I just 
> realized that this is not a good way to handle OOM.
> https://stackoverflow.com/questions/1692230/is-it-possible-to-catch-out-of-memory-exception-in-java
> bq. there's likely no easy way for you to recover from it if you do catch it
> If we want to handle OOM, it's best to do it from outside. It's best to do it 
> with the JVM facilities:
> {{-XX:+ExitOnOutOfMemoryError}}
> {{-XX:OnOutOfMemoryError}}
> It seems odd that the OOM handler attempts to load a handler and then do more 
> work when clearly the server is hosed at this point and just requesting to do 
> more work will further add to memory pressure.
> The current OOM logic in {{HiveServer2OomHookRunner}} causes HiveServer2 to 
> shutdown, but we already have that with the JVM shutdown hook.  This JVM 
> shutdown hook is triggered if {{-XX:OnOutOfMemoryError="kill -9 %p"}} exists 
> and is the appropriate thing to do.
> https://github.com/apache/hive/blob/328d197431b2ff1000fd9c56ce758013eff81ad8/service/src/java/org/apache/hive/service/server/HiveServer2.java#L443-L444
> https://github.com/apache/hive/blob/cb0541a31b87016fae8e4c0e7130532c6e5f8de7/service/src/java/org/apache/hive/service/server/HiveServer2OomHookRunner.java#L42-L44



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-25173) Fix build failure of hive-pre-upgrade due to missing dependency on pentaho-aggdesigner-algorithm

2021-06-15 Thread Stamatis Zampetakis (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363518#comment-17363518
 ] 

Stamatis Zampetakis commented on HIVE-25173:


The pentaho-aggdesigner-algorithm artifacts were present only in the spring 
repo and not in maven central. I think the spring repo was retired so it is not 
possible to find the artifacts any more.  For sure we can exclude this dep and 
move forward for future releases but any old tag that relies on this cannot be 
built (including calcite-1.10.0 itself). Don't know if it is possible to push 
these artifacts to maven central at this point. CC [~jhyde] just for awareness.

> Fix build failure of hive-pre-upgrade due to missing dependency on 
> pentaho-aggdesigner-algorithm
> 
>
> Key: HIVE-25173
> URL: https://issues.apache.org/jira/browse/HIVE-25173
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.2
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {noformat}
> [ERROR] Failed to execute goal on project hive-pre-upgrade: Could not resolve 
> dependencies for project org.apache.hive:hive-pre-upgrade:jar:4.0.0-SNAPSHOT: 
> Failure to find org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde in 
> https://repo.maven.apache.org/maven2 was cached in the local repository, 
> resolution will not be reattempted until the update interval of central has 
> elapsed or updates are forced
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-25224) Multi insert statements involving tables with different bucketing_versions results in error

2021-06-15 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich resolved HIVE-25224.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

merged into master. Thank you Krisztian for reviewing the changes!

> Multi insert statements involving tables with different bucketing_versions 
> results in error
> ---
>
> Key: HIVE-25224
> URL: https://issues.apache.org/jira/browse/HIVE-25224
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {code}
> drop table if exists t;
> drop table if exists t2;
> drop table if exists t3;
> create table t (a integer);
> create table t2 (a integer);
> create table t3 (a integer);
> alter table t set tblproperties ('bucketing_version'='1');
> explain from t3 insert into t select a insert into t2 select a;
> {code}
> results in
> {code}
> Error: Error while compiling statement: FAILED: RuntimeException Error 
> setting bucketingVersion for group: [[op: FS[2], bucketingVersion=1], [op: 
> FS[11], bucketingVersion=2]] (state=42000,code=4)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25224) Multi insert statements involving tables with different bucketing_versions results in error

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25224?focusedWorklogId=611212=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611212
 ]

ASF GitHub Bot logged work on HIVE-25224:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 09:45
Start Date: 15/Jun/21 09:45
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #2381:
URL: https://github.com/apache/hive/pull/2381#issuecomment-861354189


   I've merged this - testruns had multiple unrelated  failures in a row; I'll 
go and clean up these flaky tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611212)
Time Spent: 50m  (was: 40m)

> Multi insert statements involving tables with different bucketing_versions 
> results in error
> ---
>
> Key: HIVE-25224
> URL: https://issues.apache.org/jira/browse/HIVE-25224
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {code}
> drop table if exists t;
> drop table if exists t2;
> drop table if exists t3;
> create table t (a integer);
> create table t2 (a integer);
> create table t3 (a integer);
> alter table t set tblproperties ('bucketing_version'='1');
> explain from t3 insert into t select a insert into t2 select a;
> {code}
> results in
> {code}
> Error: Error while compiling statement: FAILED: RuntimeException Error 
> setting bucketingVersion for group: [[op: FS[2], bucketingVersion=1], [op: 
> FS[11], bucketingVersion=2]] (state=42000,code=4)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25224) Multi insert statements involving tables with different bucketing_versions results in error

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25224?focusedWorklogId=611209=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611209
 ]

ASF GitHub Bot logged work on HIVE-25224:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 09:44
Start Date: 15/Jun/21 09:44
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk merged pull request #2381:
URL: https://github.com/apache/hive/pull/2381


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611209)
Time Spent: 40m  (was: 0.5h)

> Multi insert statements involving tables with different bucketing_versions 
> results in error
> ---
>
> Key: HIVE-25224
> URL: https://issues.apache.org/jira/browse/HIVE-25224
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code}
> drop table if exists t;
> drop table if exists t2;
> drop table if exists t3;
> create table t (a integer);
> create table t2 (a integer);
> create table t3 (a integer);
> alter table t set tblproperties ('bucketing_version'='1');
> explain from t3 insert into t select a insert into t2 select a;
> {code}
> results in
> {code}
> Error: Error while compiling statement: FAILED: RuntimeException Error 
> setting bucketingVersion for group: [[op: FS[2], bucketingVersion=1], [op: 
> FS[11], bucketingVersion=2]] (state=42000,code=4)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25238) Make SSL cipher suites configurable for Hive Web UI and HS2

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25238?focusedWorklogId=611207=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611207
 ]

ASF GitHub Bot logged work on HIVE-25238:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 09:38
Start Date: 15/Jun/21 09:38
Worklog Time Spent: 10m 
  Work Description: yongzhi commented on a change in pull request #2385:
URL: https://github.com/apache/hive/pull/2385#discussion_r651622848



##
File path: 
service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java
##
@@ -24,10 +24,13 @@
 import java.util.concurrent.SynchronousQueue;
 import java.util.concurrent.ThreadPoolExecutor;
 import java.util.concurrent.TimeUnit;
+import java.util.Set;
 
 import javax.net.ssl.KeyManagerFactory;
 import javax.ws.rs.HttpMethod;
 
+import com.google.common.base.Splitter;

Review comment:
   It is not new, it has been used by HttpServer.java




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611207)
Time Spent: 40m  (was: 0.5h)

> Make SSL cipher suites configurable for Hive Web UI and HS2
> ---
>
> Key: HIVE-25238
> URL: https://issues.apache.org/jira/browse/HIVE-25238
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Web UI
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When starting a jetty http server, one can explicitly exclude certain 
> (unsecure)
> SSL cipher suites. This can be especially important, when Hive
> needs to be compliant with security regulations. Need add properties to 
> support Hive WebUi and HiveServer2 to this
> For Hive Binary Cli Server, we can set include certain SSL cipher suites. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25238) Make SSL cipher suites configurable for Hive Web UI and HS2

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25238?focusedWorklogId=611203=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611203
 ]

ASF GitHub Bot logged work on HIVE-25238:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 09:36
Start Date: 15/Jun/21 09:36
Worklog Time Spent: 10m 
  Work Description: yongzhi commented on a change in pull request #2385:
URL: https://github.com/apache/hive/pull/2385#discussion_r651621053



##
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##
@@ -4183,9 +4186,14 @@ private static void 
populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal
 HIVE_SERVER2_SSL_KEYSTORE_PASSWORD("hive.server2.keystore.password", "",
 "SSL certificate keystore password."),
 HIVE_SERVER2_SSL_KEYSTORE_TYPE("hive.server2.keystore.type", "",
-"SSL certificate keystore type."),
+"SSL certificate keystore type."),
 
HIVE_SERVER2_SSL_KEYMANAGERFACTORY_ALGORITHM("hive.server2.keymanagerfactory.algorithm",
 "",
-"SSL certificate keystore algorithm."),
+"SSL certificate keystore algorithm."),
+
HIVE_SERVER2_SSL_HTTP_EXCLUDE_CIPHERSUITES("hive.server2.http.exclude.ciphersuites",
 "",

Review comment:
   No, for binary Thrift, it uses TSSLTransportFactory.getServerSocket 
which does not support excluding cipher suites. The setting for HTTP 
(webui/hs2) can be different from binary as they have different clients.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611203)
Time Spent: 0.5h  (was: 20m)

> Make SSL cipher suites configurable for Hive Web UI and HS2
> ---
>
> Key: HIVE-25238
> URL: https://issues.apache.org/jira/browse/HIVE-25238
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Web UI
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When starting a jetty http server, one can explicitly exclude certain 
> (unsecure)
> SSL cipher suites. This can be especially important, when Hive
> needs to be compliant with security regulations. Need add properties to 
> support Hive WebUi and HiveServer2 to this
> For Hive Binary Cli Server, we can set include certain SSL cipher suites. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-25245) Hive: merge r20 to cdpd-master

2021-06-15 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-25245:
---

Assignee: (was: Zoltan Haindrich)

> Hive: merge r20 to cdpd-master
> --
>
> Key: HIVE-25245
> URL: https://issues.apache.org/jira/browse/HIVE-25245
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: László Bodor
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-25245) Hive: merge r20 to cdpd-master

2021-06-15 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-25245:
---

Assignee: Zoltan Haindrich

> Hive: merge r20 to cdpd-master
> --
>
> Key: HIVE-25245
> URL: https://issues.apache.org/jira/browse/HIVE-25245
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: László Bodor
>Assignee: Zoltan Haindrich
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-8143) Create root scratch dir with 733 instead of 777 perms

2021-06-15 Thread lujie (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie reassigned HIVE-8143:
---

Assignee: Vaibhav Gumashta

> Create root scratch dir with 733 instead of 777 perms
> -
>
> Key: HIVE-8143
> URL: https://issues.apache.org/jira/browse/HIVE-8143
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 0.14.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
>  Labels: TODOC14
> Fix For: 0.14.0
>
> Attachments: HIVE-8143.1.patch, HIVE-8143.2.patch
>
>
> hive.exec.scratchdir which is treated as the root scratch directory on hdfs 
> only needs to be writable by all. We can use 733 instead of 777 for doing 
> that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-8143) Create root scratch dir with 733 instead of 777 perms

2021-06-15 Thread lujie (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie reassigned HIVE-8143:
---

Assignee: (was: lujie)

> Create root scratch dir with 733 instead of 777 perms
> -
>
> Key: HIVE-8143
> URL: https://issues.apache.org/jira/browse/HIVE-8143
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 0.14.0
>Reporter: Vaibhav Gumashta
>Priority: Major
>  Labels: TODOC14
> Fix For: 0.14.0
>
> Attachments: HIVE-8143.1.patch, HIVE-8143.2.patch
>
>
> hive.exec.scratchdir which is treated as the root scratch directory on hdfs 
> only needs to be writable by all. We can use 733 instead of 777 for doing 
> that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-8143) Create root scratch dir with 733 instead of 777 perms

2021-06-15 Thread lujie (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie reassigned HIVE-8143:
---

Assignee: lujie  (was: Vaibhav Gumashta)

> Create root scratch dir with 733 instead of 777 perms
> -
>
> Key: HIVE-8143
> URL: https://issues.apache.org/jira/browse/HIVE-8143
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 0.14.0
>Reporter: Vaibhav Gumashta
>Assignee: lujie
>Priority: Major
>  Labels: TODOC14
> Fix For: 0.14.0
>
> Attachments: HIVE-8143.1.patch, HIVE-8143.2.patch
>
>
> hive.exec.scratchdir which is treated as the root scratch directory on hdfs 
> only needs to be writable by all. We can use 733 instead of 777 for doing 
> that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25204) Reduce overhead of adding notification log for update partition column statistics

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25204?focusedWorklogId=611172=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611172
 ]

ASF GitHub Bot logged work on HIVE-25204:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 07:35
Start Date: 15/Jun/21 07:35
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on a change in pull request #2365:
URL: https://github.com/apache/hive/pull/2365#discussion_r651526249



##
File path: 
hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java
##
@@ -1194,108 +1205,82 @@ static String quoteString(String input) {
   private void addNotificationLog(NotificationEvent event, ListenerEvent 
listenerEvent, Connection dbConn,
   SQLGenerator sqlGenerator) throws 
MetaException, SQLException {
 LOG.debug("DbNotificationListener: adding notification log for : {}", 
event.getMessage());
+addNotificationLogBatch(Collections.singletonList(event), 
Collections.singletonList(listenerEvent),
+dbConn, sqlGenerator);
+  }
+
+  private void addNotificationLogBatch(List eventList, 
List listenerEventList,
+ Connection dbConn, SQLGenerator sqlGenerator) 
throws MetaException, SQLException {
 if ((dbConn == null) || (sqlGenerator == null)) {
   LOG.info("connection or sql generator is not set so executing sql via 
DN");
-  process(event, listenerEvent);
+  for (int idx = 0; idx < eventList.size(); idx++) {
+LOG.debug("DbNotificationListener: adding notification log for : {}", 
eventList.get(idx).getMessage());
+process(eventList.get(idx), listenerEventList.get(idx));
+  }
   return;
 }
-Statement stmt = null;
-PreparedStatement pst = null;
-ResultSet rs = null;
-try {
-  stmt = dbConn.createStatement();
-  event.setMessageFormat(msgEncoder.getMessageFormat());
 
+try (Statement stmt = dbConn.createStatement()) {
   if (sqlGenerator.getDbProduct().isMYSQL()) {
 stmt.execute("SET @@session.sql_mode=ANSI_QUOTES");
   }
+}
 
-  long nextEventId = getNextEventId(dbConn, sqlGenerator); 
-
-  long nextNLId = getNextNLId(dbConn, sqlGenerator,
-  "org.apache.hadoop.hive.metastore.model.MNotificationLog");
-
-  String insertVal;
-  String columns;
-  List params = new ArrayList();
-
-  // Construct the values string, parameters and column string step by 
step simultaneously so
-  // that the positions of columns and of their corresponding values do 
not go out of sync.
-
-  // Notification log id
-  columns = "\"NL_ID\"";
-  insertVal = "" + nextNLId;
-
-  // Event id
-  columns = columns + ", \"EVENT_ID\"";
-  insertVal = insertVal + "," + nextEventId;
-
-  // Event time
-  columns = columns + ", \"EVENT_TIME\"";
-  insertVal = insertVal + "," + event.getEventTime();
+long nextEventId = getNextEventId(dbConn, sqlGenerator, eventList.size());
+long nextNLId = getNextNLId(dbConn, sqlGenerator,
+"org.apache.hadoop.hive.metastore.model.MNotificationLog", 
eventList.size());
 
-  // Event type
-  columns = columns + ", \"EVENT_TYPE\"";
-  insertVal = insertVal + ", ?";
-  params.add(event.getEventType());
+String columns = "\"NL_ID\"" + ", \"EVENT_ID\"" + ", \"EVENT_TIME\"" + ", 
\"EVENT_TYPE\"" + ", \"MESSAGE\""
++ ", \"MESSAGE_FORMAT\"" + ", \"DB_NAME\"" + ", \"TBL_NAME\"" + ", 
\"CAT_NAME\"";
+String insertVal = "insert into \"NOTIFICATION_LOG\" (" + columns + ") 
VALUES ("
++ "?,?,?,?,?,?,?,?,?"
++ ")";
 
-  // Message
-  columns = columns + ", \"MESSAGE\"";
-  insertVal = insertVal + ", ?";
-  params.add(event.getMessage());
+try (PreparedStatement pst = dbConn.prepareStatement(insertVal)) {
+  int numRows = 0;
 
-  // Message format
-  columns = columns + ", \"MESSAGE_FORMAT\"";
-  insertVal = insertVal + ", ?";
-  params.add(event.getMessageFormat());
+  for (int idx = 0; idx < eventList.size(); idx++) {
+NotificationEvent event = eventList.get(idx);
+ListenerEvent listenerEvent = listenerEventList.get(idx);
 
-  // Database name, optional
-  String dbName = event.getDbName();
-  if (dbName != null) {
-assert dbName.equals(dbName.toLowerCase());
-columns = columns + ", \"DB_NAME\"";
-insertVal = insertVal + ", ?";
-params.add(dbName);
-  }
+LOG.debug("DbNotificationListener: adding notification log for : {}", 
event.getMessage());
+event.setMessageFormat(msgEncoder.getMessageFormat());
 
-  // Table name, optional
-  String tableName = event.getTableName();
-  if (tableName != null) {
-assert

[jira] [Commented] (HIVE-25104) Backward incompatible timestamp serialization in Parquet for certain timezones

2021-06-15 Thread Nikhil Gupta (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363430#comment-17363430
 ] 

Nikhil Gupta commented on HIVE-25104:
-

[~jcamachorodriguez] [~zabetak] 
I am seeing a lot of timestamp issues and backward compatibility issues 
(Parquet, Avro, ORC) being pushed.
Can we track them under a single Umbrella Jira? 

> Backward incompatible timestamp serialization in Parquet for certain timezones
> --
>
> Key: HIVE-25104
> URL: https://issues.apache.org/jira/browse/HIVE-25104
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.1.0
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> HIVE-12192, HIVE-20007 changed the way that timestamp computations are 
> performed and to some extend how timestamps are serialized and deserialized 
> in files (Parquet, Avro).
> In versions that include HIVE-12192 or HIVE-20007 the serialization in 
> Parquet files is not backwards compatible. In other words writing timestamps 
> with a version of Hive that includes HIVE-12192/HIVE-20007 and reading them 
> with another (not including the previous issues) may lead to different 
> results depending on the default timezone of the system.
> Consider the following scenario where the default system timezone is set to 
> US/Pacific.
> At apache/master commit 37f13b02dff94e310d77febd60f93d5a205254d3
> {code:sql}
> CREATE EXTERNAL TABLE employee(eid INT,birth timestamp) STORED AS PARQUET
>  LOCATION '/tmp/hiveexttbl/employee';
> INSERT INTO employee VALUES (1, '1880-01-01 00:00:00');
> INSERT INTO employee VALUES (2, '1884-01-01 00:00:00');
> INSERT INTO employee VALUES (3, '1990-01-01 00:00:00');
> SELECT * FROM employee;
> {code}
> |1|1880-01-01 00:00:00|
> |2|1884-01-01 00:00:00|
> |3|1990-01-01 00:00:00|
> At apache/branch-2.3 commit 324f9faf12d4b91a9359391810cb3312c004d356
> {code:sql}
> CREATE EXTERNAL TABLE employee(eid INT,birth timestamp) STORED AS PARQUET
>  LOCATION '/tmp/hiveexttbl/employee';
> SELECT * FROM employee;
> {code}
> |1|1879-12-31 23:52:58|
> |2|1884-01-01 00:00:00|
> |3|1990-01-01 00:00:00|
> The timestamp for {{eid=1}} in branch-2.3 is different from the one in master.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25204) Reduce overhead of adding notification log for update partition column statistics

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25204?focusedWorklogId=611165=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611165
 ]

ASF GitHub Bot logged work on HIVE-25204:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 06:56
Start Date: 15/Jun/21 06:56
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #2365:
URL: https://github.com/apache/hive/pull/2365#discussion_r651500422



##
File path: 
hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java
##
@@ -905,25 +909,32 @@ public void 
onUpdatePartitionColumnStat(UpdatePartitionColumnStatEvent updatePar
   }
 
   @Override
-  public void 
onUpdatePartitionColumnStatDirectSql(UpdatePartitionColumnStatEvent 
updatePartColStatEvent,
-   Connection dbConn, 
SQLGenerator sqlGenerator)
+  public void 
onUpdatePartitionColumnStatInBatch(UpdatePartitionColumnStatEventBatch 
updatePartColStatEventBatch,

Review comment:
   Can this be used for updatePartitionColStatsForOneBatch as well in 
HMSHandler




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611165)
Time Spent: 0.5h  (was: 20m)

> Reduce overhead of adding notification log for update partition column 
> statistics
> -
>
> Key: HIVE-25204
> URL: https://issues.apache.org/jira/browse/HIVE-25204
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: perfomance, pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The notification logs for partition column statistics can be optimised by 
> adding them in batch. In the current implementation its done one by one 
> causing multiple sql execution in the backend RDBMS. These SQL executions can 
> be batched to reduce the execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-06-15 Thread Krisztian Kasa (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa resolved HIVE-24991.
---
Resolution: Fixed

Pushed to master. Thanks [~pgaref] for review.

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24991?focusedWorklogId=611159=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611159
 ]

ASF GitHub Bot logged work on HIVE-24991:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 06:32
Start Date: 15/Jun/21 06:32
Worklog Time Spent: 10m 
  Work Description: kasakrisz merged pull request #2264:
URL: https://github.com/apache/hive/pull/2264


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611159)
Time Spent: 4h 50m  (was: 4h 40m)

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-25245) Hive: merge r20 to cdpd-master

2021-06-15 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-25245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned HIVE-25245:
---

Assignee: (was: László Bodor)

> Hive: merge r20 to cdpd-master
> --
>
> Key: HIVE-25245
> URL: https://issues.apache.org/jira/browse/HIVE-25245
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: László Bodor
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25245) Hive: merge r20 to cdpd-master

2021-06-15 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-25245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-25245:

Component/s: Hive

> Hive: merge r20 to cdpd-master
> --
>
> Key: HIVE-25245
> URL: https://issues.apache.org/jira/browse/HIVE-25245
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-25245) Hive: merge r20 to cdpd-master

2021-06-15 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-25245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor resolved HIVE-25245.
-
Resolution: Invalid

> Hive: merge r20 to cdpd-master
> --
>
> Key: HIVE-25245
> URL: https://issues.apache.org/jira/browse/HIVE-25245
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-25245) Hive: merge r20 to cdpd-master

2021-06-15 Thread Jira



[ 
https://issues.apache.org/jira/browse/HIVE-25245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363395#comment-17363395
 ] 

László Bodor commented on HIVE-25245:
-

sorry, wrong jira :D

> Hive: merge r20 to cdpd-master
> --
>
> Key: HIVE-25245
> URL: https://issues.apache.org/jira/browse/HIVE-25245
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-25245) Hive: merge r20 to cdpd-master

2021-06-15 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-25245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned HIVE-25245:
---


> Hive: merge r20 to cdpd-master
> --
>
> Key: HIVE-25245
> URL: https://issues.apache.org/jira/browse/HIVE-25245
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25204) Reduce overhead of adding notification log for update partition column statistics

2021-06-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25204?focusedWorklogId=611154=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611154
 ]

ASF GitHub Bot logged work on HIVE-25204:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 06:04
Start Date: 15/Jun/21 06:04
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #2365:
URL: https://github.com/apache/hive/pull/2365#discussion_r651473262



##
File path: 
hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java
##
@@ -1194,108 +1205,82 @@ static String quoteString(String input) {
   private void addNotificationLog(NotificationEvent event, ListenerEvent 
listenerEvent, Connection dbConn,
   SQLGenerator sqlGenerator) throws 
MetaException, SQLException {
 LOG.debug("DbNotificationListener: adding notification log for : {}", 
event.getMessage());
+addNotificationLogBatch(Collections.singletonList(event), 
Collections.singletonList(listenerEvent),
+dbConn, sqlGenerator);
+  }
+
+  private void addNotificationLogBatch(List eventList, 
List listenerEventList,
+ Connection dbConn, SQLGenerator sqlGenerator) 
throws MetaException, SQLException {
 if ((dbConn == null) || (sqlGenerator == null)) {
   LOG.info("connection or sql generator is not set so executing sql via 
DN");
-  process(event, listenerEvent);
+  for (int idx = 0; idx < eventList.size(); idx++) {
+LOG.debug("DbNotificationListener: adding notification log for : {}", 
eventList.get(idx).getMessage());
+process(eventList.get(idx), listenerEventList.get(idx));
+  }
   return;
 }
-Statement stmt = null;
-PreparedStatement pst = null;
-ResultSet rs = null;
-try {
-  stmt = dbConn.createStatement();
-  event.setMessageFormat(msgEncoder.getMessageFormat());
 
+try (Statement stmt = dbConn.createStatement()) {
   if (sqlGenerator.getDbProduct().isMYSQL()) {
 stmt.execute("SET @@session.sql_mode=ANSI_QUOTES");
   }
+}
 
-  long nextEventId = getNextEventId(dbConn, sqlGenerator); 
-
-  long nextNLId = getNextNLId(dbConn, sqlGenerator,
-  "org.apache.hadoop.hive.metastore.model.MNotificationLog");
-
-  String insertVal;
-  String columns;
-  List params = new ArrayList();
-
-  // Construct the values string, parameters and column string step by 
step simultaneously so
-  // that the positions of columns and of their corresponding values do 
not go out of sync.
-
-  // Notification log id
-  columns = "\"NL_ID\"";
-  insertVal = "" + nextNLId;
-
-  // Event id
-  columns = columns + ", \"EVENT_ID\"";
-  insertVal = insertVal + "," + nextEventId;
-
-  // Event time
-  columns = columns + ", \"EVENT_TIME\"";
-  insertVal = insertVal + "," + event.getEventTime();
+long nextEventId = getNextEventId(dbConn, sqlGenerator, eventList.size());
+long nextNLId = getNextNLId(dbConn, sqlGenerator,
+"org.apache.hadoop.hive.metastore.model.MNotificationLog", 
eventList.size());
 
-  // Event type
-  columns = columns + ", \"EVENT_TYPE\"";
-  insertVal = insertVal + ", ?";
-  params.add(event.getEventType());
+String columns = "\"NL_ID\"" + ", \"EVENT_ID\"" + ", \"EVENT_TIME\"" + ", 
\"EVENT_TYPE\"" + ", \"MESSAGE\""
++ ", \"MESSAGE_FORMAT\"" + ", \"DB_NAME\"" + ", \"TBL_NAME\"" + ", 
\"CAT_NAME\"";
+String insertVal = "insert into \"NOTIFICATION_LOG\" (" + columns + ") 
VALUES ("
++ "?,?,?,?,?,?,?,?,?"
++ ")";
 
-  // Message
-  columns = columns + ", \"MESSAGE\"";
-  insertVal = insertVal + ", ?";
-  params.add(event.getMessage());
+try (PreparedStatement pst = dbConn.prepareStatement(insertVal)) {
+  int numRows = 0;
 
-  // Message format
-  columns = columns + ", \"MESSAGE_FORMAT\"";
-  insertVal = insertVal + ", ?";
-  params.add(event.getMessageFormat());
+  for (int idx = 0; idx < eventList.size(); idx++) {
+NotificationEvent event = eventList.get(idx);
+ListenerEvent listenerEvent = listenerEventList.get(idx);
 
-  // Database name, optional
-  String dbName = event.getDbName();
-  if (dbName != null) {
-assert dbName.equals(dbName.toLowerCase());
-columns = columns + ", \"DB_NAME\"";
-insertVal = insertVal + ", ?";
-params.add(dbName);
-  }
+LOG.debug("DbNotificationListener: adding notification log for : {}", 
event.getMessage());
+event.setMessageFormat(msgEncoder.getMessageFormat());
 
-  // Table name, optional
-  String tableName = event.getTableName();
-  if (tableName != null) {
-assert tableName.equals(tableName.toLowerCase());
-

61 matches

Mail list logo