[jira] [Assigned] (HIVE-25251) Reduce overhead of adding partitions during batch loading of partitions.
[ https://issues.apache.org/jira/browse/HIVE-25251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mahesh kumar behera reassigned HIVE-25251: -- > Reduce overhead of adding partitions during batch loading of partitions. > > > Key: HIVE-25251 > URL: https://issues.apache.org/jira/browse/HIVE-25251 > Project: Hive > Issue Type: Sub-task >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > > The add partitions call done to HMS does a serial execution of data nucleus > calls to add the partitions to backend DB. This can be further optimised by > batching those sql statements. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25204) Reduce overhead of adding notification log for update partition column statistics
[ https://issues.apache.org/jira/browse/HIVE-25204?focusedWorklogId=611734=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611734 ] ASF GitHub Bot logged work on HIVE-25204: - Author: ASF GitHub Bot Created on: 16/Jun/21 04:22 Start Date: 16/Jun/21 04:22 Worklog Time Spent: 10m Work Description: maheshk114 merged pull request #2365: URL: https://github.com/apache/hive/pull/2365 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611734) Time Spent: 1h 10m (was: 1h) > Reduce overhead of adding notification log for update partition column > statistics > - > > Key: HIVE-25204 > URL: https://issues.apache.org/jira/browse/HIVE-25204 > Project: Hive > Issue Type: Sub-task > Components: Hive, HiveServer2 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: perfomance, pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > The notification logs for partition column statistics can be optimised by > adding them in batch. In the current implementation its done one by one > causing multiple sql execution in the backend RDBMS. These SQL executions can > be batched to reduce the execution time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25204) Reduce overhead of adding notification log for update partition column statistics
[ https://issues.apache.org/jira/browse/HIVE-25204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mahesh kumar behera resolved HIVE-25204. Resolution: Fixed > Reduce overhead of adding notification log for update partition column > statistics > - > > Key: HIVE-25204 > URL: https://issues.apache.org/jira/browse/HIVE-25204 > Project: Hive > Issue Type: Sub-task > Components: Hive, HiveServer2 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: perfomance, pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > The notification logs for partition column statistics can be optimised by > adding them in batch. In the current implementation its done one by one > causing multiple sql execution in the backend RDBMS. These SQL executions can > be batched to reduce the execution time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23633) Metastore some JDO query objects do not close properly
[ https://issues.apache.org/jira/browse/HIVE-23633?focusedWorklogId=611687=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611687 ] ASF GitHub Bot logged work on HIVE-23633: - Author: ASF GitHub Bot Created on: 16/Jun/21 01:11 Start Date: 16/Jun/21 01:11 Worklog Time Spent: 10m Work Description: dengzhhu653 opened a new pull request #2344: URL: https://github.com/apache/hive/pull/2344 …properly ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611687) Time Spent: 6h 10m (was: 6h) > Metastore some JDO query objects do not close properly > -- > > Key: HIVE-23633 > URL: https://issues.apache.org/jira/browse/HIVE-23633 > Project: Hive > Issue Type: Bug >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-23633.01.patch > > Time Spent: 6h 10m > Remaining Estimate: 0h > > After patched [HIVE-10895|https://issues.apache.org/jira/browse/HIVE-10895], > The metastore still has seen a memory leak on db resources: many > StatementImpls left unclosed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23633) Metastore some JDO query objects do not close properly
[ https://issues.apache.org/jira/browse/HIVE-23633?focusedWorklogId=611686=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611686 ] ASF GitHub Bot logged work on HIVE-23633: - Author: ASF GitHub Bot Created on: 16/Jun/21 01:09 Start Date: 16/Jun/21 01:09 Worklog Time Spent: 10m Work Description: dengzhhu653 closed pull request #2344: URL: https://github.com/apache/hive/pull/2344 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611686) Time Spent: 6h (was: 5h 50m) > Metastore some JDO query objects do not close properly > -- > > Key: HIVE-23633 > URL: https://issues.apache.org/jira/browse/HIVE-23633 > Project: Hive > Issue Type: Bug >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-23633.01.patch > > Time Spent: 6h > Remaining Estimate: 0h > > After patched [HIVE-10895|https://issues.apache.org/jira/browse/HIVE-10895], > The metastore still has seen a memory leak on db resources: many > StatementImpls left unclosed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25055) Improve the exception handling in HMSHandler
[ https://issues.apache.org/jira/browse/HIVE-25055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17364001#comment-17364001 ] Zhihua Deng commented on HIVE-25055: Thank you [~vihangk1] for reviewing the changes! > Improve the exception handling in HMSHandler > > > Key: HIVE-25055 > URL: https://issues.apache.org/jira/browse/HIVE-25055 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 3.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25055) Improve the exception handling in HMSHandler
[ https://issues.apache.org/jira/browse/HIVE-25055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng resolved HIVE-25055. Fix Version/s: 4.0.0 Resolution: Resolved > Improve the exception handling in HMSHandler > > > Key: HIVE-25055 > URL: https://issues.apache.org/jira/browse/HIVE-25055 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 3.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25055) Improve the exception handling in HMSHandler
[ https://issues.apache.org/jira/browse/HIVE-25055?focusedWorklogId=611672=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611672 ] ASF GitHub Bot logged work on HIVE-25055: - Author: ASF GitHub Bot Created on: 16/Jun/21 00:16 Start Date: 16/Jun/21 00:16 Worklog Time Spent: 10m Work Description: vihangk1 merged pull request #2218: URL: https://github.com/apache/hive/pull/2218 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611672) Time Spent: 3.5h (was: 3h 20m) > Improve the exception handling in HMSHandler > > > Key: HIVE-25055 > URL: https://issues.apache.org/jira/browse/HIVE-25055 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Minor > Labels: pull-request-available > Time Spent: 3.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24970) Reject location and managed locations in DDL for REMOTE databases.
[ https://issues.apache.org/jira/browse/HIVE-24970?focusedWorklogId=611515=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611515 ] ASF GitHub Bot logged work on HIVE-24970: - Author: ASF GitHub Bot Created on: 15/Jun/21 18:27 Start Date: 15/Jun/21 18:27 Worklog Time Spent: 10m Work Description: nrg4878 commented on a change in pull request #2389: URL: https://github.com/apache/hive/pull/2389#discussion_r652049945 ## File path: parser/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g ## @@ -1156,7 +1156,7 @@ createDatabaseStatement dbManagedLocation? dbConnectorName? (KW_WITH KW_DBPROPERTIES dbprops=dbProperties)? --> {$remote != null}? ^(TOK_CREATEDATABASE $name ifNotExists? databaseComment? $dbprops? dbConnectorName?) +-> {$remote != null}? ^(TOK_CREATEDATABASE $name ifNotExists? dbLocation? dbManagedLocation? databaseComment? $dbprops? dbConnectorName?) Review comment: Let me ponder on the first part. create database localdb using connector xyz If this does not throw an exception, we should change this behavior to force users to be explicit and specify REMOTE. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611515) Time Spent: 50m (was: 40m) > Reject location and managed locations in DDL for REMOTE databases. > -- > > Key: HIVE-24970 > URL: https://issues.apache.org/jira/browse/HIVE-24970 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: Naveen Gangam >Assignee: Dantong Dong >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > This was part of the review feedback from Yongzhi. Creating a followup jira > to track this discussion. > So, using DB connector for DB, will not create managed tables? > > @nrg4878 nrg4878 1 hour ago Author Member > we don't support create/drop/alter in REMOTE databases at this point. the > concepts of managed vs external is not in the picture at this point. When we > do support it, it will be application to the hive connectors only (or other > hive based connectors like AWS Glue) > > @nrg4878 nrg4878 2 minutes ago Author Member > will file a separate jira for this. Basically, instead of ignoring the > location and managedlocation that may be specified for remote database, the > grammer needs to not accept any locations in the DDL at all. > The argument is fair, why accept something we do not honor or entirely > irrelevant for such databases. However, this requires some thought when we > have additional connectors for remote hive instances. It might have some > relevance in terms of security with Ranger etc. > So will create new jira for followup discussion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25168) Add mutable validWriteIdList
[ https://issues.apache.org/jira/browse/HIVE-25168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363817#comment-17363817 ] Yu-Wen Lai commented on HIVE-25168: --- We've decided not to put this in Hive since there is no other use case in Hive now. > Add mutable validWriteIdList > > > Key: HIVE-25168 > URL: https://issues.apache.org/jira/browse/HIVE-25168 > Project: Hive > Issue Type: New Feature > Components: storage-api >Reporter: Yu-Wen Lai >Assignee: Yu-Wen Lai >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Although the current implementation for validWriteIdList is not strictly > immutable, it is in some sense to provide a read-only view snapshot. This > change is to add another class to provide functionalities for manipulating > the writeIdList. We could use this to keep writeIdList up-to-date in an > external cache layer for event-based metadata refreshing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24425) Create table in REMOTE db should fail
[ https://issues.apache.org/jira/browse/HIVE-24425?focusedWorklogId=611492=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611492 ] ASF GitHub Bot logged work on HIVE-24425: - Author: ASF GitHub Bot Created on: 15/Jun/21 17:46 Start Date: 15/Jun/21 17:46 Worklog Time Spent: 10m Work Description: dantongdong commented on a change in pull request #2393: URL: https://github.com/apache/hive/pull/2393#discussion_r652018186 ## File path: ql/src/test/results/clientnegative/createTbl_remoteDB_fail.q.out ## @@ -0,0 +1,42 @@ +PREHOOK: query: CREATE CONNECTOR IF NOT EXISTS mysql_test +TYPE 'mysql' +URL 'jdbc:mysql://nightly1.apache.org:3306/hive1' +COMMENT 'test connector' +WITH DCPROPERTIES ( +"hive.sql.dbcp.username"="hive1", +"hive.sql.dbcp.password"="hive1") +PREHOOK: type: CREATEDATACONNECTOR +PREHOOK: Output: connector:mysql_test +POSTHOOK: query: CREATE CONNECTOR IF NOT EXISTS mysql_test +TYPE 'mysql' +URL 'jdbc:mysql://nightly1.apache.org:3306/hive1' +COMMENT 'test connector' +WITH DCPROPERTIES ( +"hive.sql.dbcp.username"="hive1", +"hive.sql.dbcp.password"="hive1") +POSTHOOK: type: CREATEDATACONNECTOR +POSTHOOK: Output: connector:mysql_test +PREHOOK: query: SHOW CONNECTORS +PREHOOK: type: SHOWDATACONNECTORS +POSTHOOK: query: SHOW CONNECTORS +POSTHOOK: type: SHOWDATACONNECTORS +mysql_test +PREHOOK: query: CREATE REMOTE database mysql_db using mysql_test with DBPROPERTIES("connector.remoteDbName"="hive1") +PREHOOK: type: CREATEDATABASE +PREHOOK: Output: database:mysql_db + A masked pattern was here +POSTHOOK: query: CREATE REMOTE database mysql_db using mysql_test with DBPROPERTIES("connector.remoteDbName"="hive1") +POSTHOOK: type: CREATEDATABASE +POSTHOOK: Output: database:mysql_db + A masked pattern was here +PREHOOK: query: USE mysql_db +PREHOOK: type: SWITCHDATABASE +PREHOOK: Input: database:mysql_db +POSTHOOK: query: USE mysql_db +POSTHOOK: type: SWITCHDATABASE +POSTHOOK: Input: database:mysql_db +PREHOOK: query: create table bees (id int, name string) +PREHOOK: type: CREATETABLE +PREHOOK: Output: database:mysql_db +PREHOOK: Output: mysql_db@bees +FAILED: Execution Error, return code 4 from org.apache.hadoop.hive.ql.ddl.DDLTask. MetaException(message:Could not instantiate a provider for database mysql_db) Review comment: Yes. It seems I forgot to recompile the code when generating golden .out file. And I am failing my own test in Jenkins because of that. Will change -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611492) Time Spent: 0.5h (was: 20m) > Create table in REMOTE db should fail > - > > Key: HIVE-24425 > URL: https://issues.apache.org/jira/browse/HIVE-24425 > Project: Hive > Issue Type: Sub-task >Reporter: Naveen Gangam >Assignee: Dantong Dong >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently it creates the table in that DB but show tables does not show > anything. Preventing the creation of table will resolve this inconsistency > too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24970) Reject location and managed locations in DDL for REMOTE databases.
[ https://issues.apache.org/jira/browse/HIVE-24970?focusedWorklogId=611490=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611490 ] ASF GitHub Bot logged work on HIVE-24970: - Author: ASF GitHub Bot Created on: 15/Jun/21 17:40 Start Date: 15/Jun/21 17:40 Worklog Time Spent: 10m Work Description: dantongdong commented on a change in pull request #2389: URL: https://github.com/apache/hive/pull/2389#discussion_r652014451 ## File path: parser/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g ## @@ -1156,7 +1156,7 @@ createDatabaseStatement dbManagedLocation? dbConnectorName? (KW_WITH KW_DBPROPERTIES dbprops=dbProperties)? --> {$remote != null}? ^(TOK_CREATEDATABASE $name ifNotExists? databaseComment? $dbprops? dbConnectorName?) +-> {$remote != null}? ^(TOK_CREATEDATABASE $name ifNotExists? dbLocation? dbManagedLocation? databaseComment? $dbprops? dbConnectorName?) Review comment: It doesn't throw an exception because this line is not in charge of the actual parsing and compiling. This line is in charge of generating ASTNode based on the pared and compiled result. Not including dbLocation and dbManagedLocation in this line will only make it ignore dbLocation and dbManaged when generating ASTNode. The actual parsing and compiling is done in the lines above it(line1131-1139). All the create database statement shares the same parse and compile code, which means that all create database(including remote)statement will parse and compile dbLocation and dbManagedLocation. Likewise, if you do something like: "create database localdb using connector xyz" will not throw an error neither. I can also fix this if it is not desired behavior. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611490) Time Spent: 40m (was: 0.5h) > Reject location and managed locations in DDL for REMOTE databases. > -- > > Key: HIVE-24970 > URL: https://issues.apache.org/jira/browse/HIVE-24970 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: Naveen Gangam >Assignee: Dantong Dong >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > This was part of the review feedback from Yongzhi. Creating a followup jira > to track this discussion. > So, using DB connector for DB, will not create managed tables? > > @nrg4878 nrg4878 1 hour ago Author Member > we don't support create/drop/alter in REMOTE databases at this point. the > concepts of managed vs external is not in the picture at this point. When we > do support it, it will be application to the hive connectors only (or other > hive based connectors like AWS Glue) > > @nrg4878 nrg4878 2 minutes ago Author Member > will file a separate jira for this. Basically, instead of ignoring the > location and managedlocation that may be specified for remote database, the > grammer needs to not accept any locations in the DDL at all. > The argument is fair, why accept something we do not honor or entirely > irrelevant for such databases. However, this requires some thought when we > have additional connectors for remote hive instances. It might have some > relevance in terms of security with Ranger etc. > So will create new jira for followup discussion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25173) Fix build failure of hive-pre-upgrade due to missing dependency on pentaho-aggdesigner-algorithm
[ https://issues.apache.org/jira/browse/HIVE-25173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363800#comment-17363800 ] Julian Hyde commented on HIVE-25173: I made a release of this library under my groupid on maven central. I don’t recall the coordinates but you can find them in Calcite (calcite depends on the new version). If conjars.org is at the root of this problem, let me know. I know the owner of that repo. He took it offline to find out who, if anyone, was using it. > Fix build failure of hive-pre-upgrade due to missing dependency on > pentaho-aggdesigner-algorithm > > > Key: HIVE-25173 > URL: https://issues.apache.org/jira/browse/HIVE-25173 > Project: Hive > Issue Type: Improvement >Affects Versions: 3.1.2 >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > {noformat} > [ERROR] Failed to execute goal on project hive-pre-upgrade: Could not resolve > dependencies for project org.apache.hive:hive-pre-upgrade:jar:4.0.0-SNAPSHOT: > Failure to find org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde in > https://repo.maven.apache.org/maven2 was cached in the local repository, > resolution will not be reattempted until the update interval of central has > elapsed or updates are forced > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24970) Reject location and managed locations in DDL for REMOTE databases.
[ https://issues.apache.org/jira/browse/HIVE-24970?focusedWorklogId=611478=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611478 ] ASF GitHub Bot logged work on HIVE-24970: - Author: ASF GitHub Bot Created on: 15/Jun/21 17:23 Start Date: 15/Jun/21 17:23 Worklog Time Spent: 10m Work Description: dantongdong commented on a change in pull request #2389: URL: https://github.com/apache/hive/pull/2389#discussion_r651999419 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/database/create/CreateDatabaseAnalyzer.java ## @@ -78,10 +78,17 @@ public void analyzeInternal(ASTNode root) throws SemanticException { ASTNode nextNode = (ASTNode) root.getChild(i); connectorName = ((ASTNode)nextNode).getChild(0).getText(); outputs.add(toWriteEntity(connectorName)); -if (managedLocationUri != null) { - outputs.remove(toWriteEntity(managedLocationUri)); - managedLocationUri = null; + +// HIVE-2436: Reject location and managed locations in DDL for REMOTE databases. +if (locationUri != null || managedLocationUri != null ) { + if (locationUri == null) { +outputs.remove(toWriteEntity(locationUri)); Review comment: Will change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611478) Time Spent: 0.5h (was: 20m) > Reject location and managed locations in DDL for REMOTE databases. > -- > > Key: HIVE-24970 > URL: https://issues.apache.org/jira/browse/HIVE-24970 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: Naveen Gangam >Assignee: Dantong Dong >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > This was part of the review feedback from Yongzhi. Creating a followup jira > to track this discussion. > So, using DB connector for DB, will not create managed tables? > > @nrg4878 nrg4878 1 hour ago Author Member > we don't support create/drop/alter in REMOTE databases at this point. the > concepts of managed vs external is not in the picture at this point. When we > do support it, it will be application to the hive connectors only (or other > hive based connectors like AWS Glue) > > @nrg4878 nrg4878 2 minutes ago Author Member > will file a separate jira for this. Basically, instead of ignoring the > location and managedlocation that may be specified for remote database, the > grammer needs to not accept any locations in the DDL at all. > The argument is fair, why accept something we do not honor or entirely > irrelevant for such databases. However, this requires some thought when we > have additional connectors for remote hive instances. It might have some > relevance in terms of security with Ranger etc. > So will create new jira for followup discussion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24970) Reject location and managed locations in DDL for REMOTE databases.
[ https://issues.apache.org/jira/browse/HIVE-24970?focusedWorklogId=611477=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611477 ] ASF GitHub Bot logged work on HIVE-24970: - Author: ASF GitHub Bot Created on: 15/Jun/21 17:22 Start Date: 15/Jun/21 17:22 Worklog Time Spent: 10m Work Description: dantongdong commented on a change in pull request #2389: URL: https://github.com/apache/hive/pull/2389#discussion_r651998832 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/database/create/CreateDatabaseAnalyzer.java ## @@ -78,10 +78,17 @@ public void analyzeInternal(ASTNode root) throws SemanticException { ASTNode nextNode = (ASTNode) root.getChild(i); connectorName = ((ASTNode)nextNode).getChild(0).getText(); outputs.add(toWriteEntity(connectorName)); -if (managedLocationUri != null) { - outputs.remove(toWriteEntity(managedLocationUri)); - managedLocationUri = null; + +// HIVE-2436: Reject location and managed locations in DDL for REMOTE databases. Review comment: It was not the correct reference indeed. Will change -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611477) Time Spent: 20m (was: 10m) > Reject location and managed locations in DDL for REMOTE databases. > -- > > Key: HIVE-24970 > URL: https://issues.apache.org/jira/browse/HIVE-24970 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: Naveen Gangam >Assignee: Dantong Dong >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > This was part of the review feedback from Yongzhi. Creating a followup jira > to track this discussion. > So, using DB connector for DB, will not create managed tables? > > @nrg4878 nrg4878 1 hour ago Author Member > we don't support create/drop/alter in REMOTE databases at this point. the > concepts of managed vs external is not in the picture at this point. When we > do support it, it will be application to the hive connectors only (or other > hive based connectors like AWS Glue) > > @nrg4878 nrg4878 2 minutes ago Author Member > will file a separate jira for this. Basically, instead of ignoring the > location and managedlocation that may be specified for remote database, the > grammer needs to not accept any locations in the DDL at all. > The argument is fair, why accept something we do not honor or entirely > irrelevant for such databases. However, this requires some thought when we > have additional connectors for remote hive instances. It might have some > relevance in terms of security with Ranger etc. > So will create new jira for followup discussion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25213) Implement List getTables() for existing connectors.
[ https://issues.apache.org/jira/browse/HIVE-25213?focusedWorklogId=611475=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611475 ] ASF GitHub Bot logged work on HIVE-25213: - Author: ASF GitHub Bot Created on: 15/Jun/21 17:19 Start Date: 15/Jun/21 17:19 Worklog Time Spent: 10m Work Description: dantongdong commented on a change in pull request #2371: URL: https://github.com/apache/hive/pull/2371#discussion_r651997059 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/dataconnector/jdbc/AbstractJDBCConnectorProvider.java ## @@ -129,7 +129,30 @@ protected Connection getConnection() { * @throws MetaException To indicate any failures with executing this API * @param regex */ - @Override public abstract List getTables(String regex) throws MetaException; + @Override public List getTables(String regex) throws MetaException { +ResultSet rs = null; +try { + rs = fetchTablesViaDBMetaData(regex); + if (rs != null) { +List tables = new ArrayList(); +while(rs.next()) { + tables.add(getTable(rs.getString(3))); Review comment: If I am looking at the right [place](https://javadoc.scijava.org/Java6/java/sql/DatabaseMetaData.html). It does not specify its behavior neither. But I think it makes more sense to just leave out the one with column exception and continue returning the rest instead of stopping? Because we want the behavior to be as close as getTables(), we can just filter out the ones that has corrupted columns and return as many as we can. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611475) Time Spent: 50m (was: 40m) > Implement List getTables() for existing connectors. > -- > > Key: HIVE-25213 > URL: https://issues.apache.org/jira/browse/HIVE-25213 > Project: Hive > Issue Type: Sub-task >Reporter: Naveen Gangam >Assignee: Dantong Dong >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > In the initial implementation, connector providers do not implement the > getTables(string pattern) spi. We had deferred it for later. Only > getTableNames() and getTable() were implemented. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25238) Make SSL cipher suites configurable for Hive Web UI and HS2
[ https://issues.apache.org/jira/browse/HIVE-25238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen resolved HIVE-25238. - Fix Version/s: 4.0.0 Resolution: Fixed > Make SSL cipher suites configurable for Hive Web UI and HS2 > --- > > Key: HIVE-25238 > URL: https://issues.apache.org/jira/browse/HIVE-25238 > Project: Hive > Issue Type: Improvement > Components: HiveServer2, Web UI >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 50m > Remaining Estimate: 0h > > When starting a jetty http server, one can explicitly exclude certain > (unsecure) > SSL cipher suites. This can be especially important, when Hive > needs to be compliant with security regulations. Need add properties to > support Hive WebUi and HiveServer2 to this > For Hive Binary Cli Server, we can set include certain SSL cipher suites. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25213) Implement List getTables() for existing connectors.
[ https://issues.apache.org/jira/browse/HIVE-25213?focusedWorklogId=611443=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611443 ] ASF GitHub Bot logged work on HIVE-25213: - Author: ASF GitHub Bot Created on: 15/Jun/21 16:34 Start Date: 15/Jun/21 16:34 Worklog Time Spent: 10m Work Description: dantongdong commented on a change in pull request #2371: URL: https://github.com/apache/hive/pull/2371#discussion_r651962841 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -3799,10 +3799,33 @@ public Table get_table_core(GetTableRequest getTableRequest) throws MetaExceptio @Override public GetTablesResult get_table_objects_by_name_req(GetTablesRequest req) throws TException { String catName = req.isSetCatName() ? req.getCatName() : getDefaultCatalog(conf); +if (isDatabaseRemote(req.getDbName())) { + return new GetTablesResult(getRemoteTableObjectsInternal(req.getDbName(), req.getTblNames(), req.getTablesPattern())); +} return new GetTablesResult(getTableObjectsInternal(catName, req.getDbName(), req.getTblNames(), req.getCapabilities(), req.getProjectionSpec(), req.getTablesPattern())); } + private String tableNames2regex(List tableNames) { +return "/^(" + String.join("|", tableNames) + ")$/"; Review comment: Good catch. Will look into that. As a side note, it seems that at least mysql does not like ".*" to be passed in as a regex getTable API call and will throw an error. I have to manually changed the regex to null, is it normal? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611443) Time Spent: 40m (was: 0.5h) > Implement List getTables() for existing connectors. > -- > > Key: HIVE-25213 > URL: https://issues.apache.org/jira/browse/HIVE-25213 > Project: Hive > Issue Type: Sub-task >Reporter: Naveen Gangam >Assignee: Dantong Dong >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > In the initial implementation, connector providers do not implement the > getTables(string pattern) spi. We had deferred it for later. Only > getTableNames() and getTable() were implemented. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25244) Hive predicate pushdown with Parquet format for `date` as partitioned column name produce empty resultset
[ https://issues.apache.org/jira/browse/HIVE-25244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aniket Adnaik reassigned HIVE-25244: Assignee: Aniket Adnaik > Hive predicate pushdown with Parquet format for `date` as partitioned column > name produce empty resultset > - > > Key: HIVE-25244 > URL: https://issues.apache.org/jira/browse/HIVE-25244 > Project: Hive > Issue Type: Bug > Components: Hive, Parquet >Affects Versions: 3.1.0, 3.1.1, 3.1.2 >Reporter: Aniket Adnaik >Assignee: Aniket Adnaik >Priority: Major > Fix For: 3.1.0, 3.1.1, 3.1.2, 3.2.0 > > Attachments: test_table3_data.tar.gz > > > Hive predicate push down with Parquet format for partitioned column with > column name as keyword -> `date` produces empty result set. > If any of the followings configs is set to false, then the select query > returns results. > hive.optimize.ppd.storage, hive.optimize.ppd , hive.optimize.index.filter . > Repro steps: > -- > 1. > 1) Create an external partitioned table in Hive > CREATE EXTERNAL TABLE `test_table3`(`id` string) PARTITIONED BY (`date` > string) STORED AS parquet; > 2) In spark-shell create data frame and write the data parquet file > import java.sql.Timestamp > import org.apache.spark.sql.Row > import org.apache.spark.sql.types._ > import spark.implicits._ > val someDF = Seq(("1", "05172021"),("2", "05172021"), ("3", "06182021"), > ("4", "07192021")).toDF("id", "date") > someDF.write.mode("overwrite").parquet(" path>/hive/warehouse/external/test_table3/date=05172021") > 3) In Hive change the permissions and add partition to the table > $> hdfs dfs -chmod -R 777 /hive/warehouse/external/test_table3 > Hive Beeline -> > ALTER TABLE test_table3 ADD PARTITION(`date`='05172021') LOCATION ' path>/hive/warehouse/external/test_table3/date=05172021' > 4) SELECT * FROM test_table3; <- produces all rows > SELECT * FROM test_table3 WHERE `date`='05172021'; <--- produces no rows > SET hive.optimize.ppd.storage=false; <--- turn off ppd push down optimization > SELECT * FROM test_table3 WHERE `date`='05172021'; <--- produces rows after > setting above config to false > Attaching parquet data files for reference: > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23556) Support hive.metastore.limit.partition.request for get_partitions_ps
[ https://issues.apache.org/jira/browse/HIVE-23556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363755#comment-17363755 ] Zoltan Haindrich commented on HIVE-23556: - [~touchida]: could you please open a PR on github with your patch? > Support hive.metastore.limit.partition.request for get_partitions_ps > > > Key: HIVE-23556 > URL: https://issues.apache.org/jira/browse/HIVE-23556 > Project: Hive > Issue Type: Improvement >Reporter: Toshihiko Uchida >Assignee: Toshihiko Uchida >Priority: Minor > Attachments: HIVE-23556.2.patch, HIVE-23556.3.patch, > HIVE-23556.4.patch, HIVE-23556.patch > > > HIVE-13884 added the configuration hive.metastore.limit.partition.request to > limit the number of partitions that can be requested. > Currently, it takes in effect for the following MetaStore APIs > * get_partitions, > * get_partitions_with_auth, > * get_partitions_by_filter, > * get_partitions_spec_by_filter, > * get_partitions_by_expr, > but not for > * get_partitions_ps, > * get_partitions_ps_with_auth. > This issue proposes to apply the configuration also to get_partitions_ps and > get_partitions_ps_with_auth. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25213) Implement List getTables() for existing connectors.
[ https://issues.apache.org/jira/browse/HIVE-25213?focusedWorklogId=611436=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611436 ] ASF GitHub Bot logged work on HIVE-25213: - Author: ASF GitHub Bot Created on: 15/Jun/21 16:25 Start Date: 15/Jun/21 16:25 Worklog Time Spent: 10m Work Description: dantongdong commented on a change in pull request #2371: URL: https://github.com/apache/hive/pull/2371#discussion_r651955572 ## File path: itests/qtest/target/db_for_connectortest.db/service.properties ## @@ -0,0 +1,23 @@ +#/private/tmp/db_for_connectortest.db Review comment: Yes, we have discussed this over the meeting but I forgot to add it. Will change it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611436) Time Spent: 0.5h (was: 20m) > Implement List getTables() for existing connectors. > -- > > Key: HIVE-25213 > URL: https://issues.apache.org/jira/browse/HIVE-25213 > Project: Hive > Issue Type: Sub-task >Reporter: Naveen Gangam >Assignee: Dantong Dong >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > In the initial implementation, connector providers do not implement the > getTables(string pattern) spi. We had deferred it for later. Only > getTableNames() and getTable() were implemented. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25238) Make SSL cipher suites configurable for Hive Web UI and HS2
[ https://issues.apache.org/jira/browse/HIVE-25238?focusedWorklogId=611404=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611404 ] ASF GitHub Bot logged work on HIVE-25238: - Author: ASF GitHub Bot Created on: 15/Jun/21 15:38 Start Date: 15/Jun/21 15:38 Worklog Time Spent: 10m Work Description: yongzhi merged pull request #2385: URL: https://github.com/apache/hive/pull/2385 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611404) Time Spent: 50m (was: 40m) > Make SSL cipher suites configurable for Hive Web UI and HS2 > --- > > Key: HIVE-25238 > URL: https://issues.apache.org/jira/browse/HIVE-25238 > Project: Hive > Issue Type: Improvement > Components: HiveServer2, Web UI >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > When starting a jetty http server, one can explicitly exclude certain > (unsecure) > SSL cipher suites. This can be especially important, when Hive > needs to be compliant with security regulations. Need add properties to > support Hive WebUi and HiveServer2 to this > For Hive Binary Cli Server, we can set include certain SSL cipher suites. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25204) Reduce overhead of adding notification log for update partition column statistics
[ https://issues.apache.org/jira/browse/HIVE-25204?focusedWorklogId=611385=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611385 ] ASF GitHub Bot logged work on HIVE-25204: - Author: ASF GitHub Bot Created on: 15/Jun/21 15:07 Start Date: 15/Jun/21 15:07 Worklog Time Spent: 10m Work Description: maheshk114 commented on a change in pull request #2365: URL: https://github.com/apache/hive/pull/2365#discussion_r651886691 ## File path: hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java ## @@ -905,25 +909,32 @@ public void onUpdatePartitionColumnStat(UpdatePartitionColumnStatEvent updatePar } @Override - public void onUpdatePartitionColumnStatDirectSql(UpdatePartitionColumnStatEvent updatePartColStatEvent, - Connection dbConn, SQLGenerator sqlGenerator) + public void onUpdatePartitionColumnStatInBatch(UpdatePartitionColumnStatEventBatch updatePartColStatEventBatch, Review comment: Yes ..for normal listeners we can not change as they may be expecting the data in that way. The change is done only for transactional listeners (DBNotification listener is a transactional listener). The notification for transactional listeners are done inside the direct sql method as it has to be within same transaction. For normal listeners we need not have them in the same transaction. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611385) Time Spent: 1h (was: 50m) > Reduce overhead of adding notification log for update partition column > statistics > - > > Key: HIVE-25204 > URL: https://issues.apache.org/jira/browse/HIVE-25204 > Project: Hive > Issue Type: Sub-task > Components: Hive, HiveServer2 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: perfomance, pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > The notification logs for partition column statistics can be optimised by > adding them in batch. In the current implementation its done one by one > causing multiple sql execution in the backend RDBMS. These SQL executions can > be batched to reduce the execution time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25233) Removing deprecated unix_timestamp UDF
[ https://issues.apache.org/jira/browse/HIVE-25233?focusedWorklogId=611380=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611380 ] ASF GitHub Bot logged work on HIVE-25233: - Author: ASF GitHub Bot Created on: 15/Jun/21 14:52 Start Date: 15/Jun/21 14:52 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on pull request #2380: URL: https://github.com/apache/hive/pull/2380#issuecomment-861567894 you seem to have hit the "surefire bug" in your testruns ; you should run: ``` mvn install -pl itests/qtest -Pqsplits -Dtest=org.apache.hadoop.hive.cli.split6.TestMiniLlapLocalCliDriver -Dtest.output.overwrite ``` or similar locally to run the whole split which was "failed-to-read" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611380) Time Spent: 20m (was: 10m) > Removing deprecated unix_timestamp UDF > -- > > Key: HIVE-25233 > URL: https://issues.apache.org/jira/browse/HIVE-25233 > Project: Hive > Issue Type: Task > Components: UDF >Affects Versions: All Versions >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Trivial > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Description > Since unix_timestamp() UDF was deprecated as part of > https://issues.apache.org/jira/browse/HIVE-10728. Internal > GenericUDFUnixTimeStamp extend GenericUDFToUnixTimeStamp and call > to_utc_timestamp() for unix_timestamp(string date) & unix_timestamp(string > date, string pattern). > unix_timestamp() => CURRENT_TIMESTAMP > unix_timestamp(string date) => to_unix_timestamp() > unix_timestamp(string date, string pattern) => to_unix_timestamp() > We should clean up unix_timestamp() and points to to_unix_timestamp() > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25249) Fix TestWorker
[ https://issues.apache.org/jira/browse/HIVE-25249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363684#comment-17363684 ] Zoltan Haindrich commented on HIVE-25249: - fyi: [~pvargacl], [~dkuzmenko] > Fix TestWorker > -- > > Key: HIVE-25249 > URL: https://issues.apache.org/jira/browse/HIVE-25249 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Priority: Major > > http://ci.hive.apache.org/job/hive-precommit/job/PR-2381/1/ > http://ci.hive.apache.org/job/hive-flaky-check/236/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25246) Fix the clean up of open repl created transactions
[ https://issues.apache.org/jira/browse/HIVE-25246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25246: -- Labels: pull-request-available (was: ) > Fix the clean up of open repl created transactions > -- > > Key: HIVE-25246 > URL: https://issues.apache.org/jira/browse/HIVE-25246 > Project: Hive > Issue Type: Improvement >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25246) Fix the clean up of open repl created transactions
[ https://issues.apache.org/jira/browse/HIVE-25246?focusedWorklogId=611367=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611367 ] ASF GitHub Bot logged work on HIVE-25246: - Author: ASF GitHub Bot Created on: 15/Jun/21 14:27 Start Date: 15/Jun/21 14:27 Worklog Time Spent: 10m Work Description: hmangla98 opened a new pull request #2396: URL: https://github.com/apache/hive/pull/2396 Fix the clean up of open repl created transactions -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611367) Remaining Estimate: 0h Time Spent: 10m > Fix the clean up of open repl created transactions > -- > > Key: HIVE-25246 > URL: https://issues.apache.org/jira/browse/HIVE-25246 > Project: Hive > Issue Type: Improvement >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23633) Metastore some JDO query objects do not close properly
[ https://issues.apache.org/jira/browse/HIVE-23633?focusedWorklogId=611363=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611363 ] ASF GitHub Bot logged work on HIVE-23633: - Author: ASF GitHub Bot Created on: 15/Jun/21 14:16 Start Date: 15/Jun/21 14:16 Worklog Time Spent: 10m Work Description: dengzhhu653 opened a new pull request #2344: URL: https://github.com/apache/hive/pull/2344 …properly ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611363) Time Spent: 5h 50m (was: 5h 40m) > Metastore some JDO query objects do not close properly > -- > > Key: HIVE-23633 > URL: https://issues.apache.org/jira/browse/HIVE-23633 > Project: Hive > Issue Type: Bug >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-23633.01.patch > > Time Spent: 5h 50m > Remaining Estimate: 0h > > After patched [HIVE-10895|https://issues.apache.org/jira/browse/HIVE-10895], > The metastore still has seen a memory leak on db resources: many > StatementImpls left unclosed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25248) Fix TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1
[ https://issues.apache.org/jira/browse/HIVE-25248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-25248: - Assignee: Panagiotis Garefalakis > Fix > TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1 > --- > > Key: HIVE-25248 > URL: https://issues.apache.org/jira/browse/HIVE-25248 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Panagiotis Garefalakis >Priority: Major > > This test is failing randomly recently > http://ci.hive.apache.org/job/hive-flaky-check/233/testReport/org.apache.hadoop.hive.llap.tezplugins/TestLlapTaskSchedulerService/testForcedLocalityMultiplePreemptionsSameHost1/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23633) Metastore some JDO query objects do not close properly
[ https://issues.apache.org/jira/browse/HIVE-23633?focusedWorklogId=611362=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611362 ] ASF GitHub Bot logged work on HIVE-23633: - Author: ASF GitHub Bot Created on: 15/Jun/21 14:14 Start Date: 15/Jun/21 14:14 Worklog Time Spent: 10m Work Description: dengzhhu653 closed pull request #2344: URL: https://github.com/apache/hive/pull/2344 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611362) Time Spent: 5h 40m (was: 5.5h) > Metastore some JDO query objects do not close properly > -- > > Key: HIVE-23633 > URL: https://issues.apache.org/jira/browse/HIVE-23633 > Project: Hive > Issue Type: Bug >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-23633.01.patch > > Time Spent: 5h 40m > Remaining Estimate: 0h > > After patched [HIVE-10895|https://issues.apache.org/jira/browse/HIVE-10895], > The metastore still has seen a memory leak on db resources: many > StatementImpls left unclosed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25055) Improve the exception handling in HMSHandler
[ https://issues.apache.org/jira/browse/HIVE-25055?focusedWorklogId=611360=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611360 ] ASF GitHub Bot logged work on HIVE-25055: - Author: ASF GitHub Bot Created on: 15/Jun/21 14:12 Start Date: 15/Jun/21 14:12 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on pull request #2218: URL: https://github.com/apache/hive/pull/2218#issuecomment-861534654 Hi, @vihangk1, would you mind taking another look if have secs? thanks! :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611360) Time Spent: 3h 20m (was: 3h 10m) > Improve the exception handling in HMSHandler > > > Key: HIVE-25055 > URL: https://issues.apache.org/jira/browse/HIVE-25055 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Minor > Labels: pull-request-available > Time Spent: 3h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24802) Show operation log at webui
[ https://issues.apache.org/jira/browse/HIVE-24802?focusedWorklogId=611356=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611356 ] ASF GitHub Bot logged work on HIVE-24802: - Author: ASF GitHub Bot Created on: 15/Jun/21 14:10 Start Date: 15/Jun/21 14:10 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on pull request #1998: URL: https://github.com/apache/hive/pull/1998#issuecomment-861532574 @pvary sorry for pinging, cloud the pr be moved a little further? thank you! :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611356) Time Spent: 6h (was: 5h 50m) > Show operation log at webui > --- > > Key: HIVE-24802 > URL: https://issues.apache.org/jira/browse/HIVE-24802 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Minor > Labels: pull-request-available > Attachments: operationlog.png > > Time Spent: 6h > Remaining Estimate: 0h > > Currently we provide getQueryLog in HiveStatement to fetch the operation log, > and the operation log would be deleted on operation closing(delay for the > canceled operation). Sometimes it's would be not easy for the user(jdbc) or > administrators to deep into the details of the finished(failed) operation, so > we present the operation log on webui and keep the operation log for some > time for latter analysis. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25248) Fix TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1
[ https://issues.apache.org/jira/browse/HIVE-25248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-25248: Summary: Fix TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1 (was: Fix .TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1) > Fix > TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1 > --- > > Key: HIVE-25248 > URL: https://issues.apache.org/jira/browse/HIVE-25248 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Priority: Major > > This test is failing randomly recently > http://ci.hive.apache.org/job/hive-flaky-check/233/testReport/org.apache.hadoop.hive.llap.tezplugins/TestLlapTaskSchedulerService/testForcedLocalityMultiplePreemptionsSameHost1/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25248) Fix .TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1
[ https://issues.apache.org/jira/browse/HIVE-25248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363658#comment-17363658 ] Zoltan Haindrich commented on HIVE-25248: - another testcase from this class might also a candaidate http://ci.hive.apache.org/job/hive-precommit/job/PR-2381/4/testReport/junit/org.apache.hadoop.hive.llap.tezplugins/TestLlapTaskSchedulerService/Testing___split_17___PostProcess___testPreemption/ > Fix > .TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1 > > > Key: HIVE-25248 > URL: https://issues.apache.org/jira/browse/HIVE-25248 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Priority: Major > > This test is failing randomly recently > http://ci.hive.apache.org/job/hive-flaky-check/233/testReport/org.apache.hadoop.hive.llap.tezplugins/TestLlapTaskSchedulerService/testForcedLocalityMultiplePreemptionsSameHost1/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25246) Fix the clean up of open repl created transactions
[ https://issues.apache.org/jira/browse/HIVE-25246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haymant Mangla reassigned HIVE-25246: - > Fix the clean up of open repl created transactions > -- > > Key: HIVE-25246 > URL: https://issues.apache.org/jira/browse/HIVE-25246 > Project: Hive > Issue Type: Improvement >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25234) Implement ALTER TABLE ... SET PARTITION SPEC to change partitioning on Iceberg tables
[ https://issues.apache.org/jira/browse/HIVE-25234?focusedWorklogId=611283=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611283 ] ASF GitHub Bot logged work on HIVE-25234: - Author: ASF GitHub Bot Created on: 15/Jun/21 12:18 Start Date: 15/Jun/21 12:18 Worklog Time Spent: 10m Work Description: lcspinter commented on a change in pull request #2382: URL: https://github.com/apache/hive/pull/2382#discussion_r651732394 ## File path: iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerNoScan.java ## @@ -155,6 +155,58 @@ public void after() throws Exception { HiveIcebergStorageHandlerTestUtils.close(shell); } + @Test + public void testSetPartitionTransform() { +Schema schema = new Schema( +optional(1, "id", Types.LongType.get()), +optional(2, "year_field", Types.DateType.get()), +optional(3, "month_field", Types.TimestampType.withZone()), +optional(4, "day_field", Types.TimestampType.withoutZone()), +optional(5, "hour_field", Types.TimestampType.withoutZone()), +optional(6, "truncate_field", Types.StringType.get()), +optional(7, "bucket_field", Types.StringType.get()), +optional(8, "identity_field", Types.StringType.get()) +); + +TableIdentifier identifier = TableIdentifier.of("default", "part_test"); +shell.executeStatement("CREATE EXTERNAL TABLE " + identifier + +" PARTITIONED BY SPEC (year(year_field), hour(hour_field), " + +"truncate(2, truncate_field), bucket(2, bucket_field), identity_field)" + +" STORED BY ICEBERG " + +testTables.locationForCreateTableSQL(identifier) + +"TBLPROPERTIES ('" + InputFormatConfig.TABLE_SCHEMA + "'='" + +SchemaParser.toJson(schema) + "', " + +"'" + InputFormatConfig.CATALOG_NAME + "'='" + Catalogs.ICEBERG_DEFAULT_CATALOG_NAME + "')"); + +PartitionSpec spec = PartitionSpec.builderFor(schema) +.year("year_field") +.hour("hour_field") +.truncate("truncate_field", 2) +.bucket("bucket_field", 2) +.identity("identity_field") +.build(); + +Table table = testTables.loadTable(identifier); +Assert.assertEquals(spec, table.spec()); + +shell.executeStatement("ALTER TABLE default.part_test SET PARTITION SPEC(year(year_field), month(month_field), " + +"day(day_field))"); + +spec = PartitionSpec.builderFor(schema) +.withSpecId(1) +.year("year_field") +.alwaysNull("hour_field", "hour_field_hour") +.alwaysNull("truncate_field", "truncate_field_trunc") +.alwaysNull("bucket_field", "bucket_field_bucket") +.alwaysNull("identity_field", "identity_field") +.month("month_field") +.day("day_field") +.build(); + +table.refresh(); +Assert.assertEquals(spec, table.spec()); + } + Review comment: I added an additional test case to cover the partition evolution. Thanks for the idea! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611283) Time Spent: 1h 10m (was: 1h) > Implement ALTER TABLE ... SET PARTITION SPEC to change partitioning on > Iceberg tables > - > > Key: HIVE-25234 > URL: https://issues.apache.org/jira/browse/HIVE-25234 > Project: Hive > Issue Type: Improvement >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Provide a way to change the schema and the Iceberg partitioning specification > using Hive syntax. > {code:sql} > ALTER TABLE tbl SET PARTITION SPEC(...) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25204) Reduce overhead of adding notification log for update partition column statistics
[ https://issues.apache.org/jira/browse/HIVE-25204?focusedWorklogId=611282=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611282 ] ASF GitHub Bot logged work on HIVE-25204: - Author: ASF GitHub Bot Created on: 15/Jun/21 12:17 Start Date: 15/Jun/21 12:17 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #2365: URL: https://github.com/apache/hive/pull/2365#discussion_r651731735 ## File path: hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java ## @@ -905,25 +909,32 @@ public void onUpdatePartitionColumnStat(UpdatePartitionColumnStatEvent updatePar } @Override - public void onUpdatePartitionColumnStatDirectSql(UpdatePartitionColumnStatEvent updatePartColStatEvent, - Connection dbConn, SQLGenerator sqlGenerator) + public void onUpdatePartitionColumnStatInBatch(UpdatePartitionColumnStatEventBatch updatePartColStatEventBatch, Review comment: updatePartitionColStatsForOneBatch calls the direct sql method but also does the following MetaStoreListenerNotifier.notifyEvent(listeners, EventMessage.EventType.UPDATE_PARTITION_COLUMN_STAT, new UpdatePartitionColumnStatEvent(colStats, partVals, parameters, tbl, writeId, this)); for each event. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611282) Time Spent: 50m (was: 40m) > Reduce overhead of adding notification log for update partition column > statistics > - > > Key: HIVE-25204 > URL: https://issues.apache.org/jira/browse/HIVE-25204 > Project: Hive > Issue Type: Sub-task > Components: Hive, HiveServer2 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: perfomance, pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > The notification logs for partition column statistics can be optimised by > adding them in batch. In the current implementation its done one by one > causing multiple sql execution in the backend RDBMS. These SQL executions can > be batched to reduce the execution time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25242) Query performs extremely slow with hive.vectorized.adaptor.usage.mode = chosen
[ https://issues.apache.org/jira/browse/HIVE-25242?focusedWorklogId=611280=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611280 ] ASF GitHub Bot logged work on HIVE-25242: - Author: ASF GitHub Bot Created on: 15/Jun/21 12:13 Start Date: 15/Jun/21 12:13 Worklog Time Spent: 10m Work Description: zeroflag commented on pull request #2390: URL: https://github.com/apache/hive/pull/2390#issuecomment-861446954 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611280) Time Spent: 20m (was: 10m) > Query performs extremely slow with hive.vectorized.adaptor.usage.mode = > chosen > --- > > Key: HIVE-25242 > URL: https://issues.apache.org/jira/browse/HIVE-25242 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: Attila Magyar >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > If hive.vectorized.adaptor.usage.mode is set to chosen only certain UDFS are > vectorized through the vectorized adaptor. > Queries like this one, performs very slowly because the concat is not chosen > to be vectorized. > {code:java} > select count(*) from tbl where to_date(concat(year, '-', month, '-', day)) > between to_date('2018-12-01') and to_date('2021-03-01'); {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25235) Remove ThreadPoolExecutorWithOomHook
[ https://issues.apache.org/jira/browse/HIVE-25235?focusedWorklogId=611220=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611220 ] ASF GitHub Bot logged work on HIVE-25235: - Author: ASF GitHub Bot Created on: 15/Jun/21 10:01 Start Date: 15/Jun/21 10:01 Worklog Time Spent: 10m Work Description: miklosgergely commented on a change in pull request #2383: URL: https://github.com/apache/hive/pull/2383#discussion_r651639876 ## File path: service/src/java/org/apache/hive/service/cli/session/SessionManager.java ## @@ -224,7 +224,7 @@ private void createBackgroundOperationPool() { // Threads terminate when they are idle for more than the keepAliveTime // A bounded blocking queue is used to queue incoming operations, if #operations > poolSize String threadPoolName = "HiveServer2-Background-Pool"; -final BlockingQueue queue = new LinkedBlockingQueue(poolQueueSize); +final BlockingQueue queue = new LinkedBlockingQueue(poolQueueSize); Review comment: You can use <> here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611220) Time Spent: 50m (was: 40m) > Remove ThreadPoolExecutorWithOomHook > > > Key: HIVE-25235 > URL: https://issues.apache.org/jira/browse/HIVE-25235 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > While I was looking at [HIVE-24846] to better perform OOM logging and I just > realized that this is not a good way to handle OOM. > https://stackoverflow.com/questions/1692230/is-it-possible-to-catch-out-of-memory-exception-in-java > bq. there's likely no easy way for you to recover from it if you do catch it > If we want to handle OOM, it's best to do it from outside. It's best to do it > with the JVM facilities: > {{-XX:+ExitOnOutOfMemoryError}} > {{-XX:OnOutOfMemoryError}} > It seems odd that the OOM handler attempts to load a handler and then do more > work when clearly the server is hosed at this point and just requesting to do > more work will further add to memory pressure. > The current OOM logic in {{HiveServer2OomHookRunner}} causes HiveServer2 to > shutdown, but we already have that with the JVM shutdown hook. This JVM > shutdown hook is triggered if {{-XX:OnOutOfMemoryError="kill -9 %p"}} exists > and is the appropriate thing to do. > https://github.com/apache/hive/blob/328d197431b2ff1000fd9c56ce758013eff81ad8/service/src/java/org/apache/hive/service/server/HiveServer2.java#L443-L444 > https://github.com/apache/hive/blob/cb0541a31b87016fae8e4c0e7130532c6e5f8de7/service/src/java/org/apache/hive/service/server/HiveServer2OomHookRunner.java#L42-L44 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25173) Fix build failure of hive-pre-upgrade due to missing dependency on pentaho-aggdesigner-algorithm
[ https://issues.apache.org/jira/browse/HIVE-25173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363518#comment-17363518 ] Stamatis Zampetakis commented on HIVE-25173: The pentaho-aggdesigner-algorithm artifacts were present only in the spring repo and not in maven central. I think the spring repo was retired so it is not possible to find the artifacts any more. For sure we can exclude this dep and move forward for future releases but any old tag that relies on this cannot be built (including calcite-1.10.0 itself). Don't know if it is possible to push these artifacts to maven central at this point. CC [~jhyde] just for awareness. > Fix build failure of hive-pre-upgrade due to missing dependency on > pentaho-aggdesigner-algorithm > > > Key: HIVE-25173 > URL: https://issues.apache.org/jira/browse/HIVE-25173 > Project: Hive > Issue Type: Improvement >Affects Versions: 3.1.2 >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > {noformat} > [ERROR] Failed to execute goal on project hive-pre-upgrade: Could not resolve > dependencies for project org.apache.hive:hive-pre-upgrade:jar:4.0.0-SNAPSHOT: > Failure to find org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde in > https://repo.maven.apache.org/maven2 was cached in the local repository, > resolution will not be reattempted until the update interval of central has > elapsed or updates are forced > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25224) Multi insert statements involving tables with different bucketing_versions results in error
[ https://issues.apache.org/jira/browse/HIVE-25224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich resolved HIVE-25224. - Fix Version/s: 4.0.0 Resolution: Fixed merged into master. Thank you Krisztian for reviewing the changes! > Multi insert statements involving tables with different bucketing_versions > results in error > --- > > Key: HIVE-25224 > URL: https://issues.apache.org/jira/browse/HIVE-25224 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 50m > Remaining Estimate: 0h > > {code} > drop table if exists t; > drop table if exists t2; > drop table if exists t3; > create table t (a integer); > create table t2 (a integer); > create table t3 (a integer); > alter table t set tblproperties ('bucketing_version'='1'); > explain from t3 insert into t select a insert into t2 select a; > {code} > results in > {code} > Error: Error while compiling statement: FAILED: RuntimeException Error > setting bucketingVersion for group: [[op: FS[2], bucketingVersion=1], [op: > FS[11], bucketingVersion=2]] (state=42000,code=4) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25224) Multi insert statements involving tables with different bucketing_versions results in error
[ https://issues.apache.org/jira/browse/HIVE-25224?focusedWorklogId=611212=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611212 ] ASF GitHub Bot logged work on HIVE-25224: - Author: ASF GitHub Bot Created on: 15/Jun/21 09:45 Start Date: 15/Jun/21 09:45 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on pull request #2381: URL: https://github.com/apache/hive/pull/2381#issuecomment-861354189 I've merged this - testruns had multiple unrelated failures in a row; I'll go and clean up these flaky tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611212) Time Spent: 50m (was: 40m) > Multi insert statements involving tables with different bucketing_versions > results in error > --- > > Key: HIVE-25224 > URL: https://issues.apache.org/jira/browse/HIVE-25224 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > {code} > drop table if exists t; > drop table if exists t2; > drop table if exists t3; > create table t (a integer); > create table t2 (a integer); > create table t3 (a integer); > alter table t set tblproperties ('bucketing_version'='1'); > explain from t3 insert into t select a insert into t2 select a; > {code} > results in > {code} > Error: Error while compiling statement: FAILED: RuntimeException Error > setting bucketingVersion for group: [[op: FS[2], bucketingVersion=1], [op: > FS[11], bucketingVersion=2]] (state=42000,code=4) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25224) Multi insert statements involving tables with different bucketing_versions results in error
[ https://issues.apache.org/jira/browse/HIVE-25224?focusedWorklogId=611209=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611209 ] ASF GitHub Bot logged work on HIVE-25224: - Author: ASF GitHub Bot Created on: 15/Jun/21 09:44 Start Date: 15/Jun/21 09:44 Worklog Time Spent: 10m Work Description: kgyrtkirk merged pull request #2381: URL: https://github.com/apache/hive/pull/2381 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611209) Time Spent: 40m (was: 0.5h) > Multi insert statements involving tables with different bucketing_versions > results in error > --- > > Key: HIVE-25224 > URL: https://issues.apache.org/jira/browse/HIVE-25224 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > {code} > drop table if exists t; > drop table if exists t2; > drop table if exists t3; > create table t (a integer); > create table t2 (a integer); > create table t3 (a integer); > alter table t set tblproperties ('bucketing_version'='1'); > explain from t3 insert into t select a insert into t2 select a; > {code} > results in > {code} > Error: Error while compiling statement: FAILED: RuntimeException Error > setting bucketingVersion for group: [[op: FS[2], bucketingVersion=1], [op: > FS[11], bucketingVersion=2]] (state=42000,code=4) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25238) Make SSL cipher suites configurable for Hive Web UI and HS2
[ https://issues.apache.org/jira/browse/HIVE-25238?focusedWorklogId=611207=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611207 ] ASF GitHub Bot logged work on HIVE-25238: - Author: ASF GitHub Bot Created on: 15/Jun/21 09:38 Start Date: 15/Jun/21 09:38 Worklog Time Spent: 10m Work Description: yongzhi commented on a change in pull request #2385: URL: https://github.com/apache/hive/pull/2385#discussion_r651622848 ## File path: service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java ## @@ -24,10 +24,13 @@ import java.util.concurrent.SynchronousQueue; import java.util.concurrent.ThreadPoolExecutor; import java.util.concurrent.TimeUnit; +import java.util.Set; import javax.net.ssl.KeyManagerFactory; import javax.ws.rs.HttpMethod; +import com.google.common.base.Splitter; Review comment: It is not new, it has been used by HttpServer.java -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611207) Time Spent: 40m (was: 0.5h) > Make SSL cipher suites configurable for Hive Web UI and HS2 > --- > > Key: HIVE-25238 > URL: https://issues.apache.org/jira/browse/HIVE-25238 > Project: Hive > Issue Type: Improvement > Components: HiveServer2, Web UI >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > When starting a jetty http server, one can explicitly exclude certain > (unsecure) > SSL cipher suites. This can be especially important, when Hive > needs to be compliant with security regulations. Need add properties to > support Hive WebUi and HiveServer2 to this > For Hive Binary Cli Server, we can set include certain SSL cipher suites. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25238) Make SSL cipher suites configurable for Hive Web UI and HS2
[ https://issues.apache.org/jira/browse/HIVE-25238?focusedWorklogId=611203=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611203 ] ASF GitHub Bot logged work on HIVE-25238: - Author: ASF GitHub Bot Created on: 15/Jun/21 09:36 Start Date: 15/Jun/21 09:36 Worklog Time Spent: 10m Work Description: yongzhi commented on a change in pull request #2385: URL: https://github.com/apache/hive/pull/2385#discussion_r651621053 ## File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ## @@ -4183,9 +4186,14 @@ private static void populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal HIVE_SERVER2_SSL_KEYSTORE_PASSWORD("hive.server2.keystore.password", "", "SSL certificate keystore password."), HIVE_SERVER2_SSL_KEYSTORE_TYPE("hive.server2.keystore.type", "", -"SSL certificate keystore type."), +"SSL certificate keystore type."), HIVE_SERVER2_SSL_KEYMANAGERFACTORY_ALGORITHM("hive.server2.keymanagerfactory.algorithm", "", -"SSL certificate keystore algorithm."), +"SSL certificate keystore algorithm."), + HIVE_SERVER2_SSL_HTTP_EXCLUDE_CIPHERSUITES("hive.server2.http.exclude.ciphersuites", "", Review comment: No, for binary Thrift, it uses TSSLTransportFactory.getServerSocket which does not support excluding cipher suites. The setting for HTTP (webui/hs2) can be different from binary as they have different clients. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611203) Time Spent: 0.5h (was: 20m) > Make SSL cipher suites configurable for Hive Web UI and HS2 > --- > > Key: HIVE-25238 > URL: https://issues.apache.org/jira/browse/HIVE-25238 > Project: Hive > Issue Type: Improvement > Components: HiveServer2, Web UI >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > When starting a jetty http server, one can explicitly exclude certain > (unsecure) > SSL cipher suites. This can be especially important, when Hive > needs to be compliant with security regulations. Need add properties to > support Hive WebUi and HiveServer2 to this > For Hive Binary Cli Server, we can set include certain SSL cipher suites. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25245) Hive: merge r20 to cdpd-master
[ https://issues.apache.org/jira/browse/HIVE-25245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich reassigned HIVE-25245: --- Assignee: (was: Zoltan Haindrich) > Hive: merge r20 to cdpd-master > -- > > Key: HIVE-25245 > URL: https://issues.apache.org/jira/browse/HIVE-25245 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: László Bodor >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25245) Hive: merge r20 to cdpd-master
[ https://issues.apache.org/jira/browse/HIVE-25245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich reassigned HIVE-25245: --- Assignee: Zoltan Haindrich > Hive: merge r20 to cdpd-master > -- > > Key: HIVE-25245 > URL: https://issues.apache.org/jira/browse/HIVE-25245 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: László Bodor >Assignee: Zoltan Haindrich >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-8143) Create root scratch dir with 733 instead of 777 perms
[ https://issues.apache.org/jira/browse/HIVE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie reassigned HIVE-8143: --- Assignee: Vaibhav Gumashta > Create root scratch dir with 733 instead of 777 perms > - > > Key: HIVE-8143 > URL: https://issues.apache.org/jira/browse/HIVE-8143 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 0.14.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta >Priority: Major > Labels: TODOC14 > Fix For: 0.14.0 > > Attachments: HIVE-8143.1.patch, HIVE-8143.2.patch > > > hive.exec.scratchdir which is treated as the root scratch directory on hdfs > only needs to be writable by all. We can use 733 instead of 777 for doing > that. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-8143) Create root scratch dir with 733 instead of 777 perms
[ https://issues.apache.org/jira/browse/HIVE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie reassigned HIVE-8143: --- Assignee: (was: lujie) > Create root scratch dir with 733 instead of 777 perms > - > > Key: HIVE-8143 > URL: https://issues.apache.org/jira/browse/HIVE-8143 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 0.14.0 >Reporter: Vaibhav Gumashta >Priority: Major > Labels: TODOC14 > Fix For: 0.14.0 > > Attachments: HIVE-8143.1.patch, HIVE-8143.2.patch > > > hive.exec.scratchdir which is treated as the root scratch directory on hdfs > only needs to be writable by all. We can use 733 instead of 777 for doing > that. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-8143) Create root scratch dir with 733 instead of 777 perms
[ https://issues.apache.org/jira/browse/HIVE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie reassigned HIVE-8143: --- Assignee: lujie (was: Vaibhav Gumashta) > Create root scratch dir with 733 instead of 777 perms > - > > Key: HIVE-8143 > URL: https://issues.apache.org/jira/browse/HIVE-8143 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 0.14.0 >Reporter: Vaibhav Gumashta >Assignee: lujie >Priority: Major > Labels: TODOC14 > Fix For: 0.14.0 > > Attachments: HIVE-8143.1.patch, HIVE-8143.2.patch > > > hive.exec.scratchdir which is treated as the root scratch directory on hdfs > only needs to be writable by all. We can use 733 instead of 777 for doing > that. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25204) Reduce overhead of adding notification log for update partition column statistics
[ https://issues.apache.org/jira/browse/HIVE-25204?focusedWorklogId=611172=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611172 ] ASF GitHub Bot logged work on HIVE-25204: - Author: ASF GitHub Bot Created on: 15/Jun/21 07:35 Start Date: 15/Jun/21 07:35 Worklog Time Spent: 10m Work Description: maheshk114 commented on a change in pull request #2365: URL: https://github.com/apache/hive/pull/2365#discussion_r651526249 ## File path: hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java ## @@ -1194,108 +1205,82 @@ static String quoteString(String input) { private void addNotificationLog(NotificationEvent event, ListenerEvent listenerEvent, Connection dbConn, SQLGenerator sqlGenerator) throws MetaException, SQLException { LOG.debug("DbNotificationListener: adding notification log for : {}", event.getMessage()); +addNotificationLogBatch(Collections.singletonList(event), Collections.singletonList(listenerEvent), +dbConn, sqlGenerator); + } + + private void addNotificationLogBatch(List eventList, List listenerEventList, + Connection dbConn, SQLGenerator sqlGenerator) throws MetaException, SQLException { if ((dbConn == null) || (sqlGenerator == null)) { LOG.info("connection or sql generator is not set so executing sql via DN"); - process(event, listenerEvent); + for (int idx = 0; idx < eventList.size(); idx++) { +LOG.debug("DbNotificationListener: adding notification log for : {}", eventList.get(idx).getMessage()); +process(eventList.get(idx), listenerEventList.get(idx)); + } return; } -Statement stmt = null; -PreparedStatement pst = null; -ResultSet rs = null; -try { - stmt = dbConn.createStatement(); - event.setMessageFormat(msgEncoder.getMessageFormat()); +try (Statement stmt = dbConn.createStatement()) { if (sqlGenerator.getDbProduct().isMYSQL()) { stmt.execute("SET @@session.sql_mode=ANSI_QUOTES"); } +} - long nextEventId = getNextEventId(dbConn, sqlGenerator); - - long nextNLId = getNextNLId(dbConn, sqlGenerator, - "org.apache.hadoop.hive.metastore.model.MNotificationLog"); - - String insertVal; - String columns; - List params = new ArrayList(); - - // Construct the values string, parameters and column string step by step simultaneously so - // that the positions of columns and of their corresponding values do not go out of sync. - - // Notification log id - columns = "\"NL_ID\""; - insertVal = "" + nextNLId; - - // Event id - columns = columns + ", \"EVENT_ID\""; - insertVal = insertVal + "," + nextEventId; - - // Event time - columns = columns + ", \"EVENT_TIME\""; - insertVal = insertVal + "," + event.getEventTime(); +long nextEventId = getNextEventId(dbConn, sqlGenerator, eventList.size()); +long nextNLId = getNextNLId(dbConn, sqlGenerator, +"org.apache.hadoop.hive.metastore.model.MNotificationLog", eventList.size()); - // Event type - columns = columns + ", \"EVENT_TYPE\""; - insertVal = insertVal + ", ?"; - params.add(event.getEventType()); +String columns = "\"NL_ID\"" + ", \"EVENT_ID\"" + ", \"EVENT_TIME\"" + ", \"EVENT_TYPE\"" + ", \"MESSAGE\"" ++ ", \"MESSAGE_FORMAT\"" + ", \"DB_NAME\"" + ", \"TBL_NAME\"" + ", \"CAT_NAME\""; +String insertVal = "insert into \"NOTIFICATION_LOG\" (" + columns + ") VALUES (" ++ "?,?,?,?,?,?,?,?,?" ++ ")"; - // Message - columns = columns + ", \"MESSAGE\""; - insertVal = insertVal + ", ?"; - params.add(event.getMessage()); +try (PreparedStatement pst = dbConn.prepareStatement(insertVal)) { + int numRows = 0; - // Message format - columns = columns + ", \"MESSAGE_FORMAT\""; - insertVal = insertVal + ", ?"; - params.add(event.getMessageFormat()); + for (int idx = 0; idx < eventList.size(); idx++) { +NotificationEvent event = eventList.get(idx); +ListenerEvent listenerEvent = listenerEventList.get(idx); - // Database name, optional - String dbName = event.getDbName(); - if (dbName != null) { -assert dbName.equals(dbName.toLowerCase()); -columns = columns + ", \"DB_NAME\""; -insertVal = insertVal + ", ?"; -params.add(dbName); - } +LOG.debug("DbNotificationListener: adding notification log for : {}", event.getMessage()); +event.setMessageFormat(msgEncoder.getMessageFormat()); - // Table name, optional - String tableName = event.getTableName(); - if (tableName != null) { -assert
[jira] [Commented] (HIVE-25104) Backward incompatible timestamp serialization in Parquet for certain timezones
[ https://issues.apache.org/jira/browse/HIVE-25104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363430#comment-17363430 ] Nikhil Gupta commented on HIVE-25104: - [~jcamachorodriguez] [~zabetak] I am seeing a lot of timestamp issues and backward compatibility issues (Parquet, Avro, ORC) being pushed. Can we track them under a single Umbrella Jira? > Backward incompatible timestamp serialization in Parquet for certain timezones > -- > > Key: HIVE-25104 > URL: https://issues.apache.org/jira/browse/HIVE-25104 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Affects Versions: 3.1.0 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 2h > Remaining Estimate: 0h > > HIVE-12192, HIVE-20007 changed the way that timestamp computations are > performed and to some extend how timestamps are serialized and deserialized > in files (Parquet, Avro). > In versions that include HIVE-12192 or HIVE-20007 the serialization in > Parquet files is not backwards compatible. In other words writing timestamps > with a version of Hive that includes HIVE-12192/HIVE-20007 and reading them > with another (not including the previous issues) may lead to different > results depending on the default timezone of the system. > Consider the following scenario where the default system timezone is set to > US/Pacific. > At apache/master commit 37f13b02dff94e310d77febd60f93d5a205254d3 > {code:sql} > CREATE EXTERNAL TABLE employee(eid INT,birth timestamp) STORED AS PARQUET > LOCATION '/tmp/hiveexttbl/employee'; > INSERT INTO employee VALUES (1, '1880-01-01 00:00:00'); > INSERT INTO employee VALUES (2, '1884-01-01 00:00:00'); > INSERT INTO employee VALUES (3, '1990-01-01 00:00:00'); > SELECT * FROM employee; > {code} > |1|1880-01-01 00:00:00| > |2|1884-01-01 00:00:00| > |3|1990-01-01 00:00:00| > At apache/branch-2.3 commit 324f9faf12d4b91a9359391810cb3312c004d356 > {code:sql} > CREATE EXTERNAL TABLE employee(eid INT,birth timestamp) STORED AS PARQUET > LOCATION '/tmp/hiveexttbl/employee'; > SELECT * FROM employee; > {code} > |1|1879-12-31 23:52:58| > |2|1884-01-01 00:00:00| > |3|1990-01-01 00:00:00| > The timestamp for {{eid=1}} in branch-2.3 is different from the one in master. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25204) Reduce overhead of adding notification log for update partition column statistics
[ https://issues.apache.org/jira/browse/HIVE-25204?focusedWorklogId=611165=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611165 ] ASF GitHub Bot logged work on HIVE-25204: - Author: ASF GitHub Bot Created on: 15/Jun/21 06:56 Start Date: 15/Jun/21 06:56 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #2365: URL: https://github.com/apache/hive/pull/2365#discussion_r651500422 ## File path: hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java ## @@ -905,25 +909,32 @@ public void onUpdatePartitionColumnStat(UpdatePartitionColumnStatEvent updatePar } @Override - public void onUpdatePartitionColumnStatDirectSql(UpdatePartitionColumnStatEvent updatePartColStatEvent, - Connection dbConn, SQLGenerator sqlGenerator) + public void onUpdatePartitionColumnStatInBatch(UpdatePartitionColumnStatEventBatch updatePartColStatEventBatch, Review comment: Can this be used for updatePartitionColStatsForOneBatch as well in HMSHandler -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611165) Time Spent: 0.5h (was: 20m) > Reduce overhead of adding notification log for update partition column > statistics > - > > Key: HIVE-25204 > URL: https://issues.apache.org/jira/browse/HIVE-25204 > Project: Hive > Issue Type: Sub-task > Components: Hive, HiveServer2 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: perfomance, pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > The notification logs for partition column statistics can be optimised by > adding them in batch. In the current implementation its done one by one > causing multiple sql execution in the backend RDBMS. These SQL executions can > be batched to reduce the execution time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24991) Enable fetching deleted rows in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-24991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa resolved HIVE-24991. --- Resolution: Fixed Pushed to master. Thanks [~pgaref] for review. > Enable fetching deleted rows in vectorized mode > --- > > Key: HIVE-24991 > URL: https://issues.apache.org/jira/browse/HIVE-24991 > Project: Hive > Issue Type: Improvement > Components: Vectorization >Reporter: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 4h 50m > Remaining Estimate: 0h > > HIVE-24855 enables loading deleted rows from ORC tables when table property > *acid.fetch.deleted.rows* is true. > The goal of this jira is to enable this feature in vectorized orc batch > reader. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24991) Enable fetching deleted rows in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-24991?focusedWorklogId=611159=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611159 ] ASF GitHub Bot logged work on HIVE-24991: - Author: ASF GitHub Bot Created on: 15/Jun/21 06:32 Start Date: 15/Jun/21 06:32 Worklog Time Spent: 10m Work Description: kasakrisz merged pull request #2264: URL: https://github.com/apache/hive/pull/2264 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611159) Time Spent: 4h 50m (was: 4h 40m) > Enable fetching deleted rows in vectorized mode > --- > > Key: HIVE-24991 > URL: https://issues.apache.org/jira/browse/HIVE-24991 > Project: Hive > Issue Type: Improvement > Components: Vectorization >Reporter: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 4h 50m > Remaining Estimate: 0h > > HIVE-24855 enables loading deleted rows from ORC tables when table property > *acid.fetch.deleted.rows* is true. > The goal of this jira is to enable this feature in vectorized orc batch > reader. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25245) Hive: merge r20 to cdpd-master
[ https://issues.apache.org/jira/browse/HIVE-25245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor reassigned HIVE-25245: --- Assignee: (was: László Bodor) > Hive: merge r20 to cdpd-master > -- > > Key: HIVE-25245 > URL: https://issues.apache.org/jira/browse/HIVE-25245 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: László Bodor >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25245) Hive: merge r20 to cdpd-master
[ https://issues.apache.org/jira/browse/HIVE-25245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-25245: Component/s: Hive > Hive: merge r20 to cdpd-master > -- > > Key: HIVE-25245 > URL: https://issues.apache.org/jira/browse/HIVE-25245 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25245) Hive: merge r20 to cdpd-master
[ https://issues.apache.org/jira/browse/HIVE-25245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor resolved HIVE-25245. - Resolution: Invalid > Hive: merge r20 to cdpd-master > -- > > Key: HIVE-25245 > URL: https://issues.apache.org/jira/browse/HIVE-25245 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25245) Hive: merge r20 to cdpd-master
[ https://issues.apache.org/jira/browse/HIVE-25245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363395#comment-17363395 ] László Bodor commented on HIVE-25245: - sorry, wrong jira :D > Hive: merge r20 to cdpd-master > -- > > Key: HIVE-25245 > URL: https://issues.apache.org/jira/browse/HIVE-25245 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25245) Hive: merge r20 to cdpd-master
[ https://issues.apache.org/jira/browse/HIVE-25245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor reassigned HIVE-25245: --- > Hive: merge r20 to cdpd-master > -- > > Key: HIVE-25245 > URL: https://issues.apache.org/jira/browse/HIVE-25245 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25204) Reduce overhead of adding notification log for update partition column statistics
[ https://issues.apache.org/jira/browse/HIVE-25204?focusedWorklogId=611154=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611154 ] ASF GitHub Bot logged work on HIVE-25204: - Author: ASF GitHub Bot Created on: 15/Jun/21 06:04 Start Date: 15/Jun/21 06:04 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #2365: URL: https://github.com/apache/hive/pull/2365#discussion_r651473262 ## File path: hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java ## @@ -1194,108 +1205,82 @@ static String quoteString(String input) { private void addNotificationLog(NotificationEvent event, ListenerEvent listenerEvent, Connection dbConn, SQLGenerator sqlGenerator) throws MetaException, SQLException { LOG.debug("DbNotificationListener: adding notification log for : {}", event.getMessage()); +addNotificationLogBatch(Collections.singletonList(event), Collections.singletonList(listenerEvent), +dbConn, sqlGenerator); + } + + private void addNotificationLogBatch(List eventList, List listenerEventList, + Connection dbConn, SQLGenerator sqlGenerator) throws MetaException, SQLException { if ((dbConn == null) || (sqlGenerator == null)) { LOG.info("connection or sql generator is not set so executing sql via DN"); - process(event, listenerEvent); + for (int idx = 0; idx < eventList.size(); idx++) { +LOG.debug("DbNotificationListener: adding notification log for : {}", eventList.get(idx).getMessage()); +process(eventList.get(idx), listenerEventList.get(idx)); + } return; } -Statement stmt = null; -PreparedStatement pst = null; -ResultSet rs = null; -try { - stmt = dbConn.createStatement(); - event.setMessageFormat(msgEncoder.getMessageFormat()); +try (Statement stmt = dbConn.createStatement()) { if (sqlGenerator.getDbProduct().isMYSQL()) { stmt.execute("SET @@session.sql_mode=ANSI_QUOTES"); } +} - long nextEventId = getNextEventId(dbConn, sqlGenerator); - - long nextNLId = getNextNLId(dbConn, sqlGenerator, - "org.apache.hadoop.hive.metastore.model.MNotificationLog"); - - String insertVal; - String columns; - List params = new ArrayList(); - - // Construct the values string, parameters and column string step by step simultaneously so - // that the positions of columns and of their corresponding values do not go out of sync. - - // Notification log id - columns = "\"NL_ID\""; - insertVal = "" + nextNLId; - - // Event id - columns = columns + ", \"EVENT_ID\""; - insertVal = insertVal + "," + nextEventId; - - // Event time - columns = columns + ", \"EVENT_TIME\""; - insertVal = insertVal + "," + event.getEventTime(); +long nextEventId = getNextEventId(dbConn, sqlGenerator, eventList.size()); +long nextNLId = getNextNLId(dbConn, sqlGenerator, +"org.apache.hadoop.hive.metastore.model.MNotificationLog", eventList.size()); - // Event type - columns = columns + ", \"EVENT_TYPE\""; - insertVal = insertVal + ", ?"; - params.add(event.getEventType()); +String columns = "\"NL_ID\"" + ", \"EVENT_ID\"" + ", \"EVENT_TIME\"" + ", \"EVENT_TYPE\"" + ", \"MESSAGE\"" ++ ", \"MESSAGE_FORMAT\"" + ", \"DB_NAME\"" + ", \"TBL_NAME\"" + ", \"CAT_NAME\""; +String insertVal = "insert into \"NOTIFICATION_LOG\" (" + columns + ") VALUES (" ++ "?,?,?,?,?,?,?,?,?" ++ ")"; - // Message - columns = columns + ", \"MESSAGE\""; - insertVal = insertVal + ", ?"; - params.add(event.getMessage()); +try (PreparedStatement pst = dbConn.prepareStatement(insertVal)) { + int numRows = 0; - // Message format - columns = columns + ", \"MESSAGE_FORMAT\""; - insertVal = insertVal + ", ?"; - params.add(event.getMessageFormat()); + for (int idx = 0; idx < eventList.size(); idx++) { +NotificationEvent event = eventList.get(idx); +ListenerEvent listenerEvent = listenerEventList.get(idx); - // Database name, optional - String dbName = event.getDbName(); - if (dbName != null) { -assert dbName.equals(dbName.toLowerCase()); -columns = columns + ", \"DB_NAME\""; -insertVal = insertVal + ", ?"; -params.add(dbName); - } +LOG.debug("DbNotificationListener: adding notification log for : {}", event.getMessage()); +event.setMessageFormat(msgEncoder.getMessageFormat()); - // Table name, optional - String tableName = event.getTableName(); - if (tableName != null) { -assert tableName.equals(tableName.toLowerCase()); -