[jira] [Commented] (HIVE-26157) Change Iceberg storage handler authz URI to metadata location
[ https://issues.apache.org/jira/browse/HIVE-26157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536462#comment-17536462 ] László Pintér commented on HIVE-26157: -- Merged into master. Thanks, [~pvary] for the review! > Change Iceberg storage handler authz URI to metadata location > - > > Key: HIVE-26157 > URL: https://issues.apache.org/jira/browse/HIVE-26157 > Project: Hive > Issue Type: Improvement >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 4h 20m > Remaining Estimate: 0h > > In HIVE-25964, the authz URI has been changed to "iceberg://db.table". > It is possible to set the metadata pointers of table A to point to table B, > and therefore you could read table B's data via querying table A. > {code:sql} > alter table A set tblproperties > ('metadata_location'='/path/to/B/snapshot.json', > 'previous_metadata_location'='/path/to/B/prev_snapshot.json'); {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (HIVE-26157) Change Iceberg storage handler authz URI to metadata location
[ https://issues.apache.org/jira/browse/HIVE-26157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Pintér resolved HIVE-26157. -- Resolution: Fixed > Change Iceberg storage handler authz URI to metadata location > - > > Key: HIVE-26157 > URL: https://issues.apache.org/jira/browse/HIVE-26157 > Project: Hive > Issue Type: Improvement >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 4h 20m > Remaining Estimate: 0h > > In HIVE-25964, the authz URI has been changed to "iceberg://db.table". > It is possible to set the metadata pointers of table A to point to table B, > and therefore you could read table B's data via querying table A. > {code:sql} > alter table A set tblproperties > ('metadata_location'='/path/to/B/snapshot.json', > 'previous_metadata_location'='/path/to/B/prev_snapshot.json'); {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26157) Change Iceberg storage handler authz URI to metadata location
[ https://issues.apache.org/jira/browse/HIVE-26157?focusedWorklogId=770038&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770038 ] ASF GitHub Bot logged work on HIVE-26157: - Author: ASF GitHub Bot Created on: 13/May/22 06:44 Start Date: 13/May/22 06:44 Worklog Time Spent: 10m Work Description: lcspinter merged PR #3226: URL: https://github.com/apache/hive/pull/3226 Issue Time Tracking --- Worklog Id: (was: 770038) Time Spent: 4h 20m (was: 4h 10m) > Change Iceberg storage handler authz URI to metadata location > - > > Key: HIVE-26157 > URL: https://issues.apache.org/jira/browse/HIVE-26157 > Project: Hive > Issue Type: Improvement >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 4h 20m > Remaining Estimate: 0h > > In HIVE-25964, the authz URI has been changed to "iceberg://db.table". > It is possible to set the metadata pointers of table A to point to table B, > and therefore you could read table B's data via querying table A. > {code:sql} > alter table A set tblproperties > ('metadata_location'='/path/to/B/snapshot.json', > 'previous_metadata_location'='/path/to/B/prev_snapshot.json'); {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26082) Upgrade DataNucleus dependency to 5.2.8
[ https://issues.apache.org/jira/browse/HIVE-26082?focusedWorklogId=770024&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770024 ] ASF GitHub Bot logged work on HIVE-26082: - Author: ASF GitHub Bot Created on: 13/May/22 04:57 Start Date: 13/May/22 04:57 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on PR #3148: URL: https://github.com/apache/hive/pull/3148#issuecomment-1125657111 @nrg4878 Could you please point me to metastore microbenchmarks tool. So that I can generate the performance number? Issue Time Tracking --- Worklog Id: (was: 770024) Time Spent: 40m (was: 0.5h) > Upgrade DataNucleus dependency to 5.2.8 > --- > > Key: HIVE-26082 > URL: https://issues.apache.org/jira/browse/HIVE-26082 > Project: Hive > Issue Type: Task >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Minor > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Upgrade > datanucleus-api-jdo 5.2.4 to 5.2.8 > datanucleus-core 5.2.4 to 5.2.10 > datanucleus-rdbms 5.2.4 to 5.2.10 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26046) MySQL's bit datatype is default to void datatype in hive
[ https://issues.apache.org/jira/browse/HIVE-26046?focusedWorklogId=770009&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770009 ] ASF GitHub Bot logged work on HIVE-26046: - Author: ASF GitHub Bot Created on: 13/May/22 03:18 Start Date: 13/May/22 03:18 Worklog Time Spent: 10m Work Description: zhangbutao commented on PR #3276: URL: https://github.com/apache/hive/pull/3276#issuecomment-1125617331 @zabetak thx. I'll run a full test and try to fix this test. Issue Time Tracking --- Worklog Id: (was: 770009) Time Spent: 40m (was: 0.5h) > MySQL's bit datatype is default to void datatype in hive > > > Key: HIVE-26046 > URL: https://issues.apache.org/jira/browse/HIVE-26046 > Project: Hive > Issue Type: Sub-task > Components: Standalone Metastore >Affects Versions: 4.0.0 >Reporter: Naveen Gangam >Assignee: zhangbutao >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > describe on a table that contains a "bit" datatype gets mapped to void. We > need a explicit conversion logic in the MySQL ConnectorProvider to map it to > a suitable datatype in hive. > {noformat} > +---+---++ > | col_name| data_type > | comment | > +---+---++ > | tbl_id| bigint > | from deserializer | > | create_time | int > | from deserializer | > | db_id | bigint > | from deserializer | > | last_access_time | int > | from deserializer | > | owner | varchar(767) > | from deserializer | > | owner_type| varchar(10) > | from deserializer | > | retention | int > | from deserializer | > | sd_id | bigint > | from deserializer | > | tbl_name | varchar(256) > | from deserializer | > | tbl_type | varchar(128) > | from deserializer | > | view_expanded_text| string > | from deserializer | > | view_original_text| string > | from deserializer | > | is_rewrite_enabled| void > | from deserializer | > | write_id | bigint > | from deserializer > {noformat} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-25872) Skip tracking of alterDatabase events for replication specific properties.
[ https://issues.apache.org/jira/browse/HIVE-25872?focusedWorklogId=769990&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769990 ] ASF GitHub Bot logged work on HIVE-25872: - Author: ASF GitHub Bot Created on: 13/May/22 01:58 Start Date: 13/May/22 01:58 Worklog Time Spent: 10m Work Description: rbalamohan commented on PR #2950: URL: https://github.com/apache/hive/pull/2950#issuecomment-1125581061 LGTM. +1 pending tests. Issue Time Tracking --- Worklog Id: (was: 769990) Time Spent: 1h 10m (was: 1h) > Skip tracking of alterDatabase events for replication specific properties. > -- > > Key: HIVE-25872 > URL: https://issues.apache.org/jira/browse/HIVE-25872 > Project: Hive > Issue Type: Improvement >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (HIVE-25230) add position and occurrence to instr()
[ https://issues.apache.org/jira/browse/HIVE-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536397#comment-17536397 ] Zhihua Deng commented on HIVE-25230: Merged into master. Thank you [~stigahuang] for the contribution! > add position and occurrence to instr() > -- > > Key: HIVE-25230 > URL: https://issues.apache.org/jira/browse/HIVE-25230 > Project: Hive > Issue Type: New Feature > Components: UDF >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0, 4.0.0-alpha-2 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Current instr() only supports two arguments: > {code:java} > instr(str, substr) - Returns the index of the first occurance of substr in str > {code} > Other systems (Vertica, Oracle, Impala etc) support additional position and > occurrence arguments: > {code:java} > instr(str, substr[, pos[, occurrence]]) > {code} > Oracle doc: > [https://docs.oracle.com/database/121/SQLRF/functions089.htm#SQLRF00651] > It'd be nice to support this as well. Otherwise, it's a SQL difference > between Impala and Hive. > Impala supports this in IMPALA-3973 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (HIVE-25230) add position and occurrence to instr()
[ https://issues.apache.org/jira/browse/HIVE-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng resolved HIVE-25230. Fix Version/s: 4.0.0 4.0.0-alpha-2 Resolution: Fixed > add position and occurrence to instr() > -- > > Key: HIVE-25230 > URL: https://issues.apache.org/jira/browse/HIVE-25230 > Project: Hive > Issue Type: New Feature > Components: UDF >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0, 4.0.0-alpha-2 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Current instr() only supports two arguments: > {code:java} > instr(str, substr) - Returns the index of the first occurance of substr in str > {code} > Other systems (Vertica, Oracle, Impala etc) support additional position and > occurrence arguments: > {code:java} > instr(str, substr[, pos[, occurrence]]) > {code} > Oracle doc: > [https://docs.oracle.com/database/121/SQLRF/functions089.htm#SQLRF00651] > It'd be nice to support this as well. Otherwise, it's a SQL difference > between Impala and Hive. > Impala supports this in IMPALA-3973 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-25230) add position and occurrence to instr()
[ https://issues.apache.org/jira/browse/HIVE-25230?focusedWorklogId=769973&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769973 ] ASF GitHub Bot logged work on HIVE-25230: - Author: ASF GitHub Bot Created on: 13/May/22 01:15 Start Date: 13/May/22 01:15 Worklog Time Spent: 10m Work Description: dengzhhu653 merged PR #2378: URL: https://github.com/apache/hive/pull/2378 Issue Time Tracking --- Worklog Id: (was: 769973) Time Spent: 1h 40m (was: 1.5h) > add position and occurrence to instr() > -- > > Key: HIVE-25230 > URL: https://issues.apache.org/jira/browse/HIVE-25230 > Project: Hive > Issue Type: New Feature > Components: UDF >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Current instr() only supports two arguments: > {code:java} > instr(str, substr) - Returns the index of the first occurance of substr in str > {code} > Other systems (Vertica, Oracle, Impala etc) support additional position and > occurrence arguments: > {code:java} > instr(str, substr[, pos[, occurrence]]) > {code} > Oracle doc: > [https://docs.oracle.com/database/121/SQLRF/functions089.htm#SQLRF00651] > It'd be nice to support this as well. Otherwise, it's a SQL difference > between Impala and Hive. > Impala supports this in IMPALA-3973 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-25872) Skip tracking of alterDatabase events for replication specific properties.
[ https://issues.apache.org/jira/browse/HIVE-25872?focusedWorklogId=769964&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769964 ] ASF GitHub Bot logged work on HIVE-25872: - Author: ASF GitHub Bot Created on: 13/May/22 00:21 Start Date: 13/May/22 00:21 Worklog Time Spent: 10m Work Description: rbalamohan commented on code in PR #2950: URL: https://github.com/apache/hive/pull/2950#discussion_r871907679 ## standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java: ## @@ -284,6 +285,19 @@ public static boolean checkIfDbNeedsToBeSkipped(Database db) { return false; } + public static List getReplicationDbProps() { +return Arrays.stream(ReplConst.class.getDeclaredFields()) +.filter(field -> Modifier.isStatic(field.getModifiers())) +.map(field -> { + try { +String prop = (String) field.get(String.class); +return prop.replace("\"", ""); + } catch (IllegalAccessException e) { +throw new RuntimeException(e); Review Comment: Log and add details about the reason for the error? ## standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java: ## @@ -284,6 +285,19 @@ public static boolean checkIfDbNeedsToBeSkipped(Database db) { return false; } + public static List getReplicationDbProps() { +return Arrays.stream(ReplConst.class.getDeclaredFields()) +.filter(field -> Modifier.isStatic(field.getModifiers())) +.map(field -> { + try { +String prop = (String) field.get(String.class); +return prop.replace("\"", ""); + } catch (IllegalAccessException e) { +throw new RuntimeException(e); Review Comment: Log and add details about the reason for the error for easier debugging? Issue Time Tracking --- Worklog Id: (was: 769964) Time Spent: 1h (was: 50m) > Skip tracking of alterDatabase events for replication specific properties. > -- > > Key: HIVE-25872 > URL: https://issues.apache.org/jira/browse/HIVE-25872 > Project: Hive > Issue Type: Improvement >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-25872) Skip tracking of alterDatabase events for replication specific properties.
[ https://issues.apache.org/jira/browse/HIVE-25872?focusedWorklogId=769963&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769963 ] ASF GitHub Bot logged work on HIVE-25872: - Author: ASF GitHub Bot Created on: 13/May/22 00:20 Start Date: 13/May/22 00:20 Worklog Time Spent: 10m Work Description: rbalamohan commented on code in PR #2950: URL: https://github.com/apache/hive/pull/2950#discussion_r871907679 ## standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java: ## @@ -284,6 +285,19 @@ public static boolean checkIfDbNeedsToBeSkipped(Database db) { return false; } + public static List getReplicationDbProps() { +return Arrays.stream(ReplConst.class.getDeclaredFields()) +.filter(field -> Modifier.isStatic(field.getModifiers())) +.map(field -> { + try { +String prop = (String) field.get(String.class); +return prop.replace("\"", ""); + } catch (IllegalAccessException e) { +throw new RuntimeException(e); Review Comment: Log ? Issue Time Tracking --- Worklog Id: (was: 769963) Time Spent: 50m (was: 40m) > Skip tracking of alterDatabase events for replication specific properties. > -- > > Key: HIVE-25872 > URL: https://issues.apache.org/jira/browse/HIVE-25872 > Project: Hive > Issue Type: Improvement >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Comment Edited] (HIVE-26071) JWT authentication for Thrift over HTTP in HiveMetaStore
[ https://issues.apache.org/jira/browse/HIVE-26071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536298#comment-17536298 ] Sourabh Goyal edited comment on HIVE-26071 at 5/12/22 7:09 PM: --- [~adondon] : Please find the answers * Would it be possible to run both protocol at same time (thrift and http?) - No. Only one mode can be enabled. But if there is a need, it should be easy to extend the current implementation to support both modes together * What's about the Authenticator interface to get for example the username or groups in the jwt claims? For what i saw but i am not sure the Authenticator interface is quite couple with hadoop/kerberos ugi ? - You are right, the authentication in JWT is not coupled with Kerberos. The user is expected to set the username (in the subject field) in the JWT and send that JWT in the header request to metastore server. The server, after validating the token, extracts the username from the subject field, executes the operation as that user via ugi.doAs(). * Is there already a design today to allow like storage based authorization implementation where authenticator can get information of who is authenticated but not Hadoop related? - Not sure if I understand it correctly. In the current implementation, metastore server during start phase, fetches jwks from a configurable url and validates all the future JWTs using this set. Let me know if you have any thoughts/concerns. was (Author: sourabh912): [~adondon] : Please find the answers * Would it be possible to run both protocol at same time (thrift and http?) - No. Only one mode can be enabled. But if there is a need, it should be easy to extend the current implementation to support both modes together * What's about the Authenticator interface to get for example the username or groups in the jwt claims? For what i saw but i am not sure the Authenticator interface is quite couple with hadoop/kerberos ugi ? - You are right, the authentication in JWT is not coupled with Kerberos. The user is expected to set the username (in the subject field) in the JWT and send that JWT in the header request to metastore server. The server, after validating the token, extracts the username from the subject field, executes the operation as that user via ugi.doAs(). * Is there already a design today to allow like storage based authorization implementation where authenticator can get information of who is authenticated but not Hadoop related? - Not sure if I understand it correctly. In the current implementation, metastore server during start phase, fetches jwks from a configurable url and validates all the future JWTs using this set. Let me know if you have any thoughts/concerns. > JWT authentication for Thrift over HTTP in HiveMetaStore > > > Key: HIVE-26071 > URL: https://issues.apache.org/jira/browse/HIVE-26071 > Project: Hive > Issue Type: New Feature > Components: Standalone Metastore >Reporter: Sourabh Goyal >Assignee: Sourabh Goyal >Priority: Major > Labels: pull-request-available > Time Spent: 7h > Remaining Estimate: 0h > > HIVE-25575 recently added a support for JWT authentication in HS2. This Jira > aims to add the same feature in HMS -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (HIVE-26071) JWT authentication for Thrift over HTTP in HiveMetaStore
[ https://issues.apache.org/jira/browse/HIVE-26071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536298#comment-17536298 ] Sourabh Goyal commented on HIVE-26071: -- [~adondon] : Please find the answers * Would it be possible to run both protocol at same time (thrift and http?) - No. Only one mode can be enabled. But if there is a need, it should be easy to extend the current implementation to support both modes together * What's about the Authenticator interface to get for example the username or groups in the jwt claims? For what i saw but i am not sure the Authenticator interface is quite couple with hadoop/kerberos ugi ? - You are right, the authentication in JWT is not coupled with Kerberos. The user is expected to set the username (in the subject field) in the JWT and send that JWT in the header request to metastore server. The server, after validating the token, extracts the username from the subject field, executes the operation as that user via ugi.doAs(). * Is there already a design today to allow like storage based authorization implementation where authenticator can get information of who is authenticated but not Hadoop related? - Not sure if I understand it correctly. In the current implementation, metastore server during start phase, fetches jwks from a configurable url and validates all the future JWTs using this set. Let me know if you have any thoughts/concerns. > JWT authentication for Thrift over HTTP in HiveMetaStore > > > Key: HIVE-26071 > URL: https://issues.apache.org/jira/browse/HIVE-26071 > Project: Hive > Issue Type: New Feature > Components: Standalone Metastore >Reporter: Sourabh Goyal >Assignee: Sourabh Goyal >Priority: Major > Labels: pull-request-available > Time Spent: 7h > Remaining Estimate: 0h > > HIVE-25575 recently added a support for JWT authentication in HS2. This Jira > aims to add the same feature in HMS -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (HIVE-26071) JWT authentication for Thrift over HTTP in HiveMetaStore
[ https://issues.apache.org/jira/browse/HIVE-26071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sourabh Goyal resolved HIVE-26071. -- Resolution: Fixed > JWT authentication for Thrift over HTTP in HiveMetaStore > > > Key: HIVE-26071 > URL: https://issues.apache.org/jira/browse/HIVE-26071 > Project: Hive > Issue Type: New Feature > Components: Standalone Metastore >Reporter: Sourabh Goyal >Assignee: Sourabh Goyal >Priority: Major > Labels: pull-request-available > Time Spent: 7h > Remaining Estimate: 0h > > HIVE-25575 recently added a support for JWT authentication in HS2. This Jira > aims to add the same feature in HMS -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26071) JWT authentication for Thrift over HTTP in HiveMetaStore
[ https://issues.apache.org/jira/browse/HIVE-26071?focusedWorklogId=769816&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769816 ] ASF GitHub Bot logged work on HIVE-26071: - Author: ASF GitHub Bot Created on: 12/May/22 18:46 Start Date: 12/May/22 18:46 Worklog Time Spent: 10m Work Description: sourabh912 commented on PR #3233: URL: https://github.com/apache/hive/pull/3233#issuecomment-1125314717 Thanks @nrg4878 @dengzhhu653 @saihemanth-cloudera @hsnusonic for the review and @yongzhi for the merge. Issue Time Tracking --- Worklog Id: (was: 769816) Time Spent: 7h (was: 6h 50m) > JWT authentication for Thrift over HTTP in HiveMetaStore > > > Key: HIVE-26071 > URL: https://issues.apache.org/jira/browse/HIVE-26071 > Project: Hive > Issue Type: New Feature > Components: Standalone Metastore >Reporter: Sourabh Goyal >Assignee: Sourabh Goyal >Priority: Major > Labels: pull-request-available > Time Spent: 7h > Remaining Estimate: 0h > > HIVE-25575 recently added a support for JWT authentication in HS2. This Jira > aims to add the same feature in HMS -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26071) JWT authentication for Thrift over HTTP in HiveMetaStore
[ https://issues.apache.org/jira/browse/HIVE-26071?focusedWorklogId=769811&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769811 ] ASF GitHub Bot logged work on HIVE-26071: - Author: ASF GitHub Bot Created on: 12/May/22 18:40 Start Date: 12/May/22 18:40 Worklog Time Spent: 10m Work Description: yongzhi merged PR #3233: URL: https://github.com/apache/hive/pull/3233 Issue Time Tracking --- Worklog Id: (was: 769811) Time Spent: 6h 50m (was: 6h 40m) > JWT authentication for Thrift over HTTP in HiveMetaStore > > > Key: HIVE-26071 > URL: https://issues.apache.org/jira/browse/HIVE-26071 > Project: Hive > Issue Type: New Feature > Components: Standalone Metastore >Reporter: Sourabh Goyal >Assignee: Sourabh Goyal >Priority: Major > Labels: pull-request-available > Time Spent: 6h 50m > Remaining Estimate: 0h > > HIVE-25575 recently added a support for JWT authentication in HS2. This Jira > aims to add the same feature in HMS -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26227) Add support of catalog related statements for Hive ql
[ https://issues.apache.org/jira/browse/HIVE-26227?focusedWorklogId=769773&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769773 ] ASF GitHub Bot logged work on HIVE-26227: - Author: ASF GitHub Bot Created on: 12/May/22 17:32 Start Date: 12/May/22 17:32 Worklog Time Spent: 10m Work Description: wecharyu opened a new pull request, #3288: URL: https://github.com/apache/hive/pull/3288 ### What changes were proposed in this pull request? Implement the ddl statements related to catalog, the statements can refer to [HIVE-26227](https://issues.apache.org/jira/browse/HIVE-26227). ### Why are the changes needed? To support basic ddl operation for catalog through Hive ql. ### Does this PR introduce _any_ user-facing change? Yes, we should add these new statements to DDL Document. ### How was this patch tested? Add a qtest `catalog.q`, can be test by command: ```bash mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=catalog.q ``` Issue Time Tracking --- Worklog Id: (was: 769773) Remaining Estimate: 0h Time Spent: 10m > Add support of catalog related statements for Hive ql > - > > Key: HIVE-26227 > URL: https://issues.apache.org/jira/browse/HIVE-26227 > Project: Hive > Issue Type: Task > Components: Hive >Reporter: Wechar >Assignee: Wechar >Priority: Minor > Fix For: 4.0.0-alpha-2 > > Time Spent: 10m > Remaining Estimate: 0h > > Catalog concept is proposed to Hive 3.0 to allow different systems to connect > to different catalogs in the metastore. But so far we can not query catalog > through Hive ql, this task aims to implement the ddl statements related to > catalog. > *Create Catalog* > {code:sql} > CREATE CATALOG [IF NOT EXISTS] catalog_name > LOCATION hdfs_path > [COMMENT catalog_comment]; > {code} > LOCATION is required for creating a new catalog now. > *Alter Catalog* > {code:sql} > ALTER CATALOG catalog_name SET LOCATION hdfs_path; > {code} > Only location metadata can be altered for catalog. > *Drop Catalog* > {code:sql} > DROP CATALOG [IF EXISTS] catalog_name; > {code} > DROP CATALOG is always RESTRICT, which means DROP CATALOG will fail if there > are non-default databases in the catalog. > *Show Catalogs* > {code:sql} > SHOW CATALOGS [LIKE 'identifier_with_wildcards']; > {code} > SHOW CATALOGS lists all of the catalogs defined in the metastore. > The optional LIKE clause allows the list of catalogs to be filtered using a > regular expression. > *Describe Catalog* > {code:sql} > DESC[RIBE] CATALOG [EXTENDED] cat_name; > {code} > DESCRIBE CATALOG shows the name of the catalog, its comment (if one has been > set), and its root location on the filesystem. > EXTENDED also shows the create time. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26227) Add support of catalog related statements for Hive ql
[ https://issues.apache.org/jira/browse/HIVE-26227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-26227: -- Labels: pull-request-available (was: ) > Add support of catalog related statements for Hive ql > - > > Key: HIVE-26227 > URL: https://issues.apache.org/jira/browse/HIVE-26227 > Project: Hive > Issue Type: Task > Components: Hive >Reporter: Wechar >Assignee: Wechar >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0-alpha-2 > > Time Spent: 10m > Remaining Estimate: 0h > > Catalog concept is proposed to Hive 3.0 to allow different systems to connect > to different catalogs in the metastore. But so far we can not query catalog > through Hive ql, this task aims to implement the ddl statements related to > catalog. > *Create Catalog* > {code:sql} > CREATE CATALOG [IF NOT EXISTS] catalog_name > LOCATION hdfs_path > [COMMENT catalog_comment]; > {code} > LOCATION is required for creating a new catalog now. > *Alter Catalog* > {code:sql} > ALTER CATALOG catalog_name SET LOCATION hdfs_path; > {code} > Only location metadata can be altered for catalog. > *Drop Catalog* > {code:sql} > DROP CATALOG [IF EXISTS] catalog_name; > {code} > DROP CATALOG is always RESTRICT, which means DROP CATALOG will fail if there > are non-default databases in the catalog. > *Show Catalogs* > {code:sql} > SHOW CATALOGS [LIKE 'identifier_with_wildcards']; > {code} > SHOW CATALOGS lists all of the catalogs defined in the metastore. > The optional LIKE clause allows the list of catalogs to be filtered using a > regular expression. > *Describe Catalog* > {code:sql} > DESC[RIBE] CATALOG [EXTENDED] cat_name; > {code} > DESCRIBE CATALOG shows the name of the catalog, its comment (if one has been > set), and its root location on the filesystem. > EXTENDED also shows the create time. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-25335) Unreasonable setting reduce number, when join big size table(but small row count) and small size table
[ https://issues.apache.org/jira/browse/HIVE-25335?focusedWorklogId=769681&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769681 ] ASF GitHub Bot logged work on HIVE-25335: - Author: ASF GitHub Bot Created on: 12/May/22 14:22 Start Date: 12/May/22 14:22 Worklog Time Spent: 10m Work Description: zabetak commented on PR #2490: URL: https://github.com/apache/hive/pull/2490#issuecomment-1125059076 @zhengchenyu I don't think it's possible to reopen a closed PR. You can create a new one instead. Issue Time Tracking --- Worklog Id: (was: 769681) Time Spent: 2.5h (was: 2h 20m) > Unreasonable setting reduce number, when join big size table(but small row > count) and small size table > -- > > Key: HIVE-25335 > URL: https://issues.apache.org/jira/browse/HIVE-25335 > Project: Hive > Issue Type: Improvement >Reporter: zhengchenyu >Assignee: zhengchenyu >Priority: Major > Labels: pull-request-available > Attachments: HIVE-25335.001.patch > > Time Spent: 2.5h > Remaining Estimate: 0h > > I found an application which is slow in our cluster, because the proccess > bytes of one reduce is very huge, but only two reduce. > when I debug, I found the reason. Because in this sql, one big size table > (about 30G) with few row count(about 3.5M), another small size table (about > 100M) have more row count (about 3.6M). So JoinStatsRule.process only use > 100M to estimate reducer's number. But we need to process 30G byte in fact. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (HIVE-26226) Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)
[ https://issues.apache.org/jira/browse/HIVE-26226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536140#comment-17536140 ] Sylwester Lachiewicz commented on HIVE-26226: - {code} hive-metastore 2.3.3 (but also 2.3.9) hbase-client 1.1.1 -> hbase-annotations 1.1.1 (up to 1.21.13)-> jdk.tools 1.7 (system) {code} jdk.tools was dropped with hbase-annotations 1.2.0 > Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid) > > > Key: HIVE-26226 > URL: https://issues.apache.org/jira/browse/HIVE-26226 > Project: Hive > Issue Type: Improvement > Components: Tests >Affects Versions: 3.1.3, 4.0.0-alpha-2 >Reporter: Sylwester Lachiewicz >Priority: Minor > > The hive-metastore 2.3.3 used in upgrade-acid tests includes unnecessary > dependency - that blocks the possibility to compile with newer java versions > > 8 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26046) MySQL's bit datatype is default to void datatype in hive
[ https://issues.apache.org/jira/browse/HIVE-26046?focusedWorklogId=769670&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769670 ] ASF GitHub Bot logged work on HIVE-26046: - Author: ASF GitHub Bot Created on: 12/May/22 14:12 Start Date: 12/May/22 14:12 Worklog Time Spent: 10m Work Description: zabetak commented on PR #3276: URL: https://github.com/apache/hive/pull/3276#issuecomment-1125046939 @zhangbutao Instead of running the test individually try to run the whole split (`mvn -Pitests -Pqsplits test -Dtest=org.apache.hadoop.hive.cli.split6.TestMiniLlapLocalCliDriver`) locally. I suspect there is some kind of interference with other tests (possibly `dataconnector_mysql.q`). From the error message I suppose we are trying to use an "old" connection that is no longer open. Possibly we are closing a connection that I shouldn't or something along these lines. Issue Time Tracking --- Worklog Id: (was: 769670) Time Spent: 0.5h (was: 20m) > MySQL's bit datatype is default to void datatype in hive > > > Key: HIVE-26046 > URL: https://issues.apache.org/jira/browse/HIVE-26046 > Project: Hive > Issue Type: Sub-task > Components: Standalone Metastore >Affects Versions: 4.0.0 >Reporter: Naveen Gangam >Assignee: zhangbutao >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > describe on a table that contains a "bit" datatype gets mapped to void. We > need a explicit conversion logic in the MySQL ConnectorProvider to map it to > a suitable datatype in hive. > {noformat} > +---+---++ > | col_name| data_type > | comment | > +---+---++ > | tbl_id| bigint > | from deserializer | > | create_time | int > | from deserializer | > | db_id | bigint > | from deserializer | > | last_access_time | int > | from deserializer | > | owner | varchar(767) > | from deserializer | > | owner_type| varchar(10) > | from deserializer | > | retention | int > | from deserializer | > | sd_id | bigint > | from deserializer | > | tbl_name | varchar(256) > | from deserializer | > | tbl_type | varchar(128) > | from deserializer | > | view_expanded_text| string > | from deserializer | > | view_original_text| string > | from deserializer | > | is_rewrite_enabled| void > | from deserializer | > | write_id | bigint > | from deserializer > {noformat} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26228) Implement Iceberg table rollback feature
[ https://issues.apache.org/jira/browse/HIVE-26228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Pintér updated HIVE-26228: - Description: We should allow rolling back iceberg table's data to the state at an older table snapshot. Rollback to the last snapshot before a specific timestamp {code:java} ALTER TABLE ice_t EXECUTE ROLLBACK('2022-05-12 00:00:00') {code} Rollback to a specific snapshot ID {code:java} ALTER TABLE ice_t EXECUTE ROLLBACK(); {code} was: We should allow rolling back iceberg table's data to the state at an older table snapshot. Rollback to the last snapshot before a specific timestamp {code:java} ALTER TABLE ice_t EXECUTE ROLLBACK('1231244334324'); {code} Rollback to a specific snapshot ID {code:java} ALTER TABLE ice_t EXECUTE ROLLBACK(); {code} > Implement Iceberg table rollback feature > > > Key: HIVE-26228 > URL: https://issues.apache.org/jira/browse/HIVE-26228 > Project: Hive > Issue Type: New Feature >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > We should allow rolling back iceberg table's data to the state at an older > table snapshot. > Rollback to the last snapshot before a specific timestamp > {code:java} > ALTER TABLE ice_t EXECUTE ROLLBACK('2022-05-12 00:00:00') > {code} > Rollback to a specific snapshot ID > {code:java} > ALTER TABLE ice_t EXECUTE ROLLBACK(); > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26228) Implement Iceberg table rollback feature
[ https://issues.apache.org/jira/browse/HIVE-26228?focusedWorklogId=769665&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769665 ] ASF GitHub Bot logged work on HIVE-26228: - Author: ASF GitHub Bot Created on: 12/May/22 13:59 Start Date: 12/May/22 13:59 Worklog Time Spent: 10m Work Description: pvary commented on PR #3287: URL: https://github.com/apache/hive/pull/3287#issuecomment-1125031319 @lcspinter: Could we use a timestamp in the same format as we receive it back from the metadata table query, like `select * from default.table_to_rollback.history`? Issue Time Tracking --- Worklog Id: (was: 769665) Time Spent: 20m (was: 10m) > Implement Iceberg table rollback feature > > > Key: HIVE-26228 > URL: https://issues.apache.org/jira/browse/HIVE-26228 > Project: Hive > Issue Type: New Feature >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > We should allow rolling back iceberg table's data to the state at an older > table snapshot. > Rollback to the last snapshot before a specific timestamp > {code:java} > ALTER TABLE ice_t EXECUTE ROLLBACK('1231244334324'); > {code} > Rollback to a specific snapshot ID > {code:java} > ALTER TABLE ice_t EXECUTE ROLLBACK(); > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (HIVE-26202) Refactor Iceberg Writers
[ https://issues.apache.org/jira/browse/HIVE-26202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary resolved HIVE-26202. --- Fix Version/s: 4.0.0 Resolution: Fixed Pushed to master. Thanks for the review [~lpinter]! > Refactor Iceberg Writers > > > Key: HIVE-26202 > URL: https://issues.apache.org/jira/browse/HIVE-26202 > Project: Hive > Issue Type: Improvement >Reporter: Peter Vary >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26202) Refactor Iceberg Writers
[ https://issues.apache.org/jira/browse/HIVE-26202?focusedWorklogId=769660&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769660 ] ASF GitHub Bot logged work on HIVE-26202: - Author: ASF GitHub Bot Created on: 12/May/22 13:53 Start Date: 12/May/22 13:53 Worklog Time Spent: 10m Work Description: pvary merged PR #3269: URL: https://github.com/apache/hive/pull/3269 Issue Time Tracking --- Worklog Id: (was: 769660) Time Spent: 1h 20m (was: 1h 10m) > Refactor Iceberg Writers > > > Key: HIVE-26202 > URL: https://issues.apache.org/jira/browse/HIVE-26202 > Project: Hive > Issue Type: Improvement >Reporter: Peter Vary >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (HIVE-26226) Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)
[ https://issues.apache.org/jira/browse/HIVE-26226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536121#comment-17536121 ] Sylwester Lachiewicz commented on HIVE-26226: - I found this issue only inside upgrade-acid - that prevents me from loading projects with my IDE. I'm using Maven 3.8.5 and Java 18 so even more challenges. I don't this small update would affect executing tests (CI is green) - all code still will stay compiled for Java 8 as before. Where have You found other jdk.tools mvn org.apache.maven.plugins:maven-dependency-plugin:3.3.0:tree -Dincludes=jdk.tools:jdk.tools doesn't show anything more. I'm checking branch-3.1 now. btw. [https://github.com/apache/hive/pull/3286] > Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid) > > > Key: HIVE-26226 > URL: https://issues.apache.org/jira/browse/HIVE-26226 > Project: Hive > Issue Type: Improvement > Components: Tests >Affects Versions: 3.1.3, 4.0.0-alpha-2 >Reporter: Sylwester Lachiewicz >Priority: Minor > > The hive-metastore 2.3.3 used in upgrade-acid tests includes unnecessary > dependency - that blocks the possibility to compile with newer java versions > > 8 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26228) Implement Iceberg table rollback feature
[ https://issues.apache.org/jira/browse/HIVE-26228?focusedWorklogId=769657&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769657 ] ASF GitHub Bot logged work on HIVE-26228: - Author: ASF GitHub Bot Created on: 12/May/22 13:47 Start Date: 12/May/22 13:47 Worklog Time Spent: 10m Work Description: lcspinter opened a new pull request, #3287: URL: https://github.com/apache/hive/pull/3287 ### What changes were proposed in this pull request? Provide syntax support for iceberg table's rollback operation ### Why are the changes needed? Iceberg API already has a rollback operation exposed. We should support it from hive queries as well. ### Does this PR introduce _any_ user-facing change? The end user can execute rollback to specific snapshot ID `ALTER TABLE ice_t EXECUTE ROLLBACK(1)` or to a snapshot before a given timestamp `ALTER TABLE ice_t EXECUTE ROLLBACK(`232124134213213`) ### How was this patch tested? Manual test, unit test Issue Time Tracking --- Worklog Id: (was: 769657) Remaining Estimate: 0h Time Spent: 10m > Implement Iceberg table rollback feature > > > Key: HIVE-26228 > URL: https://issues.apache.org/jira/browse/HIVE-26228 > Project: Hive > Issue Type: New Feature >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > We should allow rolling back iceberg table's data to the state at an older > table snapshot. > Rollback to the last snapshot before a specific timestamp > {code:java} > ALTER TABLE ice_t EXECUTE ROLLBACK('1231244334324'); > {code} > Rollback to a specific snapshot ID > {code:java} > ALTER TABLE ice_t EXECUTE ROLLBACK(); > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26228) Implement Iceberg table rollback feature
[ https://issues.apache.org/jira/browse/HIVE-26228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-26228: -- Labels: pull-request-available (was: ) > Implement Iceberg table rollback feature > > > Key: HIVE-26228 > URL: https://issues.apache.org/jira/browse/HIVE-26228 > Project: Hive > Issue Type: New Feature >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > We should allow rolling back iceberg table's data to the state at an older > table snapshot. > Rollback to the last snapshot before a specific timestamp > {code:java} > ALTER TABLE ice_t EXECUTE ROLLBACK('1231244334324'); > {code} > Rollback to a specific snapshot ID > {code:java} > ALTER TABLE ice_t EXECUTE ROLLBACK(); > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (HIVE-26228) Implement Iceberg table rollback feature
[ https://issues.apache.org/jira/browse/HIVE-26228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Pintér reassigned HIVE-26228: > Implement Iceberg table rollback feature > > > Key: HIVE-26228 > URL: https://issues.apache.org/jira/browse/HIVE-26228 > Project: Hive > Issue Type: New Feature >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > > We should allow rolling back iceberg table's data to the state at an older > table snapshot. > Rollback to the last snapshot before a specific timestamp > {code:java} > ALTER TABLE ice_t EXECUTE ROLLBACK('1231244334324'); > {code} > Rollback to a specific snapshot ID > {code:java} > ALTER TABLE ice_t EXECUTE ROLLBACK(); > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (HIVE-26226) Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)
[ https://issues.apache.org/jira/browse/HIVE-26226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536109#comment-17536109 ] Stamatis Zampetakis commented on HIVE-26226: Since jdk.tools comes from a dependency with provided scope then if compilation and tests run fine after removal I guess it is safe. There are some caveats though cause if the existing tests do not reach the code which needs the removed dependency then by removing a dependency we could be breaking something without noticing. However, I doubt that jdk.tools are ever needed at runtime so we should be good to go. Other than that I am a bit skeptical about the upgrade-acid module and compilation with JDK 11. The {{PreUpgradeTool}} is meant to run with Hive 2 binaries which are compiled with jdk <= 8. What will happen if we now compile the {{PreUpgradeTool}} with JDK11? Which JDK should the end-user use to run the {{PreUpgradeTool}}? > Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid) > > > Key: HIVE-26226 > URL: https://issues.apache.org/jira/browse/HIVE-26226 > Project: Hive > Issue Type: Improvement > Components: Tests >Affects Versions: 3.1.3, 4.0.0-alpha-2 >Reporter: Sylwester Lachiewicz >Priority: Minor > > The hive-metastore 2.3.3 used in upgrade-acid tests includes unnecessary > dependency - that blocks the possibility to compile with newer java versions > > 8 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (HIVE-26226) Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)
[ https://issues.apache.org/jira/browse/HIVE-26226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536106#comment-17536106 ] Stamatis Zampetakis commented on HIVE-26226: There are various places where jdk.tools appears. Most of the time it comes transitively from hadoop related modules but I see that it also comes from tez-api. Is there any particular reason that this JIRA and PR focus exclusively on upgrade-acid module? > Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid) > > > Key: HIVE-26226 > URL: https://issues.apache.org/jira/browse/HIVE-26226 > Project: Hive > Issue Type: Improvement > Components: Tests >Affects Versions: 3.1.3, 4.0.0-alpha-2 >Reporter: Sylwester Lachiewicz >Priority: Minor > > The hive-metastore 2.3.3 used in upgrade-acid tests includes unnecessary > dependency - that blocks the possibility to compile with newer java versions > > 8 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1
[ https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=769624&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769624 ] ASF GitHub Bot logged work on HIVE-24484: - Author: ASF GitHub Bot Created on: 12/May/22 13:07 Start Date: 12/May/22 13:07 Worklog Time Spent: 10m Work Description: ayushtkn commented on code in PR #3279: URL: https://github.com/apache/hive/pull/3279#discussion_r871357800 ## standalone-metastore/pom.xml: ## @@ -227,6 +227,10 @@ hadoop-mapreduce-client-core ${hadoop.version} + +org.jline +jline + Review Comment: yeps, the best answer is to upgrade Jline, which was stuck. So, I thought to upgrade Hadoop that shouldn't block if possible, we are already on 3.1.0 which died long back ## storage-api/src/java/org/apache/hadoop/hive/common/ValidReadTxnList.java: ## @@ -18,10 +18,10 @@ package org.apache.hadoop.hive.common; -import org.apache.commons.lang.StringUtils; +import org.apache.commons.lang3.StringUtils; Review Comment: Code doesn't compile with this. It is already marked as banned import, guess the logic has flaw. https://github.com/apache/hive/blob/master/pom.xml#L1529 The dependency was getting pulled in from Hadoop & now it isn't there, so I have to change it to make it compile Issue Time Tracking --- Worklog Id: (was: 769624) Time Spent: 9h 13m (was: 9.05h) > Upgrade Hadoop to 3.3.1 > --- > > Key: HIVE-24484 > URL: https://issues.apache.org/jira/browse/HIVE-24484 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 9h 13m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1
[ https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=769620&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769620 ] ASF GitHub Bot logged work on HIVE-24484: - Author: ASF GitHub Bot Created on: 12/May/22 13:05 Start Date: 12/May/22 13:05 Worklog Time Spent: 10m Work Description: ayushtkn commented on code in PR #3279: URL: https://github.com/apache/hive/pull/3279#discussion_r871355730 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java: ## @@ -9361,7 +9362,8 @@ public NotificationEventsCountResponse get_notification_events_count(Notificatio private void authorizeProxyPrivilege() throws TException { // Skip the auth in embedded mode or if the auth is disabled if (!HiveMetaStore.isMetaStoreRemote() || -!MetastoreConf.getBoolVar(conf, ConfVars.EVENT_DB_NOTIFICATION_API_AUTH)) { +!MetastoreConf.getBoolVar(conf, ConfVars.EVENT_DB_NOTIFICATION_API_AUTH) || conf.getBoolean(HIVE_IN_TEST.getVarname(), +false)) { Review Comment: It is covered via test in TestReplicationScenarios#testAuthForNotificationAPIs This method is also used mostly in replication context only I suppose for getting NotificationLog entries... Issue Time Tracking --- Worklog Id: (was: 769620) Time Spent: 9.05h (was: 8h 53m) > Upgrade Hadoop to 3.3.1 > --- > > Key: HIVE-24484 > URL: https://issues.apache.org/jira/browse/HIVE-24484 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 9.05h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1
[ https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=769619&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769619 ] ASF GitHub Bot logged work on HIVE-24484: - Author: ASF GitHub Bot Created on: 12/May/22 13:04 Start Date: 12/May/22 13:04 Worklog Time Spent: 10m Work Description: ayushtkn commented on code in PR #3279: URL: https://github.com/apache/hive/pull/3279#discussion_r871355509 ## ql/src/java/org/apache/hadoop/hive/ql/io/RecordReaderWrapper.java: ## @@ -69,7 +70,14 @@ static RecordReader create(InputFormat inputFormat, HiveInputFormat.HiveInputSpl JobConf jobConf, Reporter reporter) throws IOException { int headerCount = Utilities.getHeaderCount(tableDesc); int footerCount = Utilities.getFooterCount(tableDesc, jobConf); -RecordReader innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter); + +RecordReader innerReader = null; +try { + innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter); +} catch (InterruptedIOException iioe) { + // If reading from the underlying record reader is interrupted, return a no-op record reader Review Comment: Answer is here & this does fixes a couple of test so I picked it: https://github.com/apache/hive/pull/1742/files#r674896581 ## itests/pom.xml: ## @@ -352,6 +352,12 @@ org.apache.hadoop hadoop-yarn-client ${hadoop.version} + + +org.jline +jline + Review Comment: Just tried. Started a Hive cluster with derby, init hive db, started HS2, then beeline. show databases; show tables; create table emp(id int) insert into emp values (1),(2),(3),(4); select * from emp; show create table emp; Jline was used in Beeline, I think it should have broken that. Let me know what else can be tested. ## ql/src/java/org/apache/hadoop/hive/ql/security/authorization/StorageBasedAuthorizationProvider.java: ## @@ -178,8 +178,7 @@ public void authorize(Database db, Privilege[] readRequiredPriv, Privilege[] wri private static boolean userHasProxyPrivilege(String user, Configuration conf) { try { - if (MetaStoreServerUtils.checkUserHasHostProxyPrivileges(user, conf, - HMSHandler.getIPAddress())) { + if (MetaStoreServerUtils.checkUserHasHostProxyPrivileges(user, conf, HMSHandler.getIPAddress())) { Review Comment: Max LineLength allowed I guess is 120? https://github.com/apache/hive/blob/master/checkstyle/checkstyle.xml#L159-L160 Issue Time Tracking --- Worklog Id: (was: 769619) Time Spent: 8h 53m (was: 8h 43m) > Upgrade Hadoop to 3.3.1 > --- > > Key: HIVE-24484 > URL: https://issues.apache.org/jira/browse/HIVE-24484 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 8h 53m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26217) Make CTAS use Direct Insert Semantics
[ https://issues.apache.org/jira/browse/HIVE-26217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sourabh Badhya updated HIVE-26217: -- Description: CTAS on transactional tables currently does a copy from staging location to table location. This can be avoided by using Direct Insert semantics. Added support for suffixed table locations as well. (was: CTAS on transactional tables currently does a copy from staging location to table location. This can be avoided by using Direct Insert semantics. However the table location must be suffixed with the transaction identifier so that if the data is not committed then this location will not be used by subsequent queries which create same table again. Follow-up will be to clean any uncommitted data if CTAS operation has failed.) > Make CTAS use Direct Insert Semantics > - > > Key: HIVE-26217 > URL: https://issues.apache.org/jira/browse/HIVE-26217 > Project: Hive > Issue Type: Improvement >Reporter: Sourabh Badhya >Assignee: Sourabh Badhya >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > CTAS on transactional tables currently does a copy from staging location to > table location. This can be avoided by using Direct Insert semantics. Added > support for suffixed table locations as well. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (HIVE-26226) Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)
[ https://issues.apache.org/jira/browse/HIVE-26226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536103#comment-17536103 ] Alessandro Solimando commented on HIVE-26226: - >From the message in the ML: {noformat} I guess it's safe to add this exclusion, since the of the dependency scope is "provided" (meaning that the dependency is expected to be in the classpath already at runtime, so the exclusion won't interfere with that, nothing is packaged differently from Hive due to the exclusion), and both compilation under JDK8 and the run of the full test suite in CI were OK.{noformat} Looking forward to hearing more opinions on this. > Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid) > > > Key: HIVE-26226 > URL: https://issues.apache.org/jira/browse/HIVE-26226 > Project: Hive > Issue Type: Improvement > Components: Tests >Affects Versions: 3.1.3, 4.0.0-alpha-2 >Reporter: Sylwester Lachiewicz >Priority: Minor > > The hive-metastore 2.3.3 used in upgrade-acid tests includes unnecessary > dependency - that blocks the possibility to compile with newer java versions > > 8 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (HIVE-26226) Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)
[ https://issues.apache.org/jira/browse/HIVE-26226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536100#comment-17536100 ] Sylwester Lachiewicz commented on HIVE-26226: - yes, I was convinced that I had seen this problem already in someone, but I did not find it in Jira. I simply excluded this dependency (see linked PR for 3.1.x and 4.x) > Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid) > > > Key: HIVE-26226 > URL: https://issues.apache.org/jira/browse/HIVE-26226 > Project: Hive > Issue Type: Improvement > Components: Tests >Affects Versions: 3.1.3, 4.0.0-alpha-2 >Reporter: Sylwester Lachiewicz >Priority: Minor > > The hive-metastore 2.3.3 used in upgrade-acid tests includes unnecessary > dependency - that blocks the possibility to compile with newer java versions > > 8 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (HIVE-26226) Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)
[ https://issues.apache.org/jira/browse/HIVE-26226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536095#comment-17536095 ] Alessandro Solimando commented on HIVE-26226: - FYI I have approached the same issue in the ML recently: [https://lists.apache.org/thread/rmdkys6ofsqslhbl2fd7yjxvsttt6wb1] Removal is as simple as an exclusion in the pom (more details in the ML message). > Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid) > > > Key: HIVE-26226 > URL: https://issues.apache.org/jira/browse/HIVE-26226 > Project: Hive > Issue Type: Improvement > Components: Tests >Affects Versions: 3.1.3, 4.0.0-alpha-2 >Reporter: Sylwester Lachiewicz >Priority: Minor > > The hive-metastore 2.3.3 used in upgrade-acid tests includes unnecessary > dependency - that blocks the possibility to compile with newer java versions > > 8 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1
[ https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=769572&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769572 ] ASF GitHub Bot logged work on HIVE-24484: - Author: ASF GitHub Bot Created on: 12/May/22 12:07 Start Date: 12/May/22 12:07 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on code in PR #3279: URL: https://github.com/apache/hive/pull/3279#discussion_r871272145 ## common/pom.xml: ## @@ -195,6 +194,11 @@ tez-api ${tez.version} + + org.fusesource.jansi + jansi + 2.3.4 Review Comment: move version to root pom ## itests/pom.xml: ## @@ -352,6 +352,12 @@ org.apache.hadoop hadoop-yarn-client ${hadoop.version} + + +org.jline +jline + Review Comment: I'm not sure if this fix will work; it could work for the tests; but you've just excluded the dependency; I think that will not prevent that dep from appearing on the classpath during runtime... have you tested a dist build as well? ## ql/src/java/org/apache/hadoop/hive/ql/io/RecordReaderWrapper.java: ## @@ -69,7 +70,14 @@ static RecordReader create(InputFormat inputFormat, HiveInputFormat.HiveInputSpl JobConf jobConf, Reporter reporter) throws IOException { int headerCount = Utilities.getHeaderCount(tableDesc); int footerCount = Utilities.getFooterCount(tableDesc, jobConf); -RecordReader innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter); + +RecordReader innerReader = null; +try { + innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter); +} catch (InterruptedIOException iioe) { + // If reading from the underlying record reader is interrupted, return a no-op record reader Review Comment: why not simply propagate the `Exception` ? This will hide away the exception ## ql/src/java/org/apache/hadoop/hive/ql/security/authorization/StorageBasedAuthorizationProvider.java: ## @@ -178,8 +178,7 @@ public void authorize(Database db, Privilege[] readRequiredPriv, Privilege[] wri private static boolean userHasProxyPrivilege(String user, Configuration conf) { try { - if (MetaStoreServerUtils.checkUserHasHostProxyPrivileges(user, conf, - HMSHandler.getIPAddress())) { + if (MetaStoreServerUtils.checkUserHasHostProxyPrivileges(user, conf, HMSHandler.getIPAddress())) { Review Comment: I think max_linelength should be <=100 ; are you using the `dev-support/eclipse-styles.xml` ? ## streaming/src/test/org/apache/hive/streaming/TestStreaming.java: ## @@ -1317,6 +1318,11 @@ public void testTransactionBatchEmptyCommit() throws Exception { connection.close(); } + /** + * Starting with HDFS 3.3.1, the underlying system NOW SUPPORTS hflush so this + * test fails. Review Comment: ok; then I think this test could be probably converted into a test which checks that it works ## hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMultiOutputFormat.java: ## @@ -315,18 +320,19 @@ public void testOutputFormat() throws Throwable { // Check permisssion on partition dirs and files created for (int i = 0; i < tableNames.length; i++) { - Path partitionFile = new Path(warehousedir + "/" + tableNames[i] -+ "/ds=1/cluster=ag/part-m-0"); - FileSystem fs = partitionFile.getFileSystem(mrConf); - Assert.assertEquals("File permissions of table " + tableNames[i] + " is not correct", -fs.getFileStatus(partitionFile).getPermission(), -new FsPermission(tablePerms[i])); - Assert.assertEquals("File permissions of table " + tableNames[i] + " is not correct", -fs.getFileStatus(partitionFile.getParent()).getPermission(), -new FsPermission(tablePerms[i])); - Assert.assertEquals("File permissions of table " + tableNames[i] + " is not correct", - fs.getFileStatus(partitionFile.getParent().getParent()).getPermission(), -new FsPermission(tablePerms[i])); + final Path partitionFile = new Path(warehousedir + "/" + tableNames[i] + "/ds=1/cluster=ag/part-m-0"); + final Path grandParentOfPartitionFile = partitionFile.getParent(); Review Comment: I would expect `grandParent` to be parent-of-parent; I think this change could be revoked - it was more readable earlier; the last assert now checks for the `parent` dir and not for `parent.parent`; the second assert was also clobbered ## itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java: ## @@ -123,57 +122,24 @@ public void targetAndSourceHaveDifferentEncryptionZoneKeys() throws Thr
[jira] [Commented] (HIVE-26226) Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)
[ https://issues.apache.org/jira/browse/HIVE-26226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536063#comment-17536063 ] Stamatis Zampetakis commented on HIVE-26226: [~slachiewicz] Are you sure that it is unnecessary? How do you plan to remove it? > Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid) > > > Key: HIVE-26226 > URL: https://issues.apache.org/jira/browse/HIVE-26226 > Project: Hive > Issue Type: Improvement > Components: Tests >Affects Versions: 3.1.3, 4.0.0-alpha-2 >Reporter: Sylwester Lachiewicz >Priority: Minor > > The hive-metastore 2.3.3 used in upgrade-acid tests includes unnecessary > dependency - that blocks the possibility to compile with newer java versions > > 8 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1
[ https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=769545&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769545 ] ASF GitHub Bot logged work on HIVE-24484: - Author: ASF GitHub Bot Created on: 12/May/22 11:05 Start Date: 12/May/22 11:05 Worklog Time Spent: 10m Work Description: ayushtkn commented on PR #3279: URL: https://github.com/apache/hive/pull/3279#issuecomment-1124857444 My Last run here had 4 errors, of which I think I have fixed 3 more. The one remaining is some XML parsing error, which I think might get auto resolved or may be an after affect. @kgyrtkirk I have sorted the JLine issue here as well, which you told in the previous PR. Can you give a check once Issue Time Tracking --- Worklog Id: (was: 769545) Time Spent: 8.55h (was: 8h 23m) > Upgrade Hadoop to 3.3.1 > --- > > Key: HIVE-24484 > URL: https://issues.apache.org/jira/browse/HIVE-24484 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 8.55h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26227) Add support of catalog related statements for Hive ql
[ https://issues.apache.org/jira/browse/HIVE-26227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wechar updated HIVE-26227: -- Description: Catalog concept is proposed to Hive 3.0 to allow different systems to connect to different catalogs in the metastore. But so far we can not query catalog through Hive ql, this task aims to implement the ddl statements related to catalog. *Create Catalog* {code:sql} CREATE CATALOG [IF NOT EXISTS] catalog_name LOCATION hdfs_path [COMMENT catalog_comment]; {code} LOCATION is required for creating a new catalog now. *Alter Catalog* {code:sql} ALTER CATALOG catalog_name SET LOCATION hdfs_path; {code} Only location metadata can be altered for catalog. *Drop Catalog* {code:sql} DROP CATALOG [IF EXISTS] catalog_name; {code} DROP CATALOG is always RESTRICT, which means DROP CATALOG will fail if there are non-default databases in the catalog. *Show Catalogs* {code:sql} SHOW CATALOGS [LIKE 'identifier_with_wildcards']; {code} SHOW CATALOGS lists all of the catalogs defined in the metastore. The optional LIKE clause allows the list of catalogs to be filtered using a regular expression. *Describe Catalog* {code:sql} DESC[RIBE] CATALOG [EXTENDED] cat_name; {code} DESCRIBE CATALOG shows the name of the catalog, its comment (if one has been set), and its root location on the filesystem. EXTENDED also shows the create time. was: Catalog concept is proposed to Hive 3.0 to allow different systems to connect to different catalogs in the metastore. But so far we can not query catalog through Hive ql, this task aims to implement the ddl statements related to catalog. *Create Catalog* {code:sql} CREATE CATALOG [IF NOT EXISTS] catalog_name LOCATION hdfs_path [COMMENT catalog_comment]; {code} LOCATION is required for creating a new catalog now. *Alter Catalog* {code:sql} ALTER CATALOG catalog_name SET LOCATION hdfs_path; {code} Only location metadata can be altered for catalog. *Drop Catalog* {code:sql} DROP CATALOG [IF EXISTS] catalog_name; {code} DROP CATALOG is always RESTRICT, which means DROP CATALOG will fail if there are non-default databases in the catalog. *Show Catalog* {code:sql} SHOW CATALOGS [LIKE 'identifier_with_wildcards']; {code} SHOW CATALOGS lists all of the catalogs defined in the metastore. The optional LIKE clause allows the list of catalogs to be filtered using a regular expression. *Describe Catalog* {code:sql} DESC[RIBE] CATALOG [EXTENDED] cat_name; {code} DESCRIBE CATALOG shows the name of the catalog, its comment (if one has been set), and its root location on the filesystem. EXTENDED also shows the create time. > Add support of catalog related statements for Hive ql > - > > Key: HIVE-26227 > URL: https://issues.apache.org/jira/browse/HIVE-26227 > Project: Hive > Issue Type: Task > Components: Hive >Reporter: Wechar >Assignee: Wechar >Priority: Minor > Fix For: 4.0.0-alpha-2 > > > Catalog concept is proposed to Hive 3.0 to allow different systems to connect > to different catalogs in the metastore. But so far we can not query catalog > through Hive ql, this task aims to implement the ddl statements related to > catalog. > *Create Catalog* > {code:sql} > CREATE CATALOG [IF NOT EXISTS] catalog_name > LOCATION hdfs_path > [COMMENT catalog_comment]; > {code} > LOCATION is required for creating a new catalog now. > *Alter Catalog* > {code:sql} > ALTER CATALOG catalog_name SET LOCATION hdfs_path; > {code} > Only location metadata can be altered for catalog. > *Drop Catalog* > {code:sql} > DROP CATALOG [IF EXISTS] catalog_name; > {code} > DROP CATALOG is always RESTRICT, which means DROP CATALOG will fail if there > are non-default databases in the catalog. > *Show Catalogs* > {code:sql} > SHOW CATALOGS [LIKE 'identifier_with_wildcards']; > {code} > SHOW CATALOGS lists all of the catalogs defined in the metastore. > The optional LIKE clause allows the list of catalogs to be filtered using a > regular expression. > *Describe Catalog* > {code:sql} > DESC[RIBE] CATALOG [EXTENDED] cat_name; > {code} > DESCRIBE CATALOG shows the name of the catalog, its comment (if one has been > set), and its root location on the filesystem. > EXTENDED also shows the create time. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (HIVE-26227) Add support of catalog related statements for Hive ql
[ https://issues.apache.org/jira/browse/HIVE-26227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wechar reassigned HIVE-26227: - > Add support of catalog related statements for Hive ql > - > > Key: HIVE-26227 > URL: https://issues.apache.org/jira/browse/HIVE-26227 > Project: Hive > Issue Type: Task > Components: Hive >Reporter: Wechar >Assignee: Wechar >Priority: Minor > Fix For: 4.0.0-alpha-2 > > > Catalog concept is proposed to Hive 3.0 to allow different systems to connect > to different catalogs in the metastore. But so far we can not query catalog > through Hive ql, this task aims to implement the ddl statements related to > catalog. > *Create Catalog* > {code:sql} > CREATE CATALOG [IF NOT EXISTS] catalog_name > LOCATION hdfs_path > [COMMENT catalog_comment]; > {code} > LOCATION is required for creating a new catalog now. > *Alter Catalog* > {code:sql} > ALTER CATALOG catalog_name SET LOCATION hdfs_path; > {code} > Only location metadata can be altered for catalog. > *Drop Catalog* > {code:sql} > DROP CATALOG [IF EXISTS] catalog_name; > {code} > DROP CATALOG is always RESTRICT, which means DROP CATALOG will fail if there > are non-default databases in the catalog. > *Show Catalog* > {code:sql} > SHOW CATALOGS [LIKE 'identifier_with_wildcards']; > {code} > SHOW CATALOGS lists all of the catalogs defined in the metastore. > The optional LIKE clause allows the list of catalogs to be filtered using a > regular expression. > *Describe Catalog* > {code:sql} > DESC[RIBE] CATALOG [EXTENDED] cat_name; > {code} > DESCRIBE CATALOG shows the name of the catalog, its comment (if one has been > set), and its root location on the filesystem. > EXTENDED also shows the create time. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (HIVE-25976) Cleaner may remove files being accessed from a fetch-task-converted reader
[ https://issues.apache.org/jira/browse/HIVE-25976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Végh reassigned HIVE-25976: -- Assignee: László Végh > Cleaner may remove files being accessed from a fetch-task-converted reader > -- > > Key: HIVE-25976 > URL: https://issues.apache.org/jira/browse/HIVE-25976 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: László Végh >Priority: Major > Attachments: fetch_task_conv_compactor_test.patch > > > in a nutshell the following happens: > * query is compiled in fetch-task-converted mode > * no real execution happensbut the locks are released > * the HS2 is communicating with the client and uses the fetch-task to get the > rows - which in this case will directly read files from the table's > directory > * client sleeps between reads - so there is ample time for other events... > * cleaner wakes up and removes some files > * in the next read the fetch-task encounters a read error... -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work started] (HIVE-25976) Cleaner may remove files being accessed from a fetch-task-converted reader
[ https://issues.apache.org/jira/browse/HIVE-25976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-25976 started by László Végh. -- > Cleaner may remove files being accessed from a fetch-task-converted reader > -- > > Key: HIVE-25976 > URL: https://issues.apache.org/jira/browse/HIVE-25976 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: László Végh >Priority: Major > Attachments: fetch_task_conv_compactor_test.patch > > > in a nutshell the following happens: > * query is compiled in fetch-task-converted mode > * no real execution happensbut the locks are released > * the HS2 is communicating with the client and uses the fetch-task to get the > rows - which in this case will directly read files from the table's > directory > * client sleeps between reads - so there is ample time for other events... > * cleaner wakes up and removes some files > * in the next read the fetch-task encounters a read error... -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26202) Refactor Iceberg Writers
[ https://issues.apache.org/jira/browse/HIVE-26202?focusedWorklogId=769441&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769441 ] ASF GitHub Bot logged work on HIVE-26202: - Author: ASF GitHub Bot Created on: 12/May/22 07:04 Start Date: 12/May/22 07:04 Worklog Time Spent: 10m Work Description: pvary commented on code in PR #3269: URL: https://github.com/apache/hive/pull/3269#discussion_r871017909 ## iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java: ## @@ -571,6 +571,15 @@ public static boolean isUpdate(Configuration conf, String tableName) { conf.get(InputFormatConfig.OPERATION_TYPE_PREFIX + tableName)); } + public static Operation operation(Configuration conf, String tableName) { Review Comment: I did this. Could you please check the result? ## iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/writer/WriterBuilder.java: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.iceberg.mr.hive.writer; + +import java.util.Locale; +import java.util.Map; +import org.apache.hadoop.hive.ql.Context.Operation; +import org.apache.hadoop.mapred.TaskAttemptID; +import org.apache.iceberg.FileFormat; +import org.apache.iceberg.PartitionSpec; +import org.apache.iceberg.Schema; +import org.apache.iceberg.Table; +import org.apache.iceberg.TableProperties; +import org.apache.iceberg.io.FileIO; +import org.apache.iceberg.io.OutputFileFactory; +import org.apache.iceberg.util.PropertyUtil; + +import static org.apache.iceberg.TableProperties.DEFAULT_FILE_FORMAT; +import static org.apache.iceberg.TableProperties.DEFAULT_FILE_FORMAT_DEFAULT; +import static org.apache.iceberg.TableProperties.DELETE_DEFAULT_FILE_FORMAT; + +public class WriterBuilder { + private Table table; Review Comment: Done Issue Time Tracking --- Worklog Id: (was: 769441) Time Spent: 1h 10m (was: 1h) > Refactor Iceberg Writers > > > Key: HIVE-26202 > URL: https://issues.apache.org/jira/browse/HIVE-26202 > Project: Hive > Issue Type: Improvement >Reporter: Peter Vary >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26223) Integrate ESRI GeoSpatial UDFs
[ https://issues.apache.org/jira/browse/HIVE-26223?focusedWorklogId=769438&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769438 ] ASF GitHub Bot logged work on HIVE-26223: - Author: ASF GitHub Bot Created on: 12/May/22 07:00 Start Date: 12/May/22 07:00 Worklog Time Spent: 10m Work Description: ayushtkn opened a new pull request, #3283: URL: https://github.com/apache/hive/pull/3283 Adds GeoSpatial UDFs Issue Time Tracking --- Worklog Id: (was: 769438) Remaining Estimate: 0h Time Spent: 10m > Integrate ESRI GeoSpatial UDFs > --- > > Key: HIVE-26223 > URL: https://issues.apache.org/jira/browse/HIVE-26223 > Project: Hive > Issue Type: Sub-task >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Add the GeoSpatial UDFs to hive -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26225) Delete operations in ObjectStore.cleanWriteNotificationEvents should be performed in different transactions
[ https://issues.apache.org/jira/browse/HIVE-26225?focusedWorklogId=769437&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769437 ] ASF GitHub Bot logged work on HIVE-26225: - Author: ASF GitHub Bot Created on: 12/May/22 07:00 Start Date: 12/May/22 07:00 Worklog Time Spent: 10m Work Description: hmangla98 opened a new pull request, #3282: URL: https://github.com/apache/hive/pull/3282 We need to improve the ObjectStore.cleanWriteNotificationEvents in the same way as it was done for notification log table in: https://issues.apache.org/jira/browse/HIVE-24432 Issue Time Tracking --- Worklog Id: (was: 769437) Remaining Estimate: 0h Time Spent: 10m > Delete operations in ObjectStore.cleanWriteNotificationEvents should be > performed in different transactions > --- > > Key: HIVE-26225 > URL: https://issues.apache.org/jira/browse/HIVE-26225 > Project: Hive > Issue Type: Improvement >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > We need to improve the ObjectStore.cleanWriteNotificationEvents in the same > way as it was done for notification log table in: > https://issues.apache.org/jira/browse/HIVE-24432 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26225) Delete operations in ObjectStore.cleanWriteNotificationEvents should be performed in different transactions
[ https://issues.apache.org/jira/browse/HIVE-26225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-26225: -- Labels: pull-request-available (was: ) > Delete operations in ObjectStore.cleanWriteNotificationEvents should be > performed in different transactions > --- > > Key: HIVE-26225 > URL: https://issues.apache.org/jira/browse/HIVE-26225 > Project: Hive > Issue Type: Improvement >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > We need to improve the ObjectStore.cleanWriteNotificationEvents in the same > way as it was done for notification log table in: > https://issues.apache.org/jira/browse/HIVE-24432 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26223) Integrate ESRI GeoSpatial UDFs
[ https://issues.apache.org/jira/browse/HIVE-26223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-26223: -- Labels: pull-request-available (was: ) > Integrate ESRI GeoSpatial UDFs > --- > > Key: HIVE-26223 > URL: https://issues.apache.org/jira/browse/HIVE-26223 > Project: Hive > Issue Type: Sub-task >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Add the GeoSpatial UDFs to hive -- This message was sent by Atlassian Jira (v8.20.7#820007)