[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=783807=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783807 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 22/Jun/22 10:22 Start Date: 22/Jun/22 10:22 Worklog Time Spent: 10m Work Description: deniskuzZ merged PR #3395: URL: https://github.com/apache/hive/pull/3395 Issue Time Tracking --- Worklog Id: (was: 783807) Time Spent: 3h 20m (was: 3h 10m) > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 3h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=783537=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783537 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 21/Jun/22 19:20 Start Date: 21/Jun/22 19:20 Worklog Time Spent: 10m Work Description: deniskuzZ opened a new pull request, #3395: URL: https://github.com/apache/hive/pull/3395 …oved immediately from the FS when non-default MetaStoreFilterHook is configured ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Issue Time Tracking --- Worklog Id: (was: 783537) Time Spent: 3h 10m (was: 3h) > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 3h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=783536=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783536 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 21/Jun/22 19:20 Start Date: 21/Jun/22 19:20 Worklog Time Spent: 10m Work Description: deniskuzZ closed pull request #3394: HIVE-26149: Addendum: Soft-delete materialized views shouldn't be removed immediately from the FS when non-default MetaStoreFilterHook is configured URL: https://github.com/apache/hive/pull/3394 Issue Time Tracking --- Worklog Id: (was: 783536) Time Spent: 3h (was: 2h 50m) > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 3h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=783535=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783535 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 21/Jun/22 19:18 Start Date: 21/Jun/22 19:18 Worklog Time Spent: 10m Work Description: deniskuzZ opened a new pull request, #3394: URL: https://github.com/apache/hive/pull/3394 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Issue Time Tracking --- Worklog Id: (was: 783535) Time Spent: 2h 50m (was: 2h 40m) > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 2h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=764021=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-764021 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 29/Apr/22 06:38 Start Date: 29/Apr/22 06:38 Worklog Time Spent: 10m Work Description: deniskuzZ merged PR #3220: URL: https://github.com/apache/hive/pull/3220 Issue Time Tracking --- Worklog Id: (was: 764021) Time Spent: 2h 40m (was: 2.5h) > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=763484=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-763484 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 28/Apr/22 12:47 Start Date: 28/Apr/22 12:47 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3220: URL: https://github.com/apache/hive/pull/3220#discussion_r860846614 ## standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java: ## @@ -1869,8 +1880,7 @@ void dropDatabase(String catName, String dbName, boolean deleteData, boolean ign * @throws MetaException something went wrong, usually either in the RDBMS or storage. * @throws TException general thrift error. */ - default void dropDatabase(String catName, String dbName, boolean deleteData, -boolean ignoreUnknownDb) + default void dropDatabase(String catName, String dbName, boolean deleteData, boolean ignoreUnknownDb) Review Comment: marked as deprecated Issue Time Tracking --- Worklog Id: (was: 763484) Time Spent: 2.5h (was: 2h 20m) > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=763483=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-763483 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 28/Apr/22 12:46 Start Date: 28/Apr/22 12:46 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3220: URL: https://github.com/apache/hive/pull/3220#discussion_r860846223 ## ql/src/java/org/apache/hadoop/hive/ql/ddl/database/drop/DropDatabaseAnalyzer.java: ## @@ -49,28 +52,36 @@ public void analyzeInternal(ASTNode root) throws SemanticException { String databaseName = unescapeIdentifier(root.getChild(0).getText()); boolean ifExists = root.getFirstChildWithType(HiveParser.TOK_IFEXISTS) != null; boolean cascade = root.getFirstChildWithType(HiveParser.TOK_CASCADE) != null; +boolean isSoftDelete = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED); Database database = getDatabase(databaseName, !ifExists); if (database == null) { return; } - // if cascade=true, then we need to authorize the drop table action as well, and add the tables to the outputs +boolean allTablesWithSuffix = false; if (cascade) { try { -for (Table table : db.getAllTableObjects(databaseName)) { - // We want no lock here, as the database lock will cover the tables, - // and putting a lock will actually cause us to deadlock on ourselves. - outputs.add(new WriteEntity(table, WriteEntity.WriteType.DDL_NO_LOCK)); +List tables = db.getAllTableObjects(databaseName); +allTablesWithSuffix = tables.stream().allMatch( +table -> AcidUtils.isTableSoftDeleteEnabled(table, conf)); +for (Table table : tables) { + // Optimization used to limit number of requested locks. Check if table lock is needed or we could get away with single DB level lock, + boolean isTableLockNeeded = isSoftDelete && !allTablesWithSuffix; + outputs.add(new WriteEntity(table, isTableLockNeeded ? +AcidUtils.isTableSoftDeleteEnabled(table, conf) ? +WriteEntity.WriteType.DDL_EXCL_WRITE : WriteEntity.WriteType.DDL_EXCLUSIVE : +WriteEntity.WriteType.DDL_NO_LOCK)); Review Comment: refactored Issue Time Tracking --- Worklog Id: (was: 763483) Time Spent: 2h 20m (was: 2h 10m) > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=763382=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-763382 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 28/Apr/22 09:20 Start Date: 28/Apr/22 09:20 Worklog Time Spent: 10m Work Description: pvary commented on code in PR #3220: URL: https://github.com/apache/hive/pull/3220#discussion_r860672337 ## standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java: ## @@ -1869,8 +1880,7 @@ void dropDatabase(String catName, String dbName, boolean deleteData, boolean ign * @throws MetaException something went wrong, usually either in the RDBMS or storage. * @throws TException general thrift error. */ - default void dropDatabase(String catName, String dbName, boolean deleteData, -boolean ignoreUnknownDb) + default void dropDatabase(String catName, String dbName, boolean deleteData, boolean ignoreUnknownDb) Review Comment: Nit: I think this could be deprecated as well Issue Time Tracking --- Worklog Id: (was: 763382) Time Spent: 2h 10m (was: 2h) > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=763381=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-763381 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 28/Apr/22 09:16 Start Date: 28/Apr/22 09:16 Worklog Time Spent: 10m Work Description: pvary commented on code in PR #3220: URL: https://github.com/apache/hive/pull/3220#discussion_r860668405 ## ql/src/java/org/apache/hadoop/hive/ql/ddl/database/drop/DropDatabaseAnalyzer.java: ## @@ -49,28 +52,36 @@ public void analyzeInternal(ASTNode root) throws SemanticException { String databaseName = unescapeIdentifier(root.getChild(0).getText()); boolean ifExists = root.getFirstChildWithType(HiveParser.TOK_IFEXISTS) != null; boolean cascade = root.getFirstChildWithType(HiveParser.TOK_CASCADE) != null; +boolean isSoftDelete = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED); Database database = getDatabase(databaseName, !ifExists); if (database == null) { return; } - // if cascade=true, then we need to authorize the drop table action as well, and add the tables to the outputs +boolean allTablesWithSuffix = false; if (cascade) { try { -for (Table table : db.getAllTableObjects(databaseName)) { - // We want no lock here, as the database lock will cover the tables, - // and putting a lock will actually cause us to deadlock on ourselves. - outputs.add(new WriteEntity(table, WriteEntity.WriteType.DDL_NO_LOCK)); +List tables = db.getAllTableObjects(databaseName); +allTablesWithSuffix = tables.stream().allMatch( +table -> AcidUtils.isTableSoftDeleteEnabled(table, conf)); +for (Table table : tables) { + // Optimization used to limit number of requested locks. Check if table lock is needed or we could get away with single DB level lock, + boolean isTableLockNeeded = isSoftDelete && !allTablesWithSuffix; + outputs.add(new WriteEntity(table, isTableLockNeeded ? +AcidUtils.isTableSoftDeleteEnabled(table, conf) ? +WriteEntity.WriteType.DDL_EXCL_WRITE : WriteEntity.WriteType.DDL_EXCLUSIVE : +WriteEntity.WriteType.DDL_NO_LOCK)); Review Comment: Would this be better: ``` LockType lockType = WriteEntity.WriteType.DDL_NO_LOCK; if (isTableLockNeeded) { lockType = AcidUtils.isTableSoftDeleteEnabled(table, conf) ? WriteEntity.WriteType.DDL_EXCL_WRITE : WriteEntity.WriteType.DDL_EXCLUSIVE; } outputs.add(new WriteEntity(table, lockType)); ``` I think having too many `:` and `?` is really hard to read. Issue Time Tracking --- Worklog Id: (was: 763381) Time Spent: 2h (was: 1h 50m) > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761718=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761718 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 25/Apr/22 11:29 Start Date: 25/Apr/22 11:29 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3220: URL: https://github.com/apache/hive/pull/3220#discussion_r857525828 ## standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java: ## @@ -1534,43 +1538,50 @@ public void dropDatabase(String catalogName, String dbName, boolean deleteData, * @param maxBatchSize * @throws TException */ - private void dropDatabaseCascadePerTable(String catName, String dbName, List tableList, - boolean deleteData, int maxBatchSize) throws TException { -String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf); -for (Table table : new TableIterable(this, catName, dbName, tableList, maxBatchSize)) { + private void dropDatabaseCascadePerTable(DropDatabaseRequest req, List tableList, int maxBatchSize) + throws TException { +String dbNameWithCatalog = prependCatalogToDbName(req.getCatalogName(), req.getName(), conf); +for (Table table : new TableIterable( +this, req.getCatalogName(), req.getName(), tableList, maxBatchSize)) { boolean success = false; HiveMetaHook hook = getHook(table); - if (hook == null) { -continue; - } try { -hook.preDropTable(table); -client.drop_table_with_environment_context(dbNameWithCatalog, table.getTableName(), deleteData, null); -hook.commitDropTable(table, deleteData); +if (hook != null) { + hook.preDropTable(table); +} +boolean isSoftDelete = req.isSoftDelete() && Boolean.parseBoolean( + table.getParameters().getOrDefault(SOFT_DELETE_TABLE, "false")); +EnvironmentContext context = null; +if (req.isSetTxnId()) { + context = new EnvironmentContext(); + context.putToProperties("txnId", String.valueOf(req.getTxnId())); + req.setDeleteManagedDir(false); +} +client.drop_table_with_environment_context(dbNameWithCatalog, table.getTableName(), +req.isDeleteData() && !isSoftDelete, context); +if (hook != null) { + hook.commitDropTable(table, req.isDeleteData()); +} success = true; } finally { -if (!success) { +if (!success && hook != null) { hook.rollbackDropTable(table); } } } -client.drop_database(dbNameWithCatalog, deleteData, true); +client.drop_database_req(req); } /** * Handles dropDatabase by invoking drop_database in HMS. * Useful when table list in DB can fit in memory, it will retrieve all tables at once and * call drop_database once. Also handles drop_table hooks. - * @param catName - * @param dbName + * @param req * @param tableList - * @param deleteData * @throws TException */ - private void dropDatabaseCascadePerDb(String catName, String dbName, List tableList, -boolean deleteData) throws TException { -String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf); -List tables = getTableObjectsByName(catName, dbName, tableList); + private void dropDatabaseCascadePerDb(DropDatabaseRequest req, List tableList) throws TException { Review Comment: that remains if soft delete is disabled, otherwise, we'll be locking just tables/not DB with an appropriate type of lock. allTablesWithSuffix is an optimization, if all tables under DB are soft-delete eligible - grad just table-level lock. Issue Time Tracking --- Worklog Id: (was: 761718) Time Spent: 1h 40m (was: 1.5h) > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761719=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761719 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 25/Apr/22 11:29 Start Date: 25/Apr/22 11:29 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3220: URL: https://github.com/apache/hive/pull/3220#discussion_r857525828 ## standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java: ## @@ -1534,43 +1538,50 @@ public void dropDatabase(String catalogName, String dbName, boolean deleteData, * @param maxBatchSize * @throws TException */ - private void dropDatabaseCascadePerTable(String catName, String dbName, List tableList, - boolean deleteData, int maxBatchSize) throws TException { -String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf); -for (Table table : new TableIterable(this, catName, dbName, tableList, maxBatchSize)) { + private void dropDatabaseCascadePerTable(DropDatabaseRequest req, List tableList, int maxBatchSize) + throws TException { +String dbNameWithCatalog = prependCatalogToDbName(req.getCatalogName(), req.getName(), conf); +for (Table table : new TableIterable( +this, req.getCatalogName(), req.getName(), tableList, maxBatchSize)) { boolean success = false; HiveMetaHook hook = getHook(table); - if (hook == null) { -continue; - } try { -hook.preDropTable(table); -client.drop_table_with_environment_context(dbNameWithCatalog, table.getTableName(), deleteData, null); -hook.commitDropTable(table, deleteData); +if (hook != null) { + hook.preDropTable(table); +} +boolean isSoftDelete = req.isSoftDelete() && Boolean.parseBoolean( + table.getParameters().getOrDefault(SOFT_DELETE_TABLE, "false")); +EnvironmentContext context = null; +if (req.isSetTxnId()) { + context = new EnvironmentContext(); + context.putToProperties("txnId", String.valueOf(req.getTxnId())); + req.setDeleteManagedDir(false); +} +client.drop_table_with_environment_context(dbNameWithCatalog, table.getTableName(), +req.isDeleteData() && !isSoftDelete, context); +if (hook != null) { + hook.commitDropTable(table, req.isDeleteData()); +} success = true; } finally { -if (!success) { +if (!success && hook != null) { hook.rollbackDropTable(table); } } } -client.drop_database(dbNameWithCatalog, deleteData, true); +client.drop_database_req(req); } /** * Handles dropDatabase by invoking drop_database in HMS. * Useful when table list in DB can fit in memory, it will retrieve all tables at once and * call drop_database once. Also handles drop_table hooks. - * @param catName - * @param dbName + * @param req * @param tableList - * @param deleteData * @throws TException */ - private void dropDatabaseCascadePerDb(String catName, String dbName, List tableList, -boolean deleteData) throws TException { -String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf); -List tables = getTableObjectsByName(catName, dbName, tableList); + private void dropDatabaseCascadePerDb(DropDatabaseRequest req, List tableList) throws TException { Review Comment: that remains if soft delete is disabled, otherwise, we'll be locking just tables/not DB with an appropriate type of lock. allTablesWithSuffix is an optimization, if all tables under DB are soft-delete eligible - grad just DB-level lock. Issue Time Tracking --- Worklog Id: (was: 761719) Time Spent: 1h 50m (was: 1h 40m) > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761654=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761654 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 25/Apr/22 08:42 Start Date: 25/Apr/22 08:42 Worklog Time Spent: 10m Work Description: pvary commented on code in PR #3220: URL: https://github.com/apache/hive/pull/3220#discussion_r857390546 ## standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java: ## @@ -1534,43 +1538,50 @@ public void dropDatabase(String catalogName, String dbName, boolean deleteData, * @param maxBatchSize * @throws TException */ - private void dropDatabaseCascadePerTable(String catName, String dbName, List tableList, - boolean deleteData, int maxBatchSize) throws TException { -String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf); -for (Table table : new TableIterable(this, catName, dbName, tableList, maxBatchSize)) { + private void dropDatabaseCascadePerTable(DropDatabaseRequest req, List tableList, int maxBatchSize) + throws TException { +String dbNameWithCatalog = prependCatalogToDbName(req.getCatalogName(), req.getName(), conf); +for (Table table : new TableIterable( +this, req.getCatalogName(), req.getName(), tableList, maxBatchSize)) { boolean success = false; HiveMetaHook hook = getHook(table); - if (hook == null) { -continue; - } try { -hook.preDropTable(table); -client.drop_table_with_environment_context(dbNameWithCatalog, table.getTableName(), deleteData, null); -hook.commitDropTable(table, deleteData); +if (hook != null) { + hook.preDropTable(table); +} +boolean isSoftDelete = req.isSoftDelete() && Boolean.parseBoolean( + table.getParameters().getOrDefault(SOFT_DELETE_TABLE, "false")); +EnvironmentContext context = null; +if (req.isSetTxnId()) { + context = new EnvironmentContext(); + context.putToProperties("txnId", String.valueOf(req.getTxnId())); + req.setDeleteManagedDir(false); +} +client.drop_table_with_environment_context(dbNameWithCatalog, table.getTableName(), +req.isDeleteData() && !isSoftDelete, context); +if (hook != null) { + hook.commitDropTable(table, req.isDeleteData()); +} success = true; } finally { -if (!success) { +if (!success && hook != null) { hook.rollbackDropTable(table); } } } -client.drop_database(dbNameWithCatalog, deleteData, true); +client.drop_database_req(req); } /** * Handles dropDatabase by invoking drop_database in HMS. * Useful when table list in DB can fit in memory, it will retrieve all tables at once and * call drop_database once. Also handles drop_table hooks. - * @param catName - * @param dbName + * @param req * @param tableList - * @param deleteData * @throws TException */ - private void dropDatabaseCascadePerDb(String catName, String dbName, List tableList, -boolean deleteData) throws TException { -String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf); -List tables = getTableObjectsByName(catName, dbName, tableList); + private void dropDatabaseCascadePerDb(DropDatabaseRequest req, List tableList) throws TException { Review Comment: How does this work together with: ``` // We want no lock here, as the database lock will cover the tables, // and putting a lock will actually cause us to deadlock on ourselves. ``` Wouldn't it cause issues with the locks? Issue Time Tracking --- Worklog Id: (was: 761654) Time Spent: 1.5h (was: 1h 20m) > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761649=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761649 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 25/Apr/22 08:25 Start Date: 25/Apr/22 08:25 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3220: URL: https://github.com/apache/hive/pull/3220#discussion_r857376124 ## standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java: ## @@ -1534,43 +1538,50 @@ public void dropDatabase(String catalogName, String dbName, boolean deleteData, * @param maxBatchSize * @throws TException */ - private void dropDatabaseCascadePerTable(String catName, String dbName, List tableList, - boolean deleteData, int maxBatchSize) throws TException { -String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf); -for (Table table : new TableIterable(this, catName, dbName, tableList, maxBatchSize)) { + private void dropDatabaseCascadePerTable(DropDatabaseRequest req, List tableList, int maxBatchSize) + throws TException { +String dbNameWithCatalog = prependCatalogToDbName(req.getCatalogName(), req.getName(), conf); +for (Table table : new TableIterable( +this, req.getCatalogName(), req.getName(), tableList, maxBatchSize)) { boolean success = false; HiveMetaHook hook = getHook(table); - if (hook == null) { -continue; - } try { -hook.preDropTable(table); -client.drop_table_with_environment_context(dbNameWithCatalog, table.getTableName(), deleteData, null); -hook.commitDropTable(table, deleteData); +if (hook != null) { + hook.preDropTable(table); +} +boolean isSoftDelete = req.isSoftDelete() && Boolean.parseBoolean( + table.getParameters().getOrDefault(SOFT_DELETE_TABLE, "false")); +EnvironmentContext context = null; +if (req.isSetTxnId()) { + context = new EnvironmentContext(); + context.putToProperties("txnId", String.valueOf(req.getTxnId())); + req.setDeleteManagedDir(false); +} +client.drop_table_with_environment_context(dbNameWithCatalog, table.getTableName(), +req.isDeleteData() && !isSoftDelete, context); +if (hook != null) { + hook.commitDropTable(table, req.isDeleteData()); +} success = true; } finally { -if (!success) { +if (!success && hook != null) { hook.rollbackDropTable(table); } } } -client.drop_database(dbNameWithCatalog, deleteData, true); +client.drop_database_req(req); } /** * Handles dropDatabase by invoking drop_database in HMS. * Useful when table list in DB can fit in memory, it will retrieve all tables at once and * call drop_database once. Also handles drop_table hooks. - * @param catName - * @param dbName + * @param req * @param tableList - * @param deleteData * @throws TException */ - private void dropDatabaseCascadePerDb(String catName, String dbName, List tableList, -boolean deleteData) throws TException { -String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf); -List tables = getTableObjectsByName(catName, dbName, tableList); + private void dropDatabaseCascadePerDb(DropDatabaseRequest req, List tableList) throws TException { Review Comment: if DB has a mix of soft-delete(prefixed) and managed tables, we acquire exclusive locks and remove as usual managed/external tables, however, for soft-delete tables we acquire excl_write lock and delegate cleanup to the cleaner process. Read locks are only removed on soft-delete tables. Note: if lockless reads are enabled we do not remove the db folder. Issue Time Tracking --- Worklog Id: (was: 761649) Time Spent: 1h 20m (was: 1h 10m) > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761637=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761637 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 25/Apr/22 08:00 Start Date: 25/Apr/22 08:00 Worklog Time Spent: 10m Work Description: pvary commented on code in PR #3220: URL: https://github.com/apache/hive/pull/3220#discussion_r857355770 ## standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java: ## @@ -1534,43 +1538,50 @@ public void dropDatabase(String catalogName, String dbName, boolean deleteData, * @param maxBatchSize * @throws TException */ - private void dropDatabaseCascadePerTable(String catName, String dbName, List tableList, - boolean deleteData, int maxBatchSize) throws TException { -String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf); -for (Table table : new TableIterable(this, catName, dbName, tableList, maxBatchSize)) { + private void dropDatabaseCascadePerTable(DropDatabaseRequest req, List tableList, int maxBatchSize) + throws TException { +String dbNameWithCatalog = prependCatalogToDbName(req.getCatalogName(), req.getName(), conf); +for (Table table : new TableIterable( +this, req.getCatalogName(), req.getName(), tableList, maxBatchSize)) { boolean success = false; HiveMetaHook hook = getHook(table); - if (hook == null) { -continue; - } try { -hook.preDropTable(table); -client.drop_table_with_environment_context(dbNameWithCatalog, table.getTableName(), deleteData, null); -hook.commitDropTable(table, deleteData); +if (hook != null) { + hook.preDropTable(table); +} +boolean isSoftDelete = req.isSoftDelete() && Boolean.parseBoolean( + table.getParameters().getOrDefault(SOFT_DELETE_TABLE, "false")); +EnvironmentContext context = null; +if (req.isSetTxnId()) { + context = new EnvironmentContext(); + context.putToProperties("txnId", String.valueOf(req.getTxnId())); + req.setDeleteManagedDir(false); +} +client.drop_table_with_environment_context(dbNameWithCatalog, table.getTableName(), +req.isDeleteData() && !isSoftDelete, context); +if (hook != null) { + hook.commitDropTable(table, req.isDeleteData()); +} success = true; } finally { -if (!success) { +if (!success && hook != null) { hook.rollbackDropTable(table); } } } -client.drop_database(dbNameWithCatalog, deleteData, true); +client.drop_database_req(req); } /** * Handles dropDatabase by invoking drop_database in HMS. * Useful when table list in DB can fit in memory, it will retrieve all tables at once and * call drop_database once. Also handles drop_table hooks. - * @param catName - * @param dbName + * @param req * @param tableList - * @param deleteData * @throws TException */ - private void dropDatabaseCascadePerDb(String catName, String dbName, List tableList, -boolean deleteData) throws TException { -String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf); -List tables = getTableObjectsByName(catName, dbName, tableList); + private void dropDatabaseCascadePerDb(DropDatabaseRequest req, List tableList) throws TException { Review Comment: What happens when the tables inside the db has a different configuration. Some of the tables are soft delete, and some of the tables are hard delete. Also what happens if the db and the table soft delete configuration is different? Issue Time Tracking --- Worklog Id: (was: 761637) Time Spent: 1h 10m (was: 1h) > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761632=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761632 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 25/Apr/22 07:55 Start Date: 25/Apr/22 07:55 Worklog Time Spent: 10m Work Description: pvary commented on code in PR #3220: URL: https://github.com/apache/hive/pull/3220#discussion_r857351954 ## standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java: ## @@ -1457,39 +1458,42 @@ public void dropDatabase(String name) @Override public void dropDatabase(String name, boolean deleteData, boolean ignoreUnknownDb) - throws NoSuchObjectException, InvalidOperationException, MetaException, TException { + throws TException { Review Comment: Could we make the old methods deprecated? Issue Time Tracking --- Worklog Id: (was: 761632) Time Spent: 1h (was: 50m) > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761631=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761631 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 25/Apr/22 07:54 Start Date: 25/Apr/22 07:54 Worklog Time Spent: 10m Work Description: pvary commented on code in PR #3220: URL: https://github.com/apache/hive/pull/3220#discussion_r857351082 ## ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java: ## @@ -3914,4 +3914,72 @@ private void testRenamePartition(boolean blocking) throws Exception { driver.getFetchTask().fetch(res); Assert.assertEquals("Expecting 1 rows and found " + res.size(), 1, res.size()); } + + @Test + public void testDropDatabaseNonBlocking() throws Exception { +dropDatabaseNonBlocking(false, false); + } + @Test Review Comment: nit: newlines Issue Time Tracking --- Worklog Id: (was: 761631) Time Spent: 50m (was: 40m) > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761630=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761630 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 25/Apr/22 07:53 Start Date: 25/Apr/22 07:53 Worklog Time Spent: 10m Work Description: pvary commented on code in PR #3220: URL: https://github.com/apache/hive/pull/3220#discussion_r857350352 ## ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java: ## @@ -1685,7 +1688,89 @@ public void testDropWithBaseMultiplePartitions() throws Exception { } } } + + @Test + public void testDropDatabaseCascadePerTableNonBlocking() throws Exception { +MetastoreConf.setLongVar(hiveConf, MetastoreConf.ConfVars.BATCH_RETRIEVE_MAX, 1); +dropDatabaseCascadeNonBlocking(); + } + @Test + public void testDropDatabaseCascadePerDbNonBlocking() throws Exception { +dropDatabaseCascadeNonBlocking(); + } + private void dropDatabaseCascadeNonBlocking() throws Exception { Review Comment: Nit: newline ## ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java: ## @@ -1685,7 +1688,89 @@ public void testDropWithBaseMultiplePartitions() throws Exception { } } } + + @Test + public void testDropDatabaseCascadePerTableNonBlocking() throws Exception { +MetastoreConf.setLongVar(hiveConf, MetastoreConf.ConfVars.BATCH_RETRIEVE_MAX, 1); +dropDatabaseCascadeNonBlocking(); + } + @Test Review Comment: Nit: newline Issue Time Tracking --- Worklog Id: (was: 761630) Time Spent: 40m (was: 0.5h) > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761629=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761629 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 25/Apr/22 07:53 Start Date: 25/Apr/22 07:53 Worklog Time Spent: 10m Work Description: pvary commented on code in PR #3220: URL: https://github.com/apache/hive/pull/3220#discussion_r857349784 ## ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java: ## @@ -661,16 +662,36 @@ public void dropDatabase(String name, boolean deleteData, boolean ignoreUnknownD */ public void dropDatabase(String name, boolean deleteData, boolean ignoreUnknownDb, boolean cascade) throws HiveException, NoSuchObjectException { +dropDatabase( + new DropDatabaseDesc(name, ignoreUnknownDb, cascade, deleteData)); + } + + public void dropDatabase(DropDatabaseDesc desc) Review Comment: Nit: I would guess that we do not need the new line here Issue Time Tracking --- Worklog Id: (was: 761629) Time Spent: 0.5h (was: 20m) > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761628=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761628 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 25/Apr/22 07:50 Start Date: 25/Apr/22 07:50 Worklog Time Spent: 10m Work Description: pvary commented on code in PR #3220: URL: https://github.com/apache/hive/pull/3220#discussion_r857347623 ## ql/src/java/org/apache/hadoop/hive/ql/ddl/database/drop/DropDatabaseAnalyzer.java: ## @@ -49,28 +52,37 @@ public void analyzeInternal(ASTNode root) throws SemanticException { String databaseName = unescapeIdentifier(root.getChild(0).getText()); boolean ifExists = root.getFirstChildWithType(HiveParser.TOK_IFEXISTS) != null; boolean cascade = root.getFirstChildWithType(HiveParser.TOK_CASCADE) != null; +boolean isSoftDelete = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED); Database database = getDatabase(databaseName, !ifExists); if (database == null) { return; } - // if cascade=true, then we need to authorize the drop table action as well, and add the tables to the outputs +boolean allTablesWithSuffix = false; if (cascade) { try { -for (Table table : db.getAllTableObjects(databaseName)) { +List tables = db.getAllTableObjects(databaseName); +allTablesWithSuffix = tables.stream().allMatch( +table -> AcidUtils.isTableSoftDeleteEnabled(table, conf)); +for (Table table : tables) { // We want no lock here, as the database lock will cover the tables, // and putting a lock will actually cause us to deadlock on ourselves. - outputs.add(new WriteEntity(table, WriteEntity.WriteType.DDL_NO_LOCK)); + outputs.add( +new WriteEntity(table, isSoftDelete && !allTablesWithSuffix ? Review Comment: Nit: Could we create boolean variables with descriptive names? It is hard to follow what happens here. Issue Time Tracking --- Worklog Id: (was: 761628) Time Spent: 20m (was: 10m) > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation
[ https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=757913=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-757913 ] ASF GitHub Bot logged work on HIVE-26149: - Author: ASF GitHub Bot Created on: 18/Apr/22 14:55 Start Date: 18/Apr/22 14:55 Worklog Time Spent: 10m Work Description: deniskuzZ opened a new pull request, #3220: URL: https://github.com/apache/hive/pull/3220 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Issue Time Tracking --- Worklog Id: (was: 757913) Remaining Estimate: 0h Time Spent: 10m > Non blocking DROP DATABASE implementation > - > > Key: HIVE-26149 > URL: https://issues.apache.org/jira/browse/HIVE-26149 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001)