[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation
[ https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=607572&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607572 ] ASF GitHub Bot logged work on HIVE-24753: - Author: ASF GitHub Bot Created on: 07/Jun/21 00:18 Start Date: 07/Jun/21 00:18 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #2017: URL: https://github.com/apache/hive/pull/2017#issuecomment-855489220 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 607572) Time Spent: 2h 20m (was: 2h 10m) > Non blocking DROP PARTITION implementation > -- > > Key: HIVE-24753 > URL: https://issues.apache.org/jira/browse/HIVE-24753 > Project: Hive > Issue Type: New Feature >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > Implement a way to execute drop partition operations in a way that doesn't > have to wait for currently running read operations to be finished. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation
[ https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=611051&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611051 ] ASF GitHub Bot logged work on HIVE-24753: - Author: ASF GitHub Bot Created on: 15/Jun/21 00:09 Start Date: 15/Jun/21 00:09 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #2017: URL: https://github.com/apache/hive/pull/2017 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611051) Time Spent: 2.5h (was: 2h 20m) > Non blocking DROP PARTITION implementation > -- > > Key: HIVE-24753 > URL: https://issues.apache.org/jira/browse/HIVE-24753 > Project: Hive > Issue Type: New Feature >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > > Implement a way to execute drop partition operations in a way that doesn't > have to wait for currently running read operations to be finished. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation
[ https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557298&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557298 ] ASF GitHub Bot logged work on HIVE-24753: - Author: ASF GitHub Bot Created on: 24/Feb/21 20:23 Start Date: 24/Feb/21 20:23 Worklog Time Spent: 10m Work Description: zchovan opened a new pull request #2017: URL: https://github.com/apache/hive/pull/2017 Change-Id: Ie989ced8a1ef88e397d4d310628143cdb53ee8ca ### What changes were proposed in this pull request? This changes the drop partition operation to asynchronous. The data files of transactional tables will not be deleted, but a truncated basefile will be written, which is going to be later cleaned up by the Compactor/Cleaner. ### Why are the changes needed? This along with a few other changes will enable us to not use read locks, which provides perf boost to the transactional tables. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added tests to the TestTxnCommands This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 557298) Remaining Estimate: 0h Time Spent: 10m > Non blocking DROP PARTITION implementation > -- > > Key: HIVE-24753 > URL: https://issues.apache.org/jira/browse/HIVE-24753 > Project: Hive > Issue Type: New Feature >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Implement a way to execute drop partition operations in a way that doesn't > have to wait for currently running read operations to be finished. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation
[ https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557299&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557299 ] ASF GitHub Bot logged work on HIVE-24753: - Author: ASF GitHub Bot Created on: 24/Feb/21 20:24 Start Date: 24/Feb/21 20:24 Worklog Time Spent: 10m Work Description: zchovan commented on pull request #2017: URL: https://github.com/apache/hive/pull/2017#issuecomment-785350797 cc @deniskuzZ @pvargacl This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 557299) Time Spent: 20m (was: 10m) > Non blocking DROP PARTITION implementation > -- > > Key: HIVE-24753 > URL: https://issues.apache.org/jira/browse/HIVE-24753 > Project: Hive > Issue Type: New Feature >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Implement a way to execute drop partition operations in a way that doesn't > have to wait for currently running read operations to be finished. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation
[ https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557766&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557766 ] ASF GitHub Bot logged work on HIVE-24753: - Author: ASF GitHub Bot Created on: 25/Feb/21 07:33 Start Date: 25/Feb/21 07:33 Worklog Time Spent: 10m Work Description: pvargacl commented on a change in pull request #2017: URL: https://github.com/apache/hive/pull/2017#discussion_r582602457 ## File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java ## @@ -687,6 +687,8 @@ public static ConfVars getMetaConf(String name) { "hive-metastore/_h...@example.com", "The service principal for the metastore Thrift server. \n" + "The special string _HOST will be replaced automatically with the correct host name."), +LOCKLESS_READS_ENABLED("metastore.lockless.reads.enabled", "metastore.lockless.reads.enabled", Review comment: I think there should be a separate config just for the drop partition This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 557766) Time Spent: 0.5h (was: 20m) > Non blocking DROP PARTITION implementation > -- > > Key: HIVE-24753 > URL: https://issues.apache.org/jira/browse/HIVE-24753 > Project: Hive > Issue Type: New Feature >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Implement a way to execute drop partition operations in a way that doesn't > have to wait for currently running read operations to be finished. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation
[ https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557767&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557767 ] ASF GitHub Bot logged work on HIVE-24753: - Author: ASF GitHub Bot Created on: 25/Feb/21 07:36 Start Date: 25/Feb/21 07:36 Worklog Time Spent: 10m Work Description: pvargacl commented on a change in pull request #2017: URL: https://github.com/apache/hive/pull/2017#discussion_r582603575 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/AcidEventListener.java ## @@ -72,6 +77,22 @@ public void onDropPartition(DropPartitionEvent partitionEvent) throws MetaExcep txnHandler = getTxnHandler(); txnHandler.cleanupRecords(HiveObjectType.PARTITION, null, partitionEvent.getTable(), partitionEvent.getPartitionIterator()); + + if (MetastoreConf.getBoolVar(conf, ConfVars.LOCKLESS_READS_ENABLED)) { +CompactionRequest rqst = new CompactionRequest(partitionEvent.getTable().getDbName(), partitionEvent.getTable().getTableName(), Review comment: Is this going to work? If the partition record is dropped, I think the compaction will automatically fail, there should be a test for this I think you should just create a compaction record that is in ready_for_cleaning state. And since there is a base file, I think I would prefer a major compaction, but it does not really matter. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 557767) Time Spent: 40m (was: 0.5h) > Non blocking DROP PARTITION implementation > -- > > Key: HIVE-24753 > URL: https://issues.apache.org/jira/browse/HIVE-24753 > Project: Hive > Issue Type: New Feature >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Implement a way to execute drop partition operations in a way that doesn't > have to wait for currently running read operations to be finished. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation
[ https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557768&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557768 ] ASF GitHub Bot logged work on HIVE-24753: - Author: ASF GitHub Bot Created on: 25/Feb/21 07:38 Start Date: 25/Feb/21 07:38 Worklog Time Spent: 10m Work Description: pvargacl commented on a change in pull request #2017: URL: https://github.com/apache/hive/pull/2017#discussion_r582604658 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -4865,10 +4871,26 @@ private boolean drop_partition_common(RawStore ms, String catName, String db_nam if (isArchived) { assert (archiveParentDir != null); -wh.deleteDir(archiveParentDir, true, mustPurge, needsCm); +if (writeTruncatedBase) { + try { +addTruncateBaseFile(archiveParentDir, writeId, archiveParentDir.getFileSystem(getConf())); + } catch (Exception e) { +throw newMetaException(e); + } +} else { + wh.deleteDir(archiveParentDir, true, mustPurge, needsCm); +} } else { assert (partPath != null); -wh.deleteDir(partPath, true, mustPurge, needsCm); +if (writeTruncatedBase) { + try { +addTruncateBaseFile(partPath, writeId, archiveParentDir.getFileSystem(getConf())); Review comment: The metadata file has a type information, I think you should not use truncate type, rather create a drop type. That info might be useful when you do the cleanup and have to decide whether to delete the whole partition directory or not This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 557768) Time Spent: 50m (was: 40m) > Non blocking DROP PARTITION implementation > -- > > Key: HIVE-24753 > URL: https://issues.apache.org/jira/browse/HIVE-24753 > Project: Hive > Issue Type: New Feature >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Implement a way to execute drop partition operations in a way that doesn't > have to wait for currently running read operations to be finished. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation
[ https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557771&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557771 ] ASF GitHub Bot logged work on HIVE-24753: - Author: ASF GitHub Bot Created on: 25/Feb/21 07:43 Start Date: 25/Feb/21 07:43 Worklog Time Spent: 10m Work Description: pvargacl commented on pull request #2017: URL: https://github.com/apache/hive/pull/2017#issuecomment-785691173 I missing some features, should these part of this PR? - shouldn't the lock type changed from exclusive to excl_write if we use this kind of drop? - shouldn't he Cleaner later delete the whole partition directory if it was not recreated? - I think you should add many test, around drop and recreate, what happens if you recreate in different scenarios (while there is an old read still running, when you overlap with the cleaner, when it was already cleaned up ... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 557771) Time Spent: 1h (was: 50m) > Non blocking DROP PARTITION implementation > -- > > Key: HIVE-24753 > URL: https://issues.apache.org/jira/browse/HIVE-24753 > Project: Hive > Issue Type: New Feature >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Implement a way to execute drop partition operations in a way that doesn't > have to wait for currently running read operations to be finished. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation
[ https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557778&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557778 ] ASF GitHub Bot logged work on HIVE-24753: - Author: ASF GitHub Bot Created on: 25/Feb/21 07:47 Start Date: 25/Feb/21 07:47 Worklog Time Spent: 10m Work Description: pvargacl commented on a change in pull request #2017: URL: https://github.com/apache/hive/pull/2017#discussion_r582609667 ## File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/PartitionDropOptions.java ## @@ -27,6 +27,8 @@ public boolean ifExists = false; public boolean returnResults = true; public boolean purgeData = false; + public String validWriteIds; Review comment: Is the validWriteIds used anywhere? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 557778) Time Spent: 1h 10m (was: 1h) > Non blocking DROP PARTITION implementation > -- > > Key: HIVE-24753 > URL: https://issues.apache.org/jira/browse/HIVE-24753 > Project: Hive > Issue Type: New Feature >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Implement a way to execute drop partition operations in a way that doesn't > have to wait for currently running read operations to be finished. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation
[ https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557782&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557782 ] ASF GitHub Bot logged work on HIVE-24753: - Author: ASF GitHub Bot Created on: 25/Feb/21 07:55 Start Date: 25/Feb/21 07:55 Worklog Time Spent: 10m Work Description: pvargacl edited a comment on pull request #2017: URL: https://github.com/apache/hive/pull/2017#issuecomment-785691173 I missing some features, should these part of this PR? - shouldn't the lock type changed from exclusive to excl_write if we use this kind of drop? - shouldn't he Cleaner later delete the whole partition directory if it was not recreated? - I think you should add many test, around drop and recreate, what happens if you recreate in different scenarios (while there is an old read still running, when you overlap with the cleaner, when it was already cleaned up ... - there will be a race condition in the cleaner and the create partition I think, what happens if Cleaner starts to run, checks if the partition was dropped, so decides it has to delete the partition directory, but now a create partition comes and a new query starts to write a new delta. This should be handled. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 557782) Time Spent: 1h 20m (was: 1h 10m) > Non blocking DROP PARTITION implementation > -- > > Key: HIVE-24753 > URL: https://issues.apache.org/jira/browse/HIVE-24753 > Project: Hive > Issue Type: New Feature >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Implement a way to execute drop partition operations in a way that doesn't > have to wait for currently running read operations to be finished. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation
[ https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557792&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557792 ] ASF GitHub Bot logged work on HIVE-24753: - Author: ASF GitHub Bot Created on: 25/Feb/21 08:26 Start Date: 25/Feb/21 08:26 Worklog Time Spent: 10m Work Description: zchovan commented on a change in pull request #2017: URL: https://github.com/apache/hive/pull/2017#discussion_r582631700 ## File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/PartitionDropOptions.java ## @@ -27,6 +27,8 @@ public boolean ifExists = false; public boolean returnResults = true; public boolean purgeData = false; + public String validWriteIds; Review comment: no, that was left in by mistake This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 557792) Time Spent: 1.5h (was: 1h 20m) > Non blocking DROP PARTITION implementation > -- > > Key: HIVE-24753 > URL: https://issues.apache.org/jira/browse/HIVE-24753 > Project: Hive > Issue Type: New Feature >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > Implement a way to execute drop partition operations in a way that doesn't > have to wait for currently running read operations to be finished. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation
[ https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557793&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557793 ] ASF GitHub Bot logged work on HIVE-24753: - Author: ASF GitHub Bot Created on: 25/Feb/21 08:28 Start Date: 25/Feb/21 08:28 Worklog Time Spent: 10m Work Description: zchovan commented on a change in pull request #2017: URL: https://github.com/apache/hive/pull/2017#discussion_r582632897 ## File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java ## @@ -687,6 +687,8 @@ public static ConfVars getMetaConf(String name) { "hive-metastore/_h...@example.com", "The service principal for the metastore Thrift server. \n" + "The special string _HOST will be replaced automatically with the correct host name."), +LOCKLESS_READS_ENABLED("metastore.lockless.reads.enabled", "metastore.lockless.reads.enabled", Review comment: do you mean there should be one main config (e.g. this one) and additional one for each related operation (dropTable/partition/etc) or no main one and one for each op? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 557793) Time Spent: 1h 40m (was: 1.5h) > Non blocking DROP PARTITION implementation > -- > > Key: HIVE-24753 > URL: https://issues.apache.org/jira/browse/HIVE-24753 > Project: Hive > Issue Type: New Feature >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Implement a way to execute drop partition operations in a way that doesn't > have to wait for currently running read operations to be finished. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation
[ https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557798&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557798 ] ASF GitHub Bot logged work on HIVE-24753: - Author: ASF GitHub Bot Created on: 25/Feb/21 08:36 Start Date: 25/Feb/21 08:36 Worklog Time Spent: 10m Work Description: zchovan commented on a change in pull request #2017: URL: https://github.com/apache/hive/pull/2017#discussion_r582638123 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -4865,10 +4871,26 @@ private boolean drop_partition_common(RawStore ms, String catName, String db_nam if (isArchived) { assert (archiveParentDir != null); -wh.deleteDir(archiveParentDir, true, mustPurge, needsCm); +if (writeTruncatedBase) { + try { +addTruncateBaseFile(archiveParentDir, writeId, archiveParentDir.getFileSystem(getConf())); + } catch (Exception e) { +throw newMetaException(e); + } +} else { + wh.deleteDir(archiveParentDir, true, mustPurge, needsCm); +} } else { assert (partPath != null); -wh.deleteDir(partPath, true, mustPurge, needsCm); +if (writeTruncatedBase) { + try { +addTruncateBaseFile(partPath, writeId, archiveParentDir.getFileSystem(getConf())); Review comment: good idea, will do This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 557798) Time Spent: 1h 50m (was: 1h 40m) > Non blocking DROP PARTITION implementation > -- > > Key: HIVE-24753 > URL: https://issues.apache.org/jira/browse/HIVE-24753 > Project: Hive > Issue Type: New Feature >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > Implement a way to execute drop partition operations in a way that doesn't > have to wait for currently running read operations to be finished. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation
[ https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557804&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557804 ] ASF GitHub Bot logged work on HIVE-24753: - Author: ASF GitHub Bot Created on: 25/Feb/21 08:56 Start Date: 25/Feb/21 08:56 Worklog Time Spent: 10m Work Description: zchovan commented on pull request #2017: URL: https://github.com/apache/hive/pull/2017#issuecomment-785731823 @pvargacl The Cleaner changes are planned for a separate commit. I agree that the scenarios you've mentioned have to be tested, but without the final cleaner changes they don't really make sense here. The way I see it, the Cleaner first checks if the partition still exists in the HMS, if it doesn't, then the partition has not been yet recreated and the whole location dir can be deleted, no compaction needed. If the partition exists that means that between the dropPartition and the compaction's start the partition was recreated and should be compacted, e.g the files created before the truncated/deleted base file was written can be compacted/deleted. This still leaves the last scenario where the Cleaner is already running and the partition is recreated, so yeah that should be checked and tested. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 557804) Time Spent: 2h (was: 1h 50m) > Non blocking DROP PARTITION implementation > -- > > Key: HIVE-24753 > URL: https://issues.apache.org/jira/browse/HIVE-24753 > Project: Hive > Issue Type: New Feature >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > Implement a way to execute drop partition operations in a way that doesn't > have to wait for currently running read operations to be finished. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation
[ https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557807&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557807 ] ASF GitHub Bot logged work on HIVE-24753: - Author: ASF GitHub Bot Created on: 25/Feb/21 09:16 Start Date: 25/Feb/21 09:16 Worklog Time Spent: 10m Work Description: pvargacl commented on pull request #2017: URL: https://github.com/apache/hive/pull/2017#issuecomment-785744497 > @pvargacl > The Cleaner changes are planned for a separate commit. I agree that the scenarios you've mentioned have to be tested, but without the final cleaner changes they don't really make sense here. > The way I see it, the Cleaner first checks if the partition still exists in the HMS, if it doesn't, then the partition has not been yet recreated and the whole location dir can be deleted, no compaction needed. > If the partition exists that means that between the dropPartition and the compaction's start the partition was recreated and should be compacted, e.g the files created before the truncated/deleted base file was written can be compacted/deleted. > This still leaves the last scenario where the Cleaner is already running and the partition is recreated, so yeah that should be checked and tested. I don't think this can go in with some basic Cleaner change, even if it does not delete the partition directory, you have to handle if the partition record is missing, otherwise the Cleaner will just fail. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 557807) Time Spent: 2h 10m (was: 2h) > Non blocking DROP PARTITION implementation > -- > > Key: HIVE-24753 > URL: https://issues.apache.org/jira/browse/HIVE-24753 > Project: Hive > Issue Type: New Feature >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > Implement a way to execute drop partition operations in a way that doesn't > have to wait for currently running read operations to be finished. -- This message was sent by Atlassian Jira (v8.3.4#803005)