[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation

2021-06-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=607572&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607572
 ]

ASF GitHub Bot logged work on HIVE-24753:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:18
Start Date: 07/Jun/21 00:18
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2017:
URL: https://github.com/apache/hive/pull/2017#issuecomment-855489220


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607572)
Time Spent: 2h 20m  (was: 2h 10m)

> Non blocking DROP PARTITION implementation
> --
>
> Key: HIVE-24753
> URL: https://issues.apache.org/jira/browse/HIVE-24753
> Project: Hive
>  Issue Type: New Feature
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Implement a way to execute drop partition operations in a way that doesn't 
> have to wait for currently running read operations to be finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation

2021-06-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=611051&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611051
 ]

ASF GitHub Bot logged work on HIVE-24753:
-

Author: ASF GitHub Bot
Created on: 15/Jun/21 00:09
Start Date: 15/Jun/21 00:09
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #2017:
URL: https://github.com/apache/hive/pull/2017


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611051)
Time Spent: 2.5h  (was: 2h 20m)

> Non blocking DROP PARTITION implementation
> --
>
> Key: HIVE-24753
> URL: https://issues.apache.org/jira/browse/HIVE-24753
> Project: Hive
>  Issue Type: New Feature
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Implement a way to execute drop partition operations in a way that doesn't 
> have to wait for currently running read operations to be finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation

2021-02-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557298&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557298
 ]

ASF GitHub Bot logged work on HIVE-24753:
-

Author: ASF GitHub Bot
Created on: 24/Feb/21 20:23
Start Date: 24/Feb/21 20:23
Worklog Time Spent: 10m 
  Work Description: zchovan opened a new pull request #2017:
URL: https://github.com/apache/hive/pull/2017


   Change-Id: Ie989ced8a1ef88e397d4d310628143cdb53ee8ca
   
   
   
   ### What changes were proposed in this pull request?
   
   This changes the drop partition operation to asynchronous. The data files of 
transactional tables will not be deleted, but a truncated basefile will be 
written, which is going to be later cleaned up by the Compactor/Cleaner. 
   
   ### Why are the changes needed?
   
   This along with a few other changes will enable us to not use read locks, 
which provides perf boost to the transactional tables.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   Added tests to the TestTxnCommands
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 557298)
Remaining Estimate: 0h
Time Spent: 10m

> Non blocking DROP PARTITION implementation
> --
>
> Key: HIVE-24753
> URL: https://issues.apache.org/jira/browse/HIVE-24753
> Project: Hive
>  Issue Type: New Feature
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Implement a way to execute drop partition operations in a way that doesn't 
> have to wait for currently running read operations to be finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation

2021-02-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557299&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557299
 ]

ASF GitHub Bot logged work on HIVE-24753:
-

Author: ASF GitHub Bot
Created on: 24/Feb/21 20:24
Start Date: 24/Feb/21 20:24
Worklog Time Spent: 10m 
  Work Description: zchovan commented on pull request #2017:
URL: https://github.com/apache/hive/pull/2017#issuecomment-785350797


   cc @deniskuzZ @pvargacl 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 557299)
Time Spent: 20m  (was: 10m)

> Non blocking DROP PARTITION implementation
> --
>
> Key: HIVE-24753
> URL: https://issues.apache.org/jira/browse/HIVE-24753
> Project: Hive
>  Issue Type: New Feature
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Implement a way to execute drop partition operations in a way that doesn't 
> have to wait for currently running read operations to be finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation

2021-02-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557766&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557766
 ]

ASF GitHub Bot logged work on HIVE-24753:
-

Author: ASF GitHub Bot
Created on: 25/Feb/21 07:33
Start Date: 25/Feb/21 07:33
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on a change in pull request #2017:
URL: https://github.com/apache/hive/pull/2017#discussion_r582602457



##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
##
@@ -687,6 +687,8 @@ public static ConfVars getMetaConf(String name) {
 "hive-metastore/_h...@example.com",
 "The service principal for the metastore Thrift server. \n" +
 "The special string _HOST will be replaced automatically with the 
correct host name."),
+LOCKLESS_READS_ENABLED("metastore.lockless.reads.enabled", 
"metastore.lockless.reads.enabled",

Review comment:
   I think there should be a separate config just for the drop partition





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 557766)
Time Spent: 0.5h  (was: 20m)

> Non blocking DROP PARTITION implementation
> --
>
> Key: HIVE-24753
> URL: https://issues.apache.org/jira/browse/HIVE-24753
> Project: Hive
>  Issue Type: New Feature
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Implement a way to execute drop partition operations in a way that doesn't 
> have to wait for currently running read operations to be finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation

2021-02-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557767&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557767
 ]

ASF GitHub Bot logged work on HIVE-24753:
-

Author: ASF GitHub Bot
Created on: 25/Feb/21 07:36
Start Date: 25/Feb/21 07:36
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on a change in pull request #2017:
URL: https://github.com/apache/hive/pull/2017#discussion_r582603575



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/AcidEventListener.java
##
@@ -72,6 +77,22 @@ public void onDropPartition(DropPartitionEvent 
partitionEvent)  throws MetaExcep
   txnHandler = getTxnHandler();
   txnHandler.cleanupRecords(HiveObjectType.PARTITION, null, 
partitionEvent.getTable(),
   partitionEvent.getPartitionIterator());
+
+  if (MetastoreConf.getBoolVar(conf, ConfVars.LOCKLESS_READS_ENABLED)) {
+CompactionRequest rqst = new 
CompactionRequest(partitionEvent.getTable().getDbName(), 
partitionEvent.getTable().getTableName(),

Review comment:
   Is this going to work? If the partition record is dropped, I think the 
compaction will automatically fail, there should be a test for this
   I think you should just create a compaction record that is in 
ready_for_cleaning state. And since there is a base file, I think I would 
prefer a major compaction, but it does not really matter.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 557767)
Time Spent: 40m  (was: 0.5h)

> Non blocking DROP PARTITION implementation
> --
>
> Key: HIVE-24753
> URL: https://issues.apache.org/jira/browse/HIVE-24753
> Project: Hive
>  Issue Type: New Feature
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Implement a way to execute drop partition operations in a way that doesn't 
> have to wait for currently running read operations to be finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation

2021-02-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557768&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557768
 ]

ASF GitHub Bot logged work on HIVE-24753:
-

Author: ASF GitHub Bot
Created on: 25/Feb/21 07:38
Start Date: 25/Feb/21 07:38
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on a change in pull request #2017:
URL: https://github.com/apache/hive/pull/2017#discussion_r582604658



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java
##
@@ -4865,10 +4871,26 @@ private boolean drop_partition_common(RawStore ms, 
String catName, String db_nam
 
   if (isArchived) {
 assert (archiveParentDir != null);
-wh.deleteDir(archiveParentDir, true, mustPurge, needsCm);
+if (writeTruncatedBase) {
+  try {
+addTruncateBaseFile(archiveParentDir, writeId, 
archiveParentDir.getFileSystem(getConf()));
+  } catch (Exception e) {
+throw newMetaException(e);
+  }
+} else {
+  wh.deleteDir(archiveParentDir, true, mustPurge, needsCm);
+}
   } else {
 assert (partPath != null);
-wh.deleteDir(partPath, true, mustPurge, needsCm);
+if (writeTruncatedBase) {
+  try {
+addTruncateBaseFile(partPath, writeId, 
archiveParentDir.getFileSystem(getConf()));

Review comment:
   The metadata file has a type information, I think you should not use 
truncate type, rather create a drop type.
   That info might be useful when you do the cleanup and have to decide whether 
to delete the whole partition directory or not





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 557768)
Time Spent: 50m  (was: 40m)

> Non blocking DROP PARTITION implementation
> --
>
> Key: HIVE-24753
> URL: https://issues.apache.org/jira/browse/HIVE-24753
> Project: Hive
>  Issue Type: New Feature
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Implement a way to execute drop partition operations in a way that doesn't 
> have to wait for currently running read operations to be finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation

2021-02-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557771&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557771
 ]

ASF GitHub Bot logged work on HIVE-24753:
-

Author: ASF GitHub Bot
Created on: 25/Feb/21 07:43
Start Date: 25/Feb/21 07:43
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on pull request #2017:
URL: https://github.com/apache/hive/pull/2017#issuecomment-785691173


   I missing some features, should these part of this PR?
   
   -  shouldn't the lock type changed from exclusive to excl_write if we use 
this kind of drop?
   -  shouldn't he Cleaner later delete the whole partition directory if it was 
not recreated?
   -  I think you should add many test, around drop and recreate, what happens 
if you recreate in different scenarios (while there is an old read still 
running, when you overlap with the cleaner, when it was already cleaned up ...



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 557771)
Time Spent: 1h  (was: 50m)

> Non blocking DROP PARTITION implementation
> --
>
> Key: HIVE-24753
> URL: https://issues.apache.org/jira/browse/HIVE-24753
> Project: Hive
>  Issue Type: New Feature
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Implement a way to execute drop partition operations in a way that doesn't 
> have to wait for currently running read operations to be finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation

2021-02-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557778&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557778
 ]

ASF GitHub Bot logged work on HIVE-24753:
-

Author: ASF GitHub Bot
Created on: 25/Feb/21 07:47
Start Date: 25/Feb/21 07:47
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on a change in pull request #2017:
URL: https://github.com/apache/hive/pull/2017#discussion_r582609667



##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/PartitionDropOptions.java
##
@@ -27,6 +27,8 @@
   public boolean ifExists = false;
   public boolean returnResults = true;
   public boolean purgeData = false;
+  public String validWriteIds;

Review comment:
   Is the validWriteIds used anywhere?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 557778)
Time Spent: 1h 10m  (was: 1h)

> Non blocking DROP PARTITION implementation
> --
>
> Key: HIVE-24753
> URL: https://issues.apache.org/jira/browse/HIVE-24753
> Project: Hive
>  Issue Type: New Feature
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Implement a way to execute drop partition operations in a way that doesn't 
> have to wait for currently running read operations to be finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation

2021-02-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557782&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557782
 ]

ASF GitHub Bot logged work on HIVE-24753:
-

Author: ASF GitHub Bot
Created on: 25/Feb/21 07:55
Start Date: 25/Feb/21 07:55
Worklog Time Spent: 10m 
  Work Description: pvargacl edited a comment on pull request #2017:
URL: https://github.com/apache/hive/pull/2017#issuecomment-785691173


   I missing some features, should these part of this PR?
   
   -  shouldn't the lock type changed from exclusive to excl_write if we use 
this kind of drop?
   -  shouldn't he Cleaner later delete the whole partition directory if it was 
not recreated?
   -  I think you should add many test, around drop and recreate, what happens 
if you recreate in different scenarios (while there is an old read still 
running, when you overlap with the cleaner, when it was already cleaned up ...
   - there will be a race condition in the cleaner and the create partition I 
think, what happens if Cleaner starts to run, checks if the partition was 
dropped, so decides it has to delete the partition directory, but now a create 
partition comes and a new query starts to write a new delta. This should be 
handled. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 557782)
Time Spent: 1h 20m  (was: 1h 10m)

> Non blocking DROP PARTITION implementation
> --
>
> Key: HIVE-24753
> URL: https://issues.apache.org/jira/browse/HIVE-24753
> Project: Hive
>  Issue Type: New Feature
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Implement a way to execute drop partition operations in a way that doesn't 
> have to wait for currently running read operations to be finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation

2021-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557792&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557792
 ]

ASF GitHub Bot logged work on HIVE-24753:
-

Author: ASF GitHub Bot
Created on: 25/Feb/21 08:26
Start Date: 25/Feb/21 08:26
Worklog Time Spent: 10m 
  Work Description: zchovan commented on a change in pull request #2017:
URL: https://github.com/apache/hive/pull/2017#discussion_r582631700



##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/PartitionDropOptions.java
##
@@ -27,6 +27,8 @@
   public boolean ifExists = false;
   public boolean returnResults = true;
   public boolean purgeData = false;
+  public String validWriteIds;

Review comment:
   no, that was left in by mistake





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 557792)
Time Spent: 1.5h  (was: 1h 20m)

> Non blocking DROP PARTITION implementation
> --
>
> Key: HIVE-24753
> URL: https://issues.apache.org/jira/browse/HIVE-24753
> Project: Hive
>  Issue Type: New Feature
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Implement a way to execute drop partition operations in a way that doesn't 
> have to wait for currently running read operations to be finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation

2021-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557793&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557793
 ]

ASF GitHub Bot logged work on HIVE-24753:
-

Author: ASF GitHub Bot
Created on: 25/Feb/21 08:28
Start Date: 25/Feb/21 08:28
Worklog Time Spent: 10m 
  Work Description: zchovan commented on a change in pull request #2017:
URL: https://github.com/apache/hive/pull/2017#discussion_r582632897



##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
##
@@ -687,6 +687,8 @@ public static ConfVars getMetaConf(String name) {
 "hive-metastore/_h...@example.com",
 "The service principal for the metastore Thrift server. \n" +
 "The special string _HOST will be replaced automatically with the 
correct host name."),
+LOCKLESS_READS_ENABLED("metastore.lockless.reads.enabled", 
"metastore.lockless.reads.enabled",

Review comment:
   do you mean there should be one main config (e.g. this one) and 
additional one for each related operation (dropTable/partition/etc) or no main 
one and one for each op?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 557793)
Time Spent: 1h 40m  (was: 1.5h)

> Non blocking DROP PARTITION implementation
> --
>
> Key: HIVE-24753
> URL: https://issues.apache.org/jira/browse/HIVE-24753
> Project: Hive
>  Issue Type: New Feature
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Implement a way to execute drop partition operations in a way that doesn't 
> have to wait for currently running read operations to be finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation

2021-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557798&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557798
 ]

ASF GitHub Bot logged work on HIVE-24753:
-

Author: ASF GitHub Bot
Created on: 25/Feb/21 08:36
Start Date: 25/Feb/21 08:36
Worklog Time Spent: 10m 
  Work Description: zchovan commented on a change in pull request #2017:
URL: https://github.com/apache/hive/pull/2017#discussion_r582638123



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java
##
@@ -4865,10 +4871,26 @@ private boolean drop_partition_common(RawStore ms, 
String catName, String db_nam
 
   if (isArchived) {
 assert (archiveParentDir != null);
-wh.deleteDir(archiveParentDir, true, mustPurge, needsCm);
+if (writeTruncatedBase) {
+  try {
+addTruncateBaseFile(archiveParentDir, writeId, 
archiveParentDir.getFileSystem(getConf()));
+  } catch (Exception e) {
+throw newMetaException(e);
+  }
+} else {
+  wh.deleteDir(archiveParentDir, true, mustPurge, needsCm);
+}
   } else {
 assert (partPath != null);
-wh.deleteDir(partPath, true, mustPurge, needsCm);
+if (writeTruncatedBase) {
+  try {
+addTruncateBaseFile(partPath, writeId, 
archiveParentDir.getFileSystem(getConf()));

Review comment:
   good idea, will do





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 557798)
Time Spent: 1h 50m  (was: 1h 40m)

> Non blocking DROP PARTITION implementation
> --
>
> Key: HIVE-24753
> URL: https://issues.apache.org/jira/browse/HIVE-24753
> Project: Hive
>  Issue Type: New Feature
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Implement a way to execute drop partition operations in a way that doesn't 
> have to wait for currently running read operations to be finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation

2021-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557804&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557804
 ]

ASF GitHub Bot logged work on HIVE-24753:
-

Author: ASF GitHub Bot
Created on: 25/Feb/21 08:56
Start Date: 25/Feb/21 08:56
Worklog Time Spent: 10m 
  Work Description: zchovan commented on pull request #2017:
URL: https://github.com/apache/hive/pull/2017#issuecomment-785731823


   @pvargacl 
   The Cleaner changes are planned for a separate commit. I agree that the 
scenarios you've mentioned have to be tested, but without the final cleaner 
changes they don't really make sense here. 
   The way I see it, the Cleaner first checks if the partition still exists in 
the HMS, if it doesn't, then the partition has not been yet recreated and the 
whole location dir can be deleted, no compaction needed.
   If the partition exists that means that between the dropPartition and the 
compaction's start the partition was recreated and should be compacted, e.g the 
files created before the truncated/deleted base file was written can be 
compacted/deleted.
   This still leaves the last scenario where the Cleaner is already running and 
the partition is recreated, so yeah that should be checked and tested.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 557804)
Time Spent: 2h  (was: 1h 50m)

> Non blocking DROP PARTITION implementation
> --
>
> Key: HIVE-24753
> URL: https://issues.apache.org/jira/browse/HIVE-24753
> Project: Hive
>  Issue Type: New Feature
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Implement a way to execute drop partition operations in a way that doesn't 
> have to wait for currently running read operations to be finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation

2021-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=557807&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557807
 ]

ASF GitHub Bot logged work on HIVE-24753:
-

Author: ASF GitHub Bot
Created on: 25/Feb/21 09:16
Start Date: 25/Feb/21 09:16
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on pull request #2017:
URL: https://github.com/apache/hive/pull/2017#issuecomment-785744497


   > @pvargacl
   > The Cleaner changes are planned for a separate commit. I agree that the 
scenarios you've mentioned have to be tested, but without the final cleaner 
changes they don't really make sense here.
   > The way I see it, the Cleaner first checks if the partition still exists 
in the HMS, if it doesn't, then the partition has not been yet recreated and 
the whole location dir can be deleted, no compaction needed.
   > If the partition exists that means that between the dropPartition and the 
compaction's start the partition was recreated and should be compacted, e.g the 
files created before the truncated/deleted base file was written can be 
compacted/deleted.
   > This still leaves the last scenario where the Cleaner is already running 
and the partition is recreated, so yeah that should be checked and tested.
   
   I don't think this can go in with some basic Cleaner change, even if it does 
not delete the partition directory, you have to handle if the partition record 
is missing, otherwise the Cleaner will just fail.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 557807)
Time Spent: 2h 10m  (was: 2h)

> Non blocking DROP PARTITION implementation
> --
>
> Key: HIVE-24753
> URL: https://issues.apache.org/jira/browse/HIVE-24753
> Project: Hive
>  Issue Type: New Feature
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Implement a way to execute drop partition operations in a way that doesn't 
> have to wait for currently running read operations to be finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)