[jira] [Commented] (HIVE-14626) Support Trash in Truncate Table
[ https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15435944#comment-15435944 ] Chaoyu Tang commented on HIVE-14626: Patch has been uploaded to https://reviews.apache.org/r/51395/ and requested for review. Thanks in advanced. > Support Trash in Truncate Table > --- > > Key: HIVE-14626 > URL: https://issues.apache.org/jira/browse/HIVE-14626 > Project: Hive > Issue Type: Sub-task > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-14626.patch > > > Currently Truncate Table (or Partition) is implemented using > FileSystem.delete and then recreate the directory, so > 1. it does not support HDFS Trash > 2. if the table/partition directory is initially encryption protected, after > being deleted and recreated, it is no more protected. > The new implementation is to clean the contents of directory using > multi-threaded trashFiles. If Trash is enabled and has a lower encryption > level than the data directory, the files under it will be deleted. Otherwise, > they will be Trashed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14626) Support Trash in Truncate Table
[ https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15436340#comment-15436340 ] Hive QA commented on HIVE-14626: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12825364/HIVE-14626.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10459 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char] org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver[schemeAuthority] org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/981/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/981/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-981/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12825364 - PreCommit-HIVE-MASTER-Build > Support Trash in Truncate Table > --- > > Key: HIVE-14626 > URL: https://issues.apache.org/jira/browse/HIVE-14626 > Project: Hive > Issue Type: Sub-task > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-14626.patch > > > Currently Truncate Table (or Partition) is implemented using > FileSystem.delete and then recreate the directory, so > 1. it does not support HDFS Trash > 2. if the table/partition directory is initially encryption protected, after > being deleted and recreated, it is no more protected. > The new implementation is to clean the contents of directory using > multi-threaded trashFiles. If Trash is enabled and has a lower encryption > level than the data directory, the files under it will be deleted. Otherwise, > they will be Trashed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14626) Support Trash in Truncate Table
[ https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15437128#comment-15437128 ] Chaoyu Tang commented on HIVE-14626: The new failed tests (schemeAuthority & TestPigHBaseStorageHandler) seem not related to this patch. I could not reproduce the schemeAuthority failure in my local machine. As for the TestPigHBaseStorageHandler, the Yarn server was started and the failure is not related. Other failures are aged and not related to this patch as well. [~spena], [~ychena] Could you review the patch? Thanks > Support Trash in Truncate Table > --- > > Key: HIVE-14626 > URL: https://issues.apache.org/jira/browse/HIVE-14626 > Project: Hive > Issue Type: Sub-task > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-14626.patch > > > Currently Truncate Table (or Partition) is implemented using > FileSystem.delete and then recreate the directory, so > 1. it does not support HDFS Trash > 2. if the table/partition directory is initially encryption protected, after > being deleted and recreated, it is no more protected. > The new implementation is to clean the contents of directory using > multi-threaded trashFiles. If Trash is enabled and has a lower encryption > level than the data directory, the files under it will be deleted. Otherwise, > they will be Trashed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14626) Support Trash in Truncate Table
[ https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15437258#comment-15437258 ] Sergio Peña commented on HIVE-14626: Isn't deleting one file at a time slower than deleting the directory? For blobstore systems, like S3, this may have a perf. overhead as Hive.trashFile() is calling listStatus() again. Is there a way to detect if the location is encrypted, if not, then just delete the directory? > Support Trash in Truncate Table > --- > > Key: HIVE-14626 > URL: https://issues.apache.org/jira/browse/HIVE-14626 > Project: Hive > Issue Type: Sub-task > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-14626.patch > > > Currently Truncate Table (or Partition) is implemented using > FileSystem.delete and then recreate the directory, so > 1. it does not support HDFS Trash > 2. if the table/partition directory is initially encryption protected, after > being deleted and recreated, it is no more protected. > The new implementation is to clean the contents of directory using > multi-threaded trashFiles. If Trash is enabled and has a lower encryption > level than the data directory, the files under it will be deleted. Otherwise, > they will be Trashed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14626) Support Trash in Truncate Table
[ https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468005#comment-15468005 ] Chaoyu Tang commented on HIVE-14626: Thanks [~spena] for review. I am not quite sure about Hive on S3, does it support something like encryption zone and trash? If not, will it possible to delete/recreate the directory in S3? > Support Trash in Truncate Table > --- > > Key: HIVE-14626 > URL: https://issues.apache.org/jira/browse/HIVE-14626 > Project: Hive > Issue Type: Sub-task > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-14626.patch > > > Currently Truncate Table (or Partition) is implemented using > FileSystem.delete and then recreate the directory, so > 1. it does not support HDFS Trash > 2. if the table/partition directory is initially encryption protected, after > being deleted and recreated, it is no more protected. > The new implementation is to clean the contents of directory using > multi-threaded trashFiles. If Trash is enabled and has a lower encryption > level than the data directory, the files under it will be deleted. Otherwise, > they will be Trashed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14626) Support Trash in Truncate Table
[ https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468260#comment-15468260 ] Sergio Peña commented on HIVE-14626: S3 is treated as another filesystem by Hadoop, and it does not have encryption zones, but trash may be allowed (but not recommended). Yes, I think you can delete/recreate the directory on S3. That should work. > Support Trash in Truncate Table > --- > > Key: HIVE-14626 > URL: https://issues.apache.org/jira/browse/HIVE-14626 > Project: Hive > Issue Type: Sub-task > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-14626.patch > > > Currently Truncate Table (or Partition) is implemented using > FileSystem.delete and then recreate the directory, so > 1. it does not support HDFS Trash > 2. if the table/partition directory is initially encryption protected, after > being deleted and recreated, it is no more protected. > The new implementation is to clean the contents of directory using > multi-threaded trashFiles. If Trash is enabled and has a lower encryption > level than the data directory, the files under it will be deleted. Otherwise, > they will be Trashed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14626) Support Trash in Truncate Table
[ https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15470771#comment-15470771 ] Hive QA commented on HIVE-14626: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12827362/HIVE-14626.1.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10453 tests executed *Failed tests:* {noformat} TestBeeLineWithArgs - did not produce a TEST-*.xml file TestHiveCli - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char] org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1120/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1120/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-1120/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12827362 - PreCommit-HIVE-MASTER-Build > Support Trash in Truncate Table > --- > > Key: HIVE-14626 > URL: https://issues.apache.org/jira/browse/HIVE-14626 > Project: Hive > Issue Type: Sub-task > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-14626.1.patch, HIVE-14626.patch > > > Currently Truncate Table (or Partition) is implemented using > FileSystem.delete and then recreate the directory, so > 1. it does not support HDFS Trash > 2. if the table/partition directory is initially encryption protected, after > being deleted and recreated, it is no more protected. > The new implementation is to clean the contents of directory using > multi-threaded trashFiles. If Trash is enabled and has a lower encryption > level than the data directory, the files under it will be deleted. Otherwise, > they will be Trashed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14626) Support Trash in Truncate Table
[ https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15470792#comment-15470792 ] Chaoyu Tang commented on HIVE-14626: The failed tests are not related to this patch, they are aged. > Support Trash in Truncate Table > --- > > Key: HIVE-14626 > URL: https://issues.apache.org/jira/browse/HIVE-14626 > Project: Hive > Issue Type: Sub-task > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-14626.1.patch, HIVE-14626.patch > > > Currently Truncate Table (or Partition) is implemented using > FileSystem.delete and then recreate the directory, so > 1. it does not support HDFS Trash > 2. if the table/partition directory is initially encryption protected, after > being deleted and recreated, it is no more protected. > The new implementation is to clean the contents of directory using > multi-threaded trashFiles. If Trash is enabled and has a lower encryption > level than the data directory, the files under it will be deleted. Otherwise, > they will be Trashed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14626) Support Trash in Truncate Table
[ https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15470814#comment-15470814 ] Sergio Peña commented on HIVE-14626: Looks good [~ctang.ma] +1 > Support Trash in Truncate Table > --- > > Key: HIVE-14626 > URL: https://issues.apache.org/jira/browse/HIVE-14626 > Project: Hive > Issue Type: Sub-task > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-14626.1.patch, HIVE-14626.patch > > > Currently Truncate Table (or Partition) is implemented using > FileSystem.delete and then recreate the directory, so > 1. it does not support HDFS Trash > 2. if the table/partition directory is initially encryption protected, after > being deleted and recreated, it is no more protected. > The new implementation is to clean the contents of directory using > multi-threaded trashFiles. If Trash is enabled and has a lower encryption > level than the data directory, the files under it will be deleted. Otherwise, > they will be Trashed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14626) Support Trash in Truncate Table
[ https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475786#comment-15475786 ] Lefty Leverenz commented on HIVE-14626: --- Should this change be documented in the wiki? * [DDL -- Truncate Table | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-TruncateTable] Also, how do Hive tables/partitions get encrypted? A search of the wiki for "encrypt" only had one result for the UDF aes_encrypt(). Is there also a table property? * [Hive Operators and Functions -- Misc. Functions -- aes_encrypt | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Misc.Functions] > Support Trash in Truncate Table > --- > > Key: HIVE-14626 > URL: https://issues.apache.org/jira/browse/HIVE-14626 > Project: Hive > Issue Type: Sub-task > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-14626.1.patch, HIVE-14626.patch > > > Currently Truncate Table (or Partition) is implemented using > FileSystem.delete and then recreate the directory, so > 1. it does not support HDFS Trash > 2. if the table/partition directory is initially encryption protected, after > being deleted and recreated, it is no more protected. > The new implementation is to clean the contents of directory using > multi-threaded trashFiles. If Trash is enabled and has a lower encryption > level than the data directory, the files under it will be deleted. Otherwise, > they will be Trashed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14626) Support Trash in Truncate Table
[ https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15477287#comment-15477287 ] Chaoyu Tang commented on HIVE-14626: [~leftylev] I updated the wiki to reflect this change. As for encryption, it is about HDFS encryption zone and you can find the details in HIVE-8065. > Support Trash in Truncate Table > --- > > Key: HIVE-14626 > URL: https://issues.apache.org/jira/browse/HIVE-14626 > Project: Hive > Issue Type: Sub-task > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-14626.1.patch, HIVE-14626.patch > > > Currently Truncate Table (or Partition) is implemented using > FileSystem.delete and then recreate the directory, so > 1. it does not support HDFS Trash > 2. if the table/partition directory is initially encryption protected, after > being deleted and recreated, it is no more protected. > The new implementation is to clean the contents of directory using > multi-threaded trashFiles. If Trash is enabled and has a lower encryption > level than the data directory, the files under it will be deleted. Otherwise, > they will be Trashed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14626) Support Trash in Truncate Table
[ https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15481159#comment-15481159 ] Lefty Leverenz commented on HIVE-14626: --- Thanks for the doc and the jira for encryption, [~ctang.ma]. I had forgotten about HIVE-8065, which still needs to be documented in the wiki. I added version information in the Truncate Table section, with a link to this issue. Should we also explain what the behavior was before this patch? > Support Trash in Truncate Table > --- > > Key: HIVE-14626 > URL: https://issues.apache.org/jira/browse/HIVE-14626 > Project: Hive > Issue Type: Sub-task > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-14626.1.patch, HIVE-14626.patch > > > Currently Truncate Table (or Partition) is implemented using > FileSystem.delete and then recreate the directory, so > 1. it does not support HDFS Trash > 2. if the table/partition directory is initially encryption protected, after > being deleted and recreated, it is no more protected. > The new implementation is to clean the contents of directory using > multi-threaded trashFiles. If Trash is enabled and has a lower encryption > level than the data directory, the files under it will be deleted. Otherwise, > they will be Trashed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14626) Support Trash in Truncate Table
[ https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15482740#comment-15482740 ] Chaoyu Tang commented on HIVE-14626: Thanks, [~leftylev]. The patch is to enhance the Truncate with Trash support and there is not backward compatibility, so I do not think we need the explanation to the behavior before this. > Support Trash in Truncate Table > --- > > Key: HIVE-14626 > URL: https://issues.apache.org/jira/browse/HIVE-14626 > Project: Hive > Issue Type: Sub-task > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-14626.1.patch, HIVE-14626.patch > > > Currently Truncate Table (or Partition) is implemented using > FileSystem.delete and then recreate the directory, so > 1. it does not support HDFS Trash > 2. if the table/partition directory is initially encryption protected, after > being deleted and recreated, it is no more protected. > The new implementation is to clean the contents of directory using > multi-threaded trashFiles. If Trash is enabled and has a lower encryption > level than the data directory, the files under it will be deleted. Otherwise, > they will be Trashed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14626) Support Trash in Truncate Table
[ https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483299#comment-15483299 ] Lefty Leverenz commented on HIVE-14626: --- Okay, thanks. > Support Trash in Truncate Table > --- > > Key: HIVE-14626 > URL: https://issues.apache.org/jira/browse/HIVE-14626 > Project: Hive > Issue Type: Sub-task > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-14626.1.patch, HIVE-14626.patch > > > Currently Truncate Table (or Partition) is implemented using > FileSystem.delete and then recreate the directory, so > 1. it does not support HDFS Trash > 2. if the table/partition directory is initially encryption protected, after > being deleted and recreated, it is no more protected. > The new implementation is to clean the contents of directory using > multi-threaded trashFiles. If Trash is enabled and has a lower encryption > level than the data directory, the files under it will be deleted. Otherwise, > they will be Trashed -- This message was sent by Atlassian JIRA (v6.3.4#6332)