[ 
https://issues.apache.org/jira/browse/HIVE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14122135#comment-14122135
 ] 

Mithun Radhakrishnan commented on HIVE-7100:
--------------------------------------------

Hey, David, et al. I've taken a look at the patch you have so far. (You should 
see some comments on RB.) Thanks for working on solving this.

In its current form, patch(.5) attempts to solve the problem for dropTable(), 
and leaves a TODO for dropPartitions(). I'd like very much to see the solution 
extended to dropPartitions(). Did you run into something hard in the 
partitions-case? (The {{HCatClient}} API would need to expose PURGE as an 
option. That won't be difficult.)

Question: Would it be possible to introduce a PURGE.default parameter to 
TBLPROPERTIES for a table?
I have users that face the same problem as the one you're solving, but in the 
context of dropPartitions. While I approve of the ability to 
dropPartitions(purge=true) on a per-call basis, I'd also like the ability to 
choose the default drop-action (if ifPurge isn't set), on a per table-level. 
This way:
# Table-owners can decide whether to spam their ~/.Trash on drop.
# The user wouldn't need to change their Hive script (or Oozie action, or 
HCatClient call), to be able to skipTrash.
# AFAICT, it'll not conflict with Sushanth's work on HIVE-6465, which might 
just store the new table-semantics in TBLPROPERTIES.

I don't know if the protocol need be complicated:
|| Use-case || {{dropTable(purge=<unset>)}} || {{dropTable(purge=true)}} ||
| Default (e.g. pre-existing tables) | Dropped data goes to ~/.Trash | Trash 
skipped |
| Tables with PURGE.default=true | Trash skipped | Trash skipped |

When HiveQL language support is added, {{DROP TABLE my_table PURGE}} will call 
{{dropTable(purge=true)}}, and behave identically.
{{dropPartitions()}} would work in similar fashion.

> Users of hive should be able to specify skipTrash when dropping tables.
> -----------------------------------------------------------------------
>
>                 Key: HIVE-7100
>                 URL: https://issues.apache.org/jira/browse/HIVE-7100
>             Project: Hive
>          Issue Type: Improvement
>    Affects Versions: 0.13.0
>            Reporter: Ravi Prakash
>            Assignee: Jayesh
>         Attachments: HIVE-7100.1.patch, HIVE-7100.2.patch, HIVE-7100.3.patch, 
> HIVE-7100.4.patch, HIVE-7100.5.patch, HIVE-7100.patch
>
>
> Users of our clusters are often running up against their quota limits because 
> of Hive tables. When they drop tables, they have to then manually delete the 
> files from HDFS using skipTrash. This is cumbersome and unnecessary. We 
> should enable users to skipTrash directly when dropping tables.
> We should also be able to provide this functionality without polluting SQL 
> syntax.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to