[ 
https://issues.apache.org/jira/browse/HIVE-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052941#comment-13052941
 ] 

jirapos...@reviews.apache.org commented on HIVE-2219:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/941/
-----------------------------------------------------------

Review request for hive and Paul Yang.


Summary
-------

Improve the efficiency of the function that handles dropping multiple 
partitions by finding the partitions to drop at the JDO level instead of 
iterating through all given partitions and existing partitions.


This addresses bug HIVE-2219.
    https://issues.apache.org/jira/browse/HIVE-2219


Diffs
-----

  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1138144 

Diff: https://reviews.apache.org/r/941/diff


Testing
-------

Still passes drop_multi_partitions.q.  Tested speed on dropping ~10k partitions 
in a table with ~400k partitions.  This section of code took ~10 minutes after 
the change, and some amount > 30 minutes before.


Thanks,

Sohan



> Make "alter table drop partition" more efficient
> ------------------------------------------------
>
>                 Key: HIVE-2219
>                 URL: https://issues.apache.org/jira/browse/HIVE-2219
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Sohan Jain
>            Assignee: Sohan Jain
>         Attachments: HIVE-2219.1.patch
>
>
> The current function dropTable() that handles dropping multiple partitions is 
> somewhat inefficient.  For each partition you want to drop, it loops through 
> each partition in the table to see if the partition exists.  This is an 
> _O(mn)_ operation, where _m_ is the number of partitions to drop, and _n_ is 
> the number of partitions in the table.  The running time of this function can 
> be improved, which is useful for tables with many partitions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to