[
https://issues.apache.org/jira/browse/HIVE-1699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ning Zhang updated HIVE-1699:
-----------------------------
Attachment: HIVE-1699.patch
This patch includes the following changes:
1) correctly pruning partitions based on the partition specification in
ANALYZE TABLE command.
2) adding a Hive.getPartitionsByNames() method to get a list of partitions
based on their names. Previous we'll have to use Hive.getPartitions which get
all partitions as Partition objects and then filter out partitions that doesn't
satisfy spec. This is very expensive for tables with large number of
partitions. This could be further improved by using the partition filtering
pushdown feature once it is fully supported.
3) Caching the list of partitions in tableSpec so that StatsTask does not
need to get the list of partitions again.
4) adding a explicit variable tableSpec to indicate its type (TABLE_ONLY,
STATIC_PARTITION, DYNAMIC_PARTITION) rather than relying on implicit checking
on partHandle.
> incorrect partition pruning ANALYZE TABLE
> -----------------------------------------
>
> Key: HIVE-1699
> URL: https://issues.apache.org/jira/browse/HIVE-1699
> Project: Hadoop Hive
> Issue Type: Bug
> Reporter: Ning Zhang
> Assignee: Ning Zhang
> Attachments: HIVE-1699.patch
>
>
> If table T is partitioned, ANALYZE TABLE T PARTITION (...) COMPUTE
> STATISTICS; will gather stats for all partitions even though partition spec
> only chooses a subset.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.