[
https://issues.apache.org/jira/browse/HIVE-2121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Carl Steinbach updated HIVE-2121:
---------------------------------
Component/s: Query Processor
Fix Version/s: 0.8.0
> Input Sampling By Splits
> ------------------------
>
> Key: HIVE-2121
> URL: https://issues.apache.org/jira/browse/HIVE-2121
> Project: Hive
> Issue Type: New Feature
> Components: Query Processor
> Reporter: Siying Dong
> Assignee: Siying Dong
> Fix For: 0.8.0
>
> Attachments: HIVE-2121.1.patch, HIVE-2121.2.patch, HIVE-2121.3.patch,
> HIVE-2121.4.patch, HIVE-2121.5.patch, HIVE-2121.6.patch, HIVE-2121.7.patch,
> HIVE-2121.8.patch
>
>
> We need a better input sampling to serve at least two purposes:
> 1. test their queries against a smaller data set
> 2. understand more about how the data look like without scanning the whole
> table.
> A simple function that gives a subset splits will help in those cases. It
> doesn't have to be strict sampling.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira