allow the number of files to be a multiple of bucketed table
------------------------------------------------------------
Key: HIVE-2775
URL: https://issues.apache.org/jira/browse/HIVE-2775
Project: Hive
Issue Type: New Feature
Components: Metastore
Reporter: xiaoyu wang
Currently, hive bucketed table requires the number of files to match the bucket
number in order to for correct sampling. This is very restrictive. e.g. we can
only populate the table using a fix number of reducer, which can be a
bottleneck.
The idea is to introduce this "physical bucket" and "logical bucket" concept.
"physical bucket" is the number of files and "logical bucket" is the number of
bucket stored in meda-data for bucketed table. By allowing "physical bucket" to
be a multiple of "logical bucket", we can do correct sampling as well as
scaling up.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira