Rui Li created HIVE-7956:
----------------------------
Summary: When inserting into a bucketed table, all data goes to a
single bucket [Spark Branch]
Key: HIVE-7956
URL: https://issues.apache.org/jira/browse/HIVE-7956
Project: Hive
Issue Type: Bug
Components: Spark
Reporter: Rui Li
I created a bucketed table:
{code}
create table testBucket(x int,y string) clustered by(x) into 10 buckets;
{code}
Then I run a query like:
{code}
set hive.enforce.bucketing = true;
insert overwrite table testBucket select intCol,stringCol from src;
{code}
Here {{src}} is a simple textfile-based table containing 40000000 records (not
bucketed). The query launches 10 reduce tasks but all the data goes to only one
of them.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)