Hi Team,
I found when there is uppercase string as the skew value, ListBucketing is
not working.
https://issues.apache.org/jira/browse/HIVE-13697 is filed:
For example:
1. This is good:
CREATE TABLE testskew (id INT, a STRING)
SKEWED BY (a) ON ('abc', 'xyz') STORED AS DIRECTORIES;
set hive.mapred.supports.subdirectories=true;
set mapred.input.dir.recursive=true;
INSERT OVERWRITE TABLE testskew
SELECT 123,'abc' FROM dual
union all
SELECT 123,'xyz' FROM dual
union all
SELECT 123,'others' FROM dual;
# hadoop fs -ls /user/hive/warehouse/testskew
Found 3 items
drwxrwxrwx - mapr mapr 1 2016-05-05 14:56
/user/hive/warehouse/testskew/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME
drwxrwxrwx - mapr mapr 1 2016-05-05 14:56
/user/hive/warehouse/testskew/a=abc
drwxrwxrwx - mapr mapr 1 2016-05-05 14:56
/user/hive/warehouse/testskew/a=xyz
This is good, because both "abc" and "xyz" directories got created.
2. This is bad:
Drop table testskew2;
CREATE TABLE testskew2 (id INT, a STRING)
SKEWED BY (a) ON ('aus', 'US') STORED AS DIRECTORIES;
set hive.mapred.supports.subdirectories=true;
set mapred.input.dir.recursive=true;
INSERT OVERWRITE TABLE testskew2
SELECT 123, 'aus' FROM dual
union all
SELECT 123, 'US' FROM dual
union all
SELECT 123, 'others' FROM dual;
# hadoop fs -ls /user/hive/warehouse/testskew2
Found 2 items
drwxrwxrwx - mapr mapr 1 2016-05-05 15:11
/user/hive/warehouse/testskew2/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME
drwxrwxrwx - mapr mapr 1 2016-05-05 15:11
/user/hive/warehouse/testskew2/a=aus
You can see, only "aus" directory got created...
--
Thanks,
www.openkb.info
(Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)