Thanks a lot Prasanth for the reply. I would have never figured that out as the documentation at Hive Wiki DDL page<https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-SkewedTables>and design page <https://cwiki.apache.org/confluence/display/Hive/ListBucketing> doesn't list this.
One additional point it seems the Skewed table doesn't work when the table is created as CTAS. The below statement doesn't create separate files. Is it a bug or is it by intent? create tablet1(r1 string, r2 string) skewed by (r2) on (‘a’) stored as directories select r1, r2 from t2; On Thu, Apr 24, 2014 at 6:12 AM, Prasanth Jayachandran < [email protected]> wrote: > Hi Mayur, > > The reason why you see single file is, you have not enabled storing skewed > columns/values as directories. > You can do the following to enable storing the skewed columns and values > as directories > > set hive.mapred.supports.subdirectories=true; > set mapred.input.dir.recursive=true; > create tablet1(r1 string, r2 string) skewed by (r2) on (‘a’) stored as > directories; > > This will enable you to store the skewed columns as directories below > > /user/hive/warehouse/t1/r2=a/000000_0 (skewed values go here) > /user/hive/warehouse/t1/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME/000000_0 (all > other values go here) > > With respect to your desc extended question where > skewedColValueLocationMaps is empty, its a bug in implementation. I just > verified that it shows empty for unpartitioned tables. But it shows > correctly for partitioned tables. > I have created a bug for unpartitioned tables here which you can track for > progress on this issue https://issues.apache.org/jira/browse/HIVE-6968 > > > Thanks > Prasanth Jayachandran > > On Apr 23, 2014, at 6:52 AM, Mayur Gupta <[email protected]> wrote: > > Below is my skewedInfo > > skewedInfo:SkewedInfo(skewedColNames:[r2], skewedColValues:[[a]], > skewedColValueLocationMaps:{}) > > Any idea why is the skewedColValueLocationMaps empty? > > > On Mon, Apr 21, 2014 at 11:19 AM, Mayur Gupta <[email protected]>wrote: > >> Hey There, >> >> I was trying to use Skewed tables but I am facing the issue that it is >> not creating separate files for the skewed data. Even with a simple example >> I am having the same issue. The hive version is 0.11. >> >> create table t(col1 string, col2 string); >> load data local inpath '/home/hadoop/a.txt' into table t; >> >> create table t1(r1 string, r2 string) skewed by (r2) on ('a'); >> insert into table t1 select * from t; >> >> The contents of a.txt are : >> 1 ^Aa >> 2^A b >> 3 ^Ac >> 4 ^Aa >> 5 ^Ab >> 6 ^Aa >> >> I see only single file. >> >> /user/hive/warehouse/t1/000000_0 >> >> Any pointers on what I am doing wrong? >> > > > > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity > to which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You.
