Thanks a lot Prasanth for the reply. I would have never figured that out as
the documentation at Hive Wiki DDL
page<https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-SkewedTables>and
design
page <https://cwiki.apache.org/confluence/display/Hive/ListBucketing> doesn't
list this.

One additional point it seems the Skewed table doesn't work when the table
is created as CTAS. The below statement doesn't create separate files. Is
it a bug or is it by intent?

create tablet1(r1 string, r2 string) skewed by (r2) on (‘a’) stored as
directories select r1, r2 from t2;


On Thu, Apr 24, 2014 at 6:12 AM, Prasanth Jayachandran <
[email protected]> wrote:

> Hi Mayur,
>
> The reason why you see single file is, you have not enabled storing skewed
> columns/values as directories.
> You can do the following to enable storing the skewed columns and values
> as directories
>
> set hive.mapred.supports.subdirectories=true;
> set mapred.input.dir.recursive=true;
> create tablet1(r1 string, r2 string) skewed by (r2) on (‘a’) stored as
> directories;
>
> This will enable you to store the skewed columns as directories below
>
> /user/hive/warehouse/t1/r2=a/000000_0 (skewed values go here)
> /user/hive/warehouse/t1/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME/000000_0 (all
> other values go here)
>
> With respect to your desc extended question where
> skewedColValueLocationMaps is empty, its a bug in implementation. I just
> verified that it shows empty for unpartitioned tables. But it shows
> correctly for partitioned tables.
> I have created a bug for unpartitioned tables here which you can track for
> progress on this issue https://issues.apache.org/jira/browse/HIVE-6968
>
>
> Thanks
> Prasanth Jayachandran
>
> On Apr 23, 2014, at 6:52 AM, Mayur Gupta <[email protected]> wrote:
>
> Below is my skewedInfo
>
> skewedInfo:SkewedInfo(skewedColNames:[r2], skewedColValues:[[a]],
> skewedColValueLocationMaps:{})
>
> Any idea why is the skewedColValueLocationMaps empty?
>
>
> On Mon, Apr 21, 2014 at 11:19 AM, Mayur Gupta <[email protected]>wrote:
>
>> Hey There,
>>
>> I was trying to use Skewed tables but I am facing the issue that it is
>> not creating separate files for the skewed data. Even with a simple example
>> I am having the same issue. The hive version is 0.11.
>>
>> create table t(col1 string, col2 string);
>> load  data local inpath '/home/hadoop/a.txt' into table t;
>>
>> create table t1(r1 string, r2 string) skewed by (r2) on ('a');
>> insert into table t1 select * from t;
>>
>> The contents of a.txt are :
>> 1 ^Aa
>> 2^A b
>> 3 ^Ac
>> 4 ^Aa
>> 5 ^Ab
>> 6 ^Aa
>>
>> I see only single file.
>>
>> /user/hive/warehouse/t1/000000_0
>>
>> Any pointers on what I am doing wrong?
>>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.

Reply via email to