Hi zhixin,
   As I remember  If you set "shard by" column in cube design page, Kylin will 
use this column as the condition of  "distribute by", rather than the first 
three field of rowkey.




------------------ ???????? ------------------
??????: "liuzhixin"<[email protected]>;
????????: 2018??11??2??(??????) ????3:11
??????: "dev"<[email protected]>;
????: "Chao Long"<[email protected]>; 
????: Re: Redistribute intermediate table default not by rand()



Hi Chao Long??

Thank you for the answer.
#
Step1: Create Intermediate Flat Hive Table
Step2: Redistribute intermediate table
#
Perhaps, Kylin can insert one rand column in the intermediate hive table  for 
the next shard, (as default).
At the same time,  Kylin should support the custom column for shard. (has 
provided)

Best Wishes.

> ?? 2018??11??2????????1:38??Chao Long <[email protected]> ??????
> 
> Hi zhixin,
> Data may become not correct if use "distribute by rand()".
> https://issues.apache.org/jira/browse/KYLIN-3388
> 
> 
> 
> 
> ------------------ ???????? ------------------
> ??????: "liuzhixin"<[email protected]>;
> ????????: 2018??11??2??(??????) ????12:53
> ??????: "dev"<[email protected]>;
> ????: "ShaoFeng Shi"<[email protected]>; 
> ????: Re: Redistribute intermediate table default not by rand()
> 
> 
> 
> Hi kylin team:
> 
> Step: Redistribute intermediate table
> #
> ??????????????????????????????DISTRIBUTE BY????????????????DISTRIBUTE BY 
> RAND()
> ????????????????????????????????????????????????????????????????????
> 
> Best Regards??
> 
>> ?? 2018??11??2????????12:03??liuzhixin <[email protected]> ??????
>> 
>> Hi kylin team:
>> 
>> Version: Kylin2.5-hadoop3.1 for hdp3.0
>> #
>> Step: Redistribute intermediate table
>> #
>> DISTRIBUTE BY is that:
>> INSERT OVERWRITE TABLE table_intermediate SELECT * FROM table_intermediate 
>> DISTRIBUTE BY Field1, Field2, Field3;
>> #
>> Not DISTRIBUTE BY RAND()
>> #
>> Is this default DISTRIBUTE BY Field1, Field2, Field3? how to DISTRIBUTE BY 
>> RAND()?
>> 
>> Best wishes.

Reply via email to