clustered bucket and tablesample

2016-05-14 Thread no jihun
Hello. I want to ask the correct bucketing and tablesample way. There is a table X which I created by CREATE TABLE `X`(`action_id` string,`classifier` string) CLUSTERED BY (action_id,classifier) INTO 256 BUCKETS STORED AS ORC Then I inserted 500M of rows into X by set hive.enforce.bucketing=tr

Re: clustered bucket and tablesample

2016-05-14 Thread Mich Talebzadeh
Is action_id can be created as a numeric column: CREATE TABLE X ( action_id bigint, ..) Bucketing or hash partitioning best works on numeric columns with high cardinality (say a primary key). >From my old notes: Bucketing in Hive refers to hash partitioning where a hashing function is applied.

Re: clustered bucket and tablesample

2016-05-14 Thread no jihun
ah, as i mentioned both field type of action_id and classifier is STRING. and I can not change the type. CREATE TABLE `X`(`action_id` string,`classifier` string) CLUSTERED BY (action_id,classifier) INTO 256 BUCKETS STORED AS ORC I use two fields for hash then bucketing because each one field is n

Re: clustered bucket and tablesample

2016-05-14 Thread Mich Talebzadeh
This is your code SELECT COUNT(*) FROM X TABLESAMPLE(BUCKET 1 OUT OF 256 ON action_id, classifier) WHERE action_id='aaa' AND classifier='bbb' Well I have a table dummy with 1 billion rows imported from Oracle as ORC format hive> show create table dummy; OK CREATE TABLE `dummy`( `id` int, `c

Query Failing while querying on ORC Format

2016-05-14 Thread Mahender Sarangam
Hi, We are dumping our data into ORC Partition Bucketed table. We have loaded almost 6 months data and here month is Partition by column. Now we have modified ORC partition bucketed table schema. We have added 2 more columns to the ORC table. Now whenever we are running select statement for olde

Re: Query Failing while querying on ORC Format

2016-05-14 Thread Mich Talebzadeh
check this thread. alter table add columns aternatives or hive refresh that night help HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * ht