Bucketing- Identify Number of Buckets

Db-Blog Sun, 06 Sep 2015 13:22:23 -0700

Hi, 

I need to join two big tables in hive. The join key is the grain of both these 
tables, hence clustering and sorting on the same will provide significant 
performance optimisation while joining.


However, i am not sure how to calculate the exact number of buckets while 
creating these tables. Can someone please share any pointers on the same? 

Planning to keep these Clustered and Sorted tables as parquet/orc- for columnar 
storage and better compression. 

Thanks,
Saurabh

Sent from my iPhone, please avoid typos.

Bucketing- Identify Number of Buckets

Reply via email to