Hi Anand

       The row group size of a RC file is defined 
by hive.io.rcfile.record.buffer.size . The default value is 4MB.
Good to set it to a higher value as 32 MB.

SET hive.io.rcfile.record.buffer.size = 33554432 ;


Regards
Bejoy KS 



________________________________
 From: "Ladda, Anand" <lan...@microstrategy.com>
To: "user@hive.apache.org" <user@hive.apache.org> 
Sent: Thursday, April 19, 2012 9:12 AM
Subject: Row Group Size of RCFile
 

 
How do I set the Row Group Size of RCFile in Hive
 
CREATE TABLE OrderFactPartClustRcFile(
  order_id INT,
  emp_id INT,
  order_amt FLOAT,
  order_cost FLOAT,
  qty_sold FLOAT,
  freight FLOAT,
  gross_dollar_sales FLOAT,
  ship_date STRING,
  rush_order STRING,
  customer_id INT,
  pymt_type INT,
  shipper_id INT
  ) 
PARTITIONED BY (order_date STRING) 
CLUSTERED BY (order_id) SORTED BY (order_id) INTO 256 BUCKETS 
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe' 
STORED as RCFILE;

Reply via email to