Hi Anand The row group size of a RC file is defined by hive.io.rcfile.record.buffer.size . The default value is 4MB. Good to set it to a higher value as 32 MB.
SET hive.io.rcfile.record.buffer.size = 33554432 ; Regards Bejoy KS ________________________________ From: "Ladda, Anand" <lan...@microstrategy.com> To: "user@hive.apache.org" <user@hive.apache.org> Sent: Thursday, April 19, 2012 9:12 AM Subject: Row Group Size of RCFile How do I set the Row Group Size of RCFile in Hive CREATE TABLE OrderFactPartClustRcFile( order_id INT, emp_id INT, order_amt FLOAT, order_cost FLOAT, qty_sold FLOAT, freight FLOAT, gross_dollar_sales FLOAT, ship_date STRING, rush_order STRING, customer_id INT, pymt_type INT, shipper_id INT ) PARTITIONED BY (order_date STRING) CLUSTERED BY (order_id) SORTED BY (order_id) INTO 256 BUCKETS ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe' STORED as RCFILE;