1. Create external textfile hive table pointing to /extract/DBCLOC and specify CSVSerde
if using hive-0.14 and newer use this https://cwiki.apache.org/confluence/display/Hive/CSV+Serde if hive-0.13 and older use https://github.com/ogrodnek/csv-serde You do not even need to unzgip the file. hive automatically unzgip data on select. 2. run simple query to load data insert overwrite table <orc_table> select * from <csv_table> On Wed, Apr 29, 2015 at 3:26 PM, Kumar Jayapal <kjayapa...@gmail.com> wrote: > Hello All, > > > I have this table > > > CREATE TABLE DBCLOC( > BLwhse int COMMENT 'DECIMAL(5,0) Whse', > BLsdat string COMMENT 'DATE Sales Date', > BLreg_num smallint COMMENT 'DECIMAL(3,0) Reg#', > BLtrn_num int COMMENT 'DECIMAL(5,0) Trn#', > BLscnr string COMMENT 'CHAR(1) Scenario', > BLareq string COMMENT 'CHAR(1) Act Requested', > BLatak string COMMENT 'CHAR(1) Act Taken', > BLmsgc string COMMENT 'CHAR(3) Msg Code') > PARTITIONED BY (FSCAL_YEAR smallint, FSCAL_PERIOD smallint) > STORED AS PARQUET; > > have to load from hdfs location /extract/DBCLOC/DBCL0301P.csv.gz to the > table above > > > Can any one tell me what is the most efficient way of doing it. > > > Thanks > Jay >