Hello! I have csv files which are small in size which are moved to the HDFS using the SequenceFile Format. The key is the file name and contents of the file becomes the value.
Now I want to create an external table on these csv files using HIVE. But when I do I get only the first row of each csv file. For example, Assume the csv files contain three columns - Col1, Col2, Col3 and I have 3 CSV files - File1, File2, File3. File 1 10,20,30 40,50,60, 70,80,90 File2 100,110,120 130,140,150 160,170,180 File3 200,210,220 230,240,250 260,270,280 A sequence file is created - File1 <Contents of File1> File2 <Contents of File2> File3 <Contents of File3> Now when I create an external table Stored as SEQUENCEFILE and do a SELECT ALL query using HIVE I get the following result 10 20 30 100 110 120 200 210 220 I am aware that I need to write a custom inputformat, custom recordreader and custom serde. Also, a sequence file treats one key-value pair as one row. I dont understand how to split one row (corresponding to one value) of a sequence file into multiple rows in a HIVE table. Any suggestions on how to go about this? Regards, VR