Hi guys I met a interesting problem when I implement my own custom InputFormat which extends the FileInputFormat.(I rewrite the RecordReader class but not the InputSplit class)
My recordreader will take following format as a basic record: (my recordreader extends the LineRecordReader. It returns a record if it meets #Trailer# and contains #Header#. I only have one input file that is composed of many of following basic record) #Header# .....(many lines, may be 0 lines or 1000 lines, it varies) #Trailer# Everything works fine if above basic input unit in a file is integer times of mapper. For example, I use 2 mappers and there are two basic records in my input file. Or I use 3 mappers and there are 6 basic units in the input file. However, if I use 4 mappers but there are 3 basic units in the input file(not integer times). The final output is incorrect. The "Map Input Bytes" in the job counter is also less than the input file size. How can I fix it? Do I need to rewrite the inputSplit? Any reply will be appreciated! Regards! Chen