I have a 654M sequence file <Text,BytesWritable> that I'm using as the input to a MR job. I have it loaded into HDFS on my cluster. The first job is simple: iterate through the text files in the sequence file and generate some counts. Nothing CPU intensive. It seems like the process stalls periodically, where no map tasks are executing - all are waiting for next key/value pairs.
I will get task attempts timing out after 600 seconds, then getting killed. The map progress % reverts. I put logging into the map job and it runs start-to-end in milliseconds-seconds. Another map task just doesn't seem to fire up. Thanks.