just a guess, for a long-running sequence of MR jobs, how's the namenode behaving during that time? if it gets corrupted, one might see that behavior.
we have a similar situation, with 9 MR jobs back-to-back, taking much of the day. might be good to add some notification to an external process after the end of each of those 3 MR jobs. paco On Mon, Aug 11, 2008 at 12:34 PM, Mori Bellamy <[EMAIL PROTECTED]> wrote: > hey all, > i have a job consisting of three MR jobs back to back to back. the each job > takes an appreciable percent of a day to complete (30% to 70%). even though > i execute these jobs in a blocking fashion: