Reply to myself. I'm using streaming and the task timeout was set to 0, so that's why.
On Fri, Sep 19, 2008 at 3:34 AM, Rong-en Fan <[EMAIL PROTECTED]> wrote: > Hi, > > I'm using 0.17.2.1 and see a reduce hang in shuffle phase due > to a unresponsive node. From the reduce log (sorry that I didn't > keep it around), it stuck in copying map output from a dead > node (I can not ssh to that one). At that point, all maps are already > finished. I'm wondering why this slowness does not trigger a reduce > task fail and the corresponding map failed (even if it is finished) then > redo the map task on another node so that the reduce can work. > > Thanks, > Rong-En Fan >