To complete the picture: not only was our network swamped, I realized
tonight that the NameNode/JobTracker was running on a 99% full disk (it hit
100% full about thirty minutes ago). That poor JobTracker was fighting
against a lot of odds. As soon as we upgrade to a bigger disk and switch it
back on, I'll apply the supplied patch to the cluster.

Thank you for looking into this!
- Aaron

On Thu, Oct 30, 2008 at 3:42 PM, Raghu Angadi <[EMAIL PROTECTED]> wrote:

> Raghu Angadi wrote:
>
>> Devaraj fwded the stacks that Aaron sent. As he suspected there is a
>> deadlock in RPC server. I will file a blocker for 0.18 and above. This
>> deadlock is more likely on a busy network.
>>
>>
> Aaron,
>
> Could you try the patch attached to
> https://issues.apache.org/jira/browse/HADOOP-4552 ?
>
> Thanks,
> Raghu.
>

Reply via email to