Re: Hadoop 0.20.2 namenode inactivity issue

2010-04-27 Thread Todd Lipcon
On Tue, Apr 27, 2010 at 12:50 PM, elsif wrote: > The network checks out fine. Each node has 4G of memory and there are > no other java processes running besides the data node. The nodes all > have a very low load and very little swap in use: > > top - 12:40:45 up 272 days, 1:41, 2 users, loa

Re: Hadoop 0.20.2 namenode inactivity issue

2010-04-27 Thread elsif
The network checks out fine. Each node has 4G of memory and there are no other java processes running besides the data node. The nodes all have a very low load and very little swap in use: top - 12:40:45 up 272 days, 1:41, 2 users, load average: 0.10, 0.13, 0.22 Tasks: 63 total, 1 running,

Re: Hadoop 0.20.2 namenode inactivity issue

2010-04-27 Thread Todd Lipcon
Those errors would indicate problems on the DN or client level, not the NN level. I'd double check your networking, make sure you don't have any switching issues, etc. Also double check for swapping on your DNs (if you see more than a few MB swapped out, you need to oversubscribe your memory less)

Hadoop 0.20.2 namenode inactivity issue

2010-04-27 Thread elsif
Hello, We have a 35 node cluster running Hadoop 0.20.2 and are seeing periods of namenode inactivity that cause block read and write errors. We have enabled garbage collection logging and have determined that garbage collection is not the cause of the inactivity. We have been able to reproduce t