I verified the DN was down via both jps and java. Anyways, it was enough to see via "top" since as mentioned DN was consuming 100% of one cpu when running.
2011/11/29 Stephen Boesch <java...@gmail.com> > Hi Uma, > I mentioned that I have restarted the datanode *many *times, and in > fact the entire cluster more than ten times. > > > 2011/11/29 Uma Maheswara Rao G <mahesw...@huawei.com> > >> Looks you are getting HDFS-2553. >> >> The cause might be that, you cleared the datadirectories directly without >> DN restart. Workaround would be to restart DNs. >> >> >> >> Regards, >> >> Uma >> >> >> >> ------------------------------ >> >> *From:* Stephen Boesch [java...@gmail.com] >> *Sent:* Tuesday, November 29, 2011 8:53 PM >> *To:* mapreduce-user@hadoop.apache.org >> *Subject:* Re: MRv2 DataNode problem: isBPServiceAlive invoked order of >> 200K times per second >> >> Update on this: I've shut down all the servers multiple times. Also >> cleared the data directories and reformatted the namenode. Restarted it and >> the same results: 100% cpu and millions of these calls to isBPServiceAlive. >> >> >> 2011/11/29 Stephen Boesch <java...@gmail.com> >> >>> I am just trying to get off the ground with MRv2. The first node (in >>> pseudo distributed mode) is working fine - ran a couple of TeraSort's on >>> it. >>> >>> The second node has a serious issue with its single DataNode: it >>> consumes 100% of one of the CPU's. Looking at it through JVisualVM, there >>> are over 8 million invocations of isBPServiceAlive in a matter of a minute >>> or so and continually incrementing at a steady clip. A screenshot of the >>> JvisualVM cpu profile - showing just shy of 8M invocations is attached. >>> >>> What kind of configuration error could lead to this? The conf/masters >>> and conf/slaves simply say localhost. If need be I'll copy the >>> *-site.xml's. They are boilerplate from the Cloudera page by Ahmed Radwan. >>> >>> >>> >>> >>> >> >