Re: Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 cluster

2012-01-24 Thread srinivas


Harsh J harsh@... writes:

 
 Hello RX,
 
 Could you paste your DFS configuration and the DN end-to-end log into
 a mail/pastebin-link?
 
 On Fri, May 27, 2011 at 5:31 AM, Xu, Richard richard.xu@... wrote:
  Hi Folks,
 
  We try to get hbase and hadoop running on clusters, take 2 Solaris servers 
for now.
 
  Because of the incompatibility issue between hbase and hadoop, we have to 
stick with hadoop
 0.20.2-append release.
 
  It is very straight forward to make hadoop-0.20.203 running, but stuck for 
several days with
 hadoop-0.20.2, even the official release, not the append version.
 
  1. Once try to run start-mapred.sh(hadoop-daemon.sh --config 
$HADOOP_CONF_DIR start jobtracker),
 following errors shown in namenode and jobtracker logs:
 
  2011-05-26 12:30:29,169 WARN 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to
 place enough replicas, still in need of 1
  2011-05-26 12:30:29,175 INFO org.apache.hadoop.ipc.Server: IPC Server 
handler 4 on 9000, call
 addBlock(/tmp/hadoop-cfadm/mapred/system/jobtracker.info, DFSCl
  ient_2146408809) from 169.193.181.212:55334: error: java.io.IOException: 
File
 /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 
n
  odes, instead of 1
  java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info 
could only be
 replicated to 0 nodes, instead of 1
         at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesys
tem.java:1271)
         at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
         at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
         at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.jav
a:25)
         at java.lang.reflect.Method.invoke(Method.java:597)
         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
         at java.security.AccessController.doPrivileged(Native Method)
         at javax.security.auth.Subject.doAs(Subject.java:396)
         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
 
 
  2. Also, Configured Capacity is 0, cannot put any file to HDFS.
 
  3. in datanode server, no error in logs, but tasktracker logs has the 
following suspicious thing:
  2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server 
Responder: starting
  2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server 
listener on 41904: starting
  2011-05-25 23:36:10,852 INFO org.apache.hadoop.ipc.Server: IPC Server 
handler 0 on 41904: starting
  2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server 
handler 1 on 41904: starting
  2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server 
handler 2 on 41904: starting
  2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server 
handler 3 on 41904: starting
                                         .
  2011-05-25 23:36:10,855 INFO org.apache.hadoop.ipc.Server: IPC Server 
handler 63 on 41904: starting
  2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker: 
TaskTracker up at: localhost/127.0.0.1:41904
  2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker: Starting 
tracker tracker_loanps3d:localhost/127.0.0.1:41904
 
 
  I have tried all suggestions found so far, including
      1) remove hadoop-name and hadoop-data folders and reformat namenode;
      2) clean up all temp files/folders under /tmp;
 
  But nothing works.
 
  Your help is greatly appreciated.
 
  Thanks,
 
  RX
 
 

Hi,


I am able to start name node and data node,but while starting the 
jobatracker,it's troughing an error like 

FATAL mapred.JobTracker: java.net.BindException: Problem binding to 
localhost/127.0.0.1:5102 : Address already in use

kindly help me ASAP..


regards,
Srinivas






Re: Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 cluster

2011-05-27 Thread Konstantin Boudnik
On Thu, May 26, 2011 at 07:01PM, Xu, Richard  wrote:
 2011-05-26 12:30:29,175 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 4 on 9000, call addBlock(/tmp/hadoop-cfadm/mapred/system/jobtracker.info, 
 DFSCl
 ient_2146408809) from 169.193.181.212:55334: error: java.io.IOException: File 
 /tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 
 n
 odes, instead of 1
 java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info 
 could only be replicated to 0 nodes, instead of 1

Is your DFS up running, by any chance? 

Cos


Re: Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 cluster

2011-05-27 Thread Allen Wittenauer

On May 27, 2011, at 7:26 AM, DAN wrote:
 You see you have 2 Solaris servers for now, and dfs.replication is setted 
 as 3.
 These don't match.


That doesn't matter.  HDFS will basically flag any files written with a 
warning that they are under-replicated.

The problem is that the datanode processes aren't running and/or aren't 
communicating to the namenode. That's what the java.io.IOException: File 
/tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 
nodes, instead of 1 means.

It should also be pointed out that writing to /tmp (the default) is a 
bad idea.  This should get changed.

Also, since you are running Solaris, check the FAQ on some settings 
you'll need to do in order to make Hadoop's broken username detection to work 
properly, amongst other things.

RE: Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 cluster

2011-05-27 Thread Xu, Richard
Hi Allen,

Thanks a lot for your response.

I agree with you that it does not matter with replication settings.

What really bothered me is same environment, same configures, hadoop 0.20.203 
takes us 3 mins, why 0.20.2 took 3 days.

Can you pls. shed more light on how to make Hadoop's broken username detection 
to work properly?

-Original Message-
From: Allen Wittenauer [mailto:a...@apache.org]
Sent: Friday, May 27, 2011 11:42 AM
To: common-user@hadoop.apache.org
Cc: Xu, Richard [ICG-IT]
Subject: Re: Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 
cluster


On May 27, 2011, at 7:26 AM, DAN wrote:
 You see you have 2 Solaris servers for now, and dfs.replication is setted 
 as 3.
 These don't match.


That doesn't matter.  HDFS will basically flag any files written with a 
warning that they are under-replicated.

The problem is that the datanode processes aren't running and/or aren't 
communicating to the namenode. That's what the java.io.IOException: File 
/tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 
nodes, instead of 1 means.

It should also be pointed out that writing to /tmp (the default) is a 
bad idea.  This should get changed.

Also, since you are running Solaris, check the FAQ on some settings 
you'll need to do in order to make Hadoop's broken username detection to work 
properly, amongst other things.


RE: Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 cluster

2011-05-27 Thread Xu, Richard
Add more to that:

I also tried start 0.20.2 on a linux machine in distributed mode, same error.

I had successfully started 0.20.203 on this linux machine with same config.

Seems that it is not related to Solaris.

Could it caused by port? I checked a few, did not find anyone blocked.



-Original Message-
From: Xu, Richard [ICG-IT]
Sent: Friday, May 27, 2011 4:18 PM
To: 'Allen Wittenauer'; 'common-user@hadoop.apache.org'
Subject: RE: Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 
cluster

Hi Allen,

Thanks a lot for your response.

I agree with you that it does not matter with replication settings.

What really bothered me is same environment, same configures, hadoop 0.20.203 
takes us 3 mins, why 0.20.2 took 3 days.

Can you pls. shed more light on how to make Hadoop's broken username detection 
to work properly?

-Original Message-
From: Allen Wittenauer [mailto:a...@apache.org]
Sent: Friday, May 27, 2011 11:42 AM
To: common-user@hadoop.apache.org
Cc: Xu, Richard [ICG-IT]
Subject: Re: Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 
cluster


On May 27, 2011, at 7:26 AM, DAN wrote:
 You see you have 2 Solaris servers for now, and dfs.replication is setted 
 as 3.
 These don't match.


That doesn't matter.  HDFS will basically flag any files written with a 
warning that they are under-replicated.

The problem is that the datanode processes aren't running and/or aren't 
communicating to the namenode. That's what the java.io.IOException: File 
/tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 
nodes, instead of 1 means.

It should also be pointed out that writing to /tmp (the default) is a 
bad idea.  This should get changed.

Also, since you are running Solaris, check the FAQ on some settings 
you'll need to do in order to make Hadoop's broken username detection to work 
properly, amongst other things.


Re: Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 cluster

2011-05-27 Thread Allen Wittenauer

On May 27, 2011, at 1:18 PM, Xu, Richard wrote:

 Hi Allen,
 
 Thanks a lot for your response.
 
 I agree with you that it does not matter with replication settings.
 
 What really bothered me is same environment, same configures, hadoop 0.20.203 
 takes us 3 mins, why 0.20.2 took 3 days.
 
 Can you pls. shed more light on how to make Hadoop's broken username 
 detection to work properly?

It's in the FAQ so that I don't have to do that.

http://wiki.apache.org/hadoop/FAQ


Also, check your logs.  All your logs.  Not just the namenode log.

Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 cluster

2011-05-26 Thread Xu, Richard
Hi Folks,

We try to get hbase and hadoop running on clusters, take 2 Solaris servers for 
now.

Because of the incompatibility issue between hbase and hadoop, we have to stick 
with hadoop 0.20.2-append release.

It is very straight forward to make hadoop-0.20.203 running, but stuck for 
several days with hadoop-0.20.2, even the official release, not the append 
version.

1. Once try to run start-mapred.sh(hadoop-daemon.sh --config $HADOOP_CONF_DIR 
start jobtracker), following errors shown in namenode and jobtracker logs:

2011-05-26 12:30:29,169 WARN 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place enough 
replicas, still in need of 1
2011-05-26 12:30:29,175 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 
on 9000, call addBlock(/tmp/hadoop-cfadm/mapred/system/jobtracker.info, DFSCl
ient_2146408809) from 169.193.181.212:55334: error: java.io.IOException: File 
/tmp/hadoop-cfadm/mapred/system/jobtracker.info could only be replicated to 0 n
odes, instead of 1
java.io.IOException: File /tmp/hadoop-cfadm/mapred/system/jobtracker.info could 
only be replicated to 0 nodes, instead of 1
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)


2. Also, Configured Capacity is 0, cannot put any file to HDFS.

3. in datanode server, no error in logs, but tasktracker logs has the following 
suspicious thing:
2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server 
Responder: starting
2011-05-25 23:36:10,839 INFO org.apache.hadoop.ipc.Server: IPC Server listener 
on 41904: starting
2011-05-25 23:36:10,852 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 
on 41904: starting
2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 
on 41904: starting
2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 
on 41904: starting
2011-05-25 23:36:10,853 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 
on 41904: starting
.
2011-05-25 23:36:10,855 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
63 on 41904: starting
2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker 
up at: localhost/127.0.0.1:41904
2011-05-25 23:36:10,950 INFO org.apache.hadoop.mapred.TaskTracker: Starting 
tracker tracker_loanps3d:localhost/127.0.0.1:41904


I have tried all suggestions found so far, including
 1) remove hadoop-name and hadoop-data folders and reformat namenode;
 2) clean up all temp files/folders under /tmp;

But nothing works.

Your help is greatly appreciated.

Thanks,

RX