Re: Hadoop 0.21 running problems , no namenode to stop

2011-03-02 Thread rahul patodi
Hi,
Please check logs, there might be some error occured while starting daemons
Please post the error

On Thu, Mar 3, 2011 at 10:24 AM, Shivani Rao sg...@purdue.edu wrote:

 Problems running local installation of hadoop on single-node cluster

 I followed instructions given by tutorials to run hadoop-0.21 on a single
 node cluster.

 The first problem I encountered was that of HADOOP-6953. Thankfully that
 has got fixed.

 The other problem I am facing is that the datanode does not start. This I
 guess because when I run stop-dfs.sh  for datanode, I get a message
 no datanode to stop

 I am wondering if it is related remotely to the difference in the IP
 addresses on my computer

 127.0.0.1   localhost
 127.0.1.1   my-laptop

 Although I am aware of this, I do not know how to fix this.

 I am unable to even run a simple pi estimate example on the haddop
 installation

 This is the output I get is

 bin/hadoop jar hadoop-mapred-examples-0.21.0.jar pi 10 10
 Number of Maps  = 10
 Samples per Map = 10
 11/03/02 23:38:47 INFO security.Groups: Group mapping
 impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
 cacheTimeout=30

 And nothing else for long long time.

 I have not set the dfs.namedir and dfs.datadir in my hdfs-site.xml. But
 After running bin/hadoop namenode -format, I see that the tmp.dir has a
 folder with dfs/data and dfs/data folders for the two directories.

 what Am I doing wrong? Any help is appreciated.

 Here are my configuration files

 Regards,
 Shivani

 hdfs-site.xml

 property
  namedfs.replication/name
  value1/value
  descriptionDefault block replication.
  The actual number of replications can be specified when the file is
 created.
  The default is used if replication is not specified in create time.
  /description
 /property


 core-site.xml

 property
  namehadoop.tmp.dir/name
  value/usr/local/hadoop-${user.name}/value
  descriptionA base for other temporary directories./description
 /property

 property
  namefs.default.name/name
  valuehdfs://localhost:54310/value
  descriptionThe name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem./description
 /property



 mapred-site.xml

 property
  namemapred.job.tracker/name
  valuelocalhost:54311/value
  descriptionThe host and port that the MapReduce job tracker runs
  at.  If local, then jobs are run in-process as a single map
  and reduce task.
  /description
 /property






Re: no jobtracker to stop,no namenode to stop

2011-02-08 Thread rahul patodi
Please check
1. Required port should be free
2. Another instance of hadoop should not be running


On Tue, Feb 8, 2011 at 9:58 PM, ahmed nagy ahmed_said_n...@hotmail.comwrote:


 I am facing the same problem I look in the log files I find this that error

 FATAL org.apache.hadoop.mapred.JobTracker: java.net.BindException:
 Problem binding to cannonau.isti.cnr.it/146.48.82.190:9001 : Address
 already
 in use

 I also did a netstat to see whether the port is in use but it does not show
 that the port is in use I also changed the port and did another netstat and
 the error is the same
 any ideas ? please help
 when i stop hadoop here is what i get  there is no namenode to stop and
 there is no job tracker
 ahmednagy@cannonau:~/HadoopStandalone/hadoop-0.21.0/bin$ ./stop-all.sh
 This script is Deprecated. Instead use stop-dfs.sh and stop-mapred.sh
 no namenode to stop
 n01: stopping datanode
 n02: stopping datanode
 n07: stopping datanode
 n06: stopping datanode
 n03: stopping datanode
 n04: stopping datanode
 n08: stopping datanode
 n05: stopping datanode
 localhost: no secondarynamenode to stop
 no jobtracker to stop
 n03: stopping tasktracker
 n01: stopping tasktracker
 n04: stopping tasktracker
 n06: stopping tasktracker
 n02: stopping tasktracker
 n05: stopping tasktracker
 n08: stopping tasktracker
 n07: stopping tasktracker



 I made jps

 and here is the result
 1580 Jps
 20972 RunJar
 22216 RunJar


 2011-02-08 15:25:45,610 INFO org.apache.hadoop.mapred.JobTracker:
 STARTUP_MSG:
 /
 STARTUP_MSG: Starting JobTracker
 STARTUP_MSG:   host = cannonau.isti.cnr.it/146.48.82.190
 STARTUP_MSG:   args = []
 STARTUP_MSG:   version = 0.21.0
 STARTUP_MSG:   classpath =

 /home/ahmednagy/HadoopStandalone/hadoop-0.21.0/bin/../conf:/usr/lib/jvm/java-6-sun/lib/tools.jar:/home/ahmednagy/HadoopStandalone$
 STARTUP_MSG:   build =
 https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.21 -r
 985326; compiled by 'tomwhite' on Tue Aug 17 01:02:28 EDT 2010
 /
 2011-02-08 15:25:46,737 INFO org.apache.hadoop.security.Groups: Group
 mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
 cacheTimeout=3000$
 2011-02-08 15:25:46,752 INFO org.apache.hadoop.mapred.JobTracker: Starting
 jobtracker with owner as ahmednagy and supergroup as supergroup
 2011-02-08 15:25:46,755 INFO

 org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
 Updating the current master key for generatin$
 2011-02-08 15:25:46,758 INFO

 org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
 Starting expired delegation token remover thr$
 2011-02-08 15:25:46,759 INFO

 org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
 Updating the current master key for generatin$
 2011-02-08 15:25:46,760 INFO org.apache.hadoop.mapred.JobTracker: Scheduler
 configured with (memSizeForMapSlotOnJT, memSizeForReduceSlotOnJT,
 limitMaxMemFor$
 2011-02-08 15:25:46,762 INFO org.apache.hadoop.util.HostsFileReader:
 Refreshing hosts (include/exclude) list
 2011-02-08 15:25:46,791 INFO org.apache.hadoop.mapred.QueueManager:
 AllQueues : {default=default}; LeafQueues : {default=default}
 2011-02-08 15:25:46,873
 FATAL org.apache.hadoop.mapred.JobTracker: java.net.BindException:
 Problem binding to cannonau.isti.cnr.it/146.48.82.190:9001 : Address
 already
 in use
at org.apache.hadoop.ipc.Server.bind(Server.java:218)
at org.apache.hadoop.ipc.Server$Listener.init(Server.java:289)
at org.apache.hadoop.ipc.Server.init(Server.java:1443)
at org.apache.hadoop.ipc.RPC$Server.init(RPC.java:343)
at

 org.apache.hadoop.ipc.WritableRpcEngine$Server.init(WritableRpcEngine.java:324)
at

 org.apache.hadoop.ipc.WritableRpcEngine.getServer(WritableRpcEngine.java:284)
at

 org.apache.hadoop.ipc.WritableRpcEngine.getServer(WritableRpcEngine.java:45)
at org.apache.hadoop.ipc.RPC.getServer(RPC.java:331)
at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1450)
at
 org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:258)
at
 org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:250)
at
 org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:245)
at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4164)
 Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at org.apache.hadoop.ipc.Server.bind(Server.java:216)
... 12 more

 2011-02-08 15:25:46,875 INFO org.apache.hadoop.mapred.JobTracker:
 SHUTDOWN_MSG:

 --
 View this message in context:
 

Re: Data Nodes do not start

2011-02-08 Thread rahul patodi
I think you should copy the namespaceID of your master which is in
name/current/VERSION file to all the slaves
Also use ./start-dfs.sh then ./start-mapred.sh to start respective daemons

http://hadoop-tutorial.blogspot.com/2010/11/running-hadoop-in-distributed-mode.html
http://hadoop-tutorial.blogspot.com/2010/11/running-hadoop-in-distributed-mode.html

*Regards*,
Rahul Patodi
Software Engineer,
Impetus Infotech (India) Pvt Ltd,
www.impetus.com
Mob:09907074413


On Wed, Feb 9, 2011 at 11:48 AM, madhu phatak phatak@gmail.com wrote:

 Don't use start-all.sh ,use data node daemon script to start the data node
 .

 On Mon, Feb 7, 2011 at 11:52 PM, ahmednagy ahmed_said_n...@hotmail.com
 wrote:

 
  Dear All,
  Please Help. I have tried to start the data nodes with ./start-all.sh on
 a
  7
  node cluster however I recieve incompatible namespace when i try to put
 any
  file on the HDFS I tried the suggestions in the known issues for changing
  the VERSION number in the hdfs however it did not work. any ideas Please
  advise. I am attaching the error in the log file for data node
  Regards
 
 
  https://issues.apache.org/jira/browse/HDFS-107
 
 
  2011-02-07 18:52:28,691 INFO
  org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
  /
  STARTUP_MSG: Starting DataNode
  STARTUP_MSG:   host = n01/192.168.0.1
  STARTUP_MSG:   args = []
  STARTUP_MSG:   version = 0.21.0
  STARTUP_MSG:   classpath =
 
 
 /home/ahmednagy/HadoopStandalone/hadoop-0.21.0/bin/../conf:/usr/lib/jvm/java-6-sun/lib/tools.jar:/home/ahmednagy/HadoopStandalone$
  STARTUP_MSG:   build =
  https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.21 -r
  985326; compiled by 'tomwhite' on Tue Aug 17 01:02:28 EDT 2010
  /
  2011-02-07 18:52:28,881 WARN org.apache.hadoop.hdfs.server.common.Util:
  Path
  /tmp/mylocal/ should be specified as a URI in configuration files. Please
  updat$
  2011-02-07 18:52:29,115 INFO org.apache.hadoop.security.Groups: Group
  mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
  cacheTimeout=3000$
  2011-02-07 18:52:29,580 ERROR
  org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException:
  Incompatible namespaceIDs in /tmp/mylocal: namenode name$
 at
 
 
 org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:237)
 at
 
 
 org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:152)
 at
 
 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:336)
 at
  org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:260)
 at
  org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:237)
 at
 
 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1440)
 at
 
 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1393)
 at
 
 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1407)
 at
  org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1552)
 
  --
  View this message in context:
  http://old.nabble.com/Data-Nodes-do-not-start-tp30866323p30866323.html
  Sent from the Hadoop core-user mailing list archive at Nabble.com.
 
 




-- 
*
*


Re: Cannot copy files to HDFS

2011-01-27 Thread rahul patodi
Hi,
Your data Node is not up..
please run jps command to check all required daemons are running.
you can refer http://www.hadoop-tutorial.blogspot.com/


-- 
*Regards*,
Rahul Patodi
Software Engineer,
Impetus Infotech (India) Pvt Ltd,
www.impetus.com
Mob:09907074413


Re: SSH problem in hadoop installation

2011-01-25 Thread rahul patodi
Hi,
Have you installed ssh on all the nodes?
if yes configure it. You can refer
http://hadoop-tutorial.blogspot.com/2010/11/running-hadoop-in-distributed-mode.html
http://hadoop-tutorial.blogspot.com/2010/11/running-hadoop-in-distributed-mode.html
-- 
*Regards*,
Rahul Patodi
Software Engineer,
Impetus Infotech (India) Pvt Ltd,
www.impetus.com
Mob:09907074413

On Tue, Jan 25, 2011 at 5:20 PM, real great..
greatness.hardn...@gmail.comwrote:

 Hi,
 @sourabh: its a simple cluster in lab. So nothing like server and
 administrator..
 We have always passwords with us only.

 On Tue, Jan 25, 2011 at 11:49 AM, Saurabh Dutta 
 saurabh.du...@impetus.co.in
  wrote:

  Hi,
 
  Do you've access the to the server. If you don't you'll have to ask the
  administrator to check if the client's IP address is present in the
 server's
  /etc/hosts.allow and make sure it is not present in the /etc/hosts.deny
 file
 
  There could be other reasons too like the Fingerprint Keys getting
  corrupted but this should be the first step and only then you should look
  for other options.
 
  Thanks,
  Saurabh Dutta
 
  -Original Message-
  From: real great.. [mailto:greatness.hardn...@gmail.com]
  Sent: Tuesday, January 25, 2011 11:38 AM
  To: common-user
  Subject: SSH problem in hadoop installation
 
  Hi,
  Am trying to install Hadoop on a linux cluster(Fedora 12).
  However, am not able to SSH to localhost and gives the following error.
 
  *ssh_exchange_identification: Connection closed by remote host*
 
  I know this is not the correct forum for asking this question. Yet it
 could
  solve a lot of my time if any of you could help me.
  Thanks,
 
 
 
  --
  Regards,
  R.V.
 
  
 
  For updates on our latest innovations in supporting great software
 products
  in Mobile Computing, Large Data, Cloud Computing, SaaS and Social Media 
  Analytics, follow us on www.twitter.com/impetuscalling.
  OR
  Click http://www.impetus.com to know more.
 
 
  NOTE: This message may contain information that is confidential,
  proprietary, privileged or otherwise protected by law. The message is
  intended solely for the named addressee. If received in error, please
  destroy and notify the sender. Any use of this email is prohibited when
  received in error. Impetus does not represent, warrant and/or guarantee,
  that the integrity of this communication has been maintained nor that the
  communication is free of errors, virus, interception or interference.
 



 --
 Regards,
 R.V.



Re: exceptions copying files into HDFS

2010-12-12 Thread rahul patodi
you can follow this tutorial:

http://hadoop-tutorial.blogspot.com/2010/11/running-hadoop-in-distributed-mode.html
http://cloudera-tutorial.blogspot.com/2010/11/running-cloudera-in-distributed-mode.html
also, before running any job please ensure all the required processes are
running on the correct node
like on master:
Namenode, jobtracker, secondarynamenode(if you are not running secondary
name node on another system)

on slave:
datanode, tasktracker


On Sun, Dec 12, 2010 at 2:46 PM, Varadharajan Mukundan srinath...@gmail.com
 wrote:

 HI,

  jps reports DataNode, NameNode, and SecondayNameNode as running:
 
  r...@ritter:/tmp/hadoop-rock jps
  31177 Jps
  29909 DataNode
  29751 NameNode
  30052 SecondaryNameNode

 In master node, the output of the JPS will contain a tasktracker,
 jobtracker, namenode, secondary namenode, datanode(optional, depending on
 your config) and your slaves will have tasktracker, datanodes in their jps
 output. If you need more help on configuring hadoop, i recommend you to
 take
 a look at

 http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/




  Here is the contents of the Hadoop node tree.  The only thing that looks
  like a log file are the dncp_block_verification.log.curr files, and those
  are empty.
  Note the presence of the in_use.lock files, which suggests that this node
 is
  indeed being used.


 The logs will be in the logs directory in $HADOOP_HOME (hadoop home
 directory), are you looking for logs in this directory?


 --
 Thanks,
 M. Varadharajan

 

 Experience is what you get when you didn't get what you wanted
   -By Prof. Randy Pausch in The Last Lecture

 My Journal :- www.thinkasgeek.wordpress.com




-- 
-Thanks and Regards,
Rahul Patodi
Associate Software Engineer,
Impetus Infotech (India) Private Limited,
www.impetus.com
Mob:09907074413


Re: Mapreduce Exceptions with hadoop 0.20.2

2010-12-12 Thread rahul patodi
Hi Praveen,
whenever we restart our cluster, name node goes to the safe mode for a
particular interval of time, during this period we cannt write in to the
HDFS, after that name node automatically come out from safe mode. What you
are trying to do is that you off the safe mode and than restart cluster, now
when cluster is started again name node is in safe mode to perform essential
checks. what you should do is that after restarting the cluster just wait
for a while so name node come out of safe mode


-- 
-Thanks and Regards,
Rahul Patodi
Associate Software Engineer,
Impetus Infotech (India) Private Limited,
www.impetus.com
Mob:09907074413


On Fri, Dec 10, 2010 at 10:07 AM, Konstantin Boudnik c...@apache.org wrote:

 On Thu, Dec 9, 2010 at 19:55, Praveen Bathala pbatha...@gmail.com wrote:
  I did this
  prav...@praveen-desktop:~/hadoop/hadoop-0.20.2$ bin/hadoop dfsadmin
  -safemode leave
  Safe mode is OFF
  prav...@praveen-desktop:~/hadoop/hadoop-0.20.2$ bin/hadoop dfsadmin
  -safemode get
  Safe mode is OFF

 This is not a configuration setting: this is only a runtime on/off
 switch. Once you have restarted the cluster your NN will go into
 safemode (for a number of reasons). TTs are made to quit if they can't
 connect to HDFS after some timeout (60 seconds if I remember
 correctly). Once your NN is back from its safemode you can safely
 start MR daemons and everything should be just fine.

 Simply put: be patient ;)


  and then I restarted my cluster and still I see the INFO in namenode logs
  saying in safemode..
 
  somehow I am getting my Map output fine, but the job.isSuccessful() is
  returning false.
 
  Any help on that.
 
  Thanks
  + Praveen
 
  On Thu, Dec 9, 2010 at 9:28 PM, Mahadev Konar maha...@yahoo-inc.com
 wrote:
 
  Hi Praveen,
   Looks like its your namenode that's still in safemode.
 
 
  http://wiki.apache.org/hadoop/FAQ
 
  The safemode feature in the namenode waits till a certain number of
  threshold for hdfs blocks have been reported by the datanodes,  before
  letting clients making edits to the namespace. It usually happens when
 you
  reboot your namenode. You can read more about the safemode in the above
 FAQ.
 
  Thanks
  mahadev
 
 
  On 12/9/10 6:09 PM, Praveen Bathala pbatha...@gmail.com wrote:
 
  Hi,
 
  I am running Mapreduce job to get some emails out of a huge text file.
  I used to use hadoop 0.19 version and I had no issues, now I am using
 the
  hadoop 0.20.2 and when I run my hadoop mapreduce job I see the log as
 job
  failed and in the jobtracker log
 
  Can someone please help me..
 
  2010-12-09 20:53:00,399 INFO org.apache.hadoop.mapred.JobTracker:
 problem
  cleaning system directory:
  hdfs://localhost:9000/home/praveen/hadoop/temp/mapred/system
  org.apache.hadoop.ipc.RemoteException:
  org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete
  /home/praveen/hadoop/temp/mapred/system. Name node is in safe mode.
  The ratio of reported blocks 0. has not reached the threshold
 0.9990.
  Safe mode will be turned off automatically.
 at
 
 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:1700)
 at
 
 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:1680)
 at
 
 org.apache.hadoop.hdfs.server.namenode.NameNode.delete(NameNode.java:517)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 
 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
 
   at org.apache.hadoop.ipc.Client.call(Client.java:740)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
 at $Proxy4.delete(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 
 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at
 
 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
 at
 
 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
 at $Proxy4.delete(Unknown Source)
 at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:582)
 at
 
 
 org.apache.hadoop.hdfs.DistributedFileSystem.delete

Re: exceptions copying files into HDFS

2010-12-12 Thread rahul patodi
Sanford,.
I have read ur previous posts, also blog URL given by me also contain
configuration for running hadoop in pseudo distributed mode
Also the exception you are getting is because your data node is down
I would suggest please start from scratch
*To be more specific* if you need quick install tutorial:
for hadoop:
http://hadoop-tutorial.blogspot.com/2010/11/running-hadoop-in-pseudo-distributed.html
for cloudera:
http://cloudera-tutorial.blogspot.com/2010/11/running-cloudera-in-pseudo-distributed.html

On Sun, Dec 12, 2010 at 11:12 PM, Sanford Rockowitz
rockow...@minsoft.comwrote:

 Rahul,

 I should have been more explicit.  I am simply trying to run in
 pseudo-distributed mode.   For further comments, see my previous post to
 Varadharajan.

 Thanks,
 Sanford


 On 12/12/2010 2:24 AM, rahul patodi wrote:

 you can follow this tutorial:


 http://hadoop-tutorial.blogspot.com/2010/11/running-hadoop-in-distributed-mode.html

 http://cloudera-tutorial.blogspot.com/2010/11/running-cloudera-in-distributed-mode.html
 also, before running any job please ensure all the required processes are
 running on the correct node
 like on master:
 Namenode, jobtracker, secondarynamenode(if you are not running secondary
 name node on another system)

 on slave:
 datanode, tasktracker


 On Sun, Dec 12, 2010 at 2:46 PM, Varadharajan Mukundan
 srinath...@gmail.com

 wrote:
 HI,

  jps reports DataNode, NameNode, and SecondayNameNode as running:

 r...@ritter:/tmp/hadoop-rock  jps
 31177 Jps
 29909 DataNode
 29751 NameNode
 30052 SecondaryNameNode

 In master node, the output of the JPS will contain a tasktracker,
 jobtracker, namenode, secondary namenode, datanode(optional, depending on
 your config) and your slaves will have tasktracker, datanodes in their
 jps
 output. If you need more help on configuring hadoop, i recommend you to
 take
 a look at


 http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/




  Here is the contents of the Hadoop node tree.  The only thing that looks
 like a log file are the dncp_block_verification.log.curr files, and
 those
 are empty.
 Note the presence of the in_use.lock files, which suggests that this
 node

 is

 indeed being used.


 The logs will be in the logs directory in $HADOOP_HOME (hadoop home
 directory), are you looking for logs in this directory?


 --
 Thanks,
 M. Varadharajan

 

 Experience is what you get when you didn't get what you wanted
   -By Prof. Randy Pausch in The Last Lecture

 My Journal :- www.thinkasgeek.wordpress.com







-- 
*Regards*,
Rahul Patodi
Associate Software Engineer,
Impetus Infotech (India) Pvt Ltd,
www.impetus.com
Mob:09907074413


Re: Abandoning Block

2010-12-06 Thread rahul patodi
I think you should setup passwordless ssh from master to all VMs
you can do this by running this command on master:
ssh-keygen -t rsa -P 
ssh-copy-id -i $HOME/.ssh/id_rsa.pub slave

-Thanks and Regards,
Rahul Patodi
Associate Software Engineer,
Impetus Infotech (India) Private Limited,
www.impetus.com
Mob:09907074413

On Tue, Dec 7, 2010 at 10:38 AM, Adarsh Sharma adarsh.sha...@orkash.comwrote:

 li ping wrote:

 Make sure the VMs can reach each other (e.g,IPtables). And the DNS/ip is
 correct.

 On Mon, Dec 6, 2010 at 7:05 PM, Adarsh Sharma adarsh.sha...@orkash.com
 wrote:



 Dear all,

 I am facing below problem while running Hadoop on VM's. I am using
 hadoop0-.20.2 with JDK6

 My jobtracker log says that :-2010-12-06 15:16:06,618 INFO
 org.apache.hadoop.mapred.JobTracker: JobTracker up at: 54311
 2010-12-06 15:16:06,618 INFO org.apache.hadoop.mapred.JobTracker:
 JobTracker webserver: 50030
 2010-12-06 15:16:06,738 INFO org.apache.hadoop.mapred.JobTracker:
 Cleaning
 up the system directory
 2010-12-06 15:16:06,801 INFO
 org.apache.hadoop.mapred.CompletedJobStatusStore: Completed job store is
 inactive
 2010-12-06 15:17:15,830 INFO org.apache.hadoop.hdfs.DFSClient: Exception
 in
 createBlockOutputStream java.net.SocketTimeoutException: 69000 millis
 timeout while waiting for channel to be ready for connect. ch :
 java.nio.channels.SocketChannel[connection-pending remote=/
 192.168.0.56:50010]
 2010-12-06 15:17:15,830 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
 block blk_377241628391316172_1001
 2010-12-06 15:17:15,832 INFO org.apache.hadoop.hdfs.DFSClient: Waiting to
 find target node: 192.168.0.56:50010
 2010-12-06 15:18:30,836 INFO org.apache.hadoop.hdfs.DFSClient: Exception
 in
 createBlockOutputStream java.net.SocketTimeoutException: 69000 millis
 timeout while waiting for channel to be ready for connect. ch :
 java.nio.channels.SocketChannel[connection-pending remote=/
 192.168.0.56:50010]
 2010-12-06 15:18:30,836 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
 block blk_2025622418653738085_1001
 2010-12-06 15:18:30,838 INFO org.apache.hadoop.hdfs.DFSClient: Waiting to
 find target node: 192.168.0.56:50010
 2010-12-06 15:19:45,842 INFO org.apache.hadoop.hdfs.DFSClient: Exception
 in
 createBlockOutputStream java.net.SocketTimeoutException: 69000 millis
 timeout while waiting for channel to be ready for connect. ch :
 java.nio.channels.SocketChannel[connection-pending remote=/
 192.168.0.61:50010]
 2010-12-06 15:19:45,843 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
 block blk_696328516245550547_1001
 2010-12-06 15:19:45,845 INFO org.apache.hadoop.hdfs.DFSClient: Waiting to
 find target node: 192.168.0.61:50010
 2010-12-06 15:21:00,849 INFO org.apache.hadoop.hdfs.DFSClient: Exception
 in
 createBlockOutputStream java.net.SocketTimeoutException: 69000 millis
 timeout while waiting for channel to be ready for connect. ch :
 java.nio.channels.SocketChannel[connection-pending remote=/
 192.168.0.55:50010]
 2010-12-06 15:21:00,849 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
 block blk_6110605884701761678_1001
 2010-12-06 15:21:00,853 INFO org.apache.hadoop.hdfs.DFSClient: Waiting to
 find target node: 192.168.0.55:50010
 2010-12-06 15:21:06,854 WARN org.apache.hadoop.hdfs.DFSClient:
 DataStreamer
 Exception: java.io.IOException: Unable to create new block.
  at

 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2845)
  at

 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
  at

 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)

 2010-12-06 15:21:06,855 WARN org.apache.hadoop.hdfs.DFSClient: Error
 Recovery for block blk_6110605884701761678_1001 bad datanode[0] nodes ==
 null
 2010-12-06 15:21:06,855 WARN org.apache.hadoop.hdfs.DFSClient: Could not
 get block locations. Source file /home/hadoop/mapred/system/
 jobtracker.info - Aborting...
 2010-12-06 15:21:06,855 WARN org.apache.hadoop.mapred.JobTracker: Writing
 to file
 hdfs://ws-test:54310/home/hadoop/mapred/system/jobtracker.infofailed!

 41,1   5%

 tem/jobtracker.info failed!
 2010-12-06 15:21:06,855 WARN org.apache.hadoop.mapred.JobTracker:
 FileSystem is not ready yet!
 2010-12-06 15:21:06,862 WARN org.apache.hadoop.mapred.JobTracker: Failed
 to
 initialize recovery manager.
 java.net.SocketTimeoutException: 69000 millis timeout while waiting for
 channel to be ready for connect. ch :
 java.nio.channels.SocketChannel[connection-pending remote=/
 192.168.0.55:50010]
  at

 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:213)
  at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)
  at

 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2870)
  at

 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2826)
  at

 org.apache.hadoop.hdfs.DFSClient

Re: Re: Re: where is example of the configuration about multi nodes on one machine?

2010-11-30 Thread rahul patodi
last option i gave was to run hadoop in fully distributed mode

but you can run hadoop in pseudo distributed mode:
http://hadoop-tutorial.blogspot.com/2010/11/running-hadoop-in-pseudo-distributed.html
or
standalone mode:
http://hadoop-tutorial.blogspot.com/2010/11/running-hadoop-in-standalone-mode.html



2010/11/30 beneo_7 bene...@163.com

 If you want to just use one machine, why do you want to use hadoop?
 Hadoop's
 power lies in distributed computing. That being said, it is possible to
 use
 hadoop on a single machine by using the pseudo-distributed mode (Read
 http://hadoop.apache.org/common/docs/current/single_node_setup.html and
 
 http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
 ).
 If you are using just one machine, at least ensure that your machine has
 lots of cores (8 core/16 cores would be great) to get benefit out of
 hadoop.
 
 I am not sure, but using virtual machines won't be helpful here as a
 virtual
 machine is just an abstraction and not real hardware.


 thanks very much, i use the hadoop because the apache Mahout project need
 it for clustering.

 my machine is only one and powerful, 16cores and 32g mem, but i have only
 one, so i need configuration about multi nodes on one machine.

 i have used the pseudo-distributed mode, however, the project always used
 1 core,  the cpu freq always 100% ~ 103%, the time the execution is 4 hours,
 it's too slow.

 i can not change the mahout project source code, the trunk always update
 sometime, it's difficulty to solute confliction.


 is there any way to conf some slaves in one machine?

 At 2010-11-30 17:07:49,Hari Sreekumar hsreeku...@clickable.com wrote:

 Hi beneo,
 
 If you want to just use one machine, why do you want to use hadoop?
 Hadoop's
 power lies in distributed computing. That being said, it is possible to
 use
 hadoop on a single machine by using the pseudo-distributed mode (Read
 http://hadoop.apache.org/common/docs/current/single_node_setup.html and
 
 http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
 ).
 If you are using just one machine, at least ensure that your machine has
 lots of cores (8 core/16 cores would be great) to get benefit out of
 hadoop.
 
 I am not sure, but using virtual machines won't be helpful here as a
 virtual
 machine is just an abstraction and not real hardware.
 
 Cheers,
 Hari
 
 2010/11/30 beneo_7 bene...@163.com
 
  i'm sorry, but, are you sure??
  At 2010-11-30 15:53:58,rahul patodi patodira...@gmail.com wrote:
  you can create virtual machines on your single machine:
  for you have to install sun virtual box(other tools are also available
  like
  VMware)
  now you can create as many virtual machine as you want
  then you can create one master and all slaves
  
  -Thanks and Regards,
  Rahul Patodi
  Associate Software Engineer,
  Impetus Infotech (India) Private Limited,
  www.impetus.com
  Mob:09907074413
  
  2010/11/30 beneo_7 bene...@163.com
  
   i have only one machine and it's powerful.
   so, i want the all the slaves and master on one machine?
  
   thx in advanced
  
  
  
  
  --
 
 




-- 
-Thanks and Regards,
Rahul Patodi
Associate Software Engineer,
Impetus Infotech (India) Private Limited,
www.impetus.com
Mob:09907074413


Re: where is example of the configuration about multi nodes on one machine?

2010-11-29 Thread rahul patodi
you can create virtual machines on your single machine:
for you have to install sun virtual box(other tools are also available like
VMware)
now you can create as many virtual machine as you want
then you can create one master and all slaves

-Thanks and Regards,
Rahul Patodi
Associate Software Engineer,
Impetus Infotech (India) Private Limited,
www.impetus.com
Mob:09907074413

2010/11/30 beneo_7 bene...@163.com

 i have only one machine and it's powerful.
 so, i want the all the slaves and master on one machine?

 thx in advanced




--


Re: Is there a single command to start the whole cluster in CDH3 ?

2010-11-23 Thread rahul patodi
hi Hary,
when i try to start hadoop daemons by /usr/lib/hadoop# bin/start-dfs.sh on
name node it is giving this error:*May not run daemons as root. Please
specify HADOOP_NAMENODE_USER(*same for other daemons*)*
but when i try to start it using */etc/init.d/hadoop-0.20-namenode start *
it* *gets start successfully* **
*
*whats the reason behind that?
*

On Wed, Nov 24, 2010 at 10:04 AM, Hari Sreekumar
hsreeku...@clickable.comwrote:

 Hi Ricky,

  Yes, that's how it is meant to be. The machine where you run
 start-dfs.sh will become the namenode, and the machine whihc you specify in
 you masters file becomes the secondary namenode.

 Hari

 On Wed, Nov 24, 2010 at 2:13 AM, Ricky Ho rickyphyl...@yahoo.com wrote:

  Thanks for pointing me to the right command.  I am using the CDH3
  distribution.
  I figure out no matter what I put in the masters file, it always start
 the
  NamedNode at the machine where I issue the start-all.sh command.  And
  always
  start a SecondaryNamedNode in all other machines.  Any clue ?
 
 
  Rgds,
  Ricky
 
  -Original Message-
  From: Hari Sreekumar [mailto:hsreeku...@clickable.com]
  Sent: Tuesday, November 23, 2010 10:25 AM
  To: common-user@hadoop.apache.org
  Subject: Re: Is there a single command to start the whole cluster in CDH3
 ?
 
  Hi Ricky,
 
  Which hadoop version are you using? I am using hadoop-0.20.2
 apache
  version, and I generally just run the $HADOOP_HOME/bin/start-dfs.sh and
  start-mapred.sh script on my master node. If passwordless ssh is
  configured,
  this script will start the required services on each node. You shouldn't
  have to start the services on each node individually. The secondary
  namenode
  is specified in the conf/masters file. The node where you call the
  start-*.sh script becomes the namenode(for start-dfs) or jobtracker(for
  start-mapred). The node mentioned in the masters file becomes the 2ndary
  namenode, and the datanodes and tasktrackers are the nodes which are
  mentioned in the slaves file.
 
  HTH,
  Hari
 
  On Tue, Nov 23, 2010 at 11:43 PM, Ricky Ho rickyphyl...@yahoo.com
 wrote:
 
   I setup the cluster configuration in masters, slaves,
  core-site.xml,
   hdfs-site.xml, mapred-site.xml and copy to all the machines.
  
   And I login to one of the machines and use the following to start the
   cluster.
   for service in /etc/init.d/hadoop-0.20-*; do sudo $service start; done
  
   I expect this command will SSH to all the other machines (based on the
   master
   and slaves files) to start the corresponding daemons, but obviously
 it
  is
   not
   doing that in my setup.
  
   Am I missing something in my setup ?
  
   Also, where do I specify where the Secondary Name Node is run.
  
   Rgds,
   Ricky
  
  
  
  
  
 
 
 
 




-- 
-Thanks and Regards,
Rahul Patodi
Associate Software Engineer,
Impetus Infotech (India) Private Limited,
www.impetus.com
Mob:09907074413


Re: Is there a single command to start the whole cluster in CDH3 ?

2010-11-23 Thread rahul patodi
hi Ricky,
for installing CDH3 you can refer this tutorial:
http://cloudera-tutorial.blogspot.com/2010/11/running-cloudera-in-distributed-mode.html
all the steps in this tutorial are well tested.(*in case of any query please
leave a comment*)


On Wed, Nov 24, 2010 at 11:48 AM, rahul patodi patodira...@gmail.comwrote:

 hi Hary,
 when i try to start hadoop daemons by /usr/lib/hadoop# bin/start-dfs.sh on
 name node it is giving this error:*May not run daemons as root. Please
 specify HADOOP_NAMENODE_USER(*same for other daemons*)*
 but when i try to start it using */etc/init.d/hadoop-0.20-namenode start
 *it* *gets start successfully* **
 *
 *whats the reason behind that?
 *

 On Wed, Nov 24, 2010 at 10:04 AM, Hari Sreekumar hsreeku...@clickable.com
  wrote:

 Hi Ricky,

  Yes, that's how it is meant to be. The machine where you run
 start-dfs.sh will become the namenode, and the machine whihc you specify
 in
 you masters file becomes the secondary namenode.

 Hari

 On Wed, Nov 24, 2010 at 2:13 AM, Ricky Ho rickyphyl...@yahoo.com wrote:

  Thanks for pointing me to the right command.  I am using the CDH3
  distribution.
  I figure out no matter what I put in the masters file, it always start
 the
  NamedNode at the machine where I issue the start-all.sh command.  And
  always
  start a SecondaryNamedNode in all other machines.  Any clue ?
 
 
  Rgds,
  Ricky
 
  -Original Message-
  From: Hari Sreekumar [mailto:hsreeku...@clickable.com]
  Sent: Tuesday, November 23, 2010 10:25 AM
  To: common-user@hadoop.apache.org
  Subject: Re: Is there a single command to start the whole cluster in
 CDH3 ?
 
  Hi Ricky,
 
  Which hadoop version are you using? I am using hadoop-0.20.2
 apache
  version, and I generally just run the $HADOOP_HOME/bin/start-dfs.sh and
  start-mapred.sh script on my master node. If passwordless ssh is
  configured,
  this script will start the required services on each node. You shouldn't
  have to start the services on each node individually. The secondary
  namenode
  is specified in the conf/masters file. The node where you call the
  start-*.sh script becomes the namenode(for start-dfs) or jobtracker(for
  start-mapred). The node mentioned in the masters file becomes the 2ndary
  namenode, and the datanodes and tasktrackers are the nodes which are
  mentioned in the slaves file.
 
  HTH,
  Hari
 
  On Tue, Nov 23, 2010 at 11:43 PM, Ricky Ho rickyphyl...@yahoo.com
 wrote:
 
   I setup the cluster configuration in masters, slaves,
  core-site.xml,
   hdfs-site.xml, mapred-site.xml and copy to all the machines.
  
   And I login to one of the machines and use the following to start the
   cluster.
   for service in /etc/init.d/hadoop-0.20-*; do sudo $service start; done
  
   I expect this command will SSH to all the other machines (based on the
   master
   and slaves files) to start the corresponding daemons, but obviously
 it
  is
   not
   doing that in my setup.
  
   Am I missing something in my setup ?
  
   Also, where do I specify where the Secondary Name Node is run.
  
   Rgds,
   Ricky
  
  
  
  
  
 
 
 
 




 --
 -Thanks and Regards,
 Rahul Patodi
 Associate Software Engineer,
 Impetus Infotech (India) Private Limited,
 www.impetus.com
 Mob:09907074413




-- 
-Thanks and Regards,
Rahul Patodi
Associate Software Engineer,
Impetus Infotech (India) Private Limited,
www.impetus.com
Mob:09907074413