Re: Error using hadoop distcp
Distcp will run as mapreduce job. Here tasktrackers required the hostname mappings to contact to other nodes. Please configure the mapping correctly in both the machines and try. egards, Uma - Original Message - From: trang van anh Date: Wednesday, October 5, 2011 1:41 pm Subject: Re: Error using hadoop distcp To: common-user@hadoop.apache.org > which host run the task that throws the exception ? ensure that > each > data node know another data nodes in hadoop cluster-> add "ub16" > entry > in /etc/hosts on where the task running. > On 10/5/2011 12:15 PM, praveenesh kumar wrote: > > I am trying to use distcp to copy a file from one HDFS to another. > > > > But while copying I am getting the following exception : > > > > hadoop distcp hdfs://ub13:54310/user/hadoop/weblog > > hdfs://ub16:54310/user/hadoop/weblog > > > > 11/10/05 10:41:01 INFO mapred.JobClient: Task Id : > > attempt_201110031447_0005_m_07_0, Status : FAILED > > java.net.UnknownHostException: unknown host: ub16 > > at > org.apache.hadoop.ipc.Client$Connection.(Client.java:195)> > at org.apache.hadoop.ipc.Client.getConnection(Client.java:850) > > at org.apache.hadoop.ipc.Client.call(Client.java:720) > > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) > > at $Proxy1.getProtocolVersion(Unknown Source) > > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) > > at > > > org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:113)> >at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:215) > > at > org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:177)> > at > > > org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)> > at > > > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)> >at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) > > at > org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)> > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196) > > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175) > > at > > > org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:48)> > at > > > org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:124)> > at org.apache.hadoop.mapred.Task.runJobSetupTask(Task.java:835) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:296) > > at org.apache.hadoop.mapred.Child.main(Child.java:170) > > > > Its saying its not finding ub16. But the entry is there in > /etc/hosts files. > > I am able to ssh both the machines. Do I need password less ssh > between> these two NNs ? > > What can be the issue ? Any thing I am missing before using > distcp ? > > > > Thanks, > > Praveenesh > > > >
Re: Error using hadoop distcp
I tried that thing also.. when I am using IP address, its saying I should use hostname. *hadoop@ub13:~$ hadoop distcp hdfs://162.192.100.53:54310/user/hadoop/webloghdfs:// 162.192.100.16:54310/user/hadoop/weblog* 11/10/05 14:53:50 INFO tools.DistCp: srcPaths=[hdfs:// 162.192.100.53:54310/user/hadoop/weblog] 11/10/05 14:53:50 INFO tools.DistCp: destPath=hdfs:// 162.192.100.16:54310/user/hadoop/weblog java.lang.IllegalArgumentException: Wrong FS: hdfs:// 162.192.100.53:54310/user/hadoop/weblog, expected: hdfs://ub13:54310 at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:310) at org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:99) at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:155) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:464) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:648) at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:621) at org.apache.hadoop.tools.DistCp.copy(DistCp.java:638) at org.apache.hadoop.tools.DistCp.run(DistCp.java:857) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.tools.DistCp.main(DistCp.java:884) I have the entries of both machines in /etc/hosts... On Wed, Oct 5, 2011 at 1:55 PM, wrote: > Hi praveenesh > Can you try repeating the distcp using IP instead of host name. > From the error looks like an RPC exception not able to identify the host, so > I believe it can't be due to not setting a password less ssh. Just try it > out. > Regards > Bejoy K S > > -Original Message- > From: trang van anh > Date: Wed, 05 Oct 2011 14:06:11 > To: > Reply-To: common-user@hadoop.apache.org > Subject: Re: Error using hadoop distcp > > which host run the task that throws the exception ? ensure that each > data node know another data nodes in hadoop cluster-> add "ub16" entry > in /etc/hosts on where the task running. > On 10/5/2011 12:15 PM, praveenesh kumar wrote: > > I am trying to use distcp to copy a file from one HDFS to another. > > > > But while copying I am getting the following exception : > > > > hadoop distcp hdfs://ub13:54310/user/hadoop/weblog > > hdfs://ub16:54310/user/hadoop/weblog > > > > 11/10/05 10:41:01 INFO mapred.JobClient: Task Id : > > attempt_201110031447_0005_m_07_0, Status : FAILED > > java.net.UnknownHostException: unknown host: ub16 > > at > org.apache.hadoop.ipc.Client$Connection.(Client.java:195) > > at org.apache.hadoop.ipc.Client.getConnection(Client.java:850) > > at org.apache.hadoop.ipc.Client.call(Client.java:720) > > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) > > at $Proxy1.getProtocolVersion(Unknown Source) > > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) > > at > > org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:113) > > at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:215) > > at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:177) > > at > > > org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82) > > at > > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) > > at > org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) > > at > org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) > > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196) > > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175) > > at > > > org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:48) > > at > > > org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:124) > > at org.apache.hadoop.mapred.Task.runJobSetupTask(Task.java:835) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:296) > > at org.apache.hadoop.mapred.Child.main(Child.java:170) > > > > Its saying its not finding ub16. But the entry is there in /etc/hosts > files. > > I am able to ssh both the machines. Do I need password less ssh between > > these two NNs ? > > What can be the issue ? Any thing I am missing before using distcp ? > > > > Thanks, > > Praveenesh > > > >
Re: Error using hadoop distcp
Hi praveenesh Can you try repeating the distcp using IP instead of host name. From the error looks like an RPC exception not able to identify the host, so I believe it can't be due to not setting a password less ssh. Just try it out. Regards Bejoy K S -Original Message- From: trang van anh Date: Wed, 05 Oct 2011 14:06:11 To: Reply-To: common-user@hadoop.apache.org Subject: Re: Error using hadoop distcp which host run the task that throws the exception ? ensure that each data node know another data nodes in hadoop cluster-> add "ub16" entry in /etc/hosts on where the task running. On 10/5/2011 12:15 PM, praveenesh kumar wrote: > I am trying to use distcp to copy a file from one HDFS to another. > > But while copying I am getting the following exception : > > hadoop distcp hdfs://ub13:54310/user/hadoop/weblog > hdfs://ub16:54310/user/hadoop/weblog > > 11/10/05 10:41:01 INFO mapred.JobClient: Task Id : > attempt_201110031447_0005_m_07_0, Status : FAILED > java.net.UnknownHostException: unknown host: ub16 > at org.apache.hadoop.ipc.Client$Connection.(Client.java:195) > at org.apache.hadoop.ipc.Client.getConnection(Client.java:850) > at org.apache.hadoop.ipc.Client.call(Client.java:720) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) > at $Proxy1.getProtocolVersion(Unknown Source) > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) > at > org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:113) > at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:215) > at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:177) > at > org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175) > at > org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:48) > at > org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:124) > at org.apache.hadoop.mapred.Task.runJobSetupTask(Task.java:835) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:296) > at org.apache.hadoop.mapred.Child.main(Child.java:170) > > Its saying its not finding ub16. But the entry is there in /etc/hosts files. > I am able to ssh both the machines. Do I need password less ssh between > these two NNs ? > What can be the issue ? Any thing I am missing before using distcp ? > > Thanks, > Praveenesh >
Re: Error using hadoop distcp
which host run the task that throws the exception ? ensure that each data node know another data nodes in hadoop cluster-> add "ub16" entry in /etc/hosts on where the task running. On 10/5/2011 12:15 PM, praveenesh kumar wrote: I am trying to use distcp to copy a file from one HDFS to another. But while copying I am getting the following exception : hadoop distcp hdfs://ub13:54310/user/hadoop/weblog hdfs://ub16:54310/user/hadoop/weblog 11/10/05 10:41:01 INFO mapred.JobClient: Task Id : attempt_201110031447_0005_m_07_0, Status : FAILED java.net.UnknownHostException: unknown host: ub16 at org.apache.hadoop.ipc.Client$Connection.(Client.java:195) at org.apache.hadoop.ipc.Client.getConnection(Client.java:850) at org.apache.hadoop.ipc.Client.call(Client.java:720) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy1.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:113) at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:215) at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:177) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175) at org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:48) at org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:124) at org.apache.hadoop.mapred.Task.runJobSetupTask(Task.java:835) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:296) at org.apache.hadoop.mapred.Child.main(Child.java:170) Its saying its not finding ub16. But the entry is there in /etc/hosts files. I am able to ssh both the machines. Do I need password less ssh between these two NNs ? What can be the issue ? Any thing I am missing before using distcp ? Thanks, Praveenesh
Error using hadoop distcp
I am trying to use distcp to copy a file from one HDFS to another. But while copying I am getting the following exception : hadoop distcp hdfs://ub13:54310/user/hadoop/weblog hdfs://ub16:54310/user/hadoop/weblog 11/10/05 10:41:01 INFO mapred.JobClient: Task Id : attempt_201110031447_0005_m_07_0, Status : FAILED java.net.UnknownHostException: unknown host: ub16 at org.apache.hadoop.ipc.Client$Connection.(Client.java:195) at org.apache.hadoop.ipc.Client.getConnection(Client.java:850) at org.apache.hadoop.ipc.Client.call(Client.java:720) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy1.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:113) at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:215) at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:177) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175) at org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:48) at org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:124) at org.apache.hadoop.mapred.Task.runJobSetupTask(Task.java:835) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:296) at org.apache.hadoop.mapred.Child.main(Child.java:170) Its saying its not finding ub16. But the entry is there in /etc/hosts files. I am able to ssh both the machines. Do I need password less ssh between these two NNs ? What can be the issue ? Any thing I am missing before using distcp ? Thanks, Praveenesh