Not sure if this helps but copyFromLocal is just to write data from the current client machine to hdfs; but distcp will start a mapreduce job to do the copy, that means the NodeManager/taskTracker machine need to write data to the remote hdfs cluster;
Regards, *Stanley Shi,* On Sun, Apr 13, 2014 at 9:44 PM, xeon <xeonmailingl...@gmail.com> wrote: > For curiosity, I can copy a file from host1 to host2 with the > copyFromLocal. I cannot do this with distcp. I think the problem is related > to the 'distcp' command. Maybe it is a question of permissions I already > set "<property> <name>dfs.permissions</name> <value>false</value> > </property> > " in the hdfs-site.xml and core-site. Any suggestion to fix this error? > > WorkGroup0000$ hdfs dfs -copyFromLocal setup.py 'hdfs://WorkGroup0010:9000/' > <<- this command works > WorkGroup0000$ hadoop distcp hdfs://WorkGroup0000:9000/wiki > hdfs://WorkGroup0010:9000/wiki <<- this command doesn't work > > > On 04/13/2014 11:43 AM, xeon wrote: > > As far as I can see, the source host can't write to the destination host. > This is a similar problem to the one described here ( > https://groups.google.com/a/cloudera.org/forum/#!topic/cdh-user/0nTALQjbrT0). > But I can't find a way to fix this error. Any help to try to fix this? > > On 04/13/2014 11:08 AM, xeon wrote: > > Hi, > > I am trying to copy data between HDFS that are located away from each > other, and when I run the distcp command, I get the errors below in the > namenode and the datanode of the target hosts. What is happening? > > The 2 mapreduce runtime are running in a VLAN. The host are physically > distant, but they use the same IP range. > > Command: > hadoop distcp hdfs://WorkGroup0000:9000/wiki hdfs://WorkGroup0010:9000/wiki > > Namenode logs of the source host: > 2014-04-13 10:01:24,213 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > allocateBlock: > /tmp/hadoop-yarn/staging/root/.staging/job_1397327306299_0045/job.splitmetainfo. > BP-1662111526-172.16.100.13-1397327293758 > blk_-5400567409103494582_6969{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[172.16.100.15:50010 > |RBW]]} > 2014-04-13 10:01:24,218 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: 172.16.100.15:50010 is added to > blk_-5400567409103494582_6969{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[172.16.100.15:50010|RBW]]} > size 0 > 2014-04-13 10:01:24,219 INFO org.apache.hadoop.hdfs.StateChange: DIR* > completeFile: > /tmp/hadoop-yarn/staging/root/.staging/job_1397327306299_0045/job.splitmetainfo > is closed by DFSClient_NONMAPREDUCE_458023096_1 > 2014-04-13 10:01:24,320 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > allocateBlock: > /tmp/hadoop-yarn/staging/root/.staging/job_1397327306299_0045/job.xml. > BP-1662111526-172.16.100.13-1397327293758 > blk_489660666766888075_6971{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[172.16.100.14:50010 > |RBW]]} > 2014-04-13 10:01:24,328 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: 172.16.100.14:50010 is added to > blk_489660666766888075_6971{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[172.16.100.14:50010|RBW]]} > size 0 > 2014-04-13 10:01:24,329 INFO org.apache.hadoop.hdfs.StateChange: DIR* > completeFile: > /tmp/hadoop-yarn/staging/root/.staging/job_1397327306299_0045/job.xml is > closed by DFSClient_NONMAPREDUCE_458023096_1 > 2014-04-13 10:01:24,389 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > allocateBlock: > /tmp/hadoop-yarn/staging/root/.staging/job_1397327306299_0045/appTokens. > BP-1662111526-172.16.100.13-1397327293758 > blk_-5469411569413407886_6973{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[172.16.100.16:50010 > |RBW]]} > 2014-04-13 10:01:24,396 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: 172.16.100.16:50010 is added to > blk_-5469411569413407886_6973{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[172.16.100.16:50010|RBW]]} > size 0 > 2014-04-13 10:01:24,397 INFO org.apache.hadoop.hdfs.StateChange: DIR* > completeFile: > /tmp/hadoop-yarn/staging/root/.staging/job_1397327306299_0045/appTokens is > closed by DFSClient_NONMAPREDUCE_458023096_1 > 2014-04-13 10:01:26,904 INFO BlockStateChange: BLOCK* ask > 172.16.100.14:50010 to replicate blk_8868060350766479646_6965 to > datanode(s) 172.16.100.16:50010 172.16.100.15:50010 > 2014-04-13 10:01:27,932 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > allocateBlock: > /tmp/hadoop-yarn/staging/root/.staging/job_1397327306299_0045/job_1397327306299_0045_1_conf.xml. > BP-1662111526-172.16.100.13-1397327293758 > blk_1012068924814169940_6976{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[172.16.100.14:50010 > |RBW]]} > 2014-04-13 10:01:27,972 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: 172.16.100.14:50010 is added to > blk_1012068924814169940_6976{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[172.16.100.14:50010|RBW]]} > size 0 > 2014-04-13 10:01:27,973 INFO org.apache.hadoop.hdfs.StateChange: DIR* > completeFile: > /tmp/hadoop-yarn/staging/root/.staging/job_1397327306299_0045/job_1397327306299_0045_1_conf.xml > is closed by DFSClient_NONMAPREDUCE_-1471236856_1 > 2014-04-13 10:01:28,603 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: 172.16.100.16:50010 is added to > blk_8868060350766479646_6965 size 80053 > 2014-04-13 10:01:28,605 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: 172.16.100.15:50010 is added to > blk_8868060350766479646_6965 size 80053 > > > > Datanode logs of the source host: > > 2014-04-13 10:01:24,234 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving > BP-1662111526-172.16.100.13-1397327293758:blk_-4909667885926150941_6959 > src: /172.16.100.13:51419 dest: /172.16.100.16:50010 > 2014-04-13 10:01:24,248 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: / > 172.16.100.13:51419, dest: /172.16.100.16:50010, bytes: 233, op: > HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_458023096_1, offset: 0, srvID: > DS-1825202225-172.16.100.16-50010-1397327304669, blockid: > BP-1662111526-172.16.100.13-1397327293758:blk_-4909667885926150941_6959, > duration: 12489490 > 2014-04-13 10:01:24,248 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: > BP-1662111526-172.16.100.13-1397327293758:blk_-4909667885926150941_6959, > type=LAST_IN_PIPELINE, downstreams=0:[] terminating > 2014-04-13 10:01:24,282 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: / > 172.16.100.16:50010, dest: /172.16.100.13:51420, bytes: 237, op: > HDFS_READ, cliID: DFSClient_NONMAPREDUCE_458023096_1, offset: 0, srvID: > DS-1825202225-172.16.100.16-50010-1397327304669, blockid: > BP-1662111526-172.16.100.13-1397327293758:blk_-4909667885926150941_6959, > duration: 78237 > 2014-04-13 10:01:24,319 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: / > 172.16.100.16:50010, dest: /172.16.100.13:51420, bytes: 237, op: > HDFS_READ, cliID: DFSClient_NONMAPREDUCE_458023096_1, offset: 0, srvID: > DS-1825202225-172.16.100.16-50010-1397327304669, blockid: > BP-1662111526-172.16.100.13-1397327293758:blk_-4909667885926150941_6959, > duration: 98085 > 2014-04-13 10:01:24,868 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: / > 172.16.100.16:50010, dest: /172.16.100.13:51420, bytes: 237, op: > HDFS_READ, cliID: DFSClient_NONMAPREDUCE_458023096_1, offset: 0, srvID: > DS-1825202225-172.16.100.16-50010-1397327304669, blockid: > BP-1662111526-172.16.100.13-1397327293758:blk_-4909667885926150941_6959, > duration: 68844 > 2014-04-13 10:01:24,902 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving > BP-1662111526-172.16.100.13-1397327293758:blk_7617264601876797381_6967 src: > /172.16.100.14:43154 dest: /172.16.100.16:50010 > 2014-04-13 10:01:24,907 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: / > 172.16.100.14:43154, dest: /172.16.100.16:50010, bytes: 142, op: > HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_458023096_1, offset: 0, srvID: > DS-1825202225-172.16.100.16-50010-1397327304669, blockid: > BP-1662111526-172.16.100.13-1397327293758:blk_7617264601876797381_6967, > duration: 2647953 > 2014-04-13 10:01:24,907 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: > BP-1662111526-172.16.100.13-1397327293758:blk_7617264601876797381_6967, > type=LAST_IN_PIPELINE, downstreams=0:[] terminating > 2014-04-13 10:01:25,094 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving > BP-1662111526-172.16.100.13-1397327293758:blk_-5469411569413407886_6973 > src: /172.16.100.13:51429 dest: /172.16.100.16:50010 > 2014-04-13 10:01:25,098 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: / > 172.16.100.13:51429, dest: /172.16.100.16:50010, bytes: 7, op: > HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_458023096_1, offset: 0, srvID: > DS-1825202225-172.16.100.16-50010-1397327304669, blockid: > BP-1662111526-172.16.100.13-1397327293758:blk_-5469411569413407886_6973, > duration: 3098167 > 2014-04-13 10:01:25,098 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: > BP-1662111526-172.16.100.13-1397327293758:blk_-5469411569413407886_6973, > type=LAST_IN_PIPELINE, downstreams=0:[] terminating > 2014-04-13 10:01:25,434 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: / > 172.16.100.16:50010, dest: /172.16.100.14:43158, bytes: 11, op: > HDFS_READ, cliID: DFSClient_NONMAPREDUCE_-746601717_397829, offset: 0, > srvID: DS-1825202225-172.16.100.16-50010-1397327304669, blockid: > BP-1662111526-172.16.100.13-1397327293758:blk_-5469411569413407886_6973, > duration: 75148 > 2014-04-13 10:01:29,294 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving > BP-1662111526-172.16.100.13-1397327293758:blk_8868060350766479646_6965 src: > /172.16.100.14:43167 dest: /172.16.100.16:50010 > 2014-04-13 10:01:29,304 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Received > BP-1662111526-172.16.100.13-1397327293758:blk_8868060350766479646_6965 src: > /172.16.100.14:43167 dest: /172.16.100.16:50010 of size 80053 > 2014-04-13 10:01:30,686 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: / > 172.16.100.16:50010, dest: /172.16.100.16:46021, bytes: 80681, op: > HDFS_READ, cliID: DFSClient_NONMAPREDUCE_113851557_419414, offset: 0, > srvID: DS-1825202225-172.16.100.16-50010-1397327304669, blockid: > BP-1662111526-172.16.100.13-1397327293758:blk_8868060350766479646_6965, > duration: 129669 > > Namenode logs of the target host: > 2014-04-13 09:48:23,471 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > allocateBlock: /wiki/.distcp.tmp.attempt_1397327306299_0044_m_000008_0. BP- > 862979082-172.16.100.5-1397327020274 > blk_-5593655825051572228_6376{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[172.16.100.8:50010 > |RBW]]} > > > Datanode logs of the target host: > 2014-04-13 09:48:43,789 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP- > 862979082-172.16.100.5-1397327020274:blk_3331673758913146011_6363, > type=LAST_IN_PIPELINE, downstreams=0:[]: Thread is interrupted. > 2014-04-13 09:48:43,789 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP- > 862979082-172.16.100.5-1397327020274:blk_3331673758913146011_6363, > type=LAST_IN_PIPELINE, downstreams=0:[] terminating > 2014-04-13 09:48:43,789 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP- > 862979082-172.16.100.5-1397327020274:blk_3331673758913146011_6363 > received exception java.net.SocketTimeoutException: 60000 millis timeout > while waiting for channel to be ready for read. ch : > java.nio.channels.SocketChannel[connected local=/172.16.100.8:50010remote=/ > 172.16.100.14:44724] > 2014-04-13 09:48:43,789 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: > WorkGroup0013:50010:DataXceiver error processing WRITE_BLOCK operation > src: /172.16.100.14:44724 dest: /172.16.100.8:50010 > java.net.SocketTimeoutException: 60000 millis timeout while waiting for > channel to be ready for read. ch : > java.nio.channels.SocketChannel[connected local=/172.16.100.8:50010remote=/ > 172.16.100.14:44724] > at > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:159) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:129) > at java.io.FilterInputStream.read(FilterInputStream.java:116) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:256) > at java.io.BufferedInputStream.read(BufferedInputStream.java:317) > at java.io.DataInputStream.read(DataInputStream.java:132) > at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192) > > > core-site.xml and hdfs-site.xml of the source host: > > # cat ~/Programs/hadoop/etc/hadoop/core-site.xml > <?xml version="1.0" encoding="UTF-8"?> > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > > <!-- Put site-specific property overrides in this file. --> > <configuration> > <property> <name>fs.default.name</name> > <value>hdfs://172.16.100.13:9000</value> > </property> > <property> <name>hadoop.tmp.dir</name> <value>/tmp/hadoop-temp</value> > </property> > > <property><name>hadoop.proxyuser.root.hosts</name><value>*</value></property> > > <property><name>hadoop.proxyuser.root.groups</name><value>*</value></property> > <property> <name>dfs.permissions</name> <value>false</value> </property> > </configuration> > # cat ~/Programs/hadoop/etc/hadoop/hdfs-site.xml > <?xml version="1.0" encoding="UTF-8"?> > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > > <!-- Put site-specific property overrides in this file. --> > <configuration> > <property> <name>dfs.replication</name> <value>1</value> > </property> > <property> <name>dfs.permissions</name> <value>false</value> > </property> > <property> <name>dfs.name.dir</name> > <value>/tmp/data/dfs/name/</value> </property> > <property> <name>dfs.data.dir</name> > <value>/tmp/data/dfs/data/</value> </property> > <property> <name>dfs.webhdfs.enabled</name> <value>true</value> > </property> > </configuration> > > > > core-site.xml and hdfs-site.xml of the target host: > > # cat Programs/hadoop/etc/hadoop/core-site.xml > <?xml version="1.0" encoding="UTF-8"?> > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > > <!-- Put site-specific property overrides in this file. --> > <configuration> > <property> <name>fs.default.name</name> > <value>hdfs://172.16.100.5:9000</value> > </property> > <property> <name>hadoop.tmp.dir</name> <value>/tmp/hadoop-temp</value> > </property> > > <property><name>hadoop.proxyuser.root.hosts</name><value>*</value></property> > > <property><name>hadoop.proxyuser.root.groups</name><value>*</value></property> > <property> <name>dfs.permissions</name> <value>false</value> </property> > </configuration> > # cat Programs/hadoop/etc/hadoop/hdfs-site.xml > <?xml version="1.0" encoding="UTF-8"?> > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > > <!-- Put site-specific property overrides in this file. --> > <configuration> > <property> <name>dfs.replication</name> <value>1</value> > </property> > <property> <name>dfs.permissions</name> <value>false</value> > </property> > <property> <name>dfs.name.dir</name> > <value>/tmp/data/dfs/name/</value> </property> > <property> <name>dfs.data.dir</name> > <value>/tmp/data/dfs/data/</value> </property> > <property> <name>dfs.webhdfs.enabled</name> <value>true</value> > </property> > </configuration> > > > -- > Thanks, > > > -- > Thanks, > > > -- > Thanks, > >