[
https://issues.apache.org/jira/browse/MAPREDUCE-684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Suhas Gogate resolved MAPREDUCE-684.
------------------------------------
Resolution: Invalid
Sorry I did not realize -i option would return success in spite of some files
fail to copy.
> distcp returns success but does not copy files due to connection problem.
> Error is logged on target HDFS log directory
> ----------------------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-684
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-684
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: distcp
> Affects Versions: 0.20.1
> Reporter: Suhas Gogate
>
> Distcp returns success even though files are not copied due to connection
> problem. It creates empty directory structure on the target and log the
> error message on the target HDFS log directory.
> distcp command is run on hadoop 20 fetching data from hadoop 18 cluster.
> -bash-3.1$ hadoop distcp -Dmapred.job.queue.name=xxxx -i -p -update -delete
> hftp://xxx.mydomain.com:50070/user/gogate/mirror_test2
> hdfs://yyy.mydomain.com:8020/user/gogate/mirror_test2'
> 09/06/30 18:41:29 INFO tools.DistCp:
> srcPaths=[hftp://xxx.mydomain.com:50070/user/gogate/mirror_test2]
> 09/06/30 18:41:29 INFO tools.DistCp:
> destPath=hdfs://yyy.mydomain.com:8020/user/gogate/mirror_test2
> 09/06/30 18:41:30 INFO tools.DistCp:
> hdfs://yyy.mydomain.com:8020/user/gogate/mirror_test2 does not exist.
> 09/06/30 18:41:30 INFO tools.DistCp: srcCount=4
> 09/06/30 18:41:36 INFO mapred.JobClient: Running job: job_200906290541_3336
> 09/06/30 18:41:37 INFO mapred.JobClient: map 0% reduce 0%
> 09/06/30 18:43:05 INFO mapred.JobClient: map 100% reduce 0%
> 09/06/30 18:43:28 INFO mapred.JobClient: Job complete: job_200906290541_3336
> echo $?
> 09/06/30 18:43:35 INFO mapred.JobClient: Counters: 8
> 09/06/30 18:43:35 INFO mapred.JobClient: Job Counters
> 09/06/30 18:43:35 INFO mapred.JobClient: Launched map tasks=1
> 09/06/30 18:43:35 INFO mapred.JobClient: FileSystemCounters
> 09/06/30 18:43:35 INFO mapred.JobClient: HDFS_BYTES_READ=534
> 09/06/30 18:43:35 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=3655
> 09/06/30 18:43:35 INFO mapred.JobClient: distcp
> 09/06/30 18:43:35 INFO mapred.JobClient: Files failed=2
> 09/06/30 18:43:35 INFO mapred.JobClient: Map-Reduce Framework
> 09/06/30 18:43:35 INFO mapred.JobClient: Map input records=3
> 09/06/30 18:43:35 INFO mapred.JobClient: Spilled Records=0
> 09/06/30 18:43:35 INFO mapred.JobClient: Map input bytes=434
> 09/06/30 18:43:35 INFO mapred.JobClient: Map output records=2
> -bash-3.1$ echo $?
> 0
> target HDFS log directory message.
> -bash-3.1$ hadoop fs -cat /user/gogate/_distcp_logs_f7twl9/part-00000
> FAIL pig_1245890239320.log : java.net.ConnectException: Connection refused
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
> at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
> at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
> at java.net.Socket.connect(Socket.java:519)
> at java.net.Socket.connect(Socket.java:469)
> at sun.net.NetworkClient.doConnect(NetworkClient.java:157)
> at sun.net.www.http.HttpClient.openServer(HttpClient.java:394)
> at sun.net.www.http.HttpClient.openServer(HttpClient.java:529)
> at sun.net.www.http.HttpClient.<init>(HttpClient.java:233)
> at sun.net.www.http.HttpClient.New(HttpClient.java:306)
> at sun.net.www.http.HttpClient.New(HttpClient.java:323)
> at
> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:788)
> at
> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:729)
> at
> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:654)
> at
> sun.net.www.protocol.http.HttpURLConnection.followRedirect(HttpURLConnection.java:1868)
> at
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1172)
> at org.apache.hadoop.hdfs.HftpFileSystem.open(HftpFileSystem.java:142)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:351)
> at org.apache.hadoop.tools.DistCp$CopyFilesMapper.copy(DistCp.java:410)
> at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:537)
> at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:306)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
> FAIL dir1/xxx.pig : java.net.ConnectException: Connection refused
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
> at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
> at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
> at java.net.Socket.connect(Socket.java:519)
> at java.net.Socket.connect(Socket.java:469)
> at sun.net.NetworkClient.doConnect(NetworkClient.java:157)
> at sun.net.www.http.HttpClient.openServer(HttpClient.java:394)
> at sun.net.www.http.HttpClient.openServer(HttpClient.java:529)
> at sun.net.www.http.HttpClient.<init>(HttpClient.java:233)
> at sun.net.www.http.HttpClient.New(HttpClient.java:306)
> at sun.net.www.http.HttpClient.New(HttpClient.java:323)
> at
> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:788)
> at
> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:729)
> at
> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:654)
> at
> sun.net.www.protocol.http.HttpURLConnection.followRedirect(HttpURLConnection.java:1868)
> at
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1172)
> at org.apache.hadoop.hdfs.HftpFileSystem.open(HftpFileSystem.java:142)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:351)
> at org.apache.hadoop.tools.DistCp$CopyFilesMapper.copy(DistCp.java:410)
> at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:537)
> at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:306)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
> -bash-3.1$
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.