I'm trying to copy from 0.20.2 to 0.20.3. I was trying to follow the DistCp Guide but I think I know the problem. I'm trying to run the command on the destination cluster, but when I call hadoop, I think the path is set to run the hadoop1 executable. So I tried going to the hadoop2 install and running it with "./hadoop distcp -update hftp://mc00001:50070/ hdfs://mc00000:55310/" but now I get this error:
11/02/07 12:38:09 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/] 11/02/07 12:38:09 INFO tools.DistCp: destPath=hdfs://mc00000:55310/ Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.getCredentials()Lorg/apache/hadoop/security/Credentials; at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:632) at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656) at org.apache.hadoop.tools.DistCp.run(DistCp.java:881) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.tools.DistCp.main(DistCp.java:908) ________________________________________ From: Sonal Goyal [[email protected]] Sent: Monday, February 07, 2011 12:11 PM To: [email protected] Subject: Re: Hadoop XML Error Mike, This error is not related to malformed XML files etc you are trying to copy, but because for some reason, the source or destination listing can not be retrieved/parsed. Are you trying to copy between diff versions of clusters? As far as I know, your destination should be writable, distcp should be run from the destination cluster. See more here: http://hadoop.apache.org/common/docs/r0.20.2/distcp.html Let us know how it goes. Thanks and Regards, Sonal <https://github.com/sonalgoyal/hiho>Connect Hadoop with databases, Salesforce, FTP servers and others <https://github.com/sonalgoyal/hiho> Nube Technologies <http://www.nubetech.co> <http://in.linkedin.com/in/sonalgoyal> On Mon, Feb 7, 2011 at 9:21 PM, Korb, Michael [USA] <[email protected]>wrote: > I am running two instances of Hadoop on a cluster and want to copy all the > data from hadoop1 to the updated hadoop2. From hadoop2, I am running the > command "hadoop distcp -update hftp://mc00001:50070/ hftp://mc00000:50070/" > where mc00001 is the namenode of hadoop1 and mc00000 is the namenode of > hadoop2. I get the following error: > > 11/02/07 10:12:31 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/] > 11/02/07 10:12:31 INFO tools.DistCp: destPath=hftp://mc00000:50070/ > [Fatal Error] :1:215: XML document structures must start and end within the > same entity. > With failures, global counters are inaccurate; consider running with -i > Copy failed: java.io.IOException: invalid xml directory content > at > org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:350) > at > org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:355) > at > org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:384) > at org.apache.hadoop.tools.DistCp.sameFile(DistCp.java:1227) > at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1120) > at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666) > at org.apache.hadoop.tools.DistCp.run(DistCp.java:881) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) > at org.apache.hadoop.tools.DistCp.main(DistCp.java:908) > Caused by: org.xml.sax.SAXParseException: XML document structures must > start and end within the same entity. > at > com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231) > at > org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:344) > ... 9 more > > I am fairly certain that none of the XML files are malformed or corrupted. > This thread ( > http://www.mail-archive.com/[email protected]/msg18064.html) > discusses a similar problem caused by file permissions but doesn't seem to > offer a solution. Any help would be appreciated. > > Thanks, > Mike >
