Mike,

I've seen this when a directory has been removed or is missing from the
time distcp starting stating the source files.  You'll probably want to
make sure that no code or person is messing with the filesystem during
your copy.  Also you should use hdfs as the destination protocol.

Cheers,

-Xavier


On 2/7/11 7:51 AM, Korb, Michael [USA] wrote:
> I am running two instances of Hadoop on a cluster and want to copy all the 
> data from hadoop1 to the updated hadoop2. From hadoop2, I am running the 
> command "hadoop distcp -update hftp://mc00001:50070/ hftp://mc00000:50070/"; 
> where mc00001 is the namenode of hadoop1 and mc00000 is the namenode of 
> hadoop2. I get the following error:
>
> 11/02/07 10:12:31 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
> 11/02/07 10:12:31 INFO tools.DistCp: destPath=hftp://mc00000:50070/
> [Fatal Error] :1:215: XML document structures must start and end within the 
> same entity.
> With failures, global counters are inaccurate; consider running with -i
> Copy failed: java.io.IOException: invalid xml directory content
>       at 
> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:350)
>       at 
> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:355)
>       at 
> org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:384)
>       at org.apache.hadoop.tools.DistCp.sameFile(DistCp.java:1227)
>       at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1120)
>       at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
>       at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>       at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
> Caused by: org.xml.sax.SAXParseException: XML document structures must start 
> and end within the same entity.
>       at 
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
>       at 
> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:344)
>       ... 9 more
>
> I am fairly certain that none of the XML files are malformed or corrupted. 
> This thread 
> (http://www.mail-archive.com/[email protected]/msg18064.html) 
> discusses a similar problem caused by file permissions but doesn't seem to 
> offer a solution. Any help would be appreciated.
>
> Thanks,
> Mike

Reply via email to