Mike,

This error is not related to malformed XML files etc you are trying to copy,
but because for some reason, the source or destination listing can not be
retrieved/parsed. Are you trying to copy between diff versions of clusters?
As far as I know, your destination should be writable, distcp should be run
from the destination cluster. See more here:
http://hadoop.apache.org/common/docs/r0.20.2/distcp.html

Let us know how it goes.

Thanks and Regards,
Sonal
<https://github.com/sonalgoyal/hiho>Connect Hadoop with databases,
Salesforce, FTP servers and others <https://github.com/sonalgoyal/hiho>
Nube Technologies <http://www.nubetech.co>

<http://in.linkedin.com/in/sonalgoyal>





On Mon, Feb 7, 2011 at 9:21 PM, Korb, Michael [USA] <korb_mich...@bah.com>wrote:

> I am running two instances of Hadoop on a cluster and want to copy all the
> data from hadoop1 to the updated hadoop2. From hadoop2, I am running the
> command "hadoop distcp -update hftp://mc00001:50070/ hftp://mc00000:50070/";
> where mc00001 is the namenode of hadoop1 and mc00000 is the namenode of
> hadoop2. I get the following error:
>
> 11/02/07 10:12:31 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
> 11/02/07 10:12:31 INFO tools.DistCp: destPath=hftp://mc00000:50070/
> [Fatal Error] :1:215: XML document structures must start and end within the
> same entity.
> With failures, global counters are inaccurate; consider running with -i
> Copy failed: java.io.IOException: invalid xml directory content
>        at
> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:350)
>        at
> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:355)
>        at
> org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:384)
>        at org.apache.hadoop.tools.DistCp.sameFile(DistCp.java:1227)
>        at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1120)
>        at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
>        at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>        at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
> Caused by: org.xml.sax.SAXParseException: XML document structures must
> start and end within the same entity.
>        at
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
>        at
> org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:344)
>        ... 9 more
>
> I am fairly certain that none of the XML files are malformed or corrupted.
> This thread (
> http://www.mail-archive.com/core-dev@hadoop.apache.org/msg18064.html)
> discusses a similar problem caused by file permissions but doesn't seem to
> offer a solution. Any help would be appreciated.
>
> Thanks,
> Mike
>

Reply via email to