Owen O'Malley wrote:
Sure, the info server on the name node of HDFS has a read-only interface
that lists directories in xml and allows the client to read files over
http. There is a FileSystem implementation that provides the client side
interface to the xml/http access.
To use it, you need
On Feb 28, 2008, at 8:20 AM, Steve Sapovits wrote:
Can you further explain the hftp part of this? I'm not familiar
with that. We have a similar need to go cross-data center.
Sure, the info server on the name node of HDFS has a read-only
interface that lists directories in xml and allows t
Owen O'Malley wrote:
To copy between clusters, there is a tool called distcp. Look at
"bin/hadoop distcp". It runs a map/reduce job that copies a group of
files. It can also be used to copy between versions of hadoop, if the
source file system is hftp, which uses xml to read hdfs.
Can you fu
On Feb 28, 2008, at 2:43 AM, Miles Osborne wrote:
Currently, we have the following setup:
--cluster A, running Nutch: small RAM per node
--cluster B, just running Hadoop: lots of RAM per node
At some point in the future we will want cluster B to talk to
cluster A, and
ideally this should