Julien Nioche wrote:
I have tried using *slave.host.name* and give it the public address of my
data node. I can now see the node listed with its public address on the
dfshealth.jsp, however when I try to send a file to the HDFS from my
external server I still get :

*08/09/08 15:58:41 INFO dfs.DFSClient: Waiting to find target node:
10.251.75.177:50010
08/09/08 15:59:50 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.net.SocketTimeoutException
08/09/08 15:59:50 INFO dfs.DFSClient: Abandoning block
blk_-8257572465338588575
08/09/08 15:59:50 INFO dfs.DFSClient: Waiting to find target node:
10.251.75.177:50010
08/09/08 15:59:56 WARN dfs.DFSClient: DataStreamer Exception:
java.io.IOException: Unable to create new block.
        at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2246)
        at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1700(DFSClient.java:1702)
        at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1842)

08/09/08 15:59:56 WARN dfs.DFSClient: Error Recovery for block
blk_-8257572465338588575 bad datanode[0]*

Is there another parameter I could specify to force the address of my
datanode? I have been searching on the EC2 forums and documentation and
apparently there is no way I can use *dfs.datanode.dns.interface* or *
dfs.datanode.dns.nameserver* to specify the public IP of my instance.

Has anyone else managed to send/retrieve stuff from HDFS on an EC2 cluster
from an external machine?

I think most people try to avoid allowing remote access for security reasons. If you can add a file, I can mount your filesystem too, maybe even delete things. Whereas with EC2-only filesystems, your files are *only* exposed to everyone else that knows or can scan for your IPAddr and ports.

Reply via email to