Philip Zeyliger wrote:

You could use ssh to set up a SOCKS proxy between your machine and
ec2, and setup org.apache.hadoop.net.SocksSocketFactory to be the
socket factory.
http://www.cloudera.com/blog/2008/12/03/securing-a-hadoop-cluster-through-a-gateway/
has more information.

very useful write up. Regd the problem with reverse DNS mentioned (thats why you had to add a DNS record for internal ip) it is fixed in https://issues.apache.org/jira/browse/HADOOP-5191 (for HDFS access least). Some mapred parts are still affected (HADOOP-5610). Depending on reverse DNS should avoided.

Ideally setting fs.default.name to internal ip should just work for clients.. both internally and externally (through proxies).

Raghu.

Reply via email to