I looked in JIRA but didn't see this reported so I thought I'd see what this 
list thinks.  We've been using SOCKS proxying to access a Hadoop cluster 
generally  using setup described on the Couldera blog posting 
(http://www.cloudera.com/blog/2008/12/03/securing-a-hadoop-cluster-through-a-gateway/
 ).  This works great by setting  hadoop.rpc.socket.factory.class.default to 
org.apache.hadoop.net.SocksSocketFactory.  Generally thinks work well ( hadoop 
dfs activity like -ls -rmr -cat ) all work fine.  The one command that doesn't 
work is fsck.  Note the following command and error:

hadoop fsck /
Exception in thread "main" java.net.NoRouteToHostException: No route to host

   So looking at org.apache.hadoop.hdfs.tools.DFSck.java the connection is 
created using URLConnection, so it makes sense why it wouldn't work since it 
doesn't seem to use the socket factory.

   So to me this seems like an issue.  Can someone please confirm?  If it is 
I'll add a JIRA.  Happy to take a crack and making a change as well.  Unclear 
to me the easiest way to change.  I haven't run across in the codebase code 
that uses hadoop.rpc.socket.factory.class.default for HTTP connections.

   Any thoughts would be appreciated.

   Thanks

   Andy

Reply via email to