Hi All, I have just set up a CDH cluster on EC2 using cloudera manager 4.5. I have been trying to run a couple of mapreduce jobs as part of an oozie workflow but have been blocked by the following exception: (my reducer always hangs because of this) -
2013-04-17 00:32:02,268 WARN org.apache.hadoop.mapred.ReduceTask: attempt_201304170021_0003_r_000000_0 copy failed: attempt_201304170021_0003_m_000000_0 from ip-10-174-49-51.us-west-1.compute.internal 2013-04-17 00:32:02,269 WARN org.apache.hadoop.mapred.ReduceTask: java.net.SocketTimeoutException: connect timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at java.net.Socket.connect(Socket.java:529) at sun.net.NetworkClient.doConnect(NetworkClient.java:158) at sun.net.www.http.HttpClient.openServer(HttpClient.java:395) at sun.net.www.http.HttpClient.openServer(HttpClient.java:530) at sun.net.www.http.HttpClient.<init>(HttpClient.java:234) at sun.net.www.http.HttpClient.New(HttpClient.java:307) at sun.net.www.http.HttpClient.New(HttpClient.java:324) at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:970) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:911) at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:836) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1573) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.setupSecureConnection(ReduceTask.java:1530) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1466) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1360) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1292) 2013-04-17 00:32:02,269 INFO org.apache.hadoop.mapred.ReduceTask: Task attempt_201304170021_0003_r_000000_0: Failed fetch #1 from attempt_201304170021_0003_m_000000_0 2013-04-17 00:32:02,269 WARN org.apache.hadoop.mapred.ReduceTask: attempt_201304170021_0003_r_000000_0 adding host ip-10-174-49-51.us-west-1.compute.internal to penalty box, next contact in 12 seconds Any suggestions that can help me get around this? Really appreciate any help here. Thanks, Som