Thanks for the responses. In our case the port is 0, and so from the link <http://wiki.apache.org/hadoop/BindException> Ted mentioned it says that a collision is highly unlikely:
"If the port is "0", then the OS is looking for any free port -so the port-in-use and port-below-1024 problems are highly unlikely to be the cause of the problem." I think load may be the culprit since the nodes will be heavily used during the times that the exception occurs. Is there anyway to set/increase the timeout for the call/connection attempt? In all cases so far it seems to be on a call to delete a file in HDFS. I had a search through the HDFS code base but couldn't see an obvious way to set a timeout, and couldn't see it being set. Krishna On 28 February 2015 at 15:20, Ted Yu <yuzhih...@gmail.com> wrote: > Krishna: > Please take a look at: > http://wiki.apache.org/hadoop/BindException > > Cheers > > On Thu, Feb 26, 2015 at 10:30 PM, <hadoop.supp...@visolve.com> wrote: > >> Hello Krishna, >> >> >> >> Exception seems to be IP specific. It might be occurred due to >> unavailability of IP address in the system to assign. Double check the IP >> address availability and run the job. >> >> >> >> *Thanks,* >> >> *S.RagavendraGanesh* >> >> ViSolve Hadoop Support Team >> ViSolve Inc. | San Jose, California >> Website: www.visolve.com >> >> email: servi...@visolve.com | Phone: 408-850-2243 >> >> >> >> >> >> *From:* Krishna Rao [mailto:krishnanj...@gmail.com] >> *Sent:* Thursday, February 26, 2015 9:48 PM >> *To:* user@hive.apache.org; u...@hadoop.apache.org >> *Subject:* Intermittent BindException during long MR jobs >> >> >> >> Hi, >> >> >> >> we occasionally run into a BindException causing long running jobs to >> occasionally fail. >> >> >> >> The stacktrace is below. >> >> >> >> Any ideas what this could be caused by? >> >> >> >> Cheers, >> >> >> >> Krishna >> >> >> >> >> >> Stacktrace: >> >> 379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task - Job >> Submission failed with exception 'java.net.BindException(Problem binding to >> [back10/10.4.2.10:0] java.net.BindException: Cann >> >> ot assign requested address; For more details see: >> http://wiki.apache.org/hadoop/BindException)' >> >> java.net.BindException: Problem binding to [back10/10.4.2.10:0] >> java.net.BindException: Cannot assign requested address; For more details >> see: http://wiki.apache.org/hadoop/BindException >> >> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718) >> >> at org.apache.hadoop.ipc.Client.call(Client.java:1242) >> >> at >> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) >> >> at com.sun.proxy.$Proxy10.create(Unknown Source) >> >> at >> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193) >> >> at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source) >> >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> >> at java.lang.reflect.Method.invoke(Method.java:597) >> >> at >> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) >> >> at >> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) >> >> at com.sun.proxy.$Proxy11.create(Unknown Source) >> >> at >> org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1376) >> >> at >> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395) >> >> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255) >> >> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212) >> >> at >> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276) >> >> at >> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265) >> >> at >> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82) >> >> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888) >> >> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869) >> >> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768) >> >> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757) >> >> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558) >> >> at >> org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96) >> >> at >> org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85) >> >> at >> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517) >> >> at >> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487) >> >> at >> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369) >> >> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286) >> >> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283) >> >> at java.security.AccessController.doPrivileged(Native Method) >> >> at javax.security.auth.Subject.doAs(Subject.java:396) >> >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438) >> >> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283) >> >> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606) >> >> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601) >> >> at java.security.AccessController.doPrivileged(Native Method) >> >> at javax.security.auth.Subject.doAs(Subject.java:396) >> >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438) >> >> at >> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601) >> >> at >> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586) >> >> at >> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448) >> >> at >> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138) >> >> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138) >> >> at >> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66) >> >> at >> org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56) >> >> >> > >