Fw: new message
Hey! New message, please read <http://www.backofthenapkinmktg.com/taken.php?2y> Krishna Rao
Fw: new message
Hey! New message, please read <http://thuonghieuinternet.com.vn/entirely.php?m> Krishna Rao
Fw: new message
Hello! New message, please read <http://bnbsimple.com/favour.php?6> Krishna Rao
Re: Intermittent BindException during long MR jobs
Thanks for the responses. In our case the port is 0, and so from the link http://wiki.apache.org/hadoop/BindException Ted mentioned it says that a collision is highly unlikely: If the port is 0, then the OS is looking for any free port -so the port-in-use and port-below-1024 problems are highly unlikely to be the cause of the problem. I think load may be the culprit since the nodes will be heavily used during the times that the exception occurs. Is there anyway to set/increase the timeout for the call/connection attempt? In all cases so far it seems to be on a call to delete a file in HDFS. I had a search through the HDFS code base but couldn't see an obvious way to set a timeout, and couldn't see it being set. Krishna On 28 February 2015 at 15:20, Ted Yu yuzhih...@gmail.com wrote: Krishna: Please take a look at: http://wiki.apache.org/hadoop/BindException Cheers On Thu, Feb 26, 2015 at 10:30 PM, hadoop.supp...@visolve.com wrote: Hello Krishna, Exception seems to be IP specific. It might be occurred due to unavailability of IP address in the system to assign. Double check the IP address availability and run the job. *Thanks,* *S.RagavendraGanesh* ViSolve Hadoop Support Team ViSolve Inc. | San Jose, California Website: www.visolve.com email: servi...@visolve.com | Phone: 408-850-2243 *From:* Krishna Rao [mailto:krishnanj...@gmail.com] *Sent:* Thursday, February 26, 2015 9:48 PM *To:* u...@hive.apache.org; user@hadoop.apache.org *Subject:* Intermittent BindException during long MR jobs Hi, we occasionally run into a BindException causing long running jobs to occasionally fail. The stacktrace is below. Any ideas what this could be caused by? Cheers, Krishna Stacktrace: 379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task - Job Submission failed with exception 'java.net.BindException(Problem binding to [back10/10.4.2.10:0] java.net.BindException: Cann ot assign requested address; For more details see: http://wiki.apache.org/hadoop/BindException)' java.net.BindException: Problem binding to [back10/10.4.2.10:0] java.net.BindException: Cannot assign requested address; For more details see: http://wiki.apache.org/hadoop/BindException at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718) at org.apache.hadoop.ipc.Client.call(Client.java:1242) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) at com.sun.proxy.$Proxy10.create(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193) at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) at com.sun.proxy.$Proxy11.create(Unknown Source) at org.apache.hadoop.hdfs.DFSOutputStream.init(DFSOutputStream.java:1376) at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558) at org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96) at org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85) at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396
Re: Intermittent BindException during long MR jobs
Thanks for the responses. In our case the port is 0, and so from the linkhttp://wiki.apache.org/hadoop/BindException Ted mentioned it says that a collision is highly unlikely: If the port is 0, then the OS is looking for any free port -so the port-in-use and port-below-1024 problems are highly unlikely to be the cause of the problem. I think load may be the culprit since the nodes will be heavily used during the times that the exception occurs. Is there anyway to set/increase the timeout for the call/connection attempt? In all cases so far it seems to be on a call to delete a file in HDFS. I had a search through the HDFS code base but couldn't see an obvious way to set a timeout, and couldn't see it being set. Krishna On 28 February 2015 at 15:20, Ted Yu yuzhih...@gmail.commailto:yuzhih...@gmail.com wrote: Krishna: Please take a look at: http://wiki.apache.org/hadoop/BindException Cheers On Thu, Feb 26, 2015 at 10:30 PM, hadoop.supp...@visolve.commailto:hadoop.supp...@visolve.com wrote: Hello Krishna, Exception seems to be IP specific. It might be occurred due to unavailability of IP address in the system to assign. Double check the IP address availability and run the job. Thanks, S.RagavendraGanesh ViSolve Hadoop Support Team ViSolve Inc. | San Jose, California Website: www.visolve.comhttp://www.visolve.com email: servi...@visolve.commailto:servi...@visolve.com | Phone: 408-850-2243tel:408-850-2243 From: Krishna Rao [mailto:krishnanj...@gmail.commailto:krishnanj...@gmail.com] Sent: Thursday, February 26, 2015 9:48 PM To: u...@hive.apache.orgmailto:u...@hive.apache.org; user@hadoop.apache.orgmailto:user@hadoop.apache.org Subject: Intermittent BindException during long MR jobs Hi, we occasionally run into a BindException causing long running jobs to occasionally fail. The stacktrace is below. Any ideas what this could be caused by? Cheers, Krishna Stacktrace: 379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task - Job Submission failed with exception 'java.net.BindException(Problem binding to [back10/10.4.2.10:0http://10.4.2.10:0] java.net.BindException: Cann ot assign requested address; For more details see: http://wiki.apache.org/hadoop/BindException)' java.net.BindException: Problem binding to [back10/10.4.2.10:0http://10.4.2.10:0] java.net.BindException: Cannot assign requested address; For more details see: http://wiki.apache.org/hadoop/BindException at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718) at org.apache.hadoop.ipc.Client.call(Client.java:1242) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) at com.sun.proxy.$Proxy10.create(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193) at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) at com.sun.proxy.$Proxy11.create(Unknown Source) at org.apache.hadoop.hdfs.DFSOutputStream.init(DFSOutputStream.java:1376) at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558) at org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96) at org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85) at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283) at java.security.AccessController.doPrivileged(Native
Intermittent BindException during long MR jobs
Hi, we occasionally run into a BindException causing long running jobs to occasionally fail. The stacktrace is below. Any ideas what this could be caused by? Cheers, Krishna Stacktrace: 379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task - Job Submission failed with exception 'java.net.BindException(Problem binding to [back10/10.4.2.10:0] java.net.BindException: Cann ot assign requested address; For more details see: http://wiki.apache.org/hadoop/BindException)' java.net.BindException: Problem binding to [back10/10.4.2.10:0] java.net.BindException: Cannot assign requested address; For more details see: http://wiki.apache.org/hadoop/BindException at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718) at org.apache.hadoop.ipc.Client.call(Client.java:1242) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) at com.sun.proxy.$Proxy10.create(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193) at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) at com.sun.proxy.$Proxy11.create(Unknown Source) at org.apache.hadoop.hdfs.DFSOutputStream.init(DFSOutputStream.java:1376) at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558) at org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96) at org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85) at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1283) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66) at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:56)
HBase checksum vs HDFS checksum
Hi all, I understand that there is a significant improvement gain when turning on short circuit reads, and additionally by setting HBase to do checksums rather than HDFS. However, I'm a little confused by this, do I need to turn of checksum within HDFS for the entire file system? We don't just use HBase on our cluster, so this would seem to be a bad idea right? Cheers, Krishna
Job froze for hours because of an unresponsive disk on one of the task trackers
Hi, we have a daily Hive script that usually takes a few hours to run. The other day I notice one of the jobs was taking in excess of a few hours. Digging into it I saw that there were 3 attempts to launch a job on a single node: Task Id Start Time Finish Time Error task_201312241250_46714_r_48 Error launching task task_201312241250_46714_r_49 Error launching task task_201312241250_46714_r_50 Error launching task I later found out that this node had a dodgy/unresponsive disk (still being tested right now). We've seen tasks fail in the past, but re-submitted to another node and succeeding. So, shouldn't this task have been kicked off on another node after the first failure? Is there anything I could be missing in terms of configuration that should be set? We're using CDH4.4.0. Cheers, Krishna
Re: Job froze for hours because of an unresponsive disk on one of the task trackers
I noticed, but none of the jobs ended up being re-submitted! And all 3 of those jobs failed on the same node. All we know is that the disk on that node became unresponsive. On 27 March 2014 09:33, Dieter De Witte drdwi...@gmail.com wrote: The ids of the tasks are different so the node got killed after failing on 3 different(!) reduce tasks. The reduce task 48 will probably have been resubmitted to another node. 2014-03-27 10:22 GMT+01:00 Krishna Rao krishnanj...@gmail.com: Hi, we have a daily Hive script that usually takes a few hours to run. The other day I notice one of the jobs was taking in excess of a few hours. Digging into it I saw that there were 3 attempts to launch a job on a single node: Task Id Start Time Finish Time Error task_201312241250_46714_r_48 Error launching task task_201312241250_46714_r_49 Error launching task task_201312241250_46714_r_50 Error launching task I later found out that this node had a dodgy/unresponsive disk (still being tested right now). We've seen tasks fail in the past, but re-submitted to another node and succeeding. So, shouldn't this task have been kicked off on another node after the first failure? Is there anything I could be missing in terms of configuration that should be set? We're using CDH4.4.0. Cheers, Krishna
Possible to list all failed jobs?
Is it possible to list just all the failed jobs in either the JobTracker JobHistory or otherwise? Cheers, Krishna
Re: Possible to run an application jar as a hadoop daemon?
Thanks for the replies. Harsh J, hadoop classpath was exactly what I needed. Got it working now. Cheers, Krishna On 6 January 2013 11:14, John Hancock jhancock1...@gmail.com wrote: Krishna, You should be able to take the command you are using to start the hadoop job (hadoop jar ..) and paste it into a text file. Then make the file executable and call it as a shell script in a CRON job (crontab -e). To be safe, use absolute paths to reference any files in the command. Or, I suppose what you crazy kids and your object oriented programming would do is use Quartz. -John On Sat, Jan 5, 2013 at 4:33 PM, Chitresh Deshpande chitreshdeshpa...@gmail.com wrote: Hi Krishna, I dont know what do you mean by Hadoop daemon, but if you mean run when all the other hadoop daemons like namenode, datanode etc are started, then you can change start-all file in conf directory. Thanks and Regards, Chitresh Deshpande On Fri, Jan 4, 2013 at 6:40 AM, Krishna Rao krishnanj...@gmail.comwrote: Hi al, I have a java application jar that converts some files and writes directly into hdfs. If I want to run the jar I need to run it using hadoop jar application jar, so that it can access HDFS (that is running java -jar application jar results in a HDFS error). Is it possible to run an jar as a hadoop daemon? Cheers, Krishna
Possible to run an application jar as a hadoop daemon?
Hi al, I have a java application jar that converts some files and writes directly into hdfs. If I want to run the jar I need to run it using hadoop jar application jar, so that it can access HDFS (that is running java -jar application jar results in a HDFS error). Is it possible to run an jar as a hadoop daemon? Cheers, Krishna