Re: Hadoop startup problems ( FileSystem is not ready yet! )
Changing /etc/hosts line from : 127.0.0.1 localhost, prasen-host to 127.0.0.1 localhost fixed the problem... On Sat, Jun 16, 2012 at 12:30 PM, prasenjit mukherjee prasen@gmail.com wrote: I started hadoop in a single-node/pseudo-distributed mode. Took all the precautionary measures like : dfsck, namenode -format etc. before running start-all.sh. After starting jobtracker-log keeps getting flooded with following stacktraces : I have a hunch it is related to localhost/127.0.0.1 stuff. Any pointers on how to fix this. Because of this I cant put anything into hdfs. $tail -f hadoop-prasen-jobtracker-oilreadproud-lm.log 2012-06-16 12:09:36,037 WARN org.apache.hadoop.mapred.JobTracker: Retrying... 2012-06-16 12:09:36,049 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: java.lang.NumberFormatException: For input string: 0:0:0:0:0:0:1%0:50010 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Integer.parseInt(Integer.java:458) at java.lang.Integer.parseInt(Integer.java:499) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:148) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:125) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:3025) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2983) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2255) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2446) 2012-06-16 12:09:36,049 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_-5253437002798877541_1048 bad datanode[0] nodes == null 2012-06-16 12:09:36,050 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source file /tmp/hadoop-prasen/mapred/system/jobtracker.info - Aborting... 2012-06-16 12:09:36,050 WARN org.apache.hadoop.mapred.JobTracker: Writing to file hdfs://localhost:9000/tmp/hadoop-prasen/mapred/system/jobtracker.info failed! 2012-06-16 12:09:36,050 WARN org.apache.hadoop.mapred.JobTracker: FileSystem is not ready yet! 2012-06-16 12:09:36,052 WARN org.apache.hadoop.mapred.JobTracker: Failed to initialize recovery manager. java.io.IOException: Could not get block locations. Source file /tmp/hadoop-prasen/mapred/system/jobtracker.info - Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2691) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2255) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2423)
Setting number of mappers according to number of TextInput lines
Hello, I have very small input size (kB), but processing to produce some output takes several minutes. Is there a way how to say, file has 100 lines, i need 10 mappers, where each mapper node has to process 10 lines of input file? Thanks for advice. Ondrej Klimpera
Re: Setting number of mappers according to number of TextInput lines
Hi Ondrej You can use NLineInputFormat with n set to 10. --Original Message-- From: Ondřej Klimpera To: common-user@hadoop.apache.org ReplyTo: common-user@hadoop.apache.org Subject: Setting number of mappers according to number of TextInput lines Sent: Jun 16, 2012 14:31 Hello, I have very small input size (kB), but processing to produce some output takes several minutes. Is there a way how to say, file has 100 lines, i need 10 mappers, where each mapper node has to process 10 lines of input file? Thanks for advice. Ondrej Klimpera Regards Bejoy KS Sent from handheld, please excuse typos.
Re: Setting number of mappers according to number of TextInput lines
I tried this approach, but the job is not distributed among 10 mapper nodes. Seems Hadoop ignores this property :( My first thought is, that the small file size is the problem and Hadoop doesn't care about it's splitting in proper way. Thanks any ideas. On 06/16/2012 11:27 AM, Bejoy KS wrote: Hi Ondrej You can use NLineInputFormat with n set to 10. --Original Message-- From: Ondřej Klimpera To: common-user@hadoop.apache.org ReplyTo: common-user@hadoop.apache.org Subject: Setting number of mappers according to number of TextInput lines Sent: Jun 16, 2012 14:31 Hello, I have very small input size (kB), but processing to produce some output takes several minutes. Is there a way how to say, file has 100 lines, i need 10 mappers, where each mapper node has to process 10 lines of input file? Thanks for advice. Ondrej Klimpera Regards Bejoy KS Sent from handheld, please excuse typos.
Re: Map works well, but Redue failed
Hi raj, I think you should increase the reducer worker threads,to fetch the map output. Regards Abhishek. Sent from my iPhone On Jun 15, 2012, at 9:42 PM, Raj Vishwanathan rajv...@yahoo.com wrote: Most probably you have a network problem. Check your hostname and IP address mapping From: Yongwei Xing jdxyw2...@gmail.com To: common-user@hadoop.apache.org Sent: Thursday, June 14, 2012 10:15 AM Subject: Map works well, but Redue failed Hi all I run a simple sort program, however, I meet such error like below. 12/06/15 01:13:17 WARN mapred.JobClient: Error reading task outputServer returned HTTP response code: 403 for URL: http://192.168.1.106:50060/tasklog?plaintext=trueattemptid=attempt_201206150102_0002_m_01_1filter=stdout 12/06/15 01:13:18 WARN mapred.JobClient: Error reading task outputServer returned HTTP response code: 403 for URL: http://192.168.1.106:50060/tasklog?plaintext=trueattemptid=attempt_201206150102_0002_m_01_1filter=stderr 12/06/15 01:13:20 INFO mapred.JobClient: map 50% reduce 0% 12/06/15 01:13:23 INFO mapred.JobClient: map 100% reduce 0% 12/06/15 01:14:19 INFO mapred.JobClient: Task Id : attempt_201206150102_0002_m_00_2, Status : FAILED Too many fetch-failures 12/06/15 01:14:20 WARN mapred.JobClient: Error reading task outputServer returned HTTP response code: 403 for URL: http://192.168.1.106:50060/tasklog?plaintext=trueattemptid=attempt_201206150102_0002_m_00_2filter=stdout Does anyone know what's the reason and how to resolve it? Best Regards, -- Welcome to my ET Blog http://www.jdxyw.com
Re: Map works well, but Redue failed
I did the following steps: 1.stop-all.sh 2.Delete the tmp folder 3.format namenode 4.start-all.sh The problem has gone. Not sure what's the root cause. Best Regards, 2012/6/16 Abhishek abhishek.dod...@gmail.com Hi raj, I think you should increase the reducer worker threads,to fetch the map output. Regards Abhishek. Sent from my iPhone On Jun 15, 2012, at 9:42 PM, Raj Vishwanathan rajv...@yahoo.com wrote: Most probably you have a network problem. Check your hostname and IP address mapping From: Yongwei Xing jdxyw2...@gmail.com To: common-user@hadoop.apache.org Sent: Thursday, June 14, 2012 10:15 AM Subject: Map works well, but Redue failed Hi all I run a simple sort program, however, I meet such error like below. 12/06/15 01:13:17 WARN mapred.JobClient: Error reading task outputServer returned HTTP response code: 403 for URL: http://192.168.1.106:50060/tasklog?plaintext=trueattemptid=attempt_201206150102_0002_m_01_1filter=stdout 12/06/15 01:13:18 WARN mapred.JobClient: Error reading task outputServer returned HTTP response code: 403 for URL: http://192.168.1.106:50060/tasklog?plaintext=trueattemptid=attempt_201206150102_0002_m_01_1filter=stderr 12/06/15 01:13:20 INFO mapred.JobClient: map 50% reduce 0% 12/06/15 01:13:23 INFO mapred.JobClient: map 100% reduce 0% 12/06/15 01:14:19 INFO mapred.JobClient: Task Id : attempt_201206150102_0002_m_00_2, Status : FAILED Too many fetch-failures 12/06/15 01:14:20 WARN mapred.JobClient: Error reading task outputServer returned HTTP response code: 403 for URL: http://192.168.1.106:50060/tasklog?plaintext=trueattemptid=attempt_201206150102_0002_m_00_2filter=stdout Does anyone know what's the reason and how to resolve it? Best Regards, -- Welcome to my ET Blog http://www.jdxyw.com -- Welcome to my ET Blog http://www.jdxyw.com
Re: Setting number of mappers according to number of TextInput lines
No. The number of lines is not known at planning time. All you know is the size of the blocks. You want to look at mapred.max.split.size . On Sat, Jun 16, 2012 at 5:31 AM, Ondřej Klimpera klimp...@fit.cvut.cz wrote: I tried this approach, but the job is not distributed among 10 mapper nodes. Seems Hadoop ignores this property :( My first thought is, that the small file size is the problem and Hadoop doesn't care about it's splitting in proper way. Thanks any ideas. On 06/16/2012 11:27 AM, Bejoy KS wrote: Hi Ondrej You can use NLineInputFormat with n set to 10. --Original Message-- From: Ondřej Klimpera To: common-user@hadoop.apache.org ReplyTo: common-user@hadoop.apache.org Subject: Setting number of mappers according to number of TextInput lines Sent: Jun 16, 2012 14:31 Hello, I have very small input size (kB), but processing to produce some output takes several minutes. Is there a way how to say, file has 100 lines, i need 10 mappers, where each mapper node has to process 10 lines of input file? Thanks for advice. Ondrej Klimpera Regards Bejoy KS Sent from handheld, please excuse typos.
Re: Setting number of mappers according to number of TextInput lines
How did you try it? I had no problem with NLineInputFormat. It just works exactly as expected. Shi
Re: Setting number of mappers according to number of TextInput lines
Ondřej, While NLineInputFormat will indeed give you N lines per task, it does not guarantee that the N map tasks that come out for a file from it will all be sent to different nodes. Which one is your need exactly - Simply having N lines per map task, or N wider distributed maps? On Sat, Jun 16, 2012 at 3:01 PM, Ondřej Klimpera klimp...@fit.cvut.cz wrote: I tried this approach, but the job is not distributed among 10 mapper nodes. Seems Hadoop ignores this property :( My first thought is, that the small file size is the problem and Hadoop doesn't care about it's splitting in proper way. Thanks any ideas. On 06/16/2012 11:27 AM, Bejoy KS wrote: Hi Ondrej You can use NLineInputFormat with n set to 10. --Original Message-- From: Ondřej Klimpera To: common-user@hadoop.apache.org ReplyTo: common-user@hadoop.apache.org Subject: Setting number of mappers according to number of TextInput lines Sent: Jun 16, 2012 14:31 Hello, I have very small input size (kB), but processing to produce some output takes several minutes. Is there a way how to say, file has 100 lines, i need 10 mappers, where each mapper node has to process 10 lines of input file? Thanks for advice. Ondrej Klimpera Regards Bejoy KS Sent from handheld, please excuse typos. -- Harsh J