Awareness of Map tasks
Hi all, Had some queries on Map task's awareness. From what I understand, every map task instance is destined to process the data in a specific Input split (can be across HDFS blocks). 1) Do these map tasks have a unique instance number? If yes, are they mapped to its specific input splits and the mapping is done using what parameters (say for eg. map task number to input file byte offset ?) ? where exactly is this hash-map preserved (at what level - jobtracker, tasktracker or each tasks) ? 2) coming to a practical scenario, when I run hadoop in local mode. I run a mapreduce job with 10 maps. Since there is an inherent jvm parallelism (say the node can afford to run 2 map task jvms simultaneously) I assume that there are some map tasks that run concurrently. Since HDFS doesnot play a role in this case, how is the map task instance - to - input split mapping mechanism carried out ? Or do we have a concept of input split at all (will all the maps start scanning from the start of the input file) ? Please help me with these queries.. Thanks, Matthew John
Hadoop Pipes Error
Dear all, Today I faced a problem while running a map-reduce job in C++. I am not able to understand to find the reason of the below error : 11/03/30 12:09:02 INFO mapred.JobClient: Task Id : attempt_201103301130_0011_m_00_0, Status : FAILED java.io.IOException: pipe child exception at org.apache.hadoop.mapred.pipes.Application.abort(Application.java:151) at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:101) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run(BinaryProtocol.java:114) attempt_201103301130_0011_m_00_0: Hadoop Pipes Exception: failed to open at wordcount-nopipe.cc:82 in WordCountReader::WordCountReader(HadoopPipes::MapContext) 11/03/30 12:09:02 INFO mapred.JobClient: Task Id : attempt_201103301130_0011_m_01_0, Status : FAILED java.io.IOException: pipe child exception at org.apache.hadoop.mapred.pipes.Application.abort(Application.java:151) at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:101) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run(BinaryProtocol.java:114) attempt_201103301130_0011_m_01_0: Hadoop Pipes Exception: failed to open at wordcount-nopipe.cc:82 in WordCountReader::WordCountReader(HadoopPipes::MapContext) 11/03/30 12:09:02 INFO mapred.JobClient: Task Id : attempt_201103301130_0011_m_02_0, Status : FAILED java.io.IOException: pipe child exception at org.apache.hadoop.mapred.pipes.Application.abort(Application.java:151) at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:101) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run(BinaryProtocol.java:114) attempt_201103301130_0011_m_02_1: Hadoop Pipes Exception: failed to open at wordcount-nopipe.cc:82 in WordCountReader::WordCountReader(HadoopPipes::MapContext) 11/03/30 12:09:15 INFO mapred.JobClient: Task Id : attempt_201103301130_0011_m_00_2, Status : FAILED java.io.IOException: pipe child exception at org.apache.hadoop.mapred.pipes.Application.abort(Application.java:151) at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:101) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:35 I tried to run *wordcount-nopipe.cc* program in */home/hadoop/project/hadoop-0.20.2/src/examples/pipes/impl* directory. make wordcount-nopipe bin/hadoop fs -put wordcount-nopipe bin/wordcount-nopipe bin/hadoop pipes -D hadoop.pipes.java.recordreader=true -D hadoop.pipes.java.recordwriter=true -input gutenberg -output gutenberg-out11 -program bin/wordcount-nopipe or bin/hadoop pipes -D hadoop.pipes.java.recordreader=false -D hadoop.pipes.java.recordwriter=false -input gutenberg -output gutenberg-out11 -program bin/wordcount-nopipe but error remains the same. I attached my Makefile also. Please have some comments on it. I am able to wun a simple wordcount.cpp program in Hadoop Cluster but don't know why this program fails in Broken Pipe error. Thanks best regards Adarsh Sharma CC = g++ HADOOP_INSTALL =/home/hadoop/project/hadoop-0.20.2 PLATFORM = Linux-amd64-64 CPPFLAGS = -m64 -I/home/hadoop/project/hadoop-0.20.2/c++/Linux-amd64-64/include -I/usr/local/cuda/include wordcount-nopipe : wordcount-nopipe.cc $(CC) $(CPPFLAGS) $ -Wall -L/home/hadoop/project/hadoop-0.20.2/c++/Linux-amd64-64/lib -L/usr/local/cuda/lib64 -lhadooppipes \ -lhadooputils
how to set different hadoop.tmp.dir for each machines?
Hey guys, I'm new here, and recently I'm working on configuring a cluster with 32 nodes. However, there are some problems, I describe below The cluster consists of nodes, which I don't have root to configure as I wish. We only have the space /localhost_name/local space to use. Thus, we only have /machine_a/local /machine_b/local ... So I guess to set hadoop.tmp.dir=/${HOSTNAME}/local will work, but sadly it didn't... Almost all the tutorials online are trying to set hadoop.tmp.dir as a single path, which assume on each machine the path is the same... but in my case it's not... I did some googling... like hadoop.tmp.dir different... but no results... Anybody can help? I'll appreciate that... for i've been working on this problem for more than 30 hours... -- Name: Ke Xie Eddy Research Group of Information Retrieval State Key Laboratory of Intelligent Technology and Systems Tsinghua University
Re: how to set different hadoop.tmp.dir for each machines?
Ok, so if I understand correctly, you want to change the location of the datastore on individual computers. I've tested it on my cluster, and it seems to work. Just for the sake of troubleshooting, you didn't mention the following: 1) Which computer were you editing the files on 2) which file were you editing? ** Here's my typical DataNode configuration: Computer: DataNode FileName: core-site.xml Contents: namehadoop.tmp.dir/name value/usr/local/hadoop/datastore/hadoop-${user.name}/value ... ** Here's the configuration of another DataNode I modified to test what you were asking: Computer: DataNode2 FileName: core-site.xml Contents: namehadoop.tmp.dir/name value/usr/local/hadoop/ANOTHERDATASTORE/hadoop-${user.name}/value ** Then, I moved datastore to ANOTHERDATASTORE on DataNode1. I started my cluster back up, and it worked perfectly. On Wed, Mar 30, 2011 at 6:08 AM, ke xie oed...@gmail.com wrote: Hey guys, I'm new here, and recently I'm working on configuring a cluster with 32 nodes. However, there are some problems, I describe below The cluster consists of nodes, which I don't have root to configure as I wish. We only have the space /localhost_name/local space to use. Thus, we only have /machine_a/local /machine_b/local ... So I guess to set hadoop.tmp.dir=/${HOSTNAME}/local will work, but sadly it didn't... Almost all the tutorials online are trying to set hadoop.tmp.dir as a single path, which assume on each machine the path is the same... but in my case it's not... I did some googling... like hadoop.tmp.dir different... but no results... Anybody can help? I'll appreciate that... for i've been working on this problem for more than 30 hours... -- Name: Ke Xie Eddy Research Group of Information Retrieval State Key Laboratory of Intelligent Technology and Systems Tsinghua University
Re: how to set different hadoop.tmp.dir for each machines?
Thank you modemide for your quick response. Sorry for not be clear...your understanding is right. I have a machine, called grande, and the other called pseg. Now i'm using grande as master (by fill the masters file by grande) and pseg as slave. the configuration of grande is (core-site.xml) property namefs.default.name/name valuehdfs://grande:8500/value descriptionThe name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. /description /property property namehadoop.tmp.dir/name value/grande/local/xieke-cluster/hadoop-tmp-data//value description. A base for other temporary directories/description /property and the configuration of pseg is: property namefs.default.name/name valuehdfs://grandonf/8500/value descriptionThe name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. /description /property property namehadoop.tmp.dir/name value/pseg/local/xieke-cluster/hadoop-tmp-data//value description A base for other temporary directories./description /property just the same as your I think? then I run ./bin/hadoop namenode -format to format nodes. and run ./bin/start-all.sh to start machines. but now: grande% ./bin/start-all.sh starting namenode, logging to /grande/local/hadoop/bin/../logs/hadoop-kx19-namenode-grande.out *pseg: /grande/local/hadoop/bin/..: No such file or directory.* grande: starting datanode, logging to /grande/local/hadoop/bin/../logs/hadoop-kx19-datanode-grande.out grande: starting secondarynamenode, logging to /grande/local/hadoop/bin/../logs/hadoop-kx19-secondarynamenode-grande.out starting jobtracker, logging to /grande/local/hadoop/bin/../logs/hadoop-kx19-jobtracker-grande.out pseg: /grande/local/hadoop/bin/..: No such file or directory. grande: starting tasktracker, logging to /grande/local/hadoop/bin/../logs/hadoop-kx19-tasktracker-grande.out Any ideas? On Wed, Mar 30, 2011 at 7:54 PM, modemide modem...@gmail.com wrote: Ok, so if I understand correctly, you want to change the location of the datastore on individual computers. I've tested it on my cluster, and it seems to work. Just for the sake of troubleshooting, you didn't mention the following: 1) Which computer were you editing the files on 2) which file were you editing? ** Here's my typical DataNode configuration: Computer: DataNode FileName: core-site.xml Contents: namehadoop.tmp.dir/name value/usr/local/hadoop/datastore/hadoop-${user.name}/value ... ** Here's the configuration of another DataNode I modified to test what you were asking: Computer: DataNode2 FileName: core-site.xml Contents: namehadoop.tmp.dir/name value/usr/local/hadoop/ANOTHERDATASTORE/hadoop-${user.name}/value ** Then, I moved datastore to ANOTHERDATASTORE on DataNode1. I started my cluster back up, and it worked perfectly. On Wed, Mar 30, 2011 at 6:08 AM, ke xie oed...@gmail.com wrote: Hey guys, I'm new here, and recently I'm working on configuring a cluster with 32 nodes. However, there are some problems, I describe below The cluster consists of nodes, which I don't have root to configure as I wish. We only have the space /localhost_name/local space to use. Thus, we only have /machine_a/local /machine_b/local ... So I guess to set hadoop.tmp.dir=/${HOSTNAME}/local will work, but sadly it didn't... Almost all the tutorials online are trying to set hadoop.tmp.dir as a single path, which assume on each machine the path is the same... but in my case it's not... I did some googling... like hadoop.tmp.dir different... but no results... Anybody can help? I'll appreciate that... for i've been working on this problem for more than 30 hours... -- Name: Ke Xie Eddy Research Group of Information Retrieval State Key Laboratory of Intelligent Technology and Systems Tsinghua University -- Name: Ke Xie Eddy Research Group of Information Retrieval State Key Laboratory of Intelligent Technology and Systems Tsinghua University
Re: how to set different hadoop.tmp.dir for each machines?
I'm a little confused as to why you're putting /pseg/local /... as the location. Are you sure that you've been given a folder name at the root of the drive called /pseg/ ? Maybe try to ssh to your server and navigate to your datastore folder, then do pwd. That should give you the working directory of the datastore. Use that as the value for the tmp datastore location. Sorry if that seems like a stupid suggestion. Just trying to get a handle on your actual problem. My linux skillset is limited to the basics, so I'm troubleshooting by looking for the type of mistake that I would make. If the above is not the issue, then I'm not sure what the issue could be. But, I'd be glad to continue trying to help (with my limited knowledge) :-) On Wed, Mar 30, 2011 at 8:37 AM, ke xie oed...@gmail.com wrote: Thank you modemide for your quick response. Sorry for not be clear...your understanding is right. I have a machine, called grande, and the other called pseg. Now i'm using grande as master (by fill the masters file by grande) and pseg as slave. the configuration of grande is (core-site.xml) property namefs.default.name/name valuehdfs://grande:8500/value descriptionThe name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. /description /property property namehadoop.tmp.dir/name value/grande/local/xieke-cluster/hadoop-tmp-data//value description. A base for other temporary directories/description /property and the configuration of pseg is: property namefs.default.name/name valuehdfs://grandonf/8500/value descriptionThe name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. /description /property property namehadoop.tmp.dir/name value/pseg/local/xieke-cluster/hadoop-tmp-data//value description A base for other temporary directories./description /property just the same as your I think? then I run ./bin/hadoop namenode -format to format nodes. and run ./bin/start-all.sh to start machines. but now: grande% ./bin/start-all.sh starting namenode, logging to /grande/local/hadoop/bin/../logs/hadoop-kx19-namenode-grande.out *pseg: /grande/local/hadoop/bin/..: No such file or directory.* grande: starting datanode, logging to /grande/local/hadoop/bin/../logs/hadoop-kx19-datanode-grande.out grande: starting secondarynamenode, logging to /grande/local/hadoop/bin/../logs/hadoop-kx19-secondarynamenode-grande.out starting jobtracker, logging to /grande/local/hadoop/bin/../logs/hadoop-kx19-jobtracker-grande.out pseg: /grande/local/hadoop/bin/..: No such file or directory. grande: starting tasktracker, logging to /grande/local/hadoop/bin/../logs/hadoop-kx19-tasktracker-grande.out Any ideas? On Wed, Mar 30, 2011 at 7:54 PM, modemide modem...@gmail.com wrote: Ok, so if I understand correctly, you want to change the location of the datastore on individual computers. I've tested it on my cluster, and it seems to work. Just for the sake of troubleshooting, you didn't mention the following: 1) Which computer were you editing the files on 2) which file were you editing? ** Here's my typical DataNode configuration: Computer: DataNode FileName: core-site.xml Contents: namehadoop.tmp.dir/name value/usr/local/hadoop/datastore/hadoop-${user.name}/value ... ** Here's the configuration of another DataNode I modified to test what you were asking: Computer: DataNode2 FileName: core-site.xml Contents: namehadoop.tmp.dir/name value/usr/local/hadoop/ANOTHERDATASTORE/hadoop-${user.name}/value ** Then, I moved datastore to ANOTHERDATASTORE on DataNode1. I started my cluster back up, and it worked perfectly. On Wed, Mar 30, 2011 at 6:08 AM, ke xie oed...@gmail.com wrote: Hey guys, I'm new here, and recently I'm working on configuring a cluster with 32 nodes. However, there are some problems, I describe below The cluster consists of nodes, which I don't have root to configure as I wish. We only have the space /localhost_name/local space to use. Thus, we only have /machine_a/local /machine_b/local ... So I guess to set hadoop.tmp.dir=/${HOSTNAME}/local will work, but sadly it didn't... Almost all the tutorials online are trying to set hadoop.tmp.dir as a single path, which assume on each machine the path is the same... but in my case it's not... I did some googling... like hadoop.tmp.dir different... but no results... Anybody can help? I'll appreciate that... for i've been working on this problem for more than 30 hours... -- Name: Ke Xie Eddy Research Group of Information Retrieval State Key
Re:Re: how to get each task's process
Harsh: I found that jvmManager.getPid(...) returned the pid of MapTaskRunner, but I want to get the task's pid. For example, I ran the the example randomwrite, the pid of task which is writing is 8268, but jvmManager.getPid(...) seemed to be its parent pid. I can not figure out the relationship between the taskrunner and the real task. Thanks zhutao At 2011-03-30 01:17:30,Harsh J qwertyman...@gmail.com wrote: I think jvmManager.getPid(...) is correct. It should give you the launched JVM's PID properly. In fact, the same is even used to kill the JVMs. 2011/3/29 朱韬 ryanzhu...@163.com: hi,guys: I want to monitor each tasks' io usage, do can I get each task's process id? I used jvmManager.getPid(tip.getTaskRunner()); but it didn't seem to be the task's process. Thanks zhutao -- Harsh J http://harshj.com
Map Tasks re-executing
Hello, My map tasks are freezing after 100% .. I'm suspecting my mapper.close() function which does some sorting. Any better suggestion of where shall I put my sorting method ? I thought of mapper.close() so that each map task sorts its own output (which is local) and hence faster. output is the following: 11/03/30 08:13:54 INFO mapred.JobClient: map 95% reduce 0% 11/03/30 08:14:09 INFO mapred.JobClient: map 96% reduce 0% 11/03/30 08:14:27 INFO mapred.JobClient: map 97% reduce 0% 11/03/30 08:14:42 INFO mapred.JobClient: map 98% reduce 0% 11/03/30 08:15:06 INFO mapred.JobClient: map 99% reduce 0% 11/03/30 08:15:45 INFO mapred.JobClient: map 100% reduce 0% 11/03/30 08:25:41 INFO mapred.JobClient: map 50% reduce 0% 11/03/30 08:25:49 INFO mapred.JobClient: Task Id : attempt_201103291035_0016_m_01_0, Status : FAILED Task attempt_201103291035_0016_m_01_0 failed to report status for 600 seconds. Killing! 11/03/30 08:25:50 INFO mapred.JobClient: map 0% reduce 0% 11/03/30 08:25:52 INFO mapred.JobClient: Task Id : attempt_201103291035_0016_m_00_0, Status : FAILED Task attempt_201103291035_0016_m_00_0 failed to report status for 600 seconds. Killing! 11/03/30 08:26:29 INFO mapred.JobClient: map 1% reduce 0% 11/03/30 08:26:53 INFO mapred.JobClient: map 2% reduce 0% 11/03/30 08:27:05 INFO mapred.JobClient: map 3% reduce 0% 11/03/30 08:27:29 INFO mapred.JobClient: map 4% reduce 0% 11/03/30 08:27:41 INFO mapred.JobClient: map 5% reduce 0% ... Thank you for any thought, Maha
Re: live/dead node problem
I haven't used 0.21. You can compare the source codes of the two versions. I set these in namenode's hdfs-site.xml to 1. I'm not sure you'd want to do it on a production cluster if its a big one. On 3/29/11 7:13 PM, Rita rmorgan...@gmail.com wrote: what about for 0.21 ? Also, where do you set this? in the data node configuration or namenode? It seems the default is set to 3 seconds. On Tue, Mar 29, 2011 at 5:37 PM, Ravi Prakash ravip...@yahoo-inc.com wrote: I set these parameters for quickly discovering live / dead nodes. For 0.20 : heartbeat.recheck.interval For 0.22 : dfs.namenode.heartbeat.recheck-interval dfs.heartbeat.interval Cheers, Ravi On 3/29/11 10:24 AM, Michael Segel michael_se...@hotmail.com http://michael_se...@hotmail.com wrote: Rita, When the NameNode doesn't see a heartbeat for 10 minutes, it then recognizes that the node is down. Per the Hadoop online documentation: Each DataNode sends a Heartbeat message to the NameNode periodically. A network partition can cause a subset of DataNodes to lose connectivity with the NameNode. The NameNode detects this condition by the absence of a Heartbeat message. The NameNode marks DataNodes without recent Heartbeats as dead and does not forward any new IO requests to them. Any data that was registered to a dead DataNode is not available to HDFS any more. DataNode death may cause the replication factor of some blocks to fall below their specified value. The NameNode constantly tracks which blocks need to be replicated and initiates replication whenever necessary. The necessity for re-replication may arise due to many reasons: a DataNode may become unavailable, a replica may become corrupted, a hard disk on a DataNode may fail, or the replication factor of a file may be increased. I was trying to find out if there's an hdfs-site parameter that could be set to decrease this time period, but wasn't successful. HTH -Mike Date: Tue, 29 Mar 2011 08:13:43 -0400 Subject: live/dead node problem From: rmorgan...@gmail.com http://rmorgan...@gmail.com To: common-user@hadoop.apache.org http://common-user@hadoop.apache.org Hello All, Is there a parameter or procedure to check more aggressively for a live/dead node? Despite me killing the hadoop process, I see the node active for more than 10+ minutes in the Live Nodes page. Fortunately, the last contact increments. Using, branch-0.21, 0985326 -- --- Get your facts first, then you can distort them as you please.--
NameNode web interface error in 0.21.0
Hi, When I click the Browse the filesystem link, I was redirected to http://localhost.localdomain:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=/, which is an error URL, I think it should be related to the domain name of my server. I am setting up a pseudo cluster environment. Regards, Xiaobo Gu
Re: Map Tasks re-executing
It's not the sorting, since the sorted files are produced in output, it's then mapper not existing well. so can anyone tell me if it's wrong to write mapper.close() function like this ? @Override public void close() throws IOException{ helper.CleanUp(); writer.close(); // SORT PRODUCED OUTPUT try{ Sorter SeqSort = new Sorter (hdfs, DocDocWritable.class, IntWritable.class, new Configuration()); SeqSort.sort(tempSeq, new Path(sorted/S+TaskID.getName())); } catch(Exception e){e.printStackTrace();} return; } and by the way, when it's executed ... it's not part of the cleanup phase that is shown in the UI .. which I think is supposed to be ... right? Thank you, Maha Hello, My map tasks are freezing after 100% .. I'm suspecting my mapper.close(). output is the following: 11/03/30 08:13:54 INFO mapred.JobClient: map 95% reduce 0% 11/03/30 08:14:09 INFO mapred.JobClient: map 96% reduce 0% 11/03/30 08:14:27 INFO mapred.JobClient: map 97% reduce 0% 11/03/30 08:14:42 INFO mapred.JobClient: map 98% reduce 0% 11/03/30 08:15:06 INFO mapred.JobClient: map 99% reduce 0% 11/03/30 08:15:45 INFO mapred.JobClient: map 100% reduce 0% 11/03/30 08:25:41 INFO mapred.JobClient: map 50% reduce 0% 11/03/30 08:25:49 INFO mapred.JobClient: Task Id : attempt_201103291035_0016_m_01_0, Status : FAILED Task attempt_201103291035_0016_m_01_0 failed to report status for 600 seconds. Killing! 11/03/30 08:25:50 INFO mapred.JobClient: map 0% reduce 0% 11/03/30 08:25:52 INFO mapred.JobClient: Task Id : attempt_201103291035_0016_m_00_0, Status : FAILED Task attempt_201103291035_0016_m_00_0 failed to report status for 600 seconds. Killing! 11/03/30 08:26:29 INFO mapred.JobClient: map 1% reduce 0% 11/03/30 08:26:53 INFO mapred.JobClient: map 2% reduce 0% 11/03/30 08:27:05 INFO mapred.JobClient: map 3% reduce 0% 11/03/30 08:27:29 INFO mapred.JobClient: map 4% reduce 0% 11/03/30 08:27:41 INFO mapred.JobClient: map 5% reduce 0% ... Thank you for any thought, Maha
Re: namenode wont start
Thanks for that tidbit, it appears to be the problem... Maybe that's a well known issue? or perhaps it should be added to the setup WIKI ??? -Bill On 03/29/2011 09:47 PM, Harsh J wrote: On Wed, Mar 30, 2011 at 3:59 AM, Bill Brunebbr...@decarta.com wrote: Hi, I've been running hadoop 0.20.2 for a while now on 2 different clusters that I setup. Now on this new cluster I can't get the namenode to stay up. It exits with a IOException incomplete hdfs uri and prints the uri: hdfs://rmsi_combined.rmsi.com:54310 -which looks complete to me. Underscores are generally not valid in hostnames.
Re: namenode wont start
On Thu, Mar 31, 2011 at 12:59 AM, Bill Brune bbr...@decarta.com wrote: Thanks for that tidbit, it appears to be the problem... Maybe that's a well known issue? or perhaps it should be added to the setup WIKI ??? It isn't really a Hadoop issue. See here for what defines a valid hostname (The behavior of '_' is undefined, and was not part of the actual RFC spec): http://www.zytrax.com/books/dns/apa/names.html -- Harsh J http://harshj.com
JVM reuse and log files
It seems like when JVM reuse is enabled map task log data is not getting written to their corresponding log files; log data from certain map tasks gets appended to log files corresponding to some other map task. For example, I have a case here where 8 map JVMs are running simultaneously and all syslog data from map task 9, 17 and 25 gets appended in to log file for map task 0. Whereas no syslog file gets generated in attempt_*m_09_0/ , attempt_*m_17_0/ and attempt_*m_25_0/ folders. This job creates 32 map tasks. This behavior might also be applicable to reduce log files, however, in our case total # of reduce tasks is not more than max reduce JVMs running at the same time and hence it might not be manifesting. BTW, this is on Apache distro 0.21.0. -Shrinivas
Re: JVM reuse and log files
Hi Shrinivas, Yes, this is the behavior of the task logs when using JVM Reuse. You should notice in the log directories for the other tasks a log index file which specifies the byte offsets into the log files where the task starts and stops. When viewing logs through the web UI, it will use these index files to show you the right portion of the logs. -Todd On Wed, Mar 30, 2011 at 1:17 PM, Shrinivas Joshi jshrini...@gmail.comwrote: It seems like when JVM reuse is enabled map task log data is not getting written to their corresponding log files; log data from certain map tasks gets appended to log files corresponding to some other map task. For example, I have a case here where 8 map JVMs are running simultaneously and all syslog data from map task 9, 17 and 25 gets appended in to log file for map task 0. Whereas no syslog file gets generated in attempt_*m_09_0/ , attempt_*m_17_0/ and attempt_*m_25_0/ folders. This job creates 32 map tasks. This behavior might also be applicable to reduce log files, however, in our case total # of reduce tasks is not more than max reduce JVMs running at the same time and hence it might not be manifesting. BTW, this is on Apache distro 0.21.0. -Shrinivas -- Todd Lipcon Software Engineer, Cloudera
How to apply Patch
Dear all, Can Someone Please tell me how to apply a patch on hadoop-0.20.2 package. I attached the patch. Please find the attachment. I just follow below steps for Hadoop : 1. Download Hadoop-0.20.2.tar.gz 2. Extract the file. 3. Set Configurations in site.xml files Thanks best Regards, Adarsh Sharma
Re: How to apply Patch
Sorry, Just check the attachment now. Adarsh Sharma wrote: Dear all, Can Someone Please tell me how to apply a patch on hadoop-0.20.2 package. I attached the patch. Please find the attachment. I just follow below steps for Hadoop : 1. Download Hadoop-0.20.2.tar.gz 2. Extract the file. 3. Set Configurations in site.xml files Thanks best Regards, Adarsh Sharma Index: src/test/org/apache/hadoop/mapred/pipes/TestPipes.java === --- src/test/org/apache/hadoop/mapred/pipes/TestPipes.java (revision 565616) +++ src/test/org/apache/hadoop/mapred/pipes/TestPipes.java (working copy) @@ -150,7 +150,8 @@ JobConf job = mr.createJobConf(); job.setInputFormat(WordCountInputFormat.class); FileSystem local = FileSystem.getLocal(job); -Path testDir = new Path(System.getProperty(test.build.data), pipes); +Path testDir = new Path(file: + System.getProperty(test.build.data), +pipes); Path inDir = new Path(testDir, input); Path outDir = new Path(testDir, output); Path wordExec = new Path(/testing/bin/application); Index: src/test/org/apache/hadoop/mapred/pipes/WordCountInputFormat.java === --- src/test/org/apache/hadoop/mapred/pipes/WordCountInputFormat.java (revision 565616) +++ src/test/org/apache/hadoop/mapred/pipes/WordCountInputFormat.java (working copy) @@ -35,7 +35,7 @@ private String filename; WordCountInputSplit() { } WordCountInputSplit(Path filename) { - this.filename = filename.toString(); + this.filename = filename.toUri().getPath(); } public void write(DataOutput out) throws IOException { Text.writeString(out, filename); Index: src/examples/pipes/impl/wordcount-nopipe.cc === --- src/examples/pipes/impl/wordcount-nopipe.cc (revision 565616) +++ src/examples/pipes/impl/wordcount-nopipe.cc (working copy) @@ -87,9 +87,15 @@ const HadoopPipes::JobConf* job = context.getJobConf(); int part = job-getInt(mapred.task.partition); std::string outDir = job-get(mapred.output.dir); +// remove the file: schema substring +std::string::size_type posn = outDir.find(:); +HADOOP_ASSERT(posn != std::string::npos, + no schema found in output dir: + outDir); +outDir.erase(0, posn+1); mkdir(outDir.c_str(), 0777); std::string outFile = outDir + /part- + HadoopUtils::toString(part); file = fopen(outFile.c_str(), wt); +HADOOP_ASSERT(file != NULL, can't open file for writing: + outFile); } ~WordCountWriter() {
Re: How to apply Patch
There is a utility available for Unix called 'patch'. You can use that with a suitable -p(num) argument (man patch, for more info). On Thu, Mar 31, 2011 at 9:41 AM, Adarsh Sharma adarsh.sha...@orkash.com wrote: Dear all, Can Someone Please tell me how to apply a patch on hadoop-0.20.2 package. I attached the patch. Please find the attachment. I just follow below steps for Hadoop : 1. Download Hadoop-0.20.2.tar.gz 2. Extract the file. 3. Set Configurations in site.xml files Thanks best Regards, Adarsh Sharma -- Harsh J http://harshj.com
Re: Hadoop Pipes Error
Any update on the below error. Please guide. Thanks best Regards, Adarsh Sharma Adarsh Sharma wrote: Dear all, Today I faced a problem while running a map-reduce job in C++. I am not able to understand to find the reason of the below error : 11/03/30 12:09:02 INFO mapred.JobClient: Task Id : attempt_201103301130_0011_m_00_0, Status : FAILED java.io.IOException: pipe child exception at org.apache.hadoop.mapred.pipes.Application.abort(Application.java:151) at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:101) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run(BinaryProtocol.java:114) attempt_201103301130_0011_m_00_0: Hadoop Pipes Exception: failed to open at wordcount-nopipe.cc:82 in WordCountReader::WordCountReader(HadoopPipes::MapContext) 11/03/30 12:09:02 INFO mapred.JobClient: Task Id : attempt_201103301130_0011_m_01_0, Status : FAILED java.io.IOException: pipe child exception at org.apache.hadoop.mapred.pipes.Application.abort(Application.java:151) at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:101) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run(BinaryProtocol.java:114) attempt_201103301130_0011_m_01_0: Hadoop Pipes Exception: failed to open at wordcount-nopipe.cc:82 in WordCountReader::WordCountReader(HadoopPipes::MapContext) 11/03/30 12:09:02 INFO mapred.JobClient: Task Id : attempt_201103301130_0011_m_02_0, Status : FAILED java.io.IOException: pipe child exception at org.apache.hadoop.mapred.pipes.Application.abort(Application.java:151) at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:101) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run(BinaryProtocol.java:114) attempt_201103301130_0011_m_02_1: Hadoop Pipes Exception: failed to open at wordcount-nopipe.cc:82 in WordCountReader::WordCountReader(HadoopPipes::MapContext) 11/03/30 12:09:15 INFO mapred.JobClient: Task Id : attempt_201103301130_0011_m_00_2, Status : FAILED java.io.IOException: pipe child exception at org.apache.hadoop.mapred.pipes.Application.abort(Application.java:151) at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:101) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:35 I tried to run *wordcount-nopipe.cc* program in */home/hadoop/project/hadoop-0.20.2/src/examples/pipes/impl* directory. make wordcount-nopipe bin/hadoop fs -put wordcount-nopipe bin/wordcount-nopipe bin/hadoop pipes -D hadoop.pipes.java.recordreader=true -D hadoop.pipes.java.recordwriter=true -input gutenberg -output gutenberg-out11 -program bin/wordcount-nopipe or bin/hadoop pipes -D hadoop.pipes.java.recordreader=false -D hadoop.pipes.java.recordwriter=false -input gutenberg -output gutenberg-out11 -program bin/wordcount-nopipe but error remains the same. I attached my Makefile also. Please have some comments on it. I am able to wun a simple wordcount.cpp program in Hadoop Cluster but don't know why this program fails in Broken Pipe error. Thanks best regards Adarsh Sharma
Hadoop Pipe Error
Dear all, Today I faced a problem while running a map-reduce job in C++. I am not able to understand to find the reason of the below error : 11/03/30 12:09:02 INFO mapred.JobClient: Task Id : attempt_201103301130_0011_m_00_0, Status : FAILED java.io.IOException: pipe child exception at org.apache.hadoop.mapred.pipes.Application.abort(Application.java:151) at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:101) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run(BinaryProtocol.java:114) attempt_201103301130_0011_m_00_0: Hadoop Pipes Exception: failed to open at wordcount-nopipe.cc:82 in WordCountReader::WordCountReader(HadoopPipes::MapContext) 11/03/30 12:09:02 INFO mapred.JobClient: Task Id : attempt_201103301130_0011_m_01_0, Status : FAILED java.io.IOException: pipe child exception at org.apache.hadoop.mapred.pipes.Application.abort(Application.java:151) at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:101) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run(BinaryProtocol.java:114) attempt_201103301130_0011_m_01_0: Hadoop Pipes Exception: failed to open at wordcount-nopipe.cc:82 in WordCountReader::WordCountReader(HadoopPipes::MapContext) 11/03/30 12:09:02 INFO mapred.JobClient: Task Id : attempt_201103301130_0011_m_02_0, Status : FAILED java.io.IOException: pipe child exception at org.apache.hadoop.mapred.pipes.Application.abort(Application.java:151) at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:101) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run(BinaryProtocol.java:114) attempt_201103301130_0011_m_02_1: Hadoop Pipes Exception: failed to open at wordcount-nopipe.cc:82 in WordCountReader::WordCountReader(HadoopPipes::MapContext) 11/03/30 12:09:15 INFO mapred.JobClient: Task Id : attempt_201103301130_0011_m_00_2, Status : FAILED java.io.IOException: pipe child exception at org.apache.hadoop.mapred.pipes.Application.abort(Application.java:151) at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:101) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:35 I tried to run *wordcount-nopipe.cc* program in */home/hadoop/project/hadoop-0.20.2/src/examples/pipes/impl* directory. make wordcount-nopipe bin/hadoop fs -put wordcount-nopipe bin/wordcount-nopipe bin/hadoop pipes -D hadoop.pipes.java.recordreader=true -D hadoop.pipes.java.recordwriter=true -input gutenberg -output gutenberg-out11 -program bin/wordcount-nopipe or bin/hadoop pipes -D hadoop.pipes.java.recordreader=false -D hadoop.pipes.java.recordwriter=false -input gutenberg -output gutenberg-out11 -program bin/wordcount-nopipe but error remains the same. I attached my Makefile also. Please have some comments on it. I am able to wun a simple wordcount.cpp program in Hadoop Cluster but don't know why this program fails in Broken Pipe error. Thanks best regards Adarsh Sharma CC = g++ HADOOP_INSTALL =/home/hadoop/project/hadoop-0.20.2 PLATFORM = Linux-amd64-64 CPPFLAGS = -m64 -I/home/hadoop/project/hadoop-0.20.2/c++/Linux-amd64-64/include -I/usr/local/cuda/include wordcount-nopipe : wordcount-nopipe.cc $(CC) $(CPPFLAGS) $ -Wall