Re: Frustrated with Cluster Setup: Reduce Tasks Stop at 16% - could not find taskTracker/jobcache...
Hi Scott, Your reducer classes are unable to read map outputs. You may check mapred.local.dir property in your conf/hadoop-default.xml and conf/hadoop-site.xml . These directories should be valid directories in your slaves. You can give multiple folders with comma separated values. - Prasad. On Thursday 30 October 2008 11:32:02 pm Scott Whitecross wrote: So its not just at 16%, but depends on the task: 2008-10-30 13:58:29,702 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200810301345_0001_r_00_0 0.25675678% reduce copy (57 of 74 at 13.58 MB/s) 2008-10-30 13:58:29,357 WARN org.apache.hadoop.mapred.TaskTracker: getMapOutput(attempt_200810301345_0001_m_48_0,0) failed : org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200810301345_0001/ attempt_200810301345_0001_m_48_0/output/file.out.index in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator $AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:359) at org .apache .hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java: 138) at org.apache.hadoop.mapred.TaskTracker $MapOutputServlet.doGet(TaskTracker.java:2402) at javax.servlet.http.HttpServlet.service(HttpServlet.java:689) at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) I'm out of thoughts on what the problem could be.. On Oct 30, 2008, at 12:35 PM, Scott Whitecross wrote: I'm growing very frustrated with a simple cluster setup. I can get the cluster setup on two machines, but have troubles when trying to extend the installation to 3 or more boxes. I keep seeing the below errors. It seems the reduce tasks can't get access to the data. I can't seem to figure out how to fix this error. What amazes me is that file not found issues appear on the master box, as well as the slaves. What causes the reduce tasks to not read find information via the localhost? Setup/Errors: My basic setup comes from: http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-No de_Cluster) (Michael Noll's setup). I've put the following in the my /etc/ hosts file: 127.0.0.1 localhost 10.1.1.12 master 10.1.1.10 slave 10.1.1.13 slave1 And have setup transparent ssh to all boxes (and it works). All boxes can see each other, etc. My base level hadoop-site.xml is: ?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? !-- Put site-specific property overrides in this file. -- configuration property namehadoop.tmp.dir/name value/opt/hadoop-datastore/value /property property namefs.default.name/name valuehdfs://master:54310/value /property property namemapred.job.tracker/name valuemaster:54311/value /property property namedfs.replication/name value3/value /property /configuration Errors: WARN org.apache.hadoop.mapred.TaskTracker: getMapOutput(attempt_200810301206_0004_m_01_0,0) failed : org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200810301206_0004/ attempt_200810301206_0004_m_01_0/output/file.out.index in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator $AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:359) at org .apache .hadoop .fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java: 138)... and in the userlog of the attempt: 2008-10-30 12:28:00,806 WARN org.apache.hadoop.mapred.ReduceTask: java.io.FileNotFoundException: http://localhost:50060/mapOutput?job=job_200810301206_0004map=attempt_20 0810301206_0004_m_01_0reduce=0 at sun.reflect.GeneratedConstructorAccessor3.newInstance(Unknown Source) at sun .reflect .DelegatingConstructorAccessorImpl .newInstance(DelegatingConstructorAccessorImpl.java:27)
Re: Maps running after reducers complete successfully?
thanks Owen, So this may be an enhancement? - Prasad. On Thursday 02 October 2008 09:58:03 pm Owen O'Malley wrote: It isn't optimal, but it is the expected behavior. In general when we lose a TaskTracker, we want the map outputs regenerated so that any reduces that need to re-run (including speculative execution). We could handle it as a special case if: 1. We didn't lose any running reduces. 2. All of the reduces (including speculative tasks) are done with shuffling. 3. We don't plan on launching any more speculative reduces. If all 3 hold, we don't need to re-run the map tasks. Actually doing so, would be a pretty involved patch to the JobTracker/Schedulers. -- Owen
Maps running after reducers complete successfully?
Hello, The following is the scenario which caused this weird behavior. Is this expected? - All maps tasks first completed successfully - Then all the reducers except 1 were completed. - While the last reduce task was running one of the tasktrackers died. - This made all the map tasks executed at that node to be moved to failed - The jobtracker re-assigned these map tasks to other nodes and the map status is running. - The last reduce task finished execution, so I have reduce 100% but maps running. Is this correct to have maps running after reducers are completed? - Prasad Pingali. IIIT Hyderabad.
AlreadyBeingCreatedException in reduce
I am tried to run a nutch fetch job. I got this exception using 0.18.1-dev during reduce phase. Not getting the exception for same job on 0.16.4 . Any pointers where I am going wrong? org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /user/jobs/crawl/segments/20080917193538/crawl_fetch/part-1/index for DFSClient_attempt_200809161607_0012_r_01_1 on client 67.215.230.24 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1047) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:990) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:298) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888) at org.apache.hadoop.ipc.Client.call(Client.java:707) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) at $Proxy1.create(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy1.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.(DFSClient.java:2432) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:453) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:170) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:485) at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.(SequenceFile.java:1198) at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401) at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:306) at org.apache.hadoop.io.MapFile$Writer.(MapFile.java:160) at org.apache.hadoop.io.MapFile$Writer.(MapFile.java:134) at org.apache.hadoop.io.MapFile$Writer.(MapFile.java:92) at org.apache.nutch.fetcher.FetcherOutputFormat.getRecordWriter(FetcherOutputFormat.java:66) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2359) thanks and regards, Prasad Pingali.
Re: scp to namenode faster than dfs put?
thanks for the replies. So looks like replication might be the real overhead when compared to scp. Also dfs put copies multiple replicas unlike scp. Lohit On Sep 17, 2008, at 6:03 AM, å¶åæ [EMAIL PROTECTED] wrote: Actually, No. As you said, I understand that dfs -put breaks the data into blocksand then copies to datanodes, but scp do not breaks the data into blocksand , and just copy the data to the namenode! 2008/9/17, Prasad Pingali [EMAIL PROTECTED]: Hello, I observe that scp of data to the namenode is faster than actually putting into dfs (all nodes coming from same switch and have same ethernet cards, homogenous nodes)? I understand that dfs -put breaks the data into blocks and then copies to datanodes, but shouldn't that be atleast as fast as copying data to namenode from a single machine, if not faster? thanks and regards, Prasad Pingali, IIIT Hyderabad. -- Sorry for my english!! æ Please help me to correct my english expression and error in syntax
Re: scp to namenode faster than dfs put?
The time seemed to be around double the time taken to scp. Didn't realize it could be due to replication. Regd dfs being faster than scp, the statement came more out of expectation (or wish list) rather than anything else. Since scp is the most elementary way of copying files, was thinking if the network topology of the cluster can be exploited in any way. The only intuition I had was there may be some approaches faster than scp, if any concepts from P2P file sharing are used here. Though I didn't fully explore P2P, I thought there may be some new developments in that area which may be useful here? After napster's centralized way of copying, I think there were quite a bit of improvements? Just thinking loud. - Prasad. How much slower is 'dfs -put' any way? How large is the file you are copying? but shouldn't that be atleast as fast as copying data to namenode from a single machine, It would be at most as fast as scp assuming you are not cpu bound. Why would you think dfs be faster even if it copying to a single replica? Raghu. Dennis Kubes wrote: While an scp will copy data to the namenode machine, it does *not* store the data in dfs, it simply copies the data to namenode machine. This is the same as copying data to any other machine. The data isn't in DFS and is not accessible from DFS. If the box running the namenode fails you lose your data. The reason put is slower is that the data is actually being stored into the DFS on multiple machines in block format. It is then accessible from programs accessing the DFS such as MR jobs. Dennis Prasad Pingali wrote: Hello, I observe that scp of data to the namenode is faster than actually putting into dfs (all nodes coming from same switch and have same ethernet cards, homogenous nodes)? I understand that dfs -put breaks the data into blocks and then copies to datanodes, but shouldn't that be atleast as fast as copying data to namenode from a single machine, if not faster? thanks and regards, Prasad Pingali, IIIT Hyderabad.
Re: scp to namenode faster than dfs put?
yes, client was a namenode and also a datanode. thanks Raghu, will try not running datanode. - Prasad. On Thursday 18 September 2008 12:00:30 am Raghu Angadi wrote: pvvpr wrote: The time seemed to be around double the time taken to scp. Didn't realize it could be due to replication. twice slow is not expected. One possibility is that your client is also one of the datanodes (i.e. you are reading from and writing to the same disk). Raghu. Regd dfs being faster than scp, the statement came more out of expectation (or wish list) rather than anything else. Since scp is the most elementary way of copying files, was thinking if the network topology of the cluster can be exploited in any way. The only intuition I had was there may be some approaches faster than scp, if any concepts from P2P file sharing are used here. Though I didn't fully explore P2P, I thought there may be some new developments in that area which may be useful here? After napster's centralized way of copying, I think there were quite a bit of improvements? Just thinking loud. - Prasad. How much slower is 'dfs -put' any way? How large is the file you are copying? but shouldn't that be atleast as fast as copying data to namenode from a single machine, It would be at most as fast as scp assuming you are not cpu bound. Why would you think dfs be faster even if it copying to a single replica? Raghu. Dennis Kubes wrote: While an scp will copy data to the namenode machine, it does *not* store the data in dfs, it simply copies the data to namenode machine. This is the same as copying data to any other machine. The data isn't in DFS and is not accessible from DFS. If the box running the namenode fails you lose your data. The reason put is slower is that the data is actually being stored into the DFS on multiple machines in block format. It is then accessible from programs accessing the DFS such as MR jobs. Dennis Prasad Pingali wrote: Hello, I observe that scp of data to the namenode is faster than actually putting into dfs (all nodes coming from same switch and have same ethernet cards, homogenous nodes)? I understand that dfs -put breaks the data into blocks and then copies to datanodes, but shouldn't that be atleast as fast as copying data to namenode from a single machine, if not faster? thanks and regards, Prasad Pingali, IIIT Hyderabad.
reduce task progress above 100%?
Hello, A strange thing happened in my job. In reduce phase, one of the tasks status shows 101.44% complete and runs till some 102% and successfully finished back to 100%. Is this a right behavior? I put a quick copy-paste of the web GUI reporting completion of the tasks. (sorry for the bad formatting below). thanks, Prasad Pingali, IIIT Hyderabad. TaskCompleteStatus Start Time Finish Time Errors Counters task_200809151235_0026_r_00 90.91% reduce reduce 15-Sep-2008 18:40:06 0 task_200809151235_0026_r_04 86.13% reduce reduce 15-Sep-2008 18:40:11 0 task_200809151235_0026_r_08 85.08% reduce reduce 15-Sep-2008 18:40:16 0 task_200809151235_0026_r_12 82.57% reduce reduce 15-Sep-2008 18:40:21 0 task_200809151235_0026_r_14 97.90% reduce reduce 15-Sep-2008 18:40:22 0 task_200809151235_0026_r_17 101.44% reduce reduce 15-Sep-2008 23:34:04 0
Reduce task failed: org.apache.hadoop.fs.FSError: java.io.IOException
Hello, Never came across this error before. Upgraded to 0.18.0 this morning and ran a nutch fetch job. Got this exception in both the reduce attempts of a task and they failed. All other reducers seemed to work fine, except for one task. Any ideas what could be the problem? - Prasad Pingali. IIIT, Hyderabad. 2008-09-11 06:31:19,837 INFO org.apache.hadoop.mapred.ReduceTask: attempt_200809101353_0021_r_04_0: Got 1 new map-outputs number of known map outputs is 1 2008-09-11 06:31:19,837 INFO org.apache.hadoop.mapred.ReduceTask: attempt_200809101353_0021_r_04_0 Scheduled 1 of 1 known outputs (0 slow hosts and 0 dup hosts) 2008-09-11 06:31:20,133 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 8230686 bytes (8230686 raw bytes) into RAM from attempt_200809101353_0021_m_95_0 2008-09-11 06:31:22,332 INFO org.apache.hadoop.mapred.ReduceTask: Read 8230686 bytes from map-output for attempt_200809101353_0021_m_95_0 2008-09-11 06:31:22,333 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from attempt_200809101353_0021_m_95_0 - (33, 134) from machine10 2008-09-11 06:31:28,837 INFO org.apache.hadoop.mapred.ReduceTask: attempt_200809101353_0021_r_04_0: Got 1 new map-outputs number of known map outputs is 1 2008-09-11 06:31:28,838 INFO org.apache.hadoop.mapred.ReduceTask: attempt_200809101353_0021_r_04_0 Scheduled 1 of 1 known outputs (0 slow hosts and 0 dup hosts) 2008-09-11 06:31:29,585 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 21454877 bytes (21454877 raw bytes) into Local-FS from attempt_200809101353_0021_m_74_0 2008-09-11 06:31:37,831 INFO org.apache.hadoop.mapred.ReduceTask: Read 21454877 bytes from map-output for attempt_200809101353_0021_m_74_0 2008-09-11 06:31:37,832 INFO org.apache.hadoop.mapred.ReduceTask: attempt_200809101353_0021_r_04_0We have 19 map outputs on disk. Triggering merge of 10 files 2008-09-11 06:31:38,033 INFO org.apache.hadoop.mapred.Merger: Merging 10 sorted segments 2008-09-11 06:31:43,838 INFO org.apache.hadoop.mapred.ReduceTask: attempt_200809101353_0021_r_04_0 Need another 9 map output(s) where 0 is already in progress 2008-09-11 06:31:43,838 INFO org.apache.hadoop.mapred.ReduceTask: attempt_200809101353_0021_r_04_0: Got 0 new map-outputs number of known map outputs is 0 2008-09-11 06:31:43,838 INFO org.apache.hadoop.mapred.ReduceTask: attempt_200809101353_0021_r_04_0 Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts) 2008-09-11 06:32:03,095 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 10 segments left of total size: 262561859 bytes 2008-09-11 06:32:51,044 WARN org.apache.hadoop.mapred.ReduceTask: attempt_200809101353_0021_r_04_0 Merging of the local FS files threw an exception: org.apache.hadoop.fs.FSError: java.io.IOException: Input/output error at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.read(RawLocalFileSystem.java:149) at java.io.BufferedInputStream.read1(BufferedInputStream.java:256) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) at java.io.DataInputStream.read(DataInputStream.java:132) at org.apache.hadoop.fs.FSInputChecker.readFully(FSInputChecker.java:380) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:208) at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:236) at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:191) at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:159) at java.io.DataInputStream.read(DataInputStream.java:132) at org.apache.hadoop.mapred.IFile$Reader.readData(IFile.java:263) at org.apache.hadoop.mapred.IFile$Reader.rejigData(IFile.java:293) at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:277) at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:339) at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:134) at org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:225) at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:242) at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:83) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$LocalFSMerger.run(ReduceTask.java:2021) Caused by: java.io.IOException: Input/output error at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:199) at org.apache.hadoop.fs.RawLocalFileSystem$TrackingFileInputStream.read(RawLocalFileSystem.java:90) at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.read(RawLocalFileSystem.java:143)
IP based cluster - address space issues
Hello, We have a cluster that we initially configured using IP addresses instead of hostnames (i.e., all namenode, datanode references are mentioned as IP addresses rather than hostnames. We did this with an intuition that the cluster might work faster). However, our entire IP address subnet is now changing in our center due to reasons beyond our control. When we changed the machine IP address to the new subnet, the datanodes refuse to start. How can I have a clean start without messing my old data? thanks in advance for help. - Prasad Pingali, LTRC, IIIT, Hyderabad.