Re: Datanodes using public ip, why?
Hi Thank you all for the comments setting dfs.datanode.dns.interface and having private ip's in slaves ans masters file didn't work. So as Alex said I changed all public ip mapping to hostnames on /etc/hosts file, and all datanodes now communicate through private network. but im not fully content since on some situations I would want hostnames to be mapped to public ips, and hadoop still communication through private network. I don't understand why dfs.datanode.dns.interface has no effect. One interesting thing i found is that if I change dfs.default.name to public ip from private one, all datanodes now report themselves with public ips. so confusing. why? btw, im using hadoop 1.0.3, without nameserver and firewalls Thank you Ben On Fri, Jul 12, 2013 at 12:29 PM, Alex Levin ale...@gmail.com wrote: make sure that your hostnames resolved ( dns or/and hosts files ) with private IPs. if you have records in the nodes hosts files like public IP hosname remove (or comment) them Alex On Jul 11, 2013 2:21 AM, Ben Kim benkimkim...@gmail.com wrote: Hello Hadoop Community! I've setup datanodes with private network by adding private hostname's to the slaves file. but it looks like when i lookup on the webUI datenodes are registered with public hostnames. are they actually networking with public network? all datanodes have eth0 with public address and eth1 with private address. what am i missing? Thanks a whole lot *Benjamin Kim* *benkimkimben at gmail* -- *Benjamin Kim* *benkimkimben at gmail*
Datanodes using public ip, why?
Hello Hadoop Community! I've setup datanodes with private network by adding private hostname's to the slaves file. but it looks like when i lookup on the webUI datenodes are registered with public hostnames. are they actually networking with public network? all datanodes have eth0 with public address and eth1 with private address. what am i missing? Thanks a whole lot *Benjamin Kim* *benkimkimben at gmail*
is time sync required among all nodes?
Hi, This is very basic fundamental question. Is time among all nodes needs to be synced? I've never even thought of timing in hadoop cluster but recently experienced my servers going out of sync with time. I know hbase requires time to by synced due to its timestamp action. But I wonder any of hadoop functionality requires time sync. Perhaps checkpoint, namenode HA, or datanode report, etc... hmm -- *Benjamin Kim* *benkimkimben at gmail*
Hadoop 2.0.4: Unable to load native-hadoop library for your platform
Hi I downloaded hadoop 2.0.4 and keep getting these errors from hadoop cli and MapReduce task logs 13/05/24 14:34:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable i tried adding $HADOOP_HOME/lib/native/* to CLASSPATH and LD_LIBRARY_PATH but none of these worked. Had anyone have similar problem? TY! -- *Benjamin Kim* *benkimkimben at gmail*
Re: issue with hadoop mapreduce using same job.jar
Things I tried so far without a luck - restart the hadoop - sync echo 3 /proc/sys/vm/drop_caches - clear namenode java cache using jcontrol - check permission of the /user/hadoop/.staging folder - delete everything under the .staging folder - rename the test.jar file - run using different user what worked though is remotely running a MR using the Hadoop API so it seems like this is only happenening in the Hadoop CLI On Wed, Mar 13, 2013 at 1:00 AM, Ben Kim benkimkim...@gmail.com wrote: Hi there It looks like the job.jar created in the /user/hadoop/.staging/ folder is always the same no matter which jar file i give. if i download the job.jar file then I reckon that it's a jar file I used run a MR job few hours ago. I'm using hadoop 1.0.3 on top of centos 6.2 anyone has any ideas? *Benjamin Kim* *benkimkimben at gmail* -- *Benjamin Kim* *benkimkimben at gmail*
issue with hadoop mapreduce using same job.jar
Hi there It looks like the job.jar created in the /user/hadoop/.staging/ folder is always the same no matter which jar file i give. if i download the job.jar file then I reckon that it's a jar file I used run a MR job few hours ago. I'm using hadoop 1.0.3 on top of centos 6.2 anyone has any ideas? *Benjamin Kim* *benkimkimben at gmail*
Re: Multiple reduce task retries running at same time
Attached a screenshot showing the retries On Tue, Jan 29, 2013 at 4:35 PM, Ben Kim benkimkim...@gmail.com wrote: Hi! I have come across the situation where i found a single reducer task executing with multiple retries simultaneously. Which is potent for slowing down the whole reduce process for large data sets. Is this pretty normal to yall for hadoop 1.0.3? -- *Benjamin Kim* *benkimkimben at gmail* -- *Benjamin Kim* *benkimkimben at gmail* attachment: sc2.png
Re: Decommissioning a datanode takes forever
UPDATE: WARN with edit log had nothing to do with the current problem. However replica placement warnings seem to be suspicious. Please have a look at the following logs. 2013-01-22 09:12:10,885 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place enough replicas, still in need of 1 2013-01-22 00:02:17,541 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Block: blk_4844131893883391179_3440513, Expected Replicas: 10, live replicas: 9, corrupt replicas: 0, decommissioned replicas: 1, excess replicas: 0, Is Open File: false, Datanodes having this block: 203.235.211.155:50010 203.235.211.156:5001020 3.235.211.145:50010 203.235.211.144:50010 203.235.211.146:50010 203.235.211.158:50010 203.235.211.159:50010 203.235.211.157:50010 203.235.211.160:50010 203.235.211.143:50010 , Current Datanode: 203.235.211.155:50010, Is current datanode decommissioning: true I have set my replication factor to 3. I dont understand why hadoop is trying to replicate it to 10 nodes. I have decommissioned one node so currently I have 9 nodes in operation. It will never be replicated to 10 nodes. I also see that all repeated warning msg like the above is for blk_4844131893883391179_3440513. How would I delete the block? it's not showing as corrupted block on fsck. :( BEN On Tue, Jan 22, 2013 at 9:28 AM, Ben Kim benkimkim...@gmail.com wrote: Hi Varun, Thnk you for the reponse No there doesnt seem to be any corrupted blocks in my cluster. I did hadoop fsck -blocks / and it didnt report any corrupted block. However, these are two WARNings in the namenode log, constantly repeating since the decommission. - 2013-01-22 09:16:30,908 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit log, edits.new files already exists in all healthy directories: - 2013-01-22 09:12:10,885 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place enough replicas, still in need of 1 There isn't any WARN or ERROR in the decommissioning datanode log Ben On Mon, Jan 21, 2013 at 3:05 PM, varun kumar varun@gmail.com wrote: Hi Ben, Are there any corrupted blocks in your hadoop cluster. Regards, Varun Kumar On Mon, Jan 21, 2013 at 8:22 AM, Ben Kim benkimkim...@gmail.com wrote: Hi! I followed the decommissioning guide on the hadoop hdfs wiki. the hdfs web ui shows that the decommissioning proceess has successfully begun. it started redeploying 80,000 blocks through the hadoop cluster, but for some reason it stopped at 9059 blocks. I've waited 30 hours and still no progress. Any one with any idea? -- *Benjamin Kim* *benkimkimben at gmail* -- Regards, Varun Kumar.P -- *Benjamin Kim* *benkimkimben at gmail* -- *Benjamin Kim* *benkimkimben at gmail*
Re: Decommissioning a datanode takes forever
Impatient I am, I just shut down the cluster and restarted it with empty exclude file. If I added the datanode hostname back to the exclude file, and ran hadoop dfsadmin -refreshNodes, *the datanode goes straight to the dead node *without going to the descommission process. I'm done for today. maybe someone else can figure it out when I come back tomorrow :) Best regards, Ben On Tue, Jan 22, 2013 at 5:38 PM, Ben Kim benkimkim...@gmail.com wrote: UPDATE: WARN with edit log had nothing to do with the current problem. However replica placement warnings seem to be suspicious. Please have a look at the following logs. 2013-01-22 09:12:10,885 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place enough replicas, still in need of 1 2013-01-22 00:02:17,541 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Block: blk_4844131893883391179_3440513, Expected Replicas: 10, live replicas: 9, corrupt replicas: 0, decommissioned replicas: 1, excess replicas: 0, Is Open File: false, Datanodes having this block: 203.235.211.155:50010 203.235.211.156:5001020 3.235.211.145:50010 203.235.211.144:50010 203.235.211.146:50010 203.235.211.158:50010 203.235.211.159:50010 203.235.211.157:50010 203.235.211.160:50010 203.235.211.143:50010 , Current Datanode: 203.235.211.155:50010, Is current datanode decommissioning: true I have set my replication factor to 3. I dont understand why hadoop is trying to replicate it to 10 nodes. I have decommissioned one node so currently I have 9 nodes in operation. It will never be replicated to 10 nodes. I also see that all repeated warning msg like the above is for blk_4844131893883391179_3440513. How would I delete the block? it's not showing as corrupted block on fsck. :( BEN On Tue, Jan 22, 2013 at 9:28 AM, Ben Kim benkimkim...@gmail.com wrote: Hi Varun, Thnk you for the reponse No there doesnt seem to be any corrupted blocks in my cluster. I did hadoop fsck -blocks / and it didnt report any corrupted block. However, these are two WARNings in the namenode log, constantly repeating since the decommission. - 2013-01-22 09:16:30,908 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit log, edits.new files already exists in all healthy directories: - 2013-01-22 09:12:10,885 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place enough replicas, still in need of 1 There isn't any WARN or ERROR in the decommissioning datanode log Ben On Mon, Jan 21, 2013 at 3:05 PM, varun kumar varun@gmail.com wrote: Hi Ben, Are there any corrupted blocks in your hadoop cluster. Regards, Varun Kumar On Mon, Jan 21, 2013 at 8:22 AM, Ben Kim benkimkim...@gmail.com wrote: Hi! I followed the decommissioning guide on the hadoop hdfs wiki. the hdfs web ui shows that the decommissioning proceess has successfully begun. it started redeploying 80,000 blocks through the hadoop cluster, but for some reason it stopped at 9059 blocks. I've waited 30 hours and still no progress. Any one with any idea? -- *Benjamin Kim* *benkimkimben at gmail* -- Regards, Varun Kumar.P -- *Benjamin Kim* *benkimkimben at gmail* -- *Benjamin Kim* *benkimkimben at gmail* -- *Benjamin Kim* *benkimkimben at gmail*
Re: Decommissioning a datanode takes forever
Hi Varun, Thnk you for the reponse No there doesnt seem to be any corrupted blocks in my cluster. I did hadoop fsck -blocks / and it didnt report any corrupted block. However, these are two WARNings in the namenode log, constantly repeating since the decommission. - 2013-01-22 09:16:30,908 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit log, edits.new files already exists in all healthy directories: - 2013-01-22 09:12:10,885 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place enough replicas, still in need of 1 There isn't any WARN or ERROR in the decommissioning datanode log Ben On Mon, Jan 21, 2013 at 3:05 PM, varun kumar varun@gmail.com wrote: Hi Ben, Are there any corrupted blocks in your hadoop cluster. Regards, Varun Kumar On Mon, Jan 21, 2013 at 8:22 AM, Ben Kim benkimkim...@gmail.com wrote: Hi! I followed the decommissioning guide on the hadoop hdfs wiki. the hdfs web ui shows that the decommissioning proceess has successfully begun. it started redeploying 80,000 blocks through the hadoop cluster, but for some reason it stopped at 9059 blocks. I've waited 30 hours and still no progress. Any one with any idea? -- *Benjamin Kim* *benkimkimben at gmail* -- Regards, Varun Kumar.P -- *Benjamin Kim* *benkimkimben at gmail*
Streaming Job map/reduce not working with scripts on 1.0.3
Hi ! I'm using hadoop-1.0.3 to run streaming jobs with map/reduce shell scripts such as this bin/hadoop jar ./contrib/streaming/hadoop-streaming-1.0.3.jar -input /input -output /output/015 -mapper streaming-map.sh -reducer streaming-reduce.sh -file /home/hadoop/streaming/streaming-map.sh -file /home/hadoop/streaming-reduce.sh but the job fails and the task attemp log shows this, java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) ... 14 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) ... 17 more Caused by: java.lang.RuntimeException: configuration exception at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:230) at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66) ... 22 more Caused by: java.io.IOException: Cannot run program /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201301041944_0001/attempt_201301041944_0001_m_00_0/work/./streaming-map.sh: java.io.IOException: error=2, No such file or directory at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214) ... 23 more Caused by: java.io.IOException: java.io.IOException: error=2, No such file or directory at java.lang.UNIXProcess.init(UNIXProcess.java:148) at java.lang.ProcessImpl.start(ProcessImpl.java:65) at java.lang.ProcessBuilder.start(ProcessBuilder.java:453) ... 24 more I tried to see what the problem is and found out that the missing file is a symbolic link and hadoop isn't able to create it, in fact the /tmp/hadoop-hadoop/...0_0/work directory doesn't exist at all. here's an exerpt from the task attempt syslog (full text attached): 2013-01-04 19:44:43,304 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201301041944_0001/jars/streaming-map.sh - /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201301041944_0001/attempt_201301041944_0001_m_00_0/work/streaming-map.sh hadoop is thinking that it's successfully created the symbolic link from .job_201301041944_0001/jars/streaming-map.s to job_201301041944_0001/attempt_201301041944_0001_m_00_0/work//streaming-map.s but it actually doesn't. Therefore throwing the error. If you had same experience or knows work around it please comment! otherwise i'll file a jira tomorrow for it seems to be an obvious bug. Best regards, *Benjamin Kim* *benkimkimben at gmail* syslog Description: Binary data
Re: Streaming Job map/reduce not working with scripts on 1.0.3
nevermind. the problem has been fixed. The problem was the trailing {control-v}{control-m} character on the first line of #!/bin/bash (which i blame my teammate for writing the script in windows notepad !!) On Fri, Jan 4, 2013 at 8:09 PM, Ben Kim benkimkim...@gmail.com wrote: Hi ! I'm using hadoop-1.0.3 to run streaming jobs with map/reduce shell scripts such as this bin/hadoop jar ./contrib/streaming/hadoop-streaming-1.0.3.jar -input /input -output /output/015 -mapper streaming-map.sh -reducer streaming-reduce.sh -file /home/hadoop/streaming/streaming-map.sh -file /home/hadoop/streaming-reduce.sh but the job fails and the task attemp log shows this, java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) ... 14 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) ... 17 more Caused by: java.lang.RuntimeException: configuration exception at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:230) at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66) ... 22 more Caused by: java.io.IOException: Cannot run program /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201301041944_0001/attempt_201301041944_0001_m_00_0/work/./streaming-map.sh: java.io.IOException: error=2, No such file or directory at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214) ... 23 more Caused by: java.io.IOException: java.io.IOException: error=2, No such file or directory at java.lang.UNIXProcess.init(UNIXProcess.java:148) at java.lang.ProcessImpl.start(ProcessImpl.java:65) at java.lang.ProcessBuilder.start(ProcessBuilder.java:453) ... 24 more I tried to see what the problem is and found out that the missing file is a symbolic link and hadoop isn't able to create it, in fact the /tmp/hadoop-hadoop/...0_0/work directory doesn't exist at all. here's an exerpt from the task attempt syslog (full text attached): 2013-01-04 19:44:43,304 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201301041944_0001/jars/streaming-map.sh - /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201301041944_0001/attempt_201301041944_0001_m_00_0/work/streaming-map.sh hadoop is thinking that it's successfully created the symbolic link from .job_201301041944_0001/jars/streaming-map.s to job_201301041944_0001/attempt_201301041944_0001_m_00_0/work//streaming-map.s but it actually doesn't. Therefore throwing the error. If you had same experience or knows work around it please comment! otherwise i'll file a jira tomorrow for it seems to be an obvious bug. Best regards, *Benjamin Kim* *benkimkimben at gmail* -- *Benjamin Kim* *benkimkimben at gmail*
Re: use S3 as input to MR job
I'm having a similar issue I'm running a wordcount MR as follows hadoop jar WordCount.jar wordcount.WordCountDriver s3n://bucket/wordcount/input s3n://bucket/wordcount/output s3n://bucket/wordcount/input is a s3 object that contains other input files. However I get following NPE error 12/10/02 18:56:23 INFO mapred.JobClient: map 0% reduce 0% 12/10/02 18:56:54 INFO mapred.JobClient: map 50% reduce 0% 12/10/02 18:56:56 INFO mapred.JobClient: Task Id : attempt_201210021853_0001_m_01_0, Status : FAILED java.lang.NullPointerException at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106) at java.io.BufferedInputStream.close(BufferedInputStream.java:451) at java.io.FilterInputStream.close(FilterInputStream.java:155) at org.apache.hadoop.util.LineReader.close(LineReader.java:83) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) MR runs fine if i specify more specific input path such as s3n://bucket/wordcount/input/file.txt what i want is to be able to pass s3 folders as parameters Does anyone knows how to do this? Best regards, Ben Kim On Fri, Jul 20, 2012 at 10:33 AM, Harsh J ha...@cloudera.com wrote: Dan, Can you share your error? The plain .gz files (not .tar.gz) are natively supported by Hadoop via its GzipCodec, and if you are facing an error, I believe its cause of something other than compression. On Fri, Jul 20, 2012 at 6:14 AM, Dan Yi d...@mediosystems.com wrote: i have a MR job to read file on amazon S3 and process the data on local hdfs. the files are zipped text file as .gz. i tried to setup the job as below but it won't work, anyone know what might be wrong? do i need to add extra step to unzip the file first? thanks. String S3_LOCATION = s3n://access_key:private_key@bucket_name protected void prepareHadoopJob() throws Exception { this.getHadoopJob().setMapperClass(Mapper1.class); this.getHadoopJob().setInputFormatClass(TextInputFormat.class); FileInputFormat.addInputPath(this.getHadoopJob(), new Path(S3_LOCATION)); this.getHadoopJob().setNumReduceTasks(0); this.getHadoopJob().setOutputFormatClass(TableOutputFormat.class); this.getHadoopJob().getConfiguration().set(TableOutputFormat.OUTPUT_TABLE, myTable.getTableName()); this.getHadoopJob().setOutputKeyClass(ImmutableBytesWritable.class); this.getHadoopJob().setOutputValueClass(Put.class); } * Dan Yi | Software Engineer, Analytics Engineering Medio Systems Inc | 701 Pike St. #1500 Seattle, WA 98101 Predictive Analytics for a Connected World * -- Harsh J -- *Benjamin Kim* *benkimkimben at gmail* medio.gif
Re: Error reading task output
Hi I'm having a similar problem so I'll continue on this mailing to describe my issue. I ran a MR job that takes 70GB of input and creates 1098 mappers and 100 Reducers to process tasks. (on 9 node Hadoop cluster) but the job fails and 4 datanode dies after few min (processes are still running, but the master recognize them as dead). When I investigate the job, it looks like 20 mappers fail with these errors ProcfsBasedProcessTree: java.io.IOException: Cannot run program getconf: java.io.IOException: error=11, Resource temporarily unavailable .. OutOfMemoryError: unable to create new native thread .. # There is insufficient memory for the Java Runtime Environment to continue. # Cannot create GC thread. Out of system resources. Reducers also fail because they weren't able to retrieve the failed mapper outputs. I'm guessing for somehow a JVM memory reaches its max and tasktrackers and datanodes aren't able to create new threads, so they die. But as lack of my experience in hadoop, I don't know what's actually causing it. And of course I dun have answers to it yet. here are some *configurations* HADOOP_HEAPSIZE=4096 HADOOP_NAMENODE_OPTS = .. -Xmx2g .. HADOOP_DATANODE_OPTS = .. -Xmx4g .. HADOOP_JOBTRACKER_OPTS = .. -Xmx4g .. dfs.datanode.max.xcievers = 6 mapred.child.java.opts = -Xmx400m mapred.tasktracker.map.tasks.maximum = 14 mapred.tasktracker.reduce.tasks.maximum = 14 also attached the* logs* If anyone knows answers to it please please let me know. I will appreciate anyone help on this. Best regards, Ben On Fri, Jun 15, 2012 at 1:05 PM, Harsh J ha...@cloudera.com wrote: Do you ship a lot of dist-cache files or perhaps have a bad mapred.child.java.opts parameter? On Fri, Jun 15, 2012 at 1:39 AM, Shamshad Ansari sans...@apixio.com wrote: Hi All, When I run hadoop jobs, I observe the following errors. Also, I notice that data node dies every time the job is initiated. Does any one know what may be causing this and how to solve this? == 12/06/14 19:57:17 INFO input.FileInputFormat: Total input paths to process : 1 12/06/14 19:57:17 INFO mapred.JobClient: Running job: job_201206141136_0002 12/06/14 19:57:18 INFO mapred.JobClient: map 0% reduce 0% 12/06/14 19:57:27 INFO mapred.JobClient: Task Id : attempt_201206141136_0002_m_01_0, Status : FAILED java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) Caused by: java.io.IOException: Task process exit with nonzero status of 1. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258) 12/06/14 19:57:27 WARN mapred.JobClient: Error reading task outputhttp://node1:50060/tasklog?plaintext=trueattemptid=attempt_201206141136_0002_m_01_0filter=stdout 12/06/14 19:57:27 WARN mapred.JobClient: Error reading task outputhttp://node1:50060/tasklog?plaintext=trueattemptid=attempt_201206141136_0002_m_01_0filter=stderr 12/06/14 19:57:33 INFO mapred.JobClient: Task Id : attempt_201206141136_0002_r_02_0, Status : FAILED java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) Caused by: java.io.IOException: Task process exit with nonzero status of 1. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258) 12/06/14 19:57:33 WARN mapred.JobClient: Error reading task outputhttp://node1:50060/tasklog?plaintext=trueattemptid=attempt_201206141136_0002_r_02_0filter=stdout 12/06/14 19:57:33 WARN mapred.JobClient: Error reading task outputhttp://node1:50060/tasklog?plaintext=trueattemptid=attempt_201206141136_0002_r_02_0filter=stderr ^Chadoop@ip-10-174-87-251:~/apixio-pipeline/pipeline-trigger$ 12/06/14 19:57:27 WARN mapred.JobClient: Error reading task outputhttp:/node1:50060/sklog?plaintext=trueattemptid=attempt_201206141136_0002_m_01_0filter=stdout Thank you, --Shamshad -- Harsh J -- *Benjamin Kim* *benkimkimben at gmail* datanode.log Description: Binary data mapper.log Description: Binary data reducer.log Description: Binary data tasktracker.log Description: Binary data
Hadoop topology not working (all servers belongs to default rack)
Hi I got my topology script from http://wiki.apache.org/hadoop/topology_rack_awareness_scripts I checked that the script works correctly. But, in the hadoop cluster, all my servers get assigned to the default rack. I'm using hadoop 1.0.3, but had experienced same problem with 1.0.0 version. Yunhong was having the same problem in the past without any resolution. http://mail-archives.apache.org/mod_mbox/hadoop-common-user/200807.mbox/%3cpine.lnx.4.64.0807031453070.28...@bert.cs.uic.edu%3E *Benjamin Kim* *benkimkimben at gmail*