Thanks Jason. I went inside the code of the statement and found out that it eventually makes some binaryRead function call to read a binary file and there it strucks.
Do you know whether there is any problem in giving a binary file for addition to the distributed cache. In the statement DistributedCache.addCacheFile(new URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf); Data is a directory which contains some text as well as some binary files. In the statement Parameters.readConfigAndLoadExternalData("Config/allLayer1.config"); I can see(in the output messages) that it is able to read the text files but it gets struck at the binary files. So, I think here the problem is: it is not able to read the binary files which either have not been transferred to the cache or a binary file cannot be read. Do you know the solution to this? Thanks, Akhil jason hadoop wrote: > > Something is happening inside of your (Parameters. > readConfigAndLoadExternalData("Config/allLayer1.config");) > code, and the framework is killing the job for not heartbeating for 600 > seconds > > On Tue, Jun 16, 2009 at 8:32 PM, akhil1988 <akhilan...@gmail.com> wrote: > >> >> One more thing, finally it terminates there (after some time) by giving >> the >> final Exception: >> >> java.io.IOException: Job failed! >> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217) >> at LbjTagger.NerTagger.main(NerTagger.java:109) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at org.apache.hadoop.util.RunJar.main(RunJar.java:165) >> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68) >> >> >> akhil1988 wrote: >> > >> > Thank you Jason for your reply. >> > >> > My Map class is an inner class and it is a static class. Here is the >> > structure of my code. >> > >> > public class NerTagger { >> > >> > public static class Map extends MapReduceBase implements >> > Mapper<LongWritable, Text, Text, Text>{ >> > private Text word = new Text(); >> > private static NETaggerLevel1 tagger1 = new >> > NETaggerLevel1(); >> > private static NETaggerLevel2 tagger2 = new >> > NETaggerLevel2(); >> > >> > Map(){ >> > System.out.println("HI2\n"); >> > >> > Parameters.readConfigAndLoadExternalData("Config/allLayer1.config"); >> > System.out.println("HI3\n"); >> > >> > Parameters.forceNewSentenceOnLineBreaks=Boolean.parseBoolean("true"); >> > >> > System.out.println("loading the tagger"); >> > >> > >> tagger1=(NETaggerLevel1)Classifier.binaryRead(Parameters.pathToModelFile+".level1"); >> > System.out.println("HI5\n"); >> > >> > >> tagger2=(NETaggerLevel2)Classifier.binaryRead(Parameters.pathToModelFile+".level2"); >> > System.out.println("Done- loading the tagger"); >> > } >> > >> > public void map(LongWritable key, Text value, >> > OutputCollector<Text, Text> output, Reporter reporter ) throws >> IOException >> > { >> > String inputline = value.toString(); >> > >> > /* Processing of the input pair is done here */ >> > } >> > >> > >> > public static void main(String [] args) throws Exception { >> > JobConf conf = new JobConf(NerTagger.class); >> > conf.setJobName("NerTagger"); >> > >> > conf.setOutputKeyClass(Text.class); >> > conf.setOutputValueClass(IntWritable.class); >> > >> > conf.setMapperClass(Map.class); >> > conf.setNumReduceTasks(0); >> > >> > conf.setInputFormat(TextInputFormat.class); >> > conf.setOutputFormat(TextOutputFormat.class); >> > >> > conf.set("mapred.job.tracker", "local"); >> > conf.set("fs.default.name", "file:///"); >> > >> > DistributedCache.addCacheFile(new >> > URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf); >> > DistributedCache.addCacheFile(new >> > URI("/home/akhil1988/Ner/OriginalNer/Config/"), conf); >> > DistributedCache.createSymlink(conf); >> > >> > >> > conf.set("mapred.child.java.opts","-Xmx4096m"); >> > >> > FileInputFormat.setInputPaths(conf, new Path(args[0])); >> > FileOutputFormat.setOutputPath(conf, new >> Path(args[1])); >> > >> > System.out.println("HI1\n"); >> > >> > JobClient.runJob(conf); >> > } >> > >> > Jason, when the program executes HI1 and HI2 are printed but it does >> not >> > reaches HI3. In the statement >> > Parameters.readConfigAndLoadExternalData("Config/allLayer1.config"); it >> is >> > able to access Config/allLayer1.config file (as while executing this >> > statement, it prints some messages like which data it is loading, etc.) >> > but it gets stuck there(while loading some classifier) and never >> reaches >> > HI3. >> > >> > This program runs fine when executed normally(without mapreduce). >> > >> > Thanks, Akhil >> > >> > >> > >> > >> > jason hadoop wrote: >> >> >> >> Is it possible that your map class is an inner class and not static? >> >> >> >> On Tue, Jun 16, 2009 at 10:51 AM, akhil1988 <akhilan...@gmail.com> >> wrote: >> >> >> >>> >> >>> Hi All, >> >>> >> >>> I am running my mapred program in local mode by setting >> >>> mapred.jobtracker.local to local mode so that I can debug my code. >> >>> The mapred program is a direct porting of my original sequential >> code. >> >>> There >> >>> is no reduce phase. >> >>> Basically, I have just put my program in the map class. >> >>> >> >>> My program takes around 1-2 min. in instantiating the data objects >> which >> >>> are >> >>> present in the constructor of Map class(it loads some data model >> files, >> >>> therefore it takes some time). After the instantiation part in the >> >>> constrcutor of Map class the map function is supposed to process the >> >>> input >> >>> split. >> >>> >> >>> The problem is that the data objects do not get instantiated >> completely >> >>> and >> >>> in between(whlie it is still in constructor) the program stops giving >> >>> the >> >>> exceptions pasted at bottom. >> >>> The program runs fine without mapreduce and does not require more >> than >> >>> 2GB >> >>> memory, but in mapreduce even after doing export >> HADOOP_HEAPSIZE=2500(I >> >>> am >> >>> working on machines with 16GB RAM), the program fails. I have also >> set >> >>> HADOOP_OPTS="-server -XX:-UseGCOverheadLimit" as sometimes I was >> getting >> >>> GC >> >>> Overhead Limit Exceeded exceptions also. >> >>> >> >>> Somebody, please help me with this problem: I have trying to debug it >> >>> for >> >>> the last 3 days, but unsuccessful. Thanks! >> >>> >> >>> java.lang.OutOfMemoryError: Java heap space >> >>> at >> >>> sun.misc.FloatingDecimal.toJavaFormatString(FloatingDecimal.java:889) >> >>> at java.lang.Double.toString(Double.java:179) >> >>> at java.text.DigitList.set(DigitList.java:272) >> >>> at java.text.DecimalFormat.format(DecimalFormat.java:584) >> >>> at java.text.DecimalFormat.format(DecimalFormat.java:507) >> >>> at java.text.NumberFormat.format(NumberFormat.java:269) >> >>> at >> >>> >> org.apache.hadoop.util.StringUtils.formatPercent(StringUtils.java:110) >> >>> at >> org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1147) >> >>> at LbjTagger.NerTagger.main(NerTagger.java:109) >> >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >>> at >> >>> >> >>> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> >>> at >> >>> >> >>> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> >>> at java.lang.reflect.Method.invoke(Method.java:597) >> >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:165) >> >>> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54) >> >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >> >>> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68) >> >>> >> >>> 09/06/16 12:34:41 WARN mapred.LocalJobRunner: job_local_0001 >> >>> java.lang.RuntimeException: >> java.lang.reflect.InvocationTargetException >> >>> at >> >>> >> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:81) >> >>> at >> >>> org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) >> >>> at >> >>> >> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58) >> >>> at >> >>> >> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83) >> >>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) >> >>> at >> >>> >> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138) >> >>> Caused by: java.lang.reflect.InvocationTargetException >> >>> at >> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native >> >>> Method) >> >>> at >> >>> >> >>> >> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) >> >>> at >> >>> >> >>> >> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) >> >>> at >> >>> java.lang.reflect.Constructor.newInstance(Constructor.java:513) >> >>> at >> >>> >> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:79) >> >>> ... 5 more >> >>> Caused by: java.lang.ThreadDeath >> >>> at java.lang.Thread.stop(Thread.java:715) >> >>> at >> >>> >> org.apache.hadoop.mapred.LocalJobRunner.killJob(LocalJobRunner.java:310) >> >>> at >> >>> >> org.apache.hadoop.mapred.JobClient$NetworkedJob.killJob(JobClient.java:315) >> >>> at >> org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1224) >> >>> at LbjTagger.NerTagger.main(NerTagger.java:109) >> >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >>> at >> >>> >> >>> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> >>> at >> >>> >> >>> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> >>> at java.lang.reflect.Method.invoke(Method.java:597) >> >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:165) >> >>> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54) >> >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >> >>> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68) >> >>> >> >>> -- >> >>> View this message in context: >> >>> >> http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24059508.html >> >>> Sent from the Hadoop core-user mailing list archive at Nabble.com. >> >>> >> >>> >> >> >> >> >> >> -- >> >> Pro Hadoop, a book to guide you from beginner to hadoop mastery, >> >> http://www.amazon.com/dp/1430219424?tag=jewlerymall >> >> www.prohadoopbook.com a community for Hadoop Professionals >> >> >> >> >> > >> > >> >> -- >> View this message in context: >> http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24066426.html >> Sent from the Hadoop core-user mailing list archive at Nabble.com. >> >> > > > -- > Pro Hadoop, a book to guide you from beginner to hadoop mastery, > http://www.amazon.com/dp/1430219424?tag=jewlerymall > www.prohadoopbook.com a community for Hadoop Professionals > > -- View this message in context: http://www.nabble.com/Nor-%22OOM-Java-Heap-Space%22-neither-%22GC-OverHead-Limit-Exeeceded%22-tp24059508p24074211.html Sent from the Hadoop core-user mailing list archive at Nabble.com.