On Sat, Jul 18, 2009 at 00:02, MilleBii<[email protected]> wrote: > Looks great my indexing is now working and I observe a constant memory usage > instead of the ever-growing slope. Thx a lot, why is this patch not in the > standard build ? >
Because I never tested it very well so I never got to commit the patch. I will try to review it before 1.1 and hopefully include it in next release. Anyway, I am glad it solves your problem. > I just get some weird message in ANT/eclipse > [jar] Warning: skipping jar archive > C:\xxx\workspace\nutch\build\nutch-extensionpoints\nutch-extensionpoints.jar > because no files were included. > [jar] Building MANIFEST-only jar: > C:\xxx\workspace\nutch\build\nutch-extensionpoints\nutch-extensionpoints.jar > Not sure what that means. > > > 2009/7/17 Doğacan Güney <[email protected]> > >> On Fri, Jul 17, 2009 at 00:30, MilleBii<[email protected]> wrote: >> > Just trying indexing a smaller segment 300k URLs ... and the memory is >> just >> > going up and up... but it does NOT hit the physical boundary limit. >> Sounds >> > like a "memory leak" ??? >> > How come I thought Java was doing the garbage collection automatically >> ???? >> > >> >> Can you try the patch at >> >> https://issues.apache.org/jira/browse/NUTCH-356 >> >> (try cache_classes.patch) >> >> > >> > 2009/7/16 MilleBii <[email protected]> >> > >> >> I get more details now for my error. >> >> What can I do about it, I have 4GB of memory, but it is not fully used >> (I >> >> think). >> >> I use cygwin/windows/local filesystem >> >> >> >> java.lang.OutOfMemoryError: Java heap space >> >> at >> >> >> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:498) >> >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) >> >> at >> >> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138) >> >> >> >> ---------- Forwarded message ---------- >> >> From: MilleBii <[email protected]> >> >> Date: 2009/7/15 >> >> Subject: Errorr when using language-identifier plugin ? >> >> To: [email protected] >> >> >> >> >> >> I decided to add the language-identifier plugin... but I get the >> following >> >> error when I start indexing my crawldb. Not really explicit. >> >> If I remove it works just fine. I tried on a smaller crawl database >> that I >> >> use for testing and it works fine too. >> >> Any idea where to look for ? >> >> >> >> >> >> 2009-07-15 16:19:54,875 WARN mapred.LocalJobRunner - job_local_0001 >> >> 2009-07-15 16:19:54,891 FATAL indexer.Indexer - Indexer: >> >> java.io.IOException: Job failed! >> >> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232) >> >> at org.apache.nutch.indexer.Indexer.index(Indexer.java:72) >> >> at org.apache.nutch.indexer.Indexer.run(Indexer.java:92) >> >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> >> at org.apache.nutch.indexer.Indexer.main(Indexer.java:101) >> >> >> >> >> >> >> >> -- >> >> -MilleBii- >> >> >> >> >> >> >> >> -- >> >> -MilleBii- >> >> >> > >> > >> > >> > -- >> > -MilleBii- >> > >> >> >> >> -- >> Doğacan Güney >> > > > > -- > -MilleBii- > -- Doğacan Güney
