Thanks for your answer Remi!
That is another issue. I run my crawls through eclipse and not through the 
standard script. I changed Run Configurations and added in Arguments tab/ VM 
Arguments this : -Xms512M -Xmx2048M.

> Date: Fri, 21 Mar 2014 17:12:21 +0800
> Subject: Re: Java Heap Space error
> From: [email protected]
> To: [email protected]
> 
> Hi,
> 
> JAVA_HEAP_MAX value can be modified in the bin/nutch script
> 
> Remi
> 
> 
> On Thu, Mar 20, 2014 at 11:11 PM, Vangelis karv 
> <[email protected]>wrote:
> 
> > I managed to crawl again but I have something else now:
> >
> > https://www.dropbox.com/s/853xf1evi8sb51v/error .
> >
> > Also, I found this :
> > 2014-03-20 14:04:33,885 INFO  mapreduce.GoraRecordWriter - Flushing the
> > datastore after 20000 records.
> >
> > Thank you in advance!
> >
> > From: [email protected]
> > To: [email protected]
> > Subject: Java Heap Space error
> > Date: Thu, 20 Mar 2014 10:59:27 +0200
> >
> >
> >
> >
> > Hello everybody! Yesterday, I tried to run a crawl at depth 5 and topN
> > 120000. In the middle of the 5th depth I got this error:
> >
> > 2014-03-19 19:16:11,608 WARN  fetcher.FetcherJob - fetch of
> > http://www.weather.com/outlook/health/allergies/common/allergens/FL-allergen-716failed
> >  with: java.lang.OutOfMemoryError: Java heap space
> > 2014-03-19 19:16:11,608 INFO  fetcher.FetcherJob - fetching
> > http://www.weather.com/outlook/health/allergies/pollenalert/USCA9000(queue 
> > crawl delay=0ms)
> > 2014-03-19 19:16:22,291 ERROR http.Http - Failed with the following error:
> > java.lang.OutOfMemoryError: Java heap space
> > 2014-03-19 19:16:24,677 INFO  fetcher.FetcherJob - fetching
> > http://www.weather.com/outlook/recreation/outdoors/fishing/29547:21(queue 
> > crawl delay=0ms)
> > 2014-03-19 19:16:24,677 WARN  fetcher.FetcherJob - fetch of
> > http://www.weather.com/outlook/health/allergies/pollenalert/USCA9000failed 
> > with: java.lang.OutOfMemoryError: Java heap space
> > 2014-03-19 19:16:33,550 ERROR http.Http - Failed with the following error:
> > java.lang.OutOfMemoryError: Java heap space
> > 2014-03-19 19:16:35,568 INFO  fetcher.FetcherJob - fetching
> > http://www.weather.com/outlook/health/allergies/common/allergens/NV-allergen-1187(queue
> >  crawl delay=0ms)
> > 2014-03-19 19:16:35,568 WARN  fetcher.FetcherJob - fetch of
> > http://www.weather.com/outlook/recreation/outdoors/fishing/29547:21failed 
> > with: java.lang.OutOfMemoryError: Java heap space
> > 2014-03-19 19:16:41,928 ERROR http.Http - Failed with the following error:
> > java.lang.OutOfMemoryError: Java heap space
> > 2014-03-19 19:16:43,535 INFO  fetcher.FetcherJob - fetching
> > http://www.weather.com/outlook/health/allergies/common/allergens/OH-allergen-928(queue
> >  crawl delay=0ms)
> > 2014-03-19 19:16:43,535 WARN  fetcher.FetcherJob - fetch of
> > http://www.weather.com/outlook/health/allergies/common/allergens/NV-allergen-1187failed
> >  with: java.lang.OutOfMemoryError: Java heap space
> > 2014-03-19 19:16:50,432 ERROR http.Http - Failed with the following error:
> > java.lang.OutOfMemoryError: Java heap space
> > 2014-03-19 19:16:50,888 WARN  fetcher.FetcherJob - fetch of
> > http://www.weather.com/outlook/health/allergies/common/allergens/OH-allergen-928failed
> >  with: java.lang.OutOfMemoryError: Java heap space
> > 2014-03-19 19:16:51,580 INFO  fetcher.FetcherJob - fetching
> > http://www.weather.com/outlook/health/allergies/common/allergens/FL-allergen-235(queue
> >  crawl delay=0ms)
> > 2014-03-19 19:16:53,120 ERROR http.Http - Failed with the following error:
> > 2014-03-19 19:16:53,711 INFO  fetcher.FetcherJob - fetching
> > http://www.weather.com/outlook/recreation/outdoors/fishing/27891:21(queue 
> > crawl delay=0ms)
> > 2014-03-19 19:16:54,659 INFO  fetcher.FetcherJob - -finishing thread
> > FetcherThread20, activeThreads=46
> > 2014-03-19 19:17:06,734 INFO  fetcher.FetcherJob - -finishing thread
> > FetcherThread48, activeThreads=44
> > 2014-03-19 19:17:08,348 ERROR http.Http - Failed with the following error:
> > java.lang.OutOfMemoryError: Java heap space
> >
> > As you can see, I have problems with the Java heap space. I ran this crawl
> > using Nutch 2.2.1, Eclipse and MySQL.
> >
> > Any ideas on how to solve this thing?
> > Recently, I changed metadata field from blob to longblob and put
> > http.content.limit to -1 (None of them caused any trouble so far though).
> >
> >
> >
                                          

Reply via email to