Re: Fetcher keeps having heap-size problems

2011-06-16 Thread Markus Jelsma
Are you parsing during the fetch cycle? On Thursday 16 June 2011 13:53:22 MilleBii wrote: The fetcher is creating me weird problems on the master node only and not on data node despite the following actions : + increased HADOOP_HEAPSIZE from 1000 to 2000 + reduced the number of threads +

Problem with Nutch Search

2011-06-16 Thread Jefferson
Hi I'm testing the nutch, I followed the tutorial in the nutch, but I found a problem. I ran the command bin / nutch crawl 6 sites in plain text that contains only about 400 lines of text, so far so normal. When I do a search with Nutch, he sweeps up about 50 lines after that he does not sweep

Re: Problem with Nutch Search

2011-06-16 Thread lewis john mcgibbney
Off the top of my head one property springs to mind. Which you may or may not have configured in nutch-site http.content.limit However I think that this is not the source of the problem. I would advise you to have a look at your hadoop log file for any obvious warnings... how do you know he