Re: [Nutch-general] OutOfMemoryError - Why should the while(1) loop stop?

2007-05-31 Thread Vishal Shah
Hi Manoharam, You can use the parse command to parse a segment after it is fetched with -noParsing option. The result will be equivalent to running fetch without the noparsing option. In your nutch installation directory, try the command bin/nutch. It will give you the usage for the parse

Re: [Nutch-general] OutOfMemoryError - Why should the while(1) loop stop?

2007-05-31 Thread Manoharam Reddy
Thanks. I do my crawl using the Intranet Recrawl script available in the wiki. I have put these statements in a loop iterating 10 times. 1. bin/nutch generate crawl/crawldb crawl/segments -topN 1000 2. seg1=`ls -d crawl/segments/* | tail -1` 3. bin/nutch fetch $seg1 -threads 50 4. bin/nutch

Re: [Nutch-general] OutOfMemoryError - Why should the while(1) loop stop?

2007-05-31 Thread Doğacan Güney
On 5/31/07, Manoharam Reddy [EMAIL PROTECTED] wrote: Thanks. I do my crawl using the Intranet Recrawl script available in the wiki. I have put these statements in a loop iterating 10 times. 1. bin/nutch generate crawl/crawldb crawl/segments -topN 1000 2. seg1=`ls -d crawl/segments/* |

[Nutch-general] OutOfMemoryError - Why should the while(1) loop stop?

2007-05-30 Thread Manoharam Reddy
Time and again I get this error and as a result the segment remains incomplete. This wastes one iteration of the for() loop in which I am doing generate, fetch and update. Can someone please tell me what are the measures I can take to avoid this error? And isn't it possible to make some code

Re: [Nutch-general] OutOfMemoryError - Why should the while(1) loop stop?

2007-05-30 Thread Doğacan Güney
On 5/30/07, Manoharam Reddy [EMAIL PROTECTED] wrote: Time and again I get this error and as a result the segment remains incomplete. This wastes one iteration of the for() loop in which I am doing generate, fetch and update. Can someone please tell me what are the measures I can take to avoid

Re: [Nutch-general] OutOfMemoryError - Why should the while(1) loop stop?

2007-05-30 Thread Manoharam Reddy
If I run fetcher in non-parsing mode how can I later parse the pages so that ultimately when a user searches in the Nutch search engine, he can see the content of PDF files, etc as summary? Please help or point me to proper articles or wiki where I can learn this. On 5/30/07, Doğacan Güney [EMAIL