That issue gets a lot of discussion on this list and some folks
have come up with their own workarounds.  Those generally involve
different implementations of the search bean.  I haven't heard of any
definitive solution for the next release.

Jake.

-----Original Message-----
From: Sugra Llistaire [mailto:[EMAIL PROTECTED] 
Sent: Thursday, February 23, 2006 10:00 AM
To: [email protected]
Subject: Re: Simple indexation and reindexation


Thanks Jacob, for the help.
It is a pity the results of the previous crawl must be removed. 
Specially because it's a problem to restart the container (JBoss, in my 
case).
Is this a feature inherited from lucene? Or maybe this will be improved 
in the future?

Thanks again.

En/na Vanderdray, Jacob ha escrit:

>       If you look at the section of the tutorial for doing intranet
>crawls, you should be able to use that for your small number of
>websites.  The bin/nutch script wraps up all the crawl functions for
you
>(fetching, indexing, deduping, etc).  You'll just need to delete the
>results of your previous day's crawl, copy over the results of the new
>crawl and restart tomcat each night.
>
>Jake.
>
>-----Original Message-----
>From: Sugra Llistaire [mailto:[EMAIL PROTECTED] 
>Sent: Thursday, February 23, 2006 4:55 AM
>To: [email protected]
>Subject: Simple indexation and reindexation
>
>
>Hello,
>I have a small number of websites to be indexed. Formerly, my search 
>engine was udmGoSearch. But I'm glad to see there is this J2EE search 
>engine.
>But I'm trying to emulate process of udm search with nutch and it 
>doesn't seem to be possible.
>
>The system was simple.
>First day, I indexed the web site.
>Nightly, I executed a script to reindex the website.
>I didn't have to think in fetching,  duplicating, injecting. All this 
>was included in udm's script.
>Of course, it is unavoidable, reconfiguring urls filters and all that
>stuff.
>Is it possible to use nutch with this easy process? Has anyone 
>implemented the script that makes all this job? A first indexation 
>script and a nightly reindexation script.
>
>Thanks in advance.
>  
>

Reply via email to