My reply to this feature of searching multiple indexes with a single
instance of Nutch has bounced because of an attachment.

To search multiple indexes with a single instance of Nutch:

- I modified web.xml to include the paths to various search indexes
- Modified Nutch.java to read all the indexes and create IndexReaders
- Modified IndexSearcher.java to handle multiple IndexReaders

https://issues.apache.org/jira/browse/NUTCH-480 contains the patch.

In the attached file you will find the patch to the Nutch 0.8 code
base and also the newly added files:

- SearchServlet - a servlet that is the web interface for search. This
is simplified version of jsp versions (without the i18n) and outputs
the results in text, xml or json format.
- SearchConstants - an interface for messages and constants

Please note that the patch includes the functionality for spell check
- aka "Did you mean?"

With this implementation, you may add check boxes to the search page
for each index that you are hosting for search, by reading the web.xml
file. With check boxes, user can narrow or widen the search across all
the indexes. The results page can also display the number of hits in
each index.

Hope this helps.

- Ravi

On 5/3/07, visava <[EMAIL PROTECTED]> wrote:
>
> One way is to use separate crawls and indexes and store them in different
> directories
> e.g. /usr/home/idxdir1
>       /usr/home/idxdir2
>
> you can then use 2 different nutch-site.xml files (e.g. nutch-site.xml ,
> nutch-site2.xml)
> For the first search you can use the default nutch-site.xml and point it to
> first index directory and you can use the default search.jsp that was
> provided.
>
> For searching the second index use nutch-site2.xml and point it to second
> index directory.
> Then use search2.jsp which is a copy of search.jsp with following
> modifications.
>
> /*
>  Comment this original line of code and use code below.
>      Configuration nutchConf = NutchConfiguration.get(application);
> */
>
> Configuration nutchConf = application.getAttribute("myconfig");
> if (nutchConf  == null) {
>         nutchConf = new Configuration();
>         nutchConf.addDefaultResource("nutch-default.xml");
>         nutchConf.addFinalResource("nutch-site2.xml");
>                 application.setAttribute("myconfig",nutchConf);
> }
>
> You can extend this idea to as many different indexes as you want.Note I
> have used this with 0.8 version.
> If you look at the source code for NutchConfiguration.java you will get an
> idea about the code above
> and you can do something similar in other versions if it is different.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Maher wrote:
> >
> > Hello everybody,
> >
> >  I'm building a little documents search engine using Nutch 0.7 and Tomcat
> > 5.5.16 and I'm wondering if it can handle 3 different indexes (db), one
> > for each of the three types of documents I'm going to crawl ? So that I
> > can have three independant db and I can search in each of them from a
> > single front end page.
> >
> >  The main problem is the path to the index in nutch-site.xml
> > (searcher.dir) how to use 3 different paths...etc.
> >
> >  Thanks
> >
> >
> > ---------------------------------
> >  Yahoo! Mail réinvente le mail ! Découvrez le nouveau Yahoo! Mail et son
> > interface révolutionnaire.
> >
>
> --
> View this message in context: 
> http://www.nabble.com/How-to-use-multiple-indexes-tf1884905.html#a10311229
> Sent from the Nutch - User mailing list archive at Nabble.com.
>
>

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to