> -----Original Message-----
> From: Doug Cutting [mailto:[EMAIL PROTECTED] 
> Sent: Monday, April 17, 2006 1:27 PM
> To: [email protected]
> Subject: Re: Using Nutch's distributed search server mode
>
> Search performance is not good with DFS-based indexes & segments.  
> This is not recommended.
>
> Distributed search is not meant for a single merged index, but 
> rather for searching multiple indexes.  With distributed search, 
> each node will typically have (a local copy of) a few segments and 
> either a merged index for just those segments, or separate indexes 
> for each segment.

I don't quite understand how to set up distributed searching with
relation to DFS (and the Tom White documents don't discuss this either).
There are three databases with relation to Nutch:

1. Web database (dfs)
2. Segments (regular fs)
3. The index (regular fs)

>From your message above, I assume that the segments and index go in the
regular file system and the web database is distributed across dfs. We
put only a portion of the segments and index on each node and the search
is distributed from Tomcat to all the nodes at once.

If we don't use DFS for the segments and index, we'll lose the
redundancy if a node is dead and we may lose search results. Is this
true?

Also, how does the Tomcat engine know which nodes to send the search to?

Thank you very much.


-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0709&bid&3057&dat1642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to