> -----Original Message----- > From: Doug Cutting [mailto:[EMAIL PROTECTED] > Sent: Monday, April 17, 2006 1:27 PM > To: [email protected] > Subject: Re: Using Nutch's distributed search server mode > > Search performance is not good with DFS-based indexes & segments. > This is not recommended. > > Distributed search is not meant for a single merged index, but > rather for searching multiple indexes. With distributed search, > each node will typically have (a local copy of) a few segments and > either a merged index for just those segments, or separate indexes > for each segment.
I don't quite understand how to set up distributed searching with relation to DFS (and the Tom White documents don't discuss this either). There are three databases with relation to Nutch: 1. Web database (dfs) 2. Segments (regular fs) 3. The index (regular fs) >From your message above, I assume that the segments and index go in the regular file system and the web database is distributed across dfs. We put only a portion of the segments and index on each node and the search is distributed from Tomcat to all the nodes at once. If we don't use DFS for the segments and index, we'll lose the redundancy if a node is dead and we may lose search results. Is this true? Also, how does the Tomcat engine know which nodes to send the search to? Thank you very much. ------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid0709&bid&3057&dat1642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
