Hi everyone! When I read the source code of Nutch,there is one thing that made me confused.That is about distributed search.
In my opinion,the way that implement achieves distributed search in Nutch is to divide the big index into many small indexes and then search them parallelly via several computers.But in the code,I saw the following: public IndexSearcher(Path[] indexDirs, Configuration conf) throws IOException { IndexReader[] readers = new IndexReader[indexDirs.length]; this.conf = conf; this.fs = FileSystem.get(conf); for (int i = 0; i < indexDirs.length; i++) { readers[i] = IndexReader.open(getDirectory(indexDirs[i])); } init(new MultiReader(readers), conf); } As you know, in DFS environment, the index has probably been saved in other machines. Thus, if you want to read it, you have to via the network. Here comes the problem: the speed of network is much slower than that of local disk, so I think it will take the seach too long. I guess the developer have considerd this issue, and I’m eager to know how it works? -- View this message in context: http://www.nabble.com/About-Nutch-distributed-search-implement-tp21306743p21306743.html Sent from the Nutch - Dev mailing list archive at Nabble.com.