Or look at Solr and its distributed search: http://search-lucene.com/?q=distributed+search&fc_project=Solr
it has nothing to do with Hadoop, though. Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ ----- Original Message ---- > From: Ian Soboroff <ian.sobor...@nist.gov> > To: common-user@hadoop.apache.org > Sent: Mon, May 17, 2010 1:42:55 PM > Subject: Re: Build a indexing and search service with Hadoop > > > href="http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/search/ParallelMultiSearcher.html" > > target=_blank > >http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/search/ParallelMultiSearcher.html Ian Aécio > < > href="mailto:aecio.sola...@gmail.com">aecio.sola...@gmail.com> > writes: > Thanks for the replies. > > I'm already > investigating how katta works and how I can extend it. > What do you mean > by distributed search capability? Lucene provives any way > to "merge" > hits from diferent indexes? > > 2010/5/14 Ian Soboroff < > ymailto="mailto:ian.sobor...@nist.gov" > href="mailto:ian.sobor...@nist.gov">ian.sobor...@nist.gov> > >> > Aécio < > href="mailto:aecio.sola...@gmail.com">aecio.sola...@gmail.com> > writes: >> >> > 2. Search >> > - The query > received is used as input of the map function. This function >> > > would search the document on the local shard using our custom library > and >> > emit the hits. The reduce function would group the hits > from all shards. >> >> There is no way you can do interactive > searches via MapReduce in Hadoop, >> because the JVM start time will > kill you. If your shard backend is >> Lucene, just use the > distributed search capability already there, or >> look at > Katta. >> >> Ian >>