Re: Spliting the Lucene

Doug Cutting Fri, 08 Dec 2006 12:22:10 -0800

howard chen wrote:

Can you suggest if using Hadoop + Lucene, how to make a simple
distributed indexing & searching program, i.e. what are the mapping /
reducing processes involved in both indexing abd searching?


There is not yet a universal, best practice for this.

Nutch provides an example of how to use Lucene for distributed indexing.Nutch's current distributed search implementation builds on Hadoop'sRPC mechanism, but is not based on Hadoop's MapReduce.


http://lucene.apache.org/nutch/apidocs/org/apache/nutch/searcher/DistributedSearch.html

There has been some discussion of MapReduce-based distributed search onthe Nutch lists, e.g.:


http://mail-archives.apache.org/mod_mbox/lucene-nutch-user/200604.mbox/[EMAIL 
PROTECTED]

I think Andrzej Bialecki has explored this approach some.

Another approach is to build a non-MapReduce-based system specificallyfor supporting distributed search and indexing. I started a discussionabout this a few months ago and hope to start work on this projectbefore long.


http://www.nabble.com/-PROPOSAL--index-server-project-tf2469695.html

Doug



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Spliting the Lucene

Reply via email to