howard chen wrote:
Can you suggest if using Hadoop + Lucene, how to make a simple
distributed indexing & searching program, i.e. what are the mapping /
reducing processes involved in both indexing abd searching?

There is not yet a universal, best practice for this.

Nutch provides an example of how to use Lucene for distributed indexing. Nutch's current distributed search implementation builds on Hadoop's RPC mechanism, but is not based on Hadoop's MapReduce.

http://lucene.apache.org/nutch/apidocs/org/apache/nutch/searcher/DistributedSearch.html

There has been some discussion of MapReduce-based distributed search on the Nutch lists, e.g.:

http://mail-archives.apache.org/mod_mbox/lucene-nutch-user/200604.mbox/[EMAIL 
PROTECTED]

I think Andrzej Bialecki has explored this approach some.

Another approach is to build a non-MapReduce-based system specifically for supporting distributed search and indexing. I started a discussion about this a few months ago and hope to start work on this project before long.

http://www.nabble.com/-PROPOSAL--index-server-project-tf2469695.html

Doug



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to