I believe there is a subproject over at Hadoop for doing distributed
stuff w/ Lucene, but I am not sure if they are doing search side, only
indexing. I was always under the impression that it was too slow for
search side, as I don't think Nutch even uses it for the search side
of the equation, but I don't know if that is still the case.
On Jul 10, 2008, at 10:16 PM, Jason Rutherglen wrote:
Has anyone taken a look at using Hadoop RPC for enabling distributed
Lucene? I am thinking it would implement the Searchable interface
and use serialization to be compatible with the current RMI
version. Somewhat defeats the purpose of using Hadoop RPC and
serialization however Hadoop RPC scales far beyond what RMI can at
the networking level. RMI uses a thread per socket and has
reportedly has latency issues. Hadoop RPC uses NIO and is proven to
scale to thousands of servers. Serialization unfortunately must be
used with Lucene due to the Weight, Query and Filter classes. There
could be an extended version of Searchable that allows passing
Weight, Query, and Filter classes that implement Hadoop's Writeable
interface if a user wants to bypass using serialization.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]