Re: anyone use hadoop+solr?

Yonik Seeley Mon, 06 Sep 2010 07:05:49 -0700

On Mon, Sep 6, 2010 at 9:47 AM, MitchK <mitc...@web.de> wrote:
> are there any discussions about SolrCloud-indexing?


Not recently - personally I've been sidetracked by other stuff.

Mapping docs to shards is the easy part... take a hash of the id, and
then I imagine the shard id (the label for the index) can just be a
range of hashes (i.e. 0x00000000-0xffffffff would be the entire
collection).

The harder part in all this is actually handling updates.
For high availability, updates need to go to multiple hosts (the
replication factor).
But some hosts may be down, and updates may go to any shard replica.
The shards/nodes need to be able to exchange info and fix things up.

> I would be glad to join them, if I find some interesting papers about that
> topic.

Awesome!  Here's some stuff to browse:
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
http://sourceforge.net/mailarchive/forum.php?forum_name=bailey-developers
http://cassandra.apache.org/


-Yonik
http://lucenerevolution.org  Lucene/Solr Conference, Boston Oct 7-8

Re: anyone use hadoop+solr?

Reply via email to