I'm not sure if you are commenting on how Katta does things in that 
LoadBalancers part, but Katta doesn't do that as far as I know.  Passing shard 
URL in request is the Solr thing, but I think we concluded shard URLs can also 
live in "defaults" for the handler, no?


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch




________________________________
From: Noble Paul നോബിള്‍ नोब्ळ् <[EMAIL PROTECTED]>
To: solr-dev@lucene.apache.org
Sent: Wednesday, November 12, 2008 11:06:21 PM
Subject: Re: Katta's goodness for Solr

The way we do distributed search is not straight forward . Introducing
extra layers (LoadBalancers) in between the shards looks like a hack
to me. Moreover , passing in the shard URL in the request is not a
very nice design The clients should be ideally unaware of the fact
that they are doing a distributed search

We must move fast in order to catch up with the developments in other projects .

On Tue, Nov 11, 2008 at 11:45 PM, Otis Gospodnetic
<[EMAIL PROTECTED]> wrote:
> Quick thought.  I saw Stefan's Katta presentation last night.  Katta seems 
> nice and simple.  If I understood correctly, juicy stuff that is interesting 
> to Solr is:
> - Katta has a notion of a Primary Master and N Secondary Slaves (no SPOF 
> there)
> - Search Nodes serve index shards copied locally from some shared storage
> - Zookeeper instances (again Primary Master and N Secondary Slaves) that 
> facilitate communication among distributed components
>
> The master:
> -- knows how to distribute a set of index shards it is given across a number 
> of search nodes (distribution policy pluggable, similar to Hadoop's, but 
> different)
> -- has a map of which shard is on which search node (in Zookeeper)
> -- knows how to replicate each shard (replication factor configurable)
> -- knows when a search node goes down (via Zookeeper notification)
> -- knows how to create more replicas of shards on dead search node (and 
> remove extra replicas when search node is revived)
> -- can notify search nodes when a new index is available (via Zookeeper)
>
> More in:
> http://joa23.files.wordpress.com/2008/09/katta-overview.pdf
>
> Paul Noble will like slide #13 ;)
>
> In particular, I think that:
> - Making use of Zookeper for index snapshot + replication might be useful 
> (Master publishes the info about a new snapshot to Zookier and Search Slaves 
> get notified immediately and start copying the index)
> - Making use of Zookeper for keeping a map of index shards + applying a 
> replication factor would be very useful
> - Making use of pluggable shard placement policy would be useful
>
> Thoughts?
>
> Also:
> While Katta provides shard->search server functionality via pluggable impl, 
> what both Solr and Katta are still missing is the doc->shard functionality.  
> However, this might not be terribly hard if we do something similar to 
> Katta's pluggable shard->search server distribution policy.  Please mind I'm 
> saying this without having looked at any of the Katta code.
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>



-- 
--Noble Paul

Reply via email to