I'm not sure if you are commenting on how Katta does things in that LoadBalancers part, but Katta doesn't do that as far as I know. Passing shard URL in request is the Solr thing, but I think we concluded shard URLs can also live in "defaults" for the handler, no?
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ________________________________ From: Noble Paul നോബിള് नोब्ळ् <[EMAIL PROTECTED]> To: solr-dev@lucene.apache.org Sent: Wednesday, November 12, 2008 11:06:21 PM Subject: Re: Katta's goodness for Solr The way we do distributed search is not straight forward . Introducing extra layers (LoadBalancers) in between the shards looks like a hack to me. Moreover , passing in the shard URL in the request is not a very nice design The clients should be ideally unaware of the fact that they are doing a distributed search We must move fast in order to catch up with the developments in other projects . On Tue, Nov 11, 2008 at 11:45 PM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > Quick thought. I saw Stefan's Katta presentation last night. Katta seems > nice and simple. If I understood correctly, juicy stuff that is interesting > to Solr is: > - Katta has a notion of a Primary Master and N Secondary Slaves (no SPOF > there) > - Search Nodes serve index shards copied locally from some shared storage > - Zookeeper instances (again Primary Master and N Secondary Slaves) that > facilitate communication among distributed components > > The master: > -- knows how to distribute a set of index shards it is given across a number > of search nodes (distribution policy pluggable, similar to Hadoop's, but > different) > -- has a map of which shard is on which search node (in Zookeeper) > -- knows how to replicate each shard (replication factor configurable) > -- knows when a search node goes down (via Zookeeper notification) > -- knows how to create more replicas of shards on dead search node (and > remove extra replicas when search node is revived) > -- can notify search nodes when a new index is available (via Zookeeper) > > More in: > http://joa23.files.wordpress.com/2008/09/katta-overview.pdf > > Paul Noble will like slide #13 ;) > > In particular, I think that: > - Making use of Zookeper for index snapshot + replication might be useful > (Master publishes the info about a new snapshot to Zookier and Search Slaves > get notified immediately and start copying the index) > - Making use of Zookeper for keeping a map of index shards + applying a > replication factor would be very useful > - Making use of pluggable shard placement policy would be useful > > Thoughts? > > Also: > While Katta provides shard->search server functionality via pluggable impl, > what both Solr and Katta are still missing is the doc->shard functionality. > However, this might not be terribly hard if we do something similar to > Katta's pluggable shard->search server distribution policy. Please mind I'm > saying this without having looked at any of the Katta code. > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > -- --Noble Paul