Re: Solr Cloud wiki and branch notes

2010-01-17 Thread Ted Dunning
Control is easily retained if you make pluggable the selection of shards to which you want to do the horizontal broadcast. The shard management layer shouldn't know or care what query you are doing and in most cases it should just use the trivial all-shards selection policy. On Sun, Jan 17, 2010

Re: Solr Cloud wiki and branch notes

2010-01-17 Thread Ted Dunning
+1 Hadoop still calls it a copy of a block if you have replication factor of 1. Why not? (for that matter, I still call it an integer if it has a value of 1) On Sun, Jan 17, 2010 at 6:06 AM, Andrzej Bialecki wrote: > I originally started off with "replica" too... but there may only be >> one

Re: Solr Cloud wiki and branch notes

2010-01-17 Thread Ted Dunning
Jason V and Jason R have done just that. Great idea. Cool work. But a unified management interface would *really* be nice. On Sun, Jan 17, 2010 at 6:06 AM, Andrzej Bialecki wrote: > Well, then if we don't intend to support updates in this iteration then > perhaps there is no need to change an

Re: Solr Cloud wiki and branch notes

2010-01-17 Thread Yonik Seeley
On Sun, Jan 17, 2010 at 9:06 AM, Andrzej Bialecki wrote: > On 2010-01-16 21:11, Yonik Seeley wrote: >> If we were building from scratch perhaps - but it seems like if we can >> just model what people do today with Solr (but just make it a lot >> easier), that's a good start.  The opaque model is w

Re: Solr Cloud wiki and branch notes

2010-01-17 Thread Andrzej Bialecki
On 2010-01-16 21:11, Yonik Seeley wrote: Agreed - but it could be as simple as qualifying this with "from shardX on node2". Right - it's pretty clear there are both physical and logical shards... but it's less clear to me at this point if distinguishing them in the vocabulary helps or hurts.

Re: Solr Cloud wiki and branch notes

2010-01-16 Thread Ted Dunning
My experience with Katta is that very quickly my developers adopted index as the aggregate of all the shards which is exactly what Andrzej is proposing. Confusion with the "index contains shards", "nodes host shards" terminology has been minimal. On Sat, Jan 16, 2010 at 11:40 AM, Andrzej Bialecki

Re: Solr Cloud wiki and branch notes

2010-01-16 Thread Yonik Seeley
On Sat, Jan 16, 2010 at 2:40 PM, Andrzej Bialecki wrote: > I avoided the word "collection", because Solr deploys various cores under > "collectionX" names, leading users to assume that core == collection. For distributed search, it's already common to name the cores the same thing for shards of t

Re: Solr Cloud wiki and branch notes

2010-01-16 Thread Mark Miller
Andrzej Bialecki wrote: > > I avoided the word "collection", because Solr deploys various cores > under "collectionX" names, leading users to assume that core == > collection. "Global index" is two words but it's unambiguous. I'm fine > with the "collection" if we clarify the definition and avoid u

Re: Solr Cloud wiki and branch notes

2010-01-16 Thread Andrzej Bialecki
On 2010-01-16 18:18, Yonik Seeley wrote: On Fri, Jan 15, 2010 at 7:36 PM, Andrzej Bialecki wrote: Hi, My 0.02 PLN on the subject ... Terminology --- First the terminology: reading your emails I have a feeling that my head is about to explode. We have to agree on the vocabulary, otherw

Re: Solr Cloud wiki and branch notes

2010-01-16 Thread Yonik Seeley
On Fri, Jan 15, 2010 at 7:36 PM, Andrzej Bialecki wrote: > Hi, > > My 0.02 PLN on the subject ... > > Terminology > --- > First the terminology: reading your emails I have a feeling that my head is > about to explode. We have to agree on the vocabulary, otherwise we have no > hope of reach

Re: Solr Cloud wiki and branch notes

2010-01-15 Thread Ted Dunning
On Fri, Jan 15, 2010 at 4:36 PM, Andrzej Bialecki wrote: > My 0.02 PLN on the subject ... > Polish currency seems pretty strong lately. There are a lot of good ideas for this small sum. > > Terminology > > * (global) search index > * index shard: > * partitioning: > * search node: > * search

Re: Solr Cloud wiki and branch notes

2010-01-15 Thread Andrzej Bialecki
Hi, My 0.02 PLN on the subject ... Terminology --- First the terminology: reading your emails I have a feeling that my head is about to explode. We have to agree on the vocabulary, otherwise we have no hope of reaching any consensus. I propose the following vocabulary that has been in

Re: Solr Cloud wiki and branch notes

2010-01-15 Thread Jason Rutherglen
> This is really about doing not-so-much in the very near term, > while thinking ahead to the longer term. Lets have a page dedicated to release 1.0 of cloud? I feel uncomfortable editing the existing wiki because I don't know what the plans are for the first release. I need to revisit Katta as m

Re: Solr Cloud wiki and branch notes

2010-01-15 Thread Yonik Seeley
On Fri, Jan 15, 2010 at 4:12 PM, Jason Rutherglen wrote: > The page is huge, which signals to me maybe we're trying to do > too much This is really about doing not-so-much in the very near term, while thinking ahead to the longer term. > Revamping distributed search could be in a different branc

Solr Cloud wiki and branch notes

2010-01-15 Thread Jason Rutherglen
Here's some rough notes after running the unit tests, reviewing some of the code (though not understanding it), and reviewing the wiki page http://wiki.apache.org/solr/SolrCloud We need a protocol in the URL, otherwise it's inflexible I'm overwhelmed with all the ?? question areas of the documen