True - though I think for 4.2. numShards has never been respected in the <cores 
def's for various reasons.

In 4.0 and 4.1, things should have still worked though - you didn't need to 
give numShards and everything should work just based on configuring different 
shard names for core or accepting the default shard names.

In 4.2 this went away - not passing numShards now means that you must distrib 
updates yourself. There are various technical reasons for this given new 
features that are being added.

So, you can only really pre configure *one* collection in solr.xml and then use 
the numShards sys prop. If you wanted to create another collection the same way 
with a *different* number of shards, you would have to stop Solr, do a new 
numShards sys prop after pre configuring the next collection, then start Solr. 
Not really a good option.

And so, the collections API is the way to go - and it's fairly poor in 4.2 due 
to it's lack of result responses (you have to search the overseer logs). It's 
slightly better in 4.2 (you will get some response) and much better in 4.2.1 
(you will get decent responses).

Now that it's much more central, it will continue to improve rapidly.

- Mark

On Mar 28, 2013, at 6:08 PM, Chris R <corg...@gmail.com> wrote:

> So, by using the numshards  at initialization time, using the sample
> collection1 solr.xml, I'm able to create a sharded and distributed index.
> Also, by removing any initial cores from the solr.xml file, i'm able to use
> the collections API via the web to create multiple collection with sharded
> indexes that work correctly; however, I can't create distributed
> collections by using the solr.xml alone.   Adding the numshards parameter
> to the first instance of a collection core in the solr.xml file is ignore,
> cores are created, by update distribution doesn't happen.  When booting up
> Solr, the configs INFO messages show numShards= null.  I get the impression
> from the documentation that you should be able to do this, buy I haven't
> seen a specific example.
> 
> With out that, it seems that I'm relegated to the shard names, locations,
> etc provided by the collections API.  I've done this testing under 4.1
> 
> True or False?
> 
> Chris
> On Mar 27, 2013 9:46 PM, "corg...@gmail.com" <corg...@gmail.com> wrote:
> 
>> I realized my error shortly, more docs, better spread.  I continued to do
>> some testing to see how I could manually lay out the shards in what I
>> thought was a more organized manner and with more descriptive  names than
>> the numshards parameter alone produced.  I also gen'd up a few thousand
>> docs and schema to test with.
>> 
>> Appreciate the help.
>> 
>> 
>> 
>> ----- Reply message -----
>> From: "Erick Erickson" <erickerick...@gmail.com>
>> To: <solr-user@lucene.apache.org>
>> Subject: Solrcloud 4.1 Collection with multiple slices only use
>> Date: Wed, Mar 27, 2013 9:30 pm
>> 
>> 
>> First, three documents isn't enough to really test. The formula for
>> assigning shards is to hash on the unique ID. It _is_ possible that
>> all three just happened to land on the same shard. If you index all 32
>> docs in the example dir and they're all on the same shard, we should
>> talk.
>> 
>> Second, a regular query to the cluster will always search all the
>> shards. Use &distrib=false on the URL to restrict the search to just
>> the node you fire the request at.....
>> 
>> Let us know if you index more docs and still see the problem.
>> 
>> Best
>> Erick
>> 
>> On Wed, Mar 27, 2013 at 9:39 AM, Chris R <corg...@gmail.com> wrote:
>>> So - I must be missing something very basic here and I've gone back to
>> the
>>> Wiki example.  After setting up the two shard example in the first
>> tutorial
>>> and indexing the three example documents, look at the shards in the Admin
>>> UI.  The documents are stored in the index where the update with
>> directed -
>>> they aren't distributed across both shards.
>>> 
>>> Release notes state that the compositeId router is the default when using
>>> the numshards parameter?  I want an even distribution of documents based
>> on
>>> ID across all shards.... suggestions on what I'm screwing up.
>>> 
>>> Chris
>>> 
>>> On Mon, Mar 25, 2013 at 11:34 PM, Mark Miller <markrmil...@gmail.com>
>> wrote:
>>> 
>>>> I'm guessing you didn't specify numShards. Things changed in 4.1 - if
>> you
>>>> don't specify numShards it goes into a mode where it's up to you to
>>>> distribute updates.
>>>> 
>>>> - Mark
>>>> 
>>>> On Mar 25, 2013, at 10:29 PM, Chris R <corg...@gmail.com> wrote:
>>>> 
>>>>> I have two issues and I'm unsure if they are related:
>>>>> 
>>>>> Problem:  After setting up a multiple collection Solrcloud 4.1
>> instance
>>>> on
>>>>> seven servers, when I index the documents they aren't distributed
>> across
>>>>> the index slices.  It feels as though, I don't actually have a "cloud"
>>>>> implementation, yet everything I see in the admin interface and
>> zookeeper
>>>>> implies I do.  I feel as I'm overlooking something obvious, but have
>> not
>>>>> been able to figure out what.
>>>>> 
>>>>> Configuration: Seven servers and four collections, each with 12 slices
>>>> (no
>>>>> replica shards yet).  Zookeeper configured in a three node ensemble.
>>>> When
>>>>> I send documents to Server1/Collection1 (which holds two slices of
>>>>> collection1), all the documents show up in a single index shard
>> (core).
>>>>> Perhaps related, I have found it impossible to get Solr to recognize
>> the
>>>>> server names with anything but a literal host="servername" parameter
>> in
>>>> the
>>>>> solr.xml.  hostname parameters, host files, network, dns, are all
>>>>> configured correctly....
>>>>> 
>>>>> I have a Solr 4.0 single collection set up similarly and it works just
>>>>> fine.  I'm using the same schema.xml and solrconfig.xml files on the
>> 4.1
>>>>> implementation with only the luceneMatchVersion changed to LUCENE_41.
>>>>> 
>>>>> sample solr.xml from server1
>>>>> 
>>>>> <?xml version="1.0" encoding="UTF-8" ?>
>>>>> <solr persistent="true">
>>>>> <cores adminPath="/admin/cores" hostPort="8080" host="server1"
>>>>> shareSchema="true" zkClientTimeout="60000">
>>>>> <core collection="col201301" shard="col201301s04"
>>>>> instanceDir="/solr/col201301/col201301s04sh01" name="col201301s04sh01"
>>>>> dataDir="/solr/col201301/col201301s04sh01/data"/>
>>>>> <core collection="col201301" shard="col201301s11"
>>>>> instanceDir="/solr/col201301/col201301s11sh01" name="col201301s11sh01"
>>>>> dataDir="/solr/col201301/col201301s11sh01/data"/>
>>>>> <core collection="col201302" shard="col201302s06"
>>>>> instanceDir="/solr/col201302/col201302s06sh01" name="col201302s06sh01"
>>>>> dataDir="/solr/col201302/col201302s06sh01/data"/>
>>>>> <core collection="col201303" shard="col201303s01"
>>>>> instanceDir="/solr/col201303/col201303s01sh01" name="col201303s01sh01"
>>>>> dataDir="/solr/col201303/col201303s01sh01/data"/>
>>>>> <core collection="col201303" shard="col201303s08"
>>>>> instanceDir="/solr/col201303/col201303s08sh01" name="col201303s08sh01"
>>>>> dataDir="/solr/col201303/col201303s08sh01/data"/>
>>>>> <core collection="col201304" shard="col201304s03"
>>>>> instanceDir="/solr/col201304/col201304s03sh01" name="col201304s03sh01"
>>>>> dataDir="/solr/col201304/col201304s03sh01/data"/>
>>>>> <core collection="col201304" shard="col201304s10"
>>>>> instanceDir="/solr/col201304/col201304s10sh01" name="col201304s10sh01"
>>>>> dataDir="/solr/col201304/col201304s10sh01/data"/>
>>>>> </cores>
>>>>> </solr>
>>>>> 
>>>>> Thanks
>>>>> Chris
>>>> 
>>>> 
>> 

Reply via email to