[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13504954#comment-13504954
 ] 

Per Steffensen commented on SOLR-4114:
--------------------------------------

bq. So that we don't lose functionality we currently have?

So now you care about backwards compatibility? :-) You didnt care about 
backwards compatibility from 3.6 to 4.0 when you introduced optimistic locking 
(including error in case of updating an existing document without providing 
correct version), which is forced upon you in 4.0 if you choose to run with 
version-field and update-log. There are perfectly valid reasons for wanting to 
use version-field and update-log, without wanting to have fullblown optimistic 
locking. My solution to SOLR-3178 support this kind of backwards compatibility 
by letting you explicitly choose among update-semantics modes "classic", 
"consistency" and "classic-consistency-hybrid". So if you come from 3.6 and 
want backwards compatibile update-semantics, but also want version-field and 
update-log, you just choose update-semantics "classic" :-) See 
http://wiki.apache.org/solr/Per%20Steffensen/Update%20semantics.
Im just teasing you a little :-)

But anyway, I like backwards compatibility so you are right, we probably do not 
want to do something that change default behaviour in 4.0.0. Will have a look 
at a solution tomorrow. It is kinda late in europe now.

bq. Example: you have 24 servers and create a collection with 8 shards and a 
target replication factor of 3... but one of the servers goes down in the 
meantime so one shard has only 2 replicas. It's entirely reasonable for a user 
to want to wait until that machine comes back up rather than doubling up on a 
different node.

Assume you mean replication-factor of 2? With a replication-factor of 2 you 
will get 3 shards per slice.

With your current solution there will be no "waiting until that machine comes 
back up". You will just end up with 8 slices, where 7 of them have 2 replica, 
and the last only have 1 replica. With the patch I provided today you will end 
up with 8 slices, where all of them have 2 replica - but one of the servers 
will be running two shards and the solr down will not be running any (when it 
comes back up). I probably would prefer my current solution - at least you 
acheive the property that any two servers can crash (including disk crash) 
without you loosing data - which is basically what you want to acheive when you 
request replication-factor of 2.

But waiting for the machine to come back up before creating the collection 
would certainly be the best solution. It is just extremly hard to know if a 
machine is down or not - or if you intented to run one server more than what is 
currently running. In general there is no information in solr/ZK about that - 
and there shouldnt. In this case a maxShardsPerNode could be a nice way to tell 
the system that you just want to wait. But then it would have to be implemented 
correctly, and that is really hard. In OverseerCollectionProcessor you can 
check if you can meet the maxShardsPerNode requirement with the current set of 
live solrs, and if you cant just dont initiate the creation process. But a 
server can go down between the time where the OverseerCollectionProcessor 
checks and the time where it is supposed to create a shard. Therefore it is 
impossible to guarantee that the OverseerCollectionProcessor does not create 
some shards of a new collection without being able to create them all while 
still living up to the maxShardsPerNode requirement. In such case, if you 
really want to live up to the maxShardsPerNode requiremnt the 
OverseerCollectionProcessor would have to try to delete the shards of the 
collection that was successfully created. But this deletion process can also 
fail. Ahhh there is no guaranteeed way.

Therefore my idea about the whole thing, is more aming at just having all the 
shards created, and then move them around later. I know this is not possible 
for now, but I do expect that we (at least my project) will make support for 
(manually and/or automatic) migration of shards from one server to another. 
This feature is needed  to acheive nice elasiticty (moving shards/load onto new 
servers as they join the cluster), but also to do re-balancing after e.g. a 
solr was down (and a shard that should have been placed on this server was 
temporarily created to run on another server).

Well as I said I will consider the best (small patch :-) ) solution tomorrow. 
But if I cant come up with a better small-patch-solution we can certainly do 
the maxShardsPerNode thing - no problemo. It just isnt going to be 100% 
guaranteed.

                
> Collection API: Allow multiple shards from one collection on the same Solr 
> server
> ---------------------------------------------------------------------------------
>
>                 Key: SOLR-4114
>                 URL: https://issues.apache.org/jira/browse/SOLR-4114
>             Project: Solr
>          Issue Type: New Feature
>          Components: multicore, SolrCloud
>    Affects Versions: 4.0
>         Environment: Solr 4.0.0 release
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: collection-api, multicore, shard, shard-allocation
>         Attachments: SOLR-4114.patch
>
>
> We should support running multiple shards from one collection on the same 
> Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
> (each Solr server running 2 shards).
> Performance tests at our side has shown that this is a good idea, and it is 
> also a good idea for easy elasticity later on - it is much easier to move an 
> entire existing shards from one Solr server to another one that just joined 
> the cluter than it is to split an exsiting shard among the Solr that used to 
> run it and the new Solr.
> See dev mailing list discussion "Multiple shards for one collection on the 
> same Solr server"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to