Would anyone be able to confirm that this is indeed a Solr bug between
restore and autoscaling?

I have done some local testing and found the following patterns and
discoveries.

When using a replica count autoscaling policy {"replica": "<2","shard":
"#EACH","node": "#ANY"}, it breaks Solr restore functionality because for
some reason Solr code needs double the amount of replica count for restore
to work from an existing backup.

If 1 replica exists on a node on a backup, restore with autoscaling requires
a rule that allows 2 replicas to exist on any node.

If 2 replicas exist on a node on a backup, restore with autoscaling requires
a rule that allows 4 replicas to exist on any node.

If 3 replicas exist on a node on a backup, restore with autoscaling requires
a rule that allows 6 replicas to exist on any node.

NOTE: When given double room, the backup comes up exactly as it was before
the restore, so nothing is actually duplicated after restore. It's just for
some reason, the current restore code may be bugged where it actually needs
more room than necessary to restore.(when there's a replica count
autoscaling policy)



Rajeswari posted a separate reply to this thread that brought me to another
discovery.
https://lucene.472066.n3.nabble.com/Re-CAUTION-Re-Solr-7-7-restore-issue-tp4450714.html

In it, they reference the legacy rule based replica placement documentation:
https://lucene.apache.org/solr/guide/7_6/rule-based-replica-placement.html

After doing some more local testing, I found that adding the same replica
count restraint as a rule onto the collection somehow now allows Solr
restore to work as intended. Example below:

collection rule: replica:<2,node:*
autoscaling policy: {"replica": "<2","node": "#ANY"}

When both are in place, restore functionality finally works.



Ideally, we should not have to do anything extra outside of placing the
original autoscaling replica count policy. But as of right now, it appears
that two workarounds involve either removing the cluster policy during
restore, or adding a legacy collection rule in addition to autoscaling
policy for restore to work.

Please let me know if something crucial is being missed, otherwise I hope
the above can help in tracking down any actual bug. Thanks.



Repeatable steps if you want to test locally using Solr tutorial:
./bin/solr stop -all ; rm -Rf example/cloud/
./bin/solr start -e cloud
(choose 1 node for gettingstarted with 1 shard 1 replica)

curl -X POST "http://localhost:8983/solr/admin/autoscaling"; --data-binary \
'{"set-cluster-policy": [{"replica": "<2","shard": "#EACH","node":
"#ANY"}]}'

curl
'http://localhost:8983/solr/admin/collections?action=BACKUP&name=myBackupName&collection=gettingstarted&location=/choose/location/'
curl
'http://localhost:8983/solr/admin/collections?action=DELETE&name=gettingstarted'
curl
'http://localhost:8983/solr/admin/collections?action=RESTORE&name=myBackupName&location=/choose/location/&collection=gettingstarted'


(use this before backup, and then restore works)
curl
'http://localhost:8983/solr/admin/collections?action=MODIFYCOLLECTION&collection=gettingstarted&rule=shard:*,replica:<2,node:*'





Koen De Groote wrote
> I also ran into this while researching cluster policies. Solr 7.6
> 
> Except same situation: introduce a rule to control placement of
> collections. Backup. Delete. Restore. Solr complains it can't do it.
> 
> I don't need them just yet, so I stopped there, but reading this is quite
> disturbing.
> 
> Does deleting the rule, restore and then immediately re-instating the rule
> work?
> 
> 
> 
> On Wed, Oct 9, 2019 at 6:33 AM Natarajan, Rajeswari <

> rajeswari.natarajan@

>> wrote:
> 
>> I am also facing the same issue. With Solr 7.6 restore fails with below
>> rule. Would like to place one replica per node by below rule
>>
>>  with the rule to place one replica per node
>> "set-cluster-policy": [{
>>         "replica": "<2",
>>         "shard": "#EACH",
>>         "node": "#ANY"
>>     }]
>>
>> Without the rule the restore works. But we need this rule. Any
>> suggestions
>> to overcome this issue.
>>
>> Thanks,
>> Rajeswari
>>
>> On 7/12/19, 11:00 AM, "Mark Thill" &lt;

> mark.thill@

> &gt; wrote:
>>
>>     I have a 4 node cluster.  My goal is to have 2 shards with two
>> replicas
>>     each and only allowing 1 core on each node.  I have a cluster policy
>> set to:
>>
>>     [{"replica":"2", "shard": "#EACH", "collection":"test",
>>     "port":"8983"},{"cores":"1", "node":"#ANY"}]
>>
>>     I then manually create a collection with:
>>
>>     name: test
>>     config set: test
>>     numShards: 2
>>     replicationFact: 2
>>
>>     This works and I get a collection that looks like what I expect.  I
>> then
>>     backup this collection.  But when I try to restore the collection it
>> fails
>>     and says
>>
>>     "Error getting replica locations : No node can satisfy the rules"
>>     [{"replica":"2", "shard": "#EACH", "collection":"test",
>>     "port":"8983"},{"cores":"1", "node":"#ANY"}]
>>
>>     If I set my cluster-policy rules back to [] and try to restore it
>> then
>>     successfully restores my collection exactly how I expect it to be. 
>> It
>>     appears that having any cluster-policy rules in place is affecting my
>>     restore, but the "error getting replica locations" is strange.
>>
>>     Any suggestions?
>>
>>     mark &lt;

> mark.thill@

> &gt;
>>
>>
>>





--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Reply via email to