[ 
https://issues.apache.org/jira/browse/SOLR-14909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker updated SOLR-14909:
---------------------------------
    Fix Version/s: 9.0
                   main (10.0)

> Add replica is very slow on a large cluster
> -------------------------------------------
>
>                 Key: SOLR-14909
>                 URL: https://issues.apache.org/jira/browse/SOLR-14909
>             Project: Solr
>          Issue Type: Task
>    Affects Versions: 7.6, 7.7, 8.0, 8.1, 8.2, 8.3, 8.4, 8.5, 8.6, 8.7, 8.8, 
> 8.9, 8.10, 8.11
>            Reporter: Varun Thacker
>            Priority: Major
>             Fix For: 9.0, main (10.0)
>
>         Attachments: skipAssignBuildReplicaPos.patch
>
>
> We create ~100 collections every day for new incoming data
> We first issue a create-collection request for all the collections (4 shards 
> and createNodeSet=empty). This would create collections with no replicas
> We then issue async add-replica calls for all the shards creating 1 replica 
> each. 100 collection X 4 shards = 400 add-replica calls. All the add replica 
> calls pass the node parameter telling Solr where the replica should be created
> The cluster has 190 nodes currently and when we upgraded to Solr 7.7.3 we 
> noticed that the add replicas took 2 hours and 45 mins to complete! Clearly 
> something was wrong as the same cluster previously running Solr 7.3.1 was 
> taking a few mins only.
> A thread dump of the overseer showed a 100 threads stuck here ( Why 100? 
> That's the Solr default thread pool size set by MAX_PARALLEL_TASKS in 
> OverseerTaskProcessor )
>  
> {code:java}
> "OverseerThreadFactory-13-thread-1226-processing-n:10.128.18.69:8983_solr" 
> #11163 prio=5 os_prio=0 cpu=0.69ms elapsed=987.97s tid=0x00007f01f8051000 
> nid=0xd7a waiting for monitor entry [0x00007f01c1121000]
>  java.lang.Thread.State: BLOCKED (on object monitor)
>  at java.lang.Object.wait(java.base@11.0.5/Native Method)
>  - waiting on <no object reference available>
>  at 
> org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper$SessionRef.get(PolicyHelper.java:449)
>  - waiting to re-lock in wait() <0x00000007259e6a98> (a java.lang.Object)
>  at 
> org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper.getSession(PolicyHelper.java:493)
>  at 
> org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper.getReplicaLocations(PolicyHelper.java:121)
>  at 
> org.apache.solr.cloud.api.collections.Assign.getPositionsUsingPolicy(Assign.java:382)
>  at 
> org.apache.solr.cloud.api.collections.Assign$PolicyBasedAssignStrategy.assign(Assign.java:630)
>  at 
> org.apache.solr.cloud.api.collections.Assign.getNodesForNewReplicas(Assign.java:368)
>  at 
> org.apache.solr.cloud.api.collections.AddReplicaCmd.buildReplicaPositions(AddReplicaCmd.java:360)
>  at 
> org.apache.solr.cloud.api.collections.AddReplicaCmd.addReplica(AddReplicaCmd.java:146)
>  at 
> org.apache.solr.cloud.api.collections.AddReplicaCmd.call(AddReplicaCmd.java:91)
>  at 
> org.apache.solr.cloud.api.collections.OverseerCollectionMessageHandler.processMessage(OverseerCollectionMessageHandler.java:294)
> {code}
>  
>  
> It's strange because each add-replica API call would create a single replica 
> and specify which node is must be created on.
>  
> Assign.getNodesForNewReplicas is where the slowdown was and we noticed 
> SKIP_NODE_ASSIGNMENT flag ( 
> https://github.com/apache/lucene-solr/commit/17cb1b17172926d0d9aed3dfd3b9adb90cf65e0f#diff-ee29887eff6e474e58fcf3c02077f179R355
>  ) that the overseer reads could have skipped the method from being called.
> So we started passing SKIP_NODE_ASSIGNMENT=true and still no luck! The 
> replicas took just as long to create. It turned out that the Collections 
> Handler wasn't passing the SKIP_NODE_ASSIGNMENT parameter to the overseer.
> The add replica call only passes a specific set of params to the overseer 
> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.7.3/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java#L823
>  . We changed this to also pass SKIP_NODE_ASSIGNMENT.
> Now when we try to create the replicas it takes 4 minutes approximately vs 2 
> hours 45 mins that it was taking previosuly.
> Only master respects that param to the overseer ( 
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java#L938
>  ) . However it doesn't matter in master because the autoscaling framework is 
> gone ( https://github.com/apache/lucene-solr/commit/cc0c111/ )
> I believe this will be seen in all versions since Solr 7.6 ( 
> https://issues.apache.org/jira/browse/SOLR-12739 ) through every 8.x release
> Lastly, I manually tried to add a replica with and without the flag. Without 
> the flag it took 20 second and with the flag 2 seconds.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to