patsonluk opened a new pull request, #1794:
URL: https://github.com/apache/solr/pull/1794

   https://issues.apache.org/jira/browse/SOLR-16871
   
   # Description
   
   PR https://github.com/apache/solr/pull/1762 fixes various race condition for 
coordinator node. One of the fixes restricts the name of the synthetic core to 
ensure at most 1 core is created per coordinator node.
   
   Unfortunately unit test cases failed because it could still add 2 cores for 
the first coordinator node (1st from collection creation, and then 2nd from 
addReplica call)
   
   For example this is the list of replicas on a failed run of 
`TestCoordinatorRole#testConcurrentAccess` which is supposed to only create 4 
cores, one of each for the 4 coordinator nodes:
   
   ```
   core_node2:{
     "core":".sys.COORDINATOR-COLL-conf_shard1_replica_n1",
     "leader":"true",
     "node_name":"127.0.0.1:49656_solr",
     "base_url":"https://127.0.0.1:49656/solr";,
     "state":"active",
     "type":"NRT",
     "force_set_state":"false"}, 
   core_node3:{
     "node_name":"127.0.0.1:49656_solr",
     "base_url":"https://127.0.0.1:49656/solr";,
     "core":".sys.COORDINATOR-COLL-conf_127.0.0.1_49656_solr",
     "state":"active",
     "type":"NRT",
     "force_set_state":"false"}, 
   core_node4:{
     "node_name":"127.0.0.1:49647_solr",
     "base_url":"https://127.0.0.1:49647/solr";,
     "core":".sys.COORDINATOR-COLL-conf_127.0.0.1_49647_solr",
     "state":"active",
     "type":"NRT",
     "force_set_state":"false"}, 
   core_node5:{
     "node_name":"127.0.0.1:49650_solr",
     "base_url":"https://127.0.0.1:49650/solr";,
     "core":".sys.COORDINATOR-COLL-conf_127.0.0.1_49650_solr",
     "state":"active",
     "type":"NRT",
     "force_set_state":"false"}, 
   core_node6:{
     "node_name":"127.0.0.1:49653_solr",
     "base_url":"https://127.0.0.1:49653/solr";,
     "core":".sys.COORDINATOR-COLL-conf_127.0.0.1_49653_solr",
     "state":"active",
     "type":"NRT",
     "force_set_state":"false"}]
   ```
   
   
   
   # Solution
   
   Instead of restricting the core name, which is hard to get it right, perhaps 
we can synchronize the replica block. This block should be rarely called - only 
very first time on a coordinator node that encounters a new config. So I think 
it's probably better to simply synchronize the block.
   
   # Tests
   
   Re-ran the test cases 10 times and ensure that they all passed
   `./gradlew :solr:core:beast -Ptests.dups=10 --tests 
"org.apache.solr.search.TestCoordinatorRole.testConcurrentAccess"`
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [ ] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `main` branch.
   - [ ] I have run `./gradlew check`.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Reference 
Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to