patsonluk opened a new pull request, #1794: URL: https://github.com/apache/solr/pull/1794
https://issues.apache.org/jira/browse/SOLR-16871 # Description PR https://github.com/apache/solr/pull/1762 fixes various race condition for coordinator node. One of the fixes restricts the name of the synthetic core to ensure at most 1 core is created per coordinator node. Unfortunately unit test cases failed because it could still add 2 cores for the first coordinator node (1st from collection creation, and then 2nd from addReplica call) For example this is the list of replicas on a failed run of `TestCoordinatorRole#testConcurrentAccess` which is supposed to only create 4 cores, one of each for the 4 coordinator nodes: ``` core_node2:{ "core":".sys.COORDINATOR-COLL-conf_shard1_replica_n1", "leader":"true", "node_name":"127.0.0.1:49656_solr", "base_url":"https://127.0.0.1:49656/solr", "state":"active", "type":"NRT", "force_set_state":"false"}, core_node3:{ "node_name":"127.0.0.1:49656_solr", "base_url":"https://127.0.0.1:49656/solr", "core":".sys.COORDINATOR-COLL-conf_127.0.0.1_49656_solr", "state":"active", "type":"NRT", "force_set_state":"false"}, core_node4:{ "node_name":"127.0.0.1:49647_solr", "base_url":"https://127.0.0.1:49647/solr", "core":".sys.COORDINATOR-COLL-conf_127.0.0.1_49647_solr", "state":"active", "type":"NRT", "force_set_state":"false"}, core_node5:{ "node_name":"127.0.0.1:49650_solr", "base_url":"https://127.0.0.1:49650/solr", "core":".sys.COORDINATOR-COLL-conf_127.0.0.1_49650_solr", "state":"active", "type":"NRT", "force_set_state":"false"}, core_node6:{ "node_name":"127.0.0.1:49653_solr", "base_url":"https://127.0.0.1:49653/solr", "core":".sys.COORDINATOR-COLL-conf_127.0.0.1_49653_solr", "state":"active", "type":"NRT", "force_set_state":"false"}] ``` # Solution Instead of restricting the core name, which is hard to get it right, perhaps we can synchronize the replica block. This block should be rarely called - only very first time on a coordinator node that encounters a new config. So I think it's probably better to simply synchronize the block. # Tests Re-ran the test cases 10 times and ensure that they all passed `./gradlew :solr:core:beast -Ptests.dups=10 --tests "org.apache.solr.search.TestCoordinatorRole.testConcurrentAccess"` # Checklist Please review the following and check all that apply: - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [x] I have created a Jira issue and added the issue ID to my pull request title. - [ ] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [x] I have developed this patch against the `main` branch. - [ ] I have run `./gradlew check`. - [ ] I have added tests for my changes. - [ ] I have added documentation for the [Reference Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org