patsonluk opened a new pull request, #1477: URL: https://github.com/apache/solr/pull/1477
https://issues.apache.org/jira/browse/SOLR-16712 # Description The current implementation of PRS requires an extra param to the DocCollection, the `PrsSupplier`, when `get` is called, would fetch the PRS states from ZK. The implementation of such supplier `LazyPrsSupplier` would only fetch the state on first call. While this flow does work properly, this flow might introduce some unnecessary complexity: 1. PRS entry fetching from ZK is done either during or after the `DocCollection` construction, this could be a bit inconsistent with existing non PRS `DocCollection` design which `DocCollection` is simply a immutable container that does not fetch data after its instantiation 2. The lazy fetching could introduce some uncertainties as to when exactly the fetching happens (and if any Zookeeper IO exceptions arises) My guess was that the lazy loading was introduced in https://issues.apache.org/jira/browse/SOLR-16580 as to avoid fetching the PRS states multiple times in the ctor of `DocCollection`, however, if we only fetch the `PerReplicaStates` once on update before calling the `DocCollection` ctor, and pass the `PerReplicaStates` object to the `DocCollection` instead, it can probably achieve similar result but with reduced uncertainty after `DocCollection` construction. There's another branch which experimented with making DocCollection, Slice and Replica immutable as well for PRS enabled collection [https://github.com/cowpaths/fullstory-solr/pull/84] but is beyond the discussion of this Jira ticket # Solution Changed `DocCollector` last param from `PrsSupplier` to `PerReplicaStates`, which means in order to create a PRS enabled `DocCollection` , the `PerReplicaStates` object need to be provided upfront instead of getting fetched later on, which used to happen in `DocCollection ctor -> Slice#setPrsSupplier -> Slice#findLeader() -> Replica#isLeader() -> PrsSuppler#get` In order to minimize the impact, a new builder method `buildDocCollection` is added to `DocCollection` and accept the identical signature as the `DocCollection` before this change (which accepts `PrsSupplier`) Take note that `DocCollection#copyWith(PerReplicaStates)` has also been renamed to `DocCollection#setReplicaStates(PerReplicaStates)`, this is to better describes that such method simply set the `PerReplicaStates` but does not "copy" (ie creates new `DocCollection` instance). # Tests No new test cases are introduced, all the existing unit test cases should provide sufficient coverage # Checklist Please review the following and check all that apply: - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [x] I have created a Jira issue and added the issue ID to my pull request title. - [ ] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [x] I have developed this patch against the `main` branch. - [ ] I have run `./gradlew check`. - [ ] I have added tests for my changes. - [ ] I have added documentation for the [Reference Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org