patsonluk opened a new pull request, #1477:
URL: https://github.com/apache/solr/pull/1477

   https://issues.apache.org/jira/browse/SOLR-16712
   
   # Description
   
   The current implementation of PRS requires an extra param to the 
DocCollection, the `PrsSupplier`, when `get` is called, would fetch the PRS 
states from ZK. The implementation of such supplier `LazyPrsSupplier` would 
only fetch the state on first call. 
   
   While this flow does work properly, this flow might introduce some 
unnecessary complexity:
   1. PRS entry fetching from ZK is done either during or after the 
`DocCollection` construction, this could be a bit inconsistent with existing 
non PRS `DocCollection` design which `DocCollection` is simply a immutable 
container that does not fetch data after its instantiation
   2. The lazy fetching could introduce some uncertainties as to when exactly 
the fetching happens (and if any Zookeeper IO exceptions arises)
    
   My guess was that the lazy loading was introduced in 
https://issues.apache.org/jira/browse/SOLR-16580 as to avoid fetching the PRS 
states multiple times in the ctor of `DocCollection`, however, if we only fetch 
the `PerReplicaStates` once on update before calling the `DocCollection` ctor, 
and pass the `PerReplicaStates` object to the `DocCollection` instead, it can 
probably achieve similar result but with reduced uncertainty after 
`DocCollection` construction.
   
   There's another branch which experimented with making DocCollection, Slice 
and Replica immutable as well for PRS enabled collection 
[https://github.com/cowpaths/fullstory-solr/pull/84] but is beyond the 
discussion of this Jira ticket
   
   
   # Solution
   
   Changed `DocCollector` last param from `PrsSupplier` to `PerReplicaStates`, 
which means in order to create a PRS enabled `DocCollection` , the 
`PerReplicaStates` object need to be provided upfront instead of getting 
fetched later on, which used to happen in `DocCollection ctor -> 
Slice#setPrsSupplier -> Slice#findLeader() -> Replica#isLeader() -> 
PrsSuppler#get`
   
   In order to minimize the impact, a new builder method `buildDocCollection` 
is added to `DocCollection` and accept the identical signature as the 
`DocCollection` before this change (which accepts `PrsSupplier`)
   
   Take note that `DocCollection#copyWith(PerReplicaStates)` has also been 
renamed to `DocCollection#setReplicaStates(PerReplicaStates)`, this is to 
better describes that such method simply set the `PerReplicaStates` but does 
not "copy" (ie creates new `DocCollection` instance).
   
   # Tests
   
   No new test cases are introduced, all the existing unit test cases should 
provide sufficient coverage
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [ ] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `main` branch.
   - [ ] I have run `./gradlew check`.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Reference 
Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to