tomglk opened a new pull request #166:
URL: https://github.com/apache/solr/pull/166


   # Description
   This PR refers to the issue 
[SOLR-15450](https://issues.apache.org/jira/browse/SOLR-15450).
   
   It introduces a new `PrefetchingFieldValueFeature` which can be used for 
stored fields without docValues.
   This feature prefetches all stored fields that itself and other 
`PrefetchingFieldValueFeatures` use so that they are available in the cache.
   This improves the performance because we only need to access the index 
during the first loading of all fields that should be prefetched instead of 
reading the fields used by the features separately.
   
   # Solution
   
   The `PrefetchingFieldValueFeature` extends the `FieldValueFeature`.
   It only is used to get the values for fields which are stored and have no 
docValues. For all other fields it delegates the work to the 
`FieldValueFeature` because in such cases we want to make use of the docValues.
   The `PrefetchingFieldValueFeature` keeps a list of all fields that should be 
fetched and uses these fields when requesting the document from the docFetcher.
   ```java
   private Set<String> prefetchFields;
   
   final Document document = docFetcher.doc(context.docBase + itr.docID(), 
prefetchFields);
   ```
   
   The `prefetchFields` are updated after all features were loaded. To do so, 
it keeps track of the feature stores that were updated, iterates over all 
`PrefetchingFieldValueFeatures` per store, collects the fields and then updates 
them.
   ```java
   Set<String> updatedFeatureStores = new HashSet<>();
   
   preparePrefetchingFieldValueFeatures(updatedFeatureStores);
   
   private void preparePrefetchingFieldValueFeatures(Set<String> 
updatedFeatureStores) {
       for(String featureStoreName: updatedFeatureStores) {
         final Set<PrefetchingFieldValueFeature> prefetchingFieldValueFeatures 
= getFeatureStore(featureStoreName)
             .getFeatures().stream()
             .filter(feature -> feature instanceof PrefetchingFieldValueFeature)
             .map(feature -> ((PrefetchingFieldValueFeature) feature))
             .collect(Collectors.toSet());
   
         final Set<String> prefetchFields = new HashSet<>();
         for (PrefetchingFieldValueFeature feature : 
prefetchingFieldValueFeatures) {
           prefetchFields.add(feature.getField());
         }
         for (PrefetchingFieldValueFeature feature : 
prefetchingFieldValueFeatures) {
           feature.setPrefetchFields(prefetchFields);
         }
       }
     }
   ```
   
   # Tests
   **TestPrefetchingFieldValueFeature** tests that
   * the `prefetchFields` are only updated after all features were loaded and 
the basic setting works
   * adding fields to an existing feature works
   
   **TestLTROnSolrCloudWithPrefetchingFieldValueFeature** tests that
   * the basic reRanking works
   * the `PrefetchingFieldValueFeatureScorers` correctly prefetch the fields 
(using assertions on the *hasBeenLoaded*-property of LazyFields and checking 
the StoredFields of the Document)
   
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [X] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [X] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [X] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [X] I have developed this patch against the `main` branch.
   - [X] I have run `./gradlew check -x test`.
   - [ ] I have run `./gradlew check`.
   - [X] I have run the tests in `org.apache.solr.ltr.test`.
   - [X] I have added tests for my changes.
   - [X] I have added documentation for the [Reference 
Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide) **NOTE:** 
very sparsely, I was not sure how much detail I should provide


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to