tomglk opened a new pull request #123:
URL: https://github.com/apache/solr/pull/123


   <!--
   _(If you are a project committer then you may remove some/all of the 
following template.)_
   
   Before creating a pull request, please file an issue in the ASF Jira system 
for Solr:
   
   * https://issues.apache.org/jira/projects/SOLR
   
   You will need to create an account in Jira in order to create an issue.
   
   The title of the PR should reference the Jira issue number in the form:
   
   * SOLR-####: <short description of problem or changes>
   
   SOLR must be fully capitalized. A short description helps people scanning 
pull requests for items they can work on.
   
   Properly referencing the issue in the title ensures that Jira is correctly 
updated with code review comments and commits. -->
   
   
   # Description
   
   This PR refers to the issue 
[SOLR-12697](https://issues.apache.org/jira/browse/SOLR-12697).
   The problem is, that the current FieldValueFeature only works for stored 
fields.
   This is not optimal, because using DocValues is faster for this use case. 
Also it increases the index size if you have to store fields only to use them 
for ltr.
   
   # Solution
   
   **Note:** This PR is based on the work of Stanislav Livotov and Christine 
Poerschke that can be seen in the jira ticket.
   
   It uses the latest patch (17th May 2019) from the jira ticket as base. I 
combined that with suggestions from Mrs. Poerschke and my own approach to the 
problem.
   
   This PR adds the DocValuesFieldValueFeatureScorer as a new Scorer used by 
the FieldValueFeatureWeight.
   The new scorer is used whenever a field has docValues and is not stored. 
Therefore it does not affect the current functionality but only is applied for 
fields that could not be used before.
   The new scorer checks the type of docValues a field has and handles NUMERIC 
and SORTED types. For NUMERIC fields, it simply uses the value, the SORTED type 
gets parsed as number or boolean-flag.
   
   # Tests
   
   New fields that have docValues=true were added to the schema.xml in order to 
test in TestLTROnSolrCloud that the feature-requests also return values for 
these fields.
   The TestLTRReRankingPipeline was changed from a SolrTestCase  to a 
SolrTestCaseJ4 in order to improve readability.
   
   I ran all tests in the package `org.apache.solr.ltr`.
   
   # Please note
   
   I am aware that the structure of the FieldValueFeature is now quite hard to 
read and the new Scorer is a bit hidden. I decided to add another nested class 
to the FieldValueFeatureWeight to avoid having to duplicate a lot of code just 
to change the inner functionality.
   
   Unit tests for the handleBytesRef are still missing. I plan to add them, but 
wanted to create the PR already so that the general approach can be reviewed 
and discussed.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [X] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [X] I have created a Jira issue and added the issue ID to my pull request 
title. (**Issue was already present**)
   - [X] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [X] I have developed this patch against the `main` branch.
   - [ ] I have run `./gradlew check`.
   - [X] I have run `./gradlew check -x test`.
   - [X] I have added tests for my changes.
   - [ ] I have added documentation for the [Reference 
Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to