[GitHub] [lucene-solr] cpoerschke commented on a change in pull request #1571: SOLR-14560: Interleaving for Learning To Rank

GitBox Wed, 04 Nov 2020 09:56:11 -0800


cpoerschke commented on a change in pull request #1571:
URL: https://github.com/apache/lucene-solr/pull/1571#discussion_r517514900




##########
File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/LTRScoringQuery.java
##########
@@ -73,6 +74,8 @@
   final private Map<String,String[]> efi;
   // Original solr query used to fetch matching documents
   private Query originalQuery;
+  // Model was picked for this Docs
+  private Set<Integer> pickedInterleavingDocIds;

Review comment:
       4.1/n The addition of a new member here which is only sometimes 
applicable got me wondering about possibly having `LTRInterleavingScoringQuery 
extends LTRScoringQuery` inheritance.

##########
File path: 
solr/contrib/ltr/src/java/org/apache/solr/ltr/search/LTRQParserPlugin.java
##########
@@ -146,93 +149,114 @@ public LTRQParser(String qstr, SolrParams localParams, 
SolrParams params,
     @Override
     public Query parse() throws SyntaxError {
       // ReRanking Model
-      final String modelName = localParams.get(LTRQParserPlugin.MODEL);
-      if ((modelName == null) || modelName.isEmpty()) {
+      final String[] modelNames = 
localParams.getParams(LTRQParserPlugin.MODEL);
+      if ((modelNames == null) || modelNames.length==0 || 
modelNames[0].isEmpty()) {
         throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
             "Must provide model in the request");
       }
-
-      final LTRScoringModel ltrScoringModel = mr.getModel(modelName);
-      if (ltrScoringModel == null) {
-        throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
-            "cannot find " + LTRQParserPlugin.MODEL + " " + modelName);
-      }
-
-      final String modelFeatureStoreName = 
ltrScoringModel.getFeatureStoreName();
-      final boolean extractFeatures = 
SolrQueryRequestContextUtils.isExtractingFeatures(req);
-      final String fvStoreName = 
SolrQueryRequestContextUtils.getFvStoreName(req);
-      // Check if features are requested and if the model feature store and 
feature-transform feature store are the same
-      final boolean featuresRequestedFromSameStore = 
(modelFeatureStoreName.equals(fvStoreName) || fvStoreName == null) ? 
extractFeatures:false;
-      if (threadManager != null) {
-        
threadManager.setExecutor(req.getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor());
-      }
-      final LTRScoringQuery scoringQuery = new LTRScoringQuery(ltrScoringModel,
-          extractEFIParams(localParams),
-          featuresRequestedFromSameStore, threadManager);
-
-      // Enable the feature vector caching if we are extracting features, and 
the features
-      // we requested are the same ones we are reranking with
-      if (featuresRequestedFromSameStore) {
-        scoringQuery.setFeatureLogger( 
SolrQueryRequestContextUtils.getFeatureLogger(req) );
+     
+      LTRScoringQuery[] rerankingQueries = new 
LTRScoringQuery[modelNames.length];
+      for (int i = 0; i < modelNames.length; i++) {
+        final LTRScoringQuery rerankingQuery;
+        if (!ORIGINAL_RANKING.equals(modelNames[i])) {
+          final LTRScoringModel ltrScoringModel = mr.getModel(modelNames[i]);
+          if (ltrScoringModel == null) {
+            throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
+                "cannot find " + LTRQParserPlugin.MODEL + " " + modelNames[i]);
+          }
+          final String modelFeatureStoreName = 
ltrScoringModel.getFeatureStoreName();
+          final boolean extractFeatures = 
SolrQueryRequestContextUtils.isExtractingFeatures(req);
+          final String fvStoreName = 
SolrQueryRequestContextUtils.getFvStoreName(req);        // Check if features 
are requested and if the model feature store and feature-transform feature 
store are the same
+          final boolean featuresRequestedFromSameStore = 
(modelFeatureStoreName.equals(fvStoreName) || fvStoreName == null) ? 
extractFeatures : false;
+          if (threadManager != null) {
+            
threadManager.setExecutor(req.getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor());
+          }
+          rerankingQuery = new LTRScoringQuery(ltrScoringModel,
+              extractEFIParams(localParams),
+              featuresRequestedFromSameStore, threadManager);
+
+          // Enable the feature vector caching if we are extracting features, 
and the features
+          // we requested are the same ones we are reranking with
+          if (featuresRequestedFromSameStore) {
+            rerankingQuery.setFeatureLogger( 
SolrQueryRequestContextUtils.getFeatureLogger(req) );
+          }
+        }else{
+          rerankingQuery = new LTRScoringQuery(null);
+        }
+
+        // External features
+        rerankingQuery.setRequest(req);
+        rerankingQueries[i] = rerankingQuery;
       }
-      SolrQueryRequestContextUtils.setScoringQuery(req, scoringQuery);
 
+      SolrQueryRequestContextUtils.setScoringQuery(req, rerankingQueries);
       int reRankDocs = localParams.getInt(RERANK_DOCS, DEFAULT_RERANK_DOCS);
       if (reRankDocs <= 0) {
         throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
-          "Must rerank at least 1 document");
+            "Must rerank at least 1 document");
+      }
+      if (rerankingQueries.length == 1) {
+        return new LTRQuery(rerankingQueries[0], reRankDocs);
+      } else {
+        return new LTRQuery(rerankingQueries, reRankDocs);
       }
-
-      // External features
-      scoringQuery.setRequest(req);
-
-      return new LTRQuery(scoringQuery, reRankDocs);
     }
   }
+  
+  public static boolean isOriginalRanking(LTRScoringQuery rerankingQuery){
+    return rerankingQuery.getScoringModel() == null;
+  }

Review comment:
       1/n Instead of having a static `isOriginalRanking(LTRScoringQuery 
rerankingQuery)` utility method here we could have a 
`OriginalRankingLTRScoringQuery extends LTRScoringQuery` class and replace 
`LTRQParserPlugin.isOriginalRanking(query)` calls with `query instanceof 
OriginalRankingLTRScoringQuery` expressions.
   
   
https://github.com/cpoerschke/lucene-solr/commit/6204ead58f1b6378f26d014dbba062423257f04d
 explores that.
   
   In isolation the change appears to be somewhat stylistic only but it would 
later  (2/n and 4/n) allow for other possibilities too.

##########
File path: 
solr/contrib/ltr/src/java/org/apache/solr/ltr/response/transform/LTRInterleavingTransformerFactory.java
##########
@@ -0,0 +1,120 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.ltr.response.transform;
+
+import java.io.IOException;
+import org.apache.solr.common.SolrDocument;
+import org.apache.solr.common.params.SolrParams;
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.ltr.LTRScoringQuery;
+import org.apache.solr.ltr.SolrQueryRequestContextUtils;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.ResultContext;
+import org.apache.solr.response.transform.DocTransformer;
+import org.apache.solr.response.transform.TransformerFactory;
+import org.apache.solr.util.SolrPluginUtils;
+
+import static org.apache.solr.ltr.search.LTRQParserPlugin.ORIGINAL_RANKING;
+import static org.apache.solr.ltr.search.LTRQParserPlugin.isOriginalRanking;

Review comment:
       2/n 
https://github.com/cpoerschke/lucene-solr/commit/be20b3a5a94fece4c8387d41831620a0f9497d39
 explores making `LTRQParserPlugin.ORIGINAL_RANKING` private. A stylistic thing 
to some extent though it would in future also make it easier to pass in their 
own alternative to the special `"_OriginalRanking_"` value as a parameter.

##########
File path: 
solr/contrib/ltr/src/java/org/apache/solr/ltr/response/transform/LTRFeatureLoggerTransformerFactory.java
##########
@@ -271,17 +287,23 @@ public void transform(SolrDocument doc, int docid)
 
     private void implTransform(SolrDocument doc, int docid, Float score)
         throws IOException {
-      Object fv = featureLogger.getFeatureVector(docid, scoringQuery, 
searcher);
-      if (fv == null) { // FV for this document was not in the cache
-        fv = featureLogger.makeFeatureVector(
-            LTRRescorer.extractFeaturesInfo(
-                modelWeight,
-                docid,
-                (docsWereNotReranked ? score : null),
-                leafContexts));
+      LTRScoringQuery rerankingQuery = rerankingQueries[0];
+      LTRScoringQuery.ModelWeight rerankingModelWeight = modelWeights[0];
+      if (rerankingQueries.length > 1 && 
rerankingQueries[1].getPickedInterleavingDocIds().contains(docid)) {
+        rerankingQuery = rerankingQueries[1];

Review comment:
       5/n When doing the 4/n stuff a question occurred to me here: 
`rerankingQueries[0]` and `modelWeights[0]` are the initial values. If the 
query is changed to `rerankingQueries[1]` then should the weight also be 
changed to `modelWeights[1]`? (Open question, haven't analysed it further yet 
nor tried it out.)

##########
File path: 
solr/contrib/ltr/src/java/org/apache/solr/ltr/search/LTRQParserPlugin.java
##########
@@ -146,93 +149,114 @@ public LTRQParser(String qstr, SolrParams localParams, 
SolrParams params,
     @Override
     public Query parse() throws SyntaxError {
       // ReRanking Model
-      final String modelName = localParams.get(LTRQParserPlugin.MODEL);
-      if ((modelName == null) || modelName.isEmpty()) {
+      final String[] modelNames = 
localParams.getParams(LTRQParserPlugin.MODEL);
+      if ((modelNames == null) || modelNames.length==0 || 
modelNames[0].isEmpty()) {
         throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
             "Must provide model in the request");
       }
-
-      final LTRScoringModel ltrScoringModel = mr.getModel(modelName);
-      if (ltrScoringModel == null) {
-        throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
-            "cannot find " + LTRQParserPlugin.MODEL + " " + modelName);
-      }
-
-      final String modelFeatureStoreName = 
ltrScoringModel.getFeatureStoreName();
-      final boolean extractFeatures = 
SolrQueryRequestContextUtils.isExtractingFeatures(req);
-      final String fvStoreName = 
SolrQueryRequestContextUtils.getFvStoreName(req);
-      // Check if features are requested and if the model feature store and 
feature-transform feature store are the same
-      final boolean featuresRequestedFromSameStore = 
(modelFeatureStoreName.equals(fvStoreName) || fvStoreName == null) ? 
extractFeatures:false;
-      if (threadManager != null) {
-        
threadManager.setExecutor(req.getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor());
-      }
-      final LTRScoringQuery scoringQuery = new LTRScoringQuery(ltrScoringModel,
-          extractEFIParams(localParams),
-          featuresRequestedFromSameStore, threadManager);
-
-      // Enable the feature vector caching if we are extracting features, and 
the features
-      // we requested are the same ones we are reranking with
-      if (featuresRequestedFromSameStore) {
-        scoringQuery.setFeatureLogger( 
SolrQueryRequestContextUtils.getFeatureLogger(req) );
+     
+      LTRScoringQuery[] rerankingQueries = new 
LTRScoringQuery[modelNames.length];
+      for (int i = 0; i < modelNames.length; i++) {
+        final LTRScoringQuery rerankingQuery;
+        if (!ORIGINAL_RANKING.equals(modelNames[i])) {
+          final LTRScoringModel ltrScoringModel = mr.getModel(modelNames[i]);
+          if (ltrScoringModel == null) {
+            throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
+                "cannot find " + LTRQParserPlugin.MODEL + " " + modelNames[i]);
+          }
+          final String modelFeatureStoreName = 
ltrScoringModel.getFeatureStoreName();
+          final boolean extractFeatures = 
SolrQueryRequestContextUtils.isExtractingFeatures(req);
+          final String fvStoreName = 
SolrQueryRequestContextUtils.getFvStoreName(req);        // Check if features 
are requested and if the model feature store and feature-transform feature 
store are the same
+          final boolean featuresRequestedFromSameStore = 
(modelFeatureStoreName.equals(fvStoreName) || fvStoreName == null) ? 
extractFeatures : false;
+          if (threadManager != null) {
+            
threadManager.setExecutor(req.getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor());
+          }
+          rerankingQuery = new LTRScoringQuery(ltrScoringModel,
+              extractEFIParams(localParams),
+              featuresRequestedFromSameStore, threadManager);
+
+          // Enable the feature vector caching if we are extracting features, 
and the features
+          // we requested are the same ones we are reranking with
+          if (featuresRequestedFromSameStore) {
+            rerankingQuery.setFeatureLogger( 
SolrQueryRequestContextUtils.getFeatureLogger(req) );
+          }
+        }else{
+          rerankingQuery = new LTRScoringQuery(null);
+        }
+
+        // External features
+        rerankingQuery.setRequest(req);
+        rerankingQueries[i] = rerankingQuery;
       }
-      SolrQueryRequestContextUtils.setScoringQuery(req, scoringQuery);
 
+      SolrQueryRequestContextUtils.setScoringQuery(req, rerankingQueries);
       int reRankDocs = localParams.getInt(RERANK_DOCS, DEFAULT_RERANK_DOCS);
       if (reRankDocs <= 0) {
         throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
-          "Must rerank at least 1 document");
+            "Must rerank at least 1 document");
+      }
+      if (rerankingQueries.length == 1) {
+        return new LTRQuery(rerankingQueries[0], reRankDocs);
+      } else {
+        return new LTRQuery(rerankingQueries, reRankDocs);
       }
-
-      // External features
-      scoringQuery.setRequest(req);
-
-      return new LTRQuery(scoringQuery, reRankDocs);
     }
   }
+  
+  public static boolean isOriginalRanking(LTRScoringQuery rerankingQuery){
+    return rerankingQuery.getScoringModel() == null;
+  }
 
   /**
    * A learning to rank Query, will incapsulate a learning to rank model, and 
delegate to it the rescoring
    * of the documents.
    **/
   public class LTRQuery extends AbstractReRankQuery {
-    private final LTRScoringQuery scoringQuery;
+    private final LTRScoringQuery[] rerankingQueries;
 
-    public LTRQuery(LTRScoringQuery scoringQuery, int reRankDocs) {
-      super(defaultQuery, reRankDocs, new LTRRescorer(scoringQuery));
-      this.scoringQuery = scoringQuery;
+    public LTRQuery(LTRScoringQuery[] rerankingQueries, int rerankDocs) {
+      super(defaultQuery, rerankDocs, new 
LTRInterleavingRescorer(rerankingQueries));
+      this.rerankingQueries = rerankingQueries;
+    }
+
+    public LTRQuery(LTRScoringQuery rerankingQuery, int rerankDocs) {
+      super(defaultQuery, rerankDocs, new LTRRescorer(rerankingQuery));
+      this.rerankingQueries = new LTRScoringQuery[]{rerankingQuery};

Review comment:
       4.2/n The differentation of behaviour here based on which constructor 
variant was called got me wondering about possibly having `LTRInterleavingQuery 
extends LTRQuery` inheritance.
   
   
https://github.com/cpoerschke/lucene-solr/commit/ff2cfe77538ee90cc7e10741ad06bf7347754d4a
 explores factoring out of `LTRInterleaving[Scoring]Query` classes.

##########
File path: 
solr/contrib/ltr/src/java/org/apache/solr/ltr/SolrQueryRequestContextUtils.java
##########
@@ -47,12 +47,12 @@ public static FeatureLogger 
getFeatureLogger(SolrQueryRequest req) {
 
   /** scoring query accessors **/
 
-  public static void setScoringQuery(SolrQueryRequest req, LTRScoringQuery 
scoringQuery) {
-    req.getContext().put(SCORING_QUERY, scoringQuery);
+  public static void setScoringQuery(SolrQueryRequest req, LTRScoringQuery[] 
scoringQueries) {

Review comment:
       3/n `getScoringQuery` already is renamed to `getScoringQueries` to 
reflect the changed signature, how about renaming `setScoringQuery` to 
`setScoringQueries` also?  
https://github.com/cpoerschke/lucene-solr/commit/2579aee9f2e76321a08754d89630f635ae6899b7
 does that plus `SCORING_QUERY` to `SCORING_QUERIES` too.

##########
File path: 
solr/contrib/ltr/src/java/org/apache/solr/ltr/search/LTRQParserPlugin.java
##########
@@ -146,93 +149,114 @@ public LTRQParser(String qstr, SolrParams localParams, 
SolrParams params,
     @Override
     public Query parse() throws SyntaxError {
       // ReRanking Model
-      final String modelName = localParams.get(LTRQParserPlugin.MODEL);
-      if ((modelName == null) || modelName.isEmpty()) {
+      final String[] modelNames = 
localParams.getParams(LTRQParserPlugin.MODEL);
+      if ((modelNames == null) || modelNames.length==0 || 
modelNames[0].isEmpty()) {
         throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
             "Must provide model in the request");
       }
-
-      final LTRScoringModel ltrScoringModel = mr.getModel(modelName);
-      if (ltrScoringModel == null) {
-        throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
-            "cannot find " + LTRQParserPlugin.MODEL + " " + modelName);
-      }
-
-      final String modelFeatureStoreName = 
ltrScoringModel.getFeatureStoreName();
-      final boolean extractFeatures = 
SolrQueryRequestContextUtils.isExtractingFeatures(req);
-      final String fvStoreName = 
SolrQueryRequestContextUtils.getFvStoreName(req);
-      // Check if features are requested and if the model feature store and 
feature-transform feature store are the same
-      final boolean featuresRequestedFromSameStore = 
(modelFeatureStoreName.equals(fvStoreName) || fvStoreName == null) ? 
extractFeatures:false;
-      if (threadManager != null) {
-        
threadManager.setExecutor(req.getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor());
-      }
-      final LTRScoringQuery scoringQuery = new LTRScoringQuery(ltrScoringModel,
-          extractEFIParams(localParams),
-          featuresRequestedFromSameStore, threadManager);
-
-      // Enable the feature vector caching if we are extracting features, and 
the features
-      // we requested are the same ones we are reranking with
-      if (featuresRequestedFromSameStore) {
-        scoringQuery.setFeatureLogger( 
SolrQueryRequestContextUtils.getFeatureLogger(req) );
+     
+      LTRScoringQuery[] rerankingQueries = new 
LTRScoringQuery[modelNames.length];
+      for (int i = 0; i < modelNames.length; i++) {
+        final LTRScoringQuery rerankingQuery;
+        if (!ORIGINAL_RANKING.equals(modelNames[i])) {
+          final LTRScoringModel ltrScoringModel = mr.getModel(modelNames[i]);
+          if (ltrScoringModel == null) {
+            throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
+                "cannot find " + LTRQParserPlugin.MODEL + " " + modelNames[i]);
+          }
+          final String modelFeatureStoreName = 
ltrScoringModel.getFeatureStoreName();
+          final boolean extractFeatures = 
SolrQueryRequestContextUtils.isExtractingFeatures(req);
+          final String fvStoreName = 
SolrQueryRequestContextUtils.getFvStoreName(req);        // Check if features 
are requested and if the model feature store and feature-transform feature 
store are the same
+          final boolean featuresRequestedFromSameStore = 
(modelFeatureStoreName.equals(fvStoreName) || fvStoreName == null) ? 
extractFeatures : false;
+          if (threadManager != null) {
+            
threadManager.setExecutor(req.getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor());
+          }

Review comment:
       6/n Also as part of 4/n noticed that here this if-then-set likely need 
not be inside the for-loop? The loop runs only once or twice so efficiency wise 
no concerns but from code reading perspective it not being inside the loop 
would be clearer.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] cpoerschke commented on a change in pull request #1571: SOLR-14560: Interleaving for Learning To Rank

Reply via email to