tomglk commented on a change in pull request #166:
URL: https://github.com/apache/solr/pull/166#discussion_r650532471



##########
File path: 
solr/contrib/ltr/src/java/org/apache/solr/ltr/feature/PrefetchingFieldValueFeature.java
##########
@@ -0,0 +1,134 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.ltr.feature;
+
+import com.google.common.annotations.VisibleForTesting;
+import org.apache.lucene.document.Document;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.search.DocIdSetIterator;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.Query;
+import org.apache.solr.common.util.StrUtils;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.search.SolrDocumentFetcher;
+
+import java.io.IOException;
+import java.util.Map;
+import java.util.Set;
+
+/**
+ * This feature returns the value of a field in the current document.
+ * The field must have stored="true" or docValues="true" properties.
+ * Example configuration:
+ * <pre>{
+  "name":  "rawHits",
+  "class": "org.apache.solr.ltr.feature.PrefetchingFieldValueFeature",
+  "params": {
+      "field": "hits"
+  }
+}</pre>
+ */
+public class PrefetchingFieldValueFeature extends FieldValueFeature {
+  // used to store all fields from all PrefetchingFieldValueFeatures
+  private Set<String> prefetchFields;
+  // can be used for debugging to only fetch the field this features uses
+  public static final String DISABLE_PREFETCHING_FIELD_VALUE_FEATURE = 
"disablePrefetchingFieldValueFeature";
+
+  public void setPrefetchFields(Set<String> fields) {
+    prefetchFields = fields;
+  }
+
+  @VisibleForTesting
+  public Set<String> getPrefetchFields(){
+    return prefetchFields;
+  }
+
+  public PrefetchingFieldValueFeature(String name, Map<String,Object> params) {
+    super(name, params);
+  }
+
+  @Override
+  public FeatureWeight createWeight(IndexSearcher searcher, boolean 
needsScores,
+      SolrQueryRequest request, Query originalQuery, Map<String,String[]> efi)
+          throws IOException {
+    return new PrefetchingFieldValueFeatureWeight(searcher, request, 
originalQuery, efi);
+  }
+
+  public class PrefetchingFieldValueFeatureWeight extends 
FieldValueFeatureWeight {
+    private final SolrDocumentFetcher docFetcher;
+    private final Boolean disablePrefetching;
+
+    public PrefetchingFieldValueFeatureWeight(IndexSearcher searcher,
+        SolrQueryRequest request, Query originalQuery, Map<String,String[]> 
efi) {
+      super(searcher, request, originalQuery, efi);
+
+      disablePrefetching = 
request.getParams().getBool(DISABLE_PREFETCHING_FIELD_VALUE_FEATURE, false);

Review comment:
       Regarding `ltr.PrefetchingFieldValueFeature.exclusions`:
   > I don't know if practically additional prefetch fields could work out as 
costly enough to justify the operational efforts of the outlined 
ltr.PrefetchingFieldValueFeature.exclusions parameter use
   
   I was surprised by your point that multiple models may be using the same 
store. I just assumed that people would use one feature-store per model.
   This could really lead to a lot of unnecessary fetching, especially is there 
is not much overlap between the features of the models.
   **However**, after taking a step back and writing the test ( 4d84293 ), this 
`costly enough to justify` thought of you stayed in my mind.
   The exclusion is not necessary to handle problems with only indexed or not 
existing fields.
   Also, to put it very bluntly, I am wondering how much of user's 
misconfiguration we should try to handle in our code.
   If there are two models that barely use the same features, which really 
would result in a lot of unnecessary fetching to the point where performance 
worsens, I would expect the user to use a separate feature-store per model.
   Maybe his could even be a hint / recommendation in the documentation.
   
   This statement from you:
   > index corruption is (hopefully) a rare source of errors, and in such a 
scenario rethrowing the exception with the details would seem most useful.
   
   made me want to remove the `disablePrefetchingFieldValueFeature` completely. 
While implementing it I felt like there would be more value to it, but now it 
seems like this param would only be useful in a situation where something is 
really wrong with the index (assuming that 'useful' means that there are quite 
some `Unable to extract feature...`-exceptions). 
   In that case the failing fetch of the PFVF seems to be the smallest problem 
besides like being unable to properly search anymore.
   
   So I am thinking about:
   ```java
   protected Document fetchDocument() throws FeatureException {
     try { 
       return docFetcher.doc(context.docBase + itr.docID(), prefetchFields);
     } catch (final IOException e) {
       final String prefetchedFields = StrUtils.join(prefetchFields, ',');
       throw new FeatureException(...);
     }
   }
   ```
   What do you think about the above change?
   We could also provide some fail-safety with your suggested try-catch. (Good 
structure in the method / comments could make up for the complexity. And maybe 
log an error instead of a warning?)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to