alessandrobenedetti commented on code in PR #3433:
URL: https://github.com/apache/solr/pull/3433#discussion_r2231064740
##########
solr/modules/ltr/src/java/org/apache/solr/ltr/LTRScoringQuery.java:
##########
@@ -551,16 +554,117 @@ public DocIdSetIterator iterator() {
return featureTraversalScorer.iterator();
}
- private class SparseModelScorer extends Scorer {
- private final DisiPriorityQueue subScorers;
- private final ScoringQuerySparseIterator itr;
+ public void setIsLogging(boolean isLogging) {
+ this.isLogging = isLogging;
+ }
- private int targetDoc = -1;
- private int activeDoc = -1;
+ abstract class FeatureTraversalScorer extends Scorer {
+ protected int targetDoc = -1;
+ protected int activeDoc = -1;
+ protected LeafReaderContext leafContext;
- private SparseModelScorer(
- Weight weight, List<Feature.FeatureWeight.FeatureScorer>
featureScorers) {
+ protected FeatureTraversalScorer(Weight weight, LeafReaderContext
leafContext) {
super(weight);
+ this.leafContext = leafContext;
+ }
+
+ @Override
+ public float score() throws IOException {
+ reset();
+ fillFeaturesInfo();
+ return makeNormalizedFeaturesAndScore();
+ }
+
+ @Override
+ public float getMaxScore(int upTo) throws IOException {
+ return Float.POSITIVE_INFINITY;
+ }
+
+ private void fillFeaturesInfo() throws IOException {
Review Comment:
not sure this is a 'scorer' responsibility, it feels more like
'FeatureExtractor' kind of responsibility, please investigate a new class that
can be used by both the scorer and logger
##########
solr/modules/ltr/src/java/org/apache/solr/ltr/LTRScoringQuery.java:
##########
@@ -551,16 +554,117 @@ public DocIdSetIterator iterator() {
return featureTraversalScorer.iterator();
}
- private class SparseModelScorer extends Scorer {
- private final DisiPriorityQueue subScorers;
- private final ScoringQuerySparseIterator itr;
+ public void setIsLogging(boolean isLogging) {
+ this.isLogging = isLogging;
+ }
- private int targetDoc = -1;
- private int activeDoc = -1;
+ abstract class FeatureTraversalScorer extends Scorer {
+ protected int targetDoc = -1;
+ protected int activeDoc = -1;
+ protected LeafReaderContext leafContext;
- private SparseModelScorer(
- Weight weight, List<Feature.FeatureWeight.FeatureScorer>
featureScorers) {
+ protected FeatureTraversalScorer(Weight weight, LeafReaderContext
leafContext) {
super(weight);
+ this.leafContext = leafContext;
+ }
+
+ @Override
+ public float score() throws IOException {
+ reset();
+ fillFeaturesInfo();
+ return makeNormalizedFeaturesAndScore();
+ }
+
+ @Override
+ public float getMaxScore(int upTo) throws IOException {
+ return Float.POSITIVE_INFINITY;
+ }
+
+ private void fillFeaturesInfo() throws IOException {
+ if (activeDoc == targetDoc) {
+ SolrCache<Integer, float[]> featureVectorCache = null;
+ float[] featureVector;
+
+ // Check added otherwise
org.apache.solr.ltr.TestLTRScoringQuery.testLTRScoringQuery
+ // and
org.apache.solr.ltr.TestSelectiveWeightCreation.testScoringQueryWeightCreation
+ // fail
Review Comment:
definitely need to elaborate here
##########
solr/modules/ltr/src/test-files/solr/collection1/conf/solrconfig-ltr.xml:
##########
@@ -11,7 +11,7 @@
language governing permissions and limitations under the License. -->
<config>
- <luceneMatchVersion>${tests.luceneMatchVersion:LATEST}</luceneMatchVersion>
+ <luceneMatchVersion>${tests.luceneMatchVersion:LATEST}</luceneMatchVersion>
Review Comment:
do we need this change?
##########
solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java:
##########
@@ -716,6 +727,11 @@ public <K, V> boolean regenerateItem(
});
}
+ if (solrConfig.featureVectorCacheConfig != null
+ && solrConfig.featureVectorCacheConfig.getRegenerator() == null) {
+ solrConfig.featureVectorCacheConfig.setRegenerator(new
NoOpRegenerator());
Review Comment:
With noOpRegenerator, we don't have autowarming, do we?
Any motivation?
##########
solr/modules/ltr/src/java/org/apache/solr/ltr/response/transform/LTRFeatureLoggerTransformerFactory.java:
##########
@@ -423,16 +420,13 @@ private void implTransform(SolrDocument doc, int docid,
DocIterationInfo docInfo
}
}
if (!(rerankingQuery instanceof OriginalRankingLTRScoringQuery) ||
hasExplicitFeatureStore) {
- Object featureVector = featureLogger.getFeatureVector(docid,
rerankingQuery, searcher);
- if (featureVector == null) { // FV for this document was not in the
cache
- featureVector =
- featureLogger.makeFeatureVector(
- LTRRescorer.extractFeaturesInfo(
- rerankingModelWeight,
- docid,
- (!docsWereReranked && docsHaveScores) ? docInfo.score()
: null,
- leafContexts));
- }
+ String featureVector =
+ featureLogger.printFeatureVector(
+ LTRRescorer.extractFeaturesInfo(
Review Comment:
mmmm extractFeatureInfo could actually just read from cache?
maybe we need to change the name then to something like
'fetchCacheableFeatureVector' or something similar
##########
solr/modules/ltr/src/java/org/apache/solr/ltr/response/transform/LTRFeatureLoggerTransformerFactory.java:
##########
@@ -75,11 +75,10 @@ public class LTRFeatureLoggerTransformerFactory extends
TransformerFactory {
// used inside fl to specify to log (all|model only) features
private static final String FV_LOG_ALL = "logAll";
- private static final String DEFAULT_LOGGING_MODEL_NAME = "logging-model";
+ public static final String DEFAULT_LOGGING_MODEL_NAME = "logging-model";
Review Comment:
do we need this public? maybe package level? maybe we need to refactor the
packages a bit
##########
solr/modules/ltr/src/java/org/apache/solr/ltr/LTRScoringQuery.java:
##########
@@ -368,17 +376,7 @@ public class ModelWeight extends Weight {
private final float[] modelFeatureValuesNormalized;
private final Feature.FeatureWeight[] extractedFeatureWeights;
- // List of all the feature names, values - used for both scoring and
logging
- /*
- * What is the advantage of using a hashmap here instead of an array of
objects?
- * A set of arrays was used earlier and the elements were accessed
using the featureId.
- * With the updated logic to create weights selectively,
- * the number of elements in the array can be fewer than the total
number of features.
- * When [features] are not requested, only the model features are
extracted.
- * In this case, the indexing by featureId, fails. For this reason,
- * we need a map which holds just the features that were triggered by
the documents in the result set.
- *
- */
+ // All the features
Review Comment:
too vague, all the features across stores? Within one feature store? Within
one model?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]