[ 
https://issues.apache.org/jira/browse/SOLR-12697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352666#comment-17352666
 ] 

Christine Poerschke commented on SOLR-12697:
--------------------------------------------

Hi [~TomGilke], thanks for the questions on the idea of connecting the stored 
fields!

Yes, the idea was to achieve connectivity via a document cache and I agree with 
you the example configuration is error prone e.g. because of repeated 
information.

So there's two "connecting" things that are needed:
 # field values: When computing scores we wish for the second and subsequent 
features to benefit from fetch work done by the first feature. 
{{FieldValueFeature}} has access to a/the {{SolrIndexSearcher}} and that has a 
{{SolrDocumentFetcher}} and that has a {{DocumentCache}} i.e. from looking 
(without trying) at least there _appears_ to be a way there to connect things 
if the first feature adds to that document cache. Whether or not using that 
document fetcher and adding to that document cache is actually a good idea or 
if some alternative special purpose document fetcher instance and corresponding 
document cache should be used instead, that's of course a different question, a 
question to explore further if one had a use case and were to start 
implementing a {{PrefetchingFieldValueFeature}} class (outside the scope of 
this JIRA task here).
 ** edit: I note that [~slivotov]'s original patch obtains a fetcher from a 
searcher and the searcher from the request i.e. that's an alternative to 
IndexSearcher-to-SolrIndexSearcher casting
 # field names: When fetching fields the first feature must "look ahead" to ask 
for the return not only of the field that it itself needs but also the fields 
that other features will subsequently need. It could do so at scoring time i.e. 
"ask around" what other fields its fellow features need or it could do so at 
construction time, both approaches have pros and cons of course but if it 
happens at construction time then our feature object would have a "state" which 
conceptually looks like
{code:java}
  private String field;
  private Set<String> fieldAsSet;
+ private Set<String> prefetchFields; // own field plus other 
PrefetchingFieldValueFeature objects' fields
{code}

So that then leads to the question of "where does prefetchFields come from?" on 
the basis that it coming from configuration is error prone and impractical.

At present feature construction is a single step e.g. 
[https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.8.2/solr/contrib/ltr/src/java/org/apache/solr/ltr/store/rest/ManagedFeatureStore.java#L121]
 but one might imagine a two pass approach e.g. after all the features in a 
store are constructed in a second pass we (a) determine which features are 
prefetching and (b) tell all the prefetching features what the joint set of 
fields is.
 * There probably is a proper name for that sort of approach but I don't know 
what it is, sorry!
 * Model construction already has a "sometimes two pass" approach e.g. note the 
[Feature.getInstance|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.8.2/solr/contrib/ltr/src/java/org/apache/solr/ltr/store/rest/ManagedFeatureStore.java#L209]
 and 
[LTRScoringModel.getInstance|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.8.2/solr/contrib/ltr/src/java/org/apache/solr/ltr/store/rest/ManagedModelStore.java#L239]
 similarity but the [if (ltrScoringModel instanceof AdapterModel) 
initAdapterModel(solrResourceLoader, (AdapterModel)ltrScoringModel, 
managedFeatureStore)|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.8.2/solr/contrib/ltr/src/java/org/apache/solr/ltr/store/rest/ManagedModelStore.java#L248-L250]
 difference i.e. for most models initialisation is completed in the constructor 
but for adapter models the initialization is completed only via init method 
call.

Oops, slightly more words above than anticipated! Perhaps one short pseudocode 
bit to try to sum it up:
{code:java}
# configuration illustration
[ ...
  { "name":  "a", "class": "PrefetchingFieldValueFeature", "params": { "field" 
: "aaa" } },
  ...
  { "name":  "b", "class": "PrefetchingFieldValueFeature", "params": { "field" 
: "bbb" } },
  ...
  { "name":  "c", "class": "PrefetchingFieldValueFeature", "params": { "field" 
: "ccc" } },
  ...
  { "name":  "c", "class": "FieldValueFeature", "params": { "field" : "xyz" } },
  ... ]

# code fragment 1
class PrefetchingFieldValueFeature extends FieldValueFeature {
  private Set<String> prefetchFields;  
  public void setPrefetchFields(Set<String> prefetchFields) {
    this.prefetchFields = prefetchFields;
  }
  ...
}

# code fragment 2
  Set<String> prefetchFields = new Set<>();
  for (Feature feature : featureStore.getFeatures()) {
    if (feature instanceof PrefetchingFieldValueFeature) {
      prefetchFields.add(((PrefetchingFieldValueFeature)feature).getField());
    }
  }
  # prefetchFields contains "aaa" and "bbb" and "ccc" at this point
  for (Feature feature : featureStore.getFeatures()) {
    if (feature instanceof PrefetchingFieldValueFeature) {
      ((PrefetchingFieldValueFeature)feature).setPrefetchFields(prefetchFields);
    }
  }
{code}
{{</endOfMeThinkingOutAloud>}} :) What do you think?

> pure DocValues support for FieldValueFeature
> --------------------------------------------
>
>                 Key: SOLR-12697
>                 URL: https://issues.apache.org/jira/browse/SOLR-12697
>             Project: Solr
>          Issue Type: Sub-task
>          Components: contrib - LTR
>            Reporter: Stanislav Livotov
>            Priority: Major
>         Attachments: SOLR-12697.patch, SOLR-12697.patch, SOLR-12697.patch, 
> SOLR-12697.patch, SOLR-12697.patch
>
>          Time Spent: 8h
>  Remaining Estimate: 0h
>
> [~slivotov] wrote in SOLR-12688:
> bq. ... FieldValueFeature doesn't support pure DocValues fields (Stored 
> false). Please also note that for fields which are both stored and DocValues 
> it is working not optimal because it is extracting just one field from the 
> stored document. DocValues are obviously faster for such usecases. ...
> (Please see SOLR-12688 description for overall context and analysis results.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to