Hello, Christine and thank you for your help!
So, we've investigated further based on your suggestions and have the
following things to note:
Reproducibility: We can reproduce the same queries on multiple runs, with
the same error.
Data as a factor: Our setup is single-sharded, so we can't investigate
further on this.
Feature vs. Model: We've also tried a dummy LinearModel with only two
features and the problem still occurs.
Identification of the troublesome feature(s): We've narrowed our model to
only two features and the problem always occurs (for some queries, not all)
when we have a feature with a mm=1 and a feature with a mm>=3. The problem
also occurs when we only do feature extraction and the problem seems to
always occur on the feature with the bigger mm. The errors seem to be
related to the size of the head DisiPriorityQueue created here:
https://github.com/apache/lucene-solr/blob/branch_8_6/lucene/core/src/java/org/apache/lucene/search/MinShouldMatchSumScorer.java#L107
as the error changes as we change the mm for the second feature:
1 feature with mm=1 and one with mm=3 -> Index 4 out of bounds for length 4
1 feature with mm=1 and one with mm=5 -> Index 2 out of bounds for length 2
You can find below the dummy feature-store.
[
{
"store": "dummystore",
"name": "similarity_name_mm_1",
"class": "org.apache.solr.ltr.feature.SolrFeature",
"params": {
"q": "{!dismax qf=name mm=1}${term}"
}
},
{
"store": "dummystore",
"name": "similarity_names_mm_3",
"class": "org.apache.solr.ltr.feature.SolrFeature",
"params": {
"q": "{!dismax qf=name mm=3}${term}"
}
}
]
The problem starts occuring in Solr 8.6.0, as we tried multiple versions <
8.6 and >= 8.6 and the problem started on 8.6.0 and we tend to believe it's
because of the following changes:
https://issues.apache.org/jira/browse/SOLR-14364 as they're the only major
changes related to LTR which were introduced in Solr 8.6.0.
I've created a Solr JIRA bug/issue ticket here:
https://issues.apache.org/jira/browse/SOLR-15071
Thank you for your help!
În mar., 5 ian. 2021 la 19:40, Christine Poerschke (BLOOMBERG/ LONDON) <
cpoersc...@bloomberg.net> a scris:
> Hello Florin Babes,
>
> Thanks for this detailed report! I agree you experiencing
> ArrayIndexOutOfBoundsException during SolrFeature computation sounds like a
> bug, would you like to open a SOLR JIRA issue for it?
>
> Here's some investigative ideas I would have, in no particular order:
>
> Reproducibility: if a failed query is run again, does it also fail second
> time around (when some caches may be used)?
>
> Data as a factor: is your setup single-sharded or multi-sharded? in a
> multi-sharded setup if the same query fails on some shards but succeeds on
> others (and all shards have some documents that match the query) then this
> could support a theory that a certain combination of data and features
> leads to the exception.
>
> Feature vs. Model: you mention use of a MultipleAdditiveTrees model, if
> the same features are used in a LinearModel instead, do the same errors
> happen? or if no model is used but only feature extraction is done, does
> that give errors?
>
> Identification of the troublesome feature(s): narrowing down to a single
> feature or a small combination of features could make it easier to figure
> out the problem. assuming the existing logging doesn't identify the
> features, replacing the org.apache.solr.ltr.feature.SolrFeature with a
> com.mycompany.solr.ltr.feature.MySolrFeature containing instrumentation
> could provide insights e.g. the existing code [2] logs feature names for
> UnsupportedOperationException and if it also caught
> ArrayIndexOutOfBoundsException then it could log the feature name before
> rethrowing the exception.
>
> Based on your detail below and this [3] conditional in the code probably
> at least two features will be necessary to hit the issue, but for
> investigative purposes two features could still be simplified potentially
> to effectively one feature e.g. if one feature is a SolrFeature and the
> other is a ValueFeature or if featureA and featureB are both SolrFeature
> features with _identical_ parameters but different names.
>
> Hope that helps.
>
> Regards,
>
> Christine
>
> [1]
> https://lucene.apache.org/solr/guide/8_6/learning-to-rank.html#extracting-features
> [2]
> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.6.3/solr/contrib/ltr/src/java/org/apache/solr/ltr/feature/SolrFeature.java#L243
> [3]
> https://g