Anna created SOLR-16596:
---------------------------

             Summary: LTR MultipleAdditiveTreeModel do not support missing 
features' value
                 Key: SOLR-16596
                 URL: https://issues.apache.org/jira/browse/SOLR-16596
             Project: Solr
          Issue Type: Improvement
      Security Level: Public (Default Security Level. Issues are Public)
            Reporter: Anna


The current MultipleAdditiveTree model doesn't support missing features' values.
When a feature value is not passed, the model directly translates it to zero.

Other LTR model libraries, like xgboost, are able to differentiate missing 
values from other values and also from zero values. They learn how to treat 
missing values at training time and add an additional "missing" branch to the 
tree with the direction learned to be the best in that situation.

It would be nice to integrate this feature also in Solr MultipleAdditiveTree 
models. An additional "missing" parameter should be added to the 
RegressionTreeNode. This will determine the direction to take in case the 
feature value is missing.

This integration will allow us to differentiate between zero and missing 
features. 
For example, if the feature is "hotel_avg_review" (with a ranking between zero 
and five stars), we would like to behave differently if the hotel has no 
reviews (we do not know if it is good) or if it has a review of zero stars (the 
hotel is bad).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to