[ https://issues.apache.org/jira/browse/SOLR-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15163105#comment-15163105 ]
Christine Poerschke commented on SOLR-8542: ------------------------------------------- Question related to [Feature Engineering|https://en.wikipedia.org/wiki/Feature_engineering] - is that the right term? - and feature extraction. [LTRQParserPlugin.java#L117|https://github.com/bloomberg/lucene-solr/blob/master-ltr-plugin-rfc/solr/contrib/ltr/src/java/org/apache/solr/ltr/ranking/LTRQParserPlugin.java#L117] mentions bq. For training a new model offline you need feature vectors, but dont yet have a model. and [README.txt#L280|https://github.com/bloomberg/lucene-solr/blob/master-ltr-plugin-rfc/solr/contrib/ltr/README.txt#L280] mentions about for now using a dummy model e.g. bq. fv=true&fl=*,score,\[features\]&rq={!ltr model=dummyModel reRankDocs=25} to extract features. If it is known already, could you outline what the replacement for the above fv/fl/dummyModel combination is likely to look like? Semi-related to that: * would the {{efi.*}} parameters move out of the {{rq}} then since candidate features to be returned in the response might reference external feature info? * might it be useful to have optional version and/or comment string elements in the feature and model JSON format? Illustration: {code} { "type": "org.apache.solr.ltr.feature.impl.SolrFeature", "name": "documentRecency", "comment": "Initial version, we may have to tweak the recip function arguments later.", "params": { "q": "{!func}recip( ms(NOW,publish_date), 3.16e-11, 1, 1)" } } ... { "type":"org.apache.solr.ltr.ranking.RankSVMModel", "name":"myModelName", "version": "1.0", "comment": "features and parameters determined using XYZ with ABC data, ticket reference: 12345", "features":[ ... ], "params":{ ... } } {code} > Integrate Learning to Rank into Solr > ------------------------------------ > > Key: SOLR-8542 > URL: https://issues.apache.org/jira/browse/SOLR-8542 > Project: Solr > Issue Type: New Feature > Reporter: Joshua Pantony > Assignee: Christine Poerschke > Priority: Minor > Attachments: README.md, README.md, SOLR-8542-branch_5x.patch, > SOLR-8542-trunk.patch > > > This is a ticket to integrate learning to rank machine learning models into > Solr. Solr Learning to Rank (LTR) provides a way for you to extract features > directly inside Solr for use in training a machine learned model. You can > then deploy that model to Solr and use it to rerank your top X search > results. This concept was previously presented by the authors at Lucene/Solr > Revolution 2015 ( > http://www.slideshare.net/lucidworks/learning-to-rank-in-solr-presented-by-michael-nilsson-diego-ceccarelli-bloomberg-lp > ). > The attached code was jointly worked on by Joshua Pantony, Michael Nilsson, > David Grohmann and Diego Ceccarelli. > Any chance this could make it into a 5x release? We've also attached > documentation as a github MD file, but are happy to convert to a desired > format. > h3. Test the plugin with solr/example/techproducts in 6 steps > Solr provides some simple example of indices. In order to test the plugin > with > the techproducts example please follow these steps > h4. 1. compile solr and the examples > cd solr > ant dist > ant example > h4. 2. run the example > ./bin/solr -e techproducts > h4. 3. stop it and install the plugin: > > ./bin/solr stop > mkdir example/techproducts/solr/techproducts/lib > cp build/contrib/ltr/lucene-ltr-6.0.0-SNAPSHOT.jar > example/techproducts/solr/techproducts/lib/ > cp contrib/ltr/example/solrconfig.xml > example/techproducts/solr/techproducts/conf/ > h4. 4. run the example again > > ./bin/solr -e techproducts > h4. 5. index some features and a model > curl -XPUT 'http://localhost:8983/solr/techproducts/schema/fstore' > --data-binary "@./contrib/ltr/example/techproducts-features.json" -H > 'Content-type:application/json' > curl -XPUT 'http://localhost:8983/solr/techproducts/schema/mstore' > --data-binary "@./contrib/ltr/example/techproducts-model.json" -H > 'Content-type:application/json' > h4. 6. have fun ! > *access to the default feature store* > http://localhost:8983/solr/techproducts/schema/fstore/_DEFAULT_ > *access to the model store* > http://localhost:8983/solr/techproducts/schema/mstore > *perform a query using the model, and retrieve the features* > http://localhost:8983/solr/techproducts/query?indent=on&q=test&wt=json&rq={!ltr%20model=svm%20reRankDocs=25%20efi.query=%27test%27}&fl=*,[features],price,score,name&fv=true -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org