[ https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189236#comment-14189236 ]
ASF subversion and git services commented on SOLR-6248: ------------------------------------------------------- Commit 1635329 from [~anshumg] in branch 'dev/trunk' [ https://svn.apache.org/r1635329 ] SOLR-6248: Changing the format of mlt query parser > MoreLikeThis Query Parser > ------------------------- > > Key: SOLR-6248 > URL: https://issues.apache.org/jira/browse/SOLR-6248 > Project: Solr > Issue Type: New Feature > Components: query parsers > Reporter: Anshum Gupta > Assignee: Anshum Gupta > Fix For: 5.0 > > Attachments: SOLR-6248.patch, SOLR-6248.patch, SOLR-6248.patch, > SOLR-6248.patch, SOLR-6248.patch > > > MLT Component doesn't let people highlight/paginate and the handler comes > with an cost of maintaining another piece in the config. Also, any changes to > the default (number of results to be fetched etc.) /select handler need to be > copied/synced with this handler too. > Having an MLT QParser would let users get back docs based on a query for them > to paginate, highlight etc. It would also give them the flexibility to use > this anywhere i.e. q,fq,bq etc. > A bit of history about MLT (thanks to Hoss) > MLT Handler pre-dates the existence of QParsers and was meant to take an > arbitrary query as input, find docs that match that > query, club them together to find interesting terms, and then use those > terms as if they were my main query to generate a main result set. > This result would then be used as the set to facet, highlight etc. > The flow: Query -> DocList(m) -> Bag (terms) -> Query -> DocList\(y) > The MLT component on the other hand solved a very different purpose of > augmenting the main result set. It is used to get similar docs for each of > the doc in the main result set. > DocSet\(n) -> n * Bag (terms) -> n * (Query) -> n * DocList(m) > The new approach: > All of this can be done better and cleaner (and makes more sense too) using > an MLT QParser. > An important thing to handle here is the case where the user doesn't have > TermVectors, in which case, it does what happens right now i.e. parsing > stored fields. > Also, in case the user doesn't have a field (to be used for MLT) indexed, the > field would need to be a TextField with an index analyzer defined. This > analyzer will then be used to extract terms for MLT. > In case of SolrCloud mode, '/get-termvectors' can be used after looking at > the schema (if TermVectors are enabled for the field). If not, a /get call > can be used to fetch the field and parse it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org