[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

ASF subversion and git services (JIRA) Wed, 29 Oct 2014 16:10:50 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189236#comment-14189236
 ]


ASF subversion and git services commented on SOLR-6248:
-------------------------------------------------------

Commit 1635329 from [~anshumg] in branch 'dev/trunk'
[ https://svn.apache.org/r1635329 ]

SOLR-6248: Changing the format of mlt query parser

> MoreLikeThis Query Parser
> -------------------------
>
>                 Key: SOLR-6248
>                 URL: https://issues.apache.org/jira/browse/SOLR-6248
>             Project: Solr
>          Issue Type: New Feature
>          Components: query parsers
>            Reporter: Anshum Gupta
>            Assignee: Anshum Gupta
>             Fix For: 5.0
>
>         Attachments: SOLR-6248.patch, SOLR-6248.patch, SOLR-6248.patch, 
> SOLR-6248.patch, SOLR-6248.patch
>
>
> MLT Component doesn't let people highlight/paginate and the handler comes 
> with an cost of maintaining another piece in the config. Also, any changes to 
> the default (number of results to be fetched etc.) /select handler need to be 
> copied/synced with this handler too.
> Having an MLT QParser would let users get back docs based on a query for them 
> to paginate, highlight etc. It would also give them the flexibility to use 
> this anywhere i.e. q,fq,bq etc.
> A bit of history about MLT (thanks to Hoss)
> MLT Handler pre-dates the existence of QParsers and was meant to take an 
> arbitrary query as input, find docs that match that 
> query, club them together to find interesting terms, and then use those 
> terms as if they were my main query to generate a main result set.
> This result would then be used as the set to facet, highlight etc.
> The flow: Query -> DocList(m) -> Bag (terms) -> Query -> DocList\(y)
> The MLT component on the other hand solved a very different purpose of 
> augmenting the main result set. It is used to get similar docs for each of 
> the doc in the main result set.
> DocSet\(n) -> n * Bag (terms) -> n * (Query) -> n * DocList(m)
> The new approach:
> All of this can be done better and cleaner (and makes more sense too) using 
> an MLT QParser.
> An important thing to handle here is the case where the user doesn't have 
> TermVectors, in which case, it does what happens right now i.e. parsing 
> stored fields.
> Also, in case the user doesn't have a field (to be used for MLT) indexed, the 
> field would need to be a TextField with an index analyzer defined. This 
> analyzer will then be used to extract terms for MLT.
> In case of SolrCloud mode, '/get-termvectors' can be used after looking at 
> the schema (if TermVectors are enabled for the field). If not, a /get call 
> can be used to fetch the field and parse it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

Reply via email to