[ 
https://issues.apache.org/jira/browse/SOLR-295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511332
 ] 

Ryan McKinley commented on SOLR-295:
------------------------------------

I haven't looked at the patch yet.  Everything sounds reasonable.  I am a bit 
reluctant to glob MLT on to the dismax request handler because we keep seeing 
the need to glob on more and more.  Recent discussions have pointed towards a 
'search component' framework.  Something that defines a chain of stuff that 
could typically happen in a query (dismax+mlt+faceting+faceting on 
mlt+collapse+highlighting+...).  SOLR-281 is a quick/crude implementation.

something to think about...

> Implementing MoreLikeThis support in DismaxRequestHandler
> ---------------------------------------------------------
>
>                 Key: SOLR-295
>                 URL: https://issues.apache.org/jira/browse/SOLR-295
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.3
>            Reporter: Pieter Berkel
>            Priority: Minor
>         Attachments: MoreLikeThis-DismaxRequestHandler_SOLR-295.patch
>
>
> There's nothing too clever about this initial patch to be upload shortly, I 
> have simply extracted the MLT code from the StandardRequestHandler and 
> inserted it into the DismaxRequestHandler.  However, there are some broader 
> MLT issues that I'd also like to address in the near future:
> 1) (trivial) No "This response format is experimental" warning when MLT is 
> used with StandardRequestHandler (or DismaxRequestHandler).  Not really a big 
> deal but at least makes developers aware of the possibility of future changes.
> 2) (trivial) "org.apache.solr.common.util.MoreLikeThisParams" should perhaps 
> be moved to the more appropriate package "org.apache.solr.common.params".
> 3) (non-trivial) The ability to specify the list of fields that should be 
> returned when MLT is invoked from an external handler (i.e. 
> StandardRequestHandler).  Currently the field list (FL) parameter is 
> inherited from the main query but I can envisage cases where it would be 
> desirable to specify more or less return fields in the MLT query than the 
> main query.  One complication is that "mlt.fl" is already used to specify the 
> fields used for similarity.  Perhaps "mlt.fl" is not the best name for this 
> parameter and should be renamed to avoid potential conflict / confusion?
> 4) (fairly-trivial) On a similar note to 3, there is currently no way to 
> specify a "start" value for the rows returned when MLT is invoked from an 
> external handler (e.g. StandardRequestHandler), it is hard-coded to 0 (i.e. 
> the first "mlt.count" documents matched).  While I can see the logic in 
> naming the parameter "mlt.count", it does seem a little inconsistent and 
> perhaps it would be better to rename (or at least alias) it to "mlt.rows" to 
> be consistent with the CommonQueryParameters.  Note that "mlt.start" is 
> fundamentally different to the "mlt.match.offset" parameter as the later 
> deals with documents *matching* the initial MLT query while the former deals 
> with documents *returned* by the MLT query (hope that makes sense).
> I have created a patch that implemented "mlt.start" (to specify the start 
> doc) and added "mlt.rows" that could be used interchangeably with "mlt.count" 
> (but I would prefer to remove "mlt.count" altogether), but since it involves 
> changing the method definition of MoreLikeThisHelper.getMoreLikeThese(), I 
> wanted to get some opinions before submitting it.
> 5) (non-trivial) Interesting Terms - the ability to return interesting term 
> information using the "mlt.interestingTerms" parameter when MLT is invoked 
> from an external handler.  This is perhaps the most useful feature I am 
> looking to implement, I can see great benefit in being able to provide a list 
> of interesting terms or "keywords" for each document returned in a standard 
> or dismax query.  Currently this only available from the MLT request handler 
> so perhaps the best approach would be to re-factor the "interestingTerms" 
> code in MoreLikeThisHandler class and put it somewhere in MoreLikeThisHelper 
> so it is available to all handlers?  Again, I would appreciate any comments 
> or suggestions.
> I've also noted the MLT features suggested by Tristan [ 
> http://www.nabble.com/MoreLikeThis-with-DisMax-boost-query---functions-tf4047187.html
>  ] which could quite possibly be rolled together with the above points -- I'm 
> not sure whether is is better to have a single ticket tracking several 
> related issues or create invididual tickets for each issue, however will be 
> happy to comply with the Solr issue tracking policy on advice from the core 
> developers.
> regards,
> Pieter

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to