Do you want to not index if something similar? Or don't index if exact.   I 
would look into a hash code of the document if you don't want to index exact.   
 Similar though, I think has to be based off a document in the index.   

On Apr 4, 2011, at 5:16, Frederico Azeiteiro <frederico.azeite...@cision.com> 
wrote:

> Hi,
> 
> 
> 
> I would like to hear your opinion about the MLT feature and if it's a
> good solution to what I need to implement.
> 
> 
> 
> My index has fields like: headline, body and medianame.
> 
> What I need to do is, before adding a new doc, verify if a similar doc
> exists for this media.
> 
> 
> 
> My idea is to use the MorelikeThisHandler
> (http://wiki.apache.org/solr/MoreLikeThisHandler) in the following way:
> 
> 
> 
> For each new doc, perform a MLT search with q= medianame and
> stream.body=headline+bodytext.
> 
> If no similar docs are found than I can safely add the doc.
> 
> 
> 
> Is this feasible using the MLT handler? Is it a good approach? Are there
> a better way to perform this comparison?
> 
> 
> 
> Thank you for your help.
> 
> 
> 
> Best regards,
> 
> ____________________________________________
> 
> Frederico Azeiteiro
> 
> 
> 

Reply via email to