Hi, The ideia is don't index if something similar (headline+bodytext) for the same exact medianame.
Do you mean I would need to index the doc first (maybe in a temp index) and then use the MLT feature to find similar docs before adding to final index? Thanks, Frederico -----Original Message----- From: Chris Fauerbach [mailto:chris.fauerb...@gmail.com] Sent: segunda-feira, 4 de Abril de 2011 10:22 To: solr-user@lucene.apache.org Subject: Re: Using MLT feature Do you want to not index if something similar? Or don't index if exact. I would look into a hash code of the document if you don't want to index exact. Similar though, I think has to be based off a document in the index. On Apr 4, 2011, at 5:16, Frederico Azeiteiro <frederico.azeite...@cision.com> wrote: > Hi, > > > > I would like to hear your opinion about the MLT feature and if it's a > good solution to what I need to implement. > > > > My index has fields like: headline, body and medianame. > > What I need to do is, before adding a new doc, verify if a similar doc > exists for this media. > > > > My idea is to use the MorelikeThisHandler > (http://wiki.apache.org/solr/MoreLikeThisHandler) in the following way: > > > > For each new doc, perform a MLT search with q= medianame and > stream.body=headline+bodytext. > > If no similar docs are found than I can safely add the doc. > > > > Is this feasible using the MLT handler? Is it a good approach? Are there > a better way to perform this comparison? > > > > Thank you for your help. > > > > Best regards, > > ____________________________________________ > > Frederico Azeiteiro > > >