Following up on Mikhail good insights,
I would probably recommend using the More Like This Query Parser followed
by grouping/field collapsing on a field.
It should solve your problem!

If your requirements are more advanced feel free to let us know!

Cheers
--------------------------
*Alessandro Benedetti*
Director @ Sease Ltd.
*Apache Lucene/Solr Committer*
*Apache Solr PMC Member*

e-mail: [email protected]


*Sease* - Information Retrieval Applied
Consulting | Training | Open Source

Website: Sease.io <http://sease.io/>
LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter
<https://twitter.com/seaseltd> | Youtube
<https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github
<https://github.com/seaseltd>


On Wed, 12 Apr 2023 at 13:15, Mikhail Khludnev <[email protected]> wrote:

> Hello Tom.
> It's not clear which kind of MLT you are referring to: handler, queryparser
> or component .
> Generally there are two options for deduplication:
> - query time: filed grouping or field collapsing
> - index time:
>   - mlt query might be limited to parents with titles and children might
> carry editions with dates and so one
>   - or mlt query can be filtered to the recent edition only for every
> title, thus recent-flag should be set during indexing and then used by
> filter.
>
> On Wed, Apr 12, 2023 at 1:22 PM Tom Tailor <[email protected]> wrote:
>
> > Hi all
> >
> >
> >
> > I want to build a recommender using Solr MoreLikeThis. I work on
> > bibliographic data I.e. books. I have multiple records of different
> > editions of the same book.  For a given book MLT returns all different
> > editions of the book this is not new content from the users point of
> view.
> > I can not deduplicate the records because the different editions are
> > relevant for other applications.
> >
> >
> >
> > Is it possible to circumvent this? I could use the books title which is
> the
> > same across all editions to filter duplicates from the MLT results
> >
> >
> >
> > Thanks for your help
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> https://t.me/MUST_SEARCH
> A caveat: Cyrillic!
>

Reply via email to