Following up on Mikhail good insights, I would probably recommend using the More Like This Query Parser followed by grouping/field collapsing on a field. It should solve your problem!
If your requirements are more advanced feel free to let us know! Cheers -------------------------- *Alessandro Benedetti* Director @ Sease Ltd. *Apache Lucene/Solr Committer* *Apache Solr PMC Member* e-mail: [email protected] *Sease* - Information Retrieval Applied Consulting | Training | Open Source Website: Sease.io <http://sease.io/> LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter <https://twitter.com/seaseltd> | Youtube <https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github <https://github.com/seaseltd> On Wed, 12 Apr 2023 at 13:15, Mikhail Khludnev <[email protected]> wrote: > Hello Tom. > It's not clear which kind of MLT you are referring to: handler, queryparser > or component . > Generally there are two options for deduplication: > - query time: filed grouping or field collapsing > - index time: > - mlt query might be limited to parents with titles and children might > carry editions with dates and so one > - or mlt query can be filtered to the recent edition only for every > title, thus recent-flag should be set during indexing and then used by > filter. > > On Wed, Apr 12, 2023 at 1:22 PM Tom Tailor <[email protected]> wrote: > > > Hi all > > > > > > > > I want to build a recommender using Solr MoreLikeThis. I work on > > bibliographic data I.e. books. I have multiple records of different > > editions of the same book. For a given book MLT returns all different > > editions of the book this is not new content from the users point of > view. > > I can not deduplicate the records because the different editions are > > relevant for other applications. > > > > > > > > Is it possible to circumvent this? I could use the books title which is > the > > same across all editions to filter duplicates from the MLT results > > > > > > > > Thanks for your help > > > > > -- > Sincerely yours > Mikhail Khludnev > https://t.me/MUST_SEARCH > A caveat: Cyrillic! >
