: I index 1000 docs, 5 of them are 95% the same (for example: copy pasted
: blog articles from different sources, with slight changes (author name,
: etc..)).
: But they have differences.
: *Now i like to see 1 doc in my result set and the other 4 should be marked
: as similar.*
Do you actaully w
Hello folks,
i have questions about MLT and Deduplication and what would be the best
choice in my case.
Case:
I index 1000 docs, 5 of them are 95% the same (for example: copy pasted
blog articles from different sources, with slight changes (author name,
etc..)).
But they have differences.
*Now i