Hello,

I am using Lucene for plagiarism detection.

The goal is that: when I have a new document, I will check on the solr index
if there is a document that contain some common chunk.

So to compute similarity between the query and a source document I would use
this formula :

Score (suspicious document, source document) = Number of common chunk
between source document and suspicious document  / Number of total chunk in
the suspicious document.

So I have to change the scoring formula in the Similarity class.

How can I change the scoring formula? ( by customizing only the Similarity
class? or Scorer?)

Do you have an Example of this use case?

Thank for your help.

Reply via email to