When performing a fuzzy search inside a BooleanQuery, it looks like the default behavior is to score all fuzzy matches separately and then sum them up to get an aggregate score. However, I need it to instead score based on the maximum of each distinct match it might find, rather than the sum of them, to avoid overly inflated scores in some circumstances.
For example, consider a query for "Bstn~2" and four documents containing "Boston", "Basin", "Boston Basin", and "Boston Boston Basin". The query might respectively score them as 1, 1, 2, and 3 (or something like that, depending on the scorer used, of course). However, I need it to instead score them as 1, 1, 1, and 2, since that's the count of just the most frequent unique fuzzy match in each document. Ideally I'd like to use a built in mechanism for achieving this, but if it's not available, a way to extend the BooleanQuery, BooleanWeight, and/or BooleanScorer classes to have slightly different scoring logic but otherwise function exactly the same would also work, but all of those are either final classes or have no public constructor, effectively making it impossible to reuse their logic directly, as near as I can tell. If anyone has any ideas of how to approach this, it would be very helpful. Thanks, Kainoa