Re: [I] IBSimilarity causes test failure in TestSynonymQuery [lucene]

via GitHub Thu, 28 May 2026 13:09:10 -0700


iprithv commented on issue #16137:
URL: https://github.com/apache/lucene/issues/16137#issuecomment-4567796246


   I did a test of the expm1 approach and it turns out to be worse, not better. 
I ran it across realistic lambda values (df/N for all df=1..N-1, N=2..1000), 
Original formula + guards: 0 negative scores in the entire search space Expm1 
reformulation: 142,948 negative scores, all at tfn=0 These aren't just tiny 
doubles, they survive the float cast in BasicSimScorer.score() (-1.1e-16 as 
float is still -1.1e-16), so BaseSimilarityTestCase's assertTrue(score >= 0) 
would fail.
   
   I also looked at the `log1p` reformulation `-log1p((pow - 1) / (1 - 
lambda))`, mathematically equivalent. produces the same -0.0 for the edge 
cases, results differ by ~1e-15 for normal inputs due to different rounding 
paths, and still needs the pow == lambda guard and clamping on top. so it 
doesn't actually solve the problem, just shifts the rounding error around.
   
   I tried `if (fraction >= 1.0) return 0.0` approach too which is kind of same 
as clamp and solves the issue. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] IBSimilarity causes test failure in TestSynonymQuery [lucene]

Reply via email to