Re: [I] IBSimilarity causes test failure in TestSynonymQuery [lucene]

via GitHub Thu, 28 May 2026 12:15:40 -0700


iprithv commented on issue #16137:
URL: https://github.com/apache/lucene/issues/16137#issuecomment-4567414301


   @romseygeek ah got it. sharing what I found while digging into this.
   
   the negative score is actually just `-0.0`, coming from `-Math.log(1.0)`. 
this happens when `(pow - lambda) / (1 - lambda)` becomes exactly `1.0`.
   one common case is when `tfn = 0`. then `q = 0`, so `lambda^0 = 1.0`, and 
the fraction becomes `(1 - lambda)/(1 - lambda) = 1.0`, so we end up with 
`-log(1.0) = -0.0`. the `nextUp` case can also hit this — if lambda is very 
close to 1, it can push pow to exactly `1.0` and same thing happens.
   the bigger issue is the `pow - lambda` subtraction. when the values are 
close, that subtraction loses precision (catastrophic cancellation), so things 
get unstable. the current `nextUp`/`nextDown` guards are a workaround but don't 
fully cover it.
   
   I tried a few options:
   1. rewriting using `Math.expm1` to avoid that subtraction — since `lambda^q 
- lambda = lambda * expm1((q-1) * log(lambda))`, the score becomes 
`-log(lambda) - log(|expm1(-r * log(lambda))|) + log(|1 - lambda|)`. this makes 
it much more stable and removes the need for the guards. worked well in most 
cases, though still saw tiny negatives (~1e-15) in some edge cases
   2. just clamping to 0, `Axiomatic` already does `Math.max(0, score)` for the 
same kind of reason (its gamma component can produce negatives). simple but 
doesn't fix the underlying instability
   3. combining both, use the stable formula and still clamp small negatives as 
a safety net
   
   I now think the combined approach is probably the safest, will wait for 
feedbacks and will change PR accordingly :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] IBSimilarity causes test failure in TestSynonymQuery [lucene]

Reply via email to