krickert commented on PR #1003:
URL: https://github.com/apache/opennlp/pull/1003#issuecomment-4267147245

   @mawiesne @rzo1 — pushed two review-follow-up commits, branch is now at 
`26ef5152`:
   
   1. **`OPENNLP-1816: Drop strong Thread reference in 
LastResultOwnerOrThreadLocal`** (`7e442322`)
      - Track ownership via `Thread#threadId()` (long) instead of a `volatile 
Thread` reference, matching what `OwnerOrPerThreadState` already does. Holding 
a strong `Thread` reference in a long-lived component would pin the worker 
thread's context classloader in container environments — exactly the leak this 
class was designed to avoid. Worst-case under id-based tracking is that a 
recycled-id thread sees a stale `ownerValue` instead of `null`, which is no 
worse than the documented `get()` contract (callers `set()` before `get()` on 
every thread).
      - Added focused unit tests for both `LastResultOwnerOrThreadLocal` and 
`OwnerOrPerThreadState`: owner-fast-path, second-thread isolation, owner 
reclaim after `clearForCurrentThread()`, one-way multi-threaded transition, 
16-thread concurrent stress, scoped clear semantics. 12 new tests, all passing. 
Previously these helpers were only exercised indirectly through 
`ThreadSafetyBenchmarkIT`.
      - Class Javadoc on `LastResultOwnerOrThreadLocal` now documents the three 
thread-safety strategies side-by-side (owner-fast-path / volatile-swap / plain 
`ThreadLocal`) and which ME classes use which, so the reasoning isn't spread 
across seven files.
   
   2. **`OPENNLP-1816: BeamSearch tempScores reuse and thread-safety Javadoc 
polish`** (`26ef5152`)
      - `BeamSearch.CacheState` now stashes a per-thread `tempScores` buffer of 
length `numOutcomes` instead of allocating a fresh `double[]` inside the inner 
beam loop on every iteration. Functionally equivalent; removes one allocation 
per beam step per token on the cache hit/miss hot path.
      - `BeamSearch.close()` Javadoc spells out the per-thread cleanup 
contract: a single `close()` only releases the calling thread's `CacheState`, 
not every per-thread slot held by long-lived `BeamSearch` instances shared 
across pool threads.
      - `POSTaggerME` class Javadoc clarifies that sharing one tagger saves 
both memory **and model load time** (the dominant startup cost).
      - `CachedFeatureGenerator` class Javadoc calls out that the "is this 
still the same sentence?" check uses reference identity (`tokens == 
prevTokens`) — passing a freshly allocated `String[]` with the same contents 
triggers a cache miss + clear.
   
   Full local `verify` is green: **685 / 685 unit tests** + **19 / 19 
integration tests** (including all 8 `ThreadSafetyBenchmarkIT` cases), 
checkstyle clean.
   
   @rzo1 — would you mind re-triggering 
[eval-tests-configurable](https://ci-builds.apache.org/job/OpenNLP/job/eval-tests-configurable/)
 against the new tip `26ef5152`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to