rzo1 commented on PR #1003:
URL: https://github.com/apache/opennlp/pull/1003#issuecomment-4156851451
Hi,
Thanks for the contribution! Overall, I like the idea of looking into
built-in thread safety rather than relying on ThreadLocal-based wrappers, which
have known issues in Jakarta EE and other long-lived thread environments.
A few concerns I'd like to discuss before this can move forward (imho):
1. Benchmarks (no JMH)
The benchmarks are hand-rolled System.nanoTime() + ExecutorService loops.
Without JMH, the results are susceptible to JIT warmup, GC pauses, and profile
pollution, i.e. there's no fork isolation, no warmup iterations, and no
statistical variance reporting. For a change that removes multiple
caching layers and claims "performance within noise," JMH would be
preferable. We already have a profile for JMH benchmarks.
2. Caches removed without replacement
Three layers of caching were removed as a shortcut to thread safety:
- `CachedFeatureGenerator`: the class is now a pass-through that caches
nothing, despite its name. During beam search, the same token position may be
feature-generated up to k times (beam width) with identical inputs. This cache
was saving real work.
- `DefaultPOSContextGenerator` / `ConfigurablePOSContextGenerator`:
per-sentence context caches removed entirely.
- `BeamSearch.contextsCache: you note this was buggy (stale references
from the shared probs[] buffer). That may be valid, but removing it rather than
fixing it (e.g., storing copies) conflates a bug fix with the thread-safety
change.
The regression benchmark reports "performance within noise," but without
JMH-level statistical rigor that's hard to verify. More importantly, the
benchmark uses a small set of short sentences: a benchmark against a real-world
dataset (e.g., from the eval/test corpora:
https://nightlies.apache.org/opennlp/) would be far more convincing,
particularly for POS tagging where the feature generation cache had the most
impact under larger workloads. A thread-safe alternative would be making the
caches method-local rather than removing them entirely.
3. Thread-safety tests are not robust
- No contention forcing: there's no CyclicBarrier or CountDownLatch to
ensure threads hit the critical section simultaneously. Threads free-run, which
reduces the probability of surfacing races.
- LemmatizerME was patched but has no thread-safety test.
- probs() under concurrent access is not tested, despite being preserved
as volatile for backward compatibility.
- The test could pass on a 2-core CI machine and fail on a 64-core box. I
think, that we sohuld at minimum set higher iteration counts with
barrier-synchronized thread starts.
4. Missing coverage
- Only 4 of 7 ME classes are addressed (ChunkerME, NameFinderME,
LanguageDetectorME are untouched). This is fine as a scoped PR, but worth
noting, so the existing ThreadSafe*ME wrappers can't be deprecated yet.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]