krickert commented on PR #1084:
URL: https://github.com/apache/opennlp/pull/1084#issuecomment-4702648323

   These are all great points.   It's never easy to make things thread safe :) 
So thank you for your patience.
   
   I just updated to addresses items 1 & 2 and the API/close notes.
   
   ### 1. Chunking consistency (`DocumentCategorizerDL` / `NameFinderDL`)
   
   Took your preferred path: extracted shared chunking into 
`AbstractDL.chunkRanges(...)` and wired **both** `tokenize()` implementations 
through it.
   
   For `len=10, split=8, overlap=4`, both now produce `[0,8)` and `[4,10)` — no 
redundant trailing `[8,10)` chunk. `NameFinderDL` is updated to match the 
corrected tail behavior rather than leaving the siblings on different loops.
   
   Added `ChunkRangesTest` with explicit coverage for that case (plus a few 
adjacent scenarios).
   
   ### 2. Comment drift
   
   Updated both `tokenize()` comments to describe overlapping chunks driven by 
`InferenceOptions` (`documentSplitSize` / `splitOverlapSize`), and removed the 
stale hardcoded "200 word / 50 overlapping" wording.
   
   ### 3. API surface / `static`
   
   Reverted `loadVocab(...)` and `createTokenizer(...)` to **instance** methods 
(`public` / `protected`), so there’s no `public static` break for external 
callers. The actual implementation lives in package-private static helpers 
(`loadVocabFile`, `createBertTokenizer`, `createWordpieceTokenizer`) used 
internally and by in-module tests.
   
   On the implicit no-arg `AbstractDL` constructor removal: unchanged in this 
update — in-repo subclasses/tests already use the required-args constructor. 
Happy to call that out in release notes if you think it’s worth documenting for 
downstream extenders.
   
   ### Minor: terminal `close()`
   
   `close()` now treats the first attempt as terminal: `closed` is flipped via 
`compareAndSet` **before** `session.close()`, and we no longer reset it on 
failure. A failed native close won’t be retried into an “already closed” state.
   
   ### Verification
   
   ```
   ./mvnw -pl opennlp-core/opennlp-ml/opennlp-dl test
   # Tests run: 41, Failures: 0, Errors: 0, Skipped: 0 — BUILD SUCCESS
   ```
   
   Let me know if you’d like anything else adjusted.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to