krickert commented on PR #1105:
URL: https://github.com/apache/opennlp/pull/1105#issuecomment-4803069606

   @rzo1 Thanks — the backward-compat surface is now written up, and the three 
minor items are addressed (tip `ed5c7777`).
   
   **Migration / behavior-change note.** Added a "Behavior changes in this 
release" section to the `opennlp-dl` README covering all four impacts you 
listed:
   - removal of public `NameFinderDL.I_PER` / `B_PER` (source + binary break 
for external referrers);
   - `find()` reports coordinates of the joined, possibly-normalized input — 
original-text coordinates come from `findInOriginal()`, and the two differ only 
under a length-changing dash fold;
   - `DocumentCategorizerDL.categorize()` now rejects null/empty input and a 
document with no non-whitespace token;
   - chunking moved from `split("\\s+")` to the full Unicode `White_Space` set, 
which affects all DL callers (not just opt-in users) and can shift chunk 
boundaries on non-ASCII whitespace.
   
   **Message capitalization.** Capitalized all the new exception messages to 
match the surrounding ones, including the parameter-led validation messages 
(`The documentSplitSize must be greater than zero.`, `The splitOverlapSize 
must…`, `The tokenCount must…`, `The strings argument must…`).
   
   **`mergeOverlappingSpans` returning the input by reference.** Now returns 
`new ArrayList<>(spans)` on the `size < 2` path so the caller always owns the 
result (matching the merging path); added a test asserting the trivial-input 
result is a distinct list.
   
   **Untested guards.** Extracted both into package-visible helpers so they're 
unit-testable without a live ONNX session (mirroring the existing 
`softmax`/`tokenIds` seams): `logitsFromOutput(Object)` covers the `infer()` 
null-output and unexpected-shape paths, and 
`requireMatchingCategoryCount(double[], int)` covers the `distribution.length 
!= categories.size()` mismatch. Both now have direct tests.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to