rzo1 commented on PR #1106:
URL: https://github.com/apache/opennlp/pull/1106#issuecomment-4762682210
Thx for the PR. Here are some suggestions:
- Declare xmlns:xlink explicitly on the chapter roots of normalizer.xml and
tokenizer.xml. Both use xlink:href (normalizer.xml:457, tokenizer.xml:461) but
rely on the DocBook 5.0 DTD's #FIXED default to bind the prefix. The Maven
build resolves the DTD so it works, but it breaks under any non-validating
namespace-aware tool (IDE linters, xmllint --nonet), and every other chapter
declares it explicitly:
normalizer.xml: <chapter xml:id="tools.normalizer"
xmlns:xlink="http://www.w3.org/1999/xlink">
tokenizer.xml: <chapter xml:id="tools.tokenizer"
xmlns:xlink="http://www.w3.org/1999/xlink">
Note that tokenizer.xml newly introduces xlink usage, so this is the first
chapter to add it there.
Otherwise the content is accurate. I cross-checked every class, method,
constructor, builder chain, enum, and stated default against the code in
#1103/#1104/#1105 and they all match, including the "off by default"
InferenceOptions claims, the findInOriginal guidance, and the lossy
matching-only warnings for confusable and accent folding. Chapter registration
in opennlp.xml, xml:id uniqueness, and xref targets all resolve.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]