rzo1 opened a new pull request, #1118:
URL: https://github.com/apache/opennlp/pull/1118

   Backport of #1117 to `opennlp-2.x`.
   
   ## Problem
   
   Release/CI builds of `opennlp-docs` fail non-deterministically while reading 
the DocBook sources:
   
   ```
   Failed to transform opennlp.xml.: Failure reading .../opennlp.xml:
   Remote host terminated the handshake: SSL peer shut down incorrectly
   ```
   
   ## Root cause
   
   The docbkx-maven-plugin already loads an **offline** XML catalog from the 
bundled `net.sf.docbook:docbook-xml:5.0-all:resources` dependency. That catalog 
maps:
   
   - public id `-//OASIS//DTD DocBook XML 5.0//EN`
   - system ids `http://www.oasis-open.org/docbook/xml/5.0/dtd/docbook.dtd` and 
`http://docbook.org/xml/5.0/dtd/docbook.dtd`
   
   But the manual sources declared a DOCTYPE that matched **neither**:
   
   - public id `-//OASIS//DTD DocBook XML **V5.0**//EN` (extra `V`)
   - system id `https://**cdn.docbook.org**/schema/5.0/dtd/docbook.dtd`
   
   With no catalog match, the parser falls back to fetching the DTD from 
`cdn.docbook.org` over the network at build time. That host intermittently 
resets the TLS handshake, so the build fails at random — the "SSL handshake" 
error is a symptom, not the cause.
   
   ## Fix
   
   Align the DOCTYPE public/system identifiers in all 
`opennlp-docs/src/docbkx/*.xml` files with the bundled catalog, so the DTD 
resolves locally. The DTD is **retained** (PDF/FO generation depends on it).
   
   ## Verification
   
   Built `opennlp-docs package` (HTML + PDF) with **offline mode and a dead 
HTTP/HTTPS proxy** on this branch — any network access would fail instantly — 
and it is `BUILD SUCCESS`, generating both `opennlp.html` and `opennlp.pdf`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to