ppkarwasz opened a new pull request, #5: URL: https://github.com/apache/commons-xml/pull/5
## Summary Two related changes to DOM (`DocumentBuilderFactory`) hardening. ### 1. Capability-driven hardening Implements @elharo's suggestion in the [`xml-commons-dev@xerces` thread](https://lists.apache.org/thread/2yct4mkq9v0jf5xx3om1nbwbqqtgtbnm): > Why not set all relevant features and properties on each and check that you succeeded in configuring a minimal set to provide security? Replaces the per-implementation class-name dispatch for DOM with a single capability-driven recipe: - Secure processing (FSP) is required; the external-DTD subset is skipped where supported. - Whether the JAXP 1.5 `accessExternal*` properties are honoured then decides whether the bare factory is already safe or needs a deny-all resolver wrapper. This is the only point where the stock JDK and the external Xerces distribution diverge. - Android stays untouched (its parser exposes no hardening surface). - An implementation is no longer rejected for being unrecognised: any parser that accepts secure processing and either the access properties or the resolver wrapper satisfies the contract. Only a parser that refuses secure processing now fails. - Adds a test that an `xsi:schemaLocation` hint is not fetched during DOM-side XSD validation (gated on parsers that honour `accessExternalSchema`). ### 2. Minimal shading entry point Adds a public `DocumentBuilderHardener.newInstance()` so consumers that only need a hardened `DocumentBuilderFactory` can shade the library and copy a minimal set of classes. - `jdependency` (the engine `maven-shade-plugin`'s `minimizeJar` uses) works at class granularity, so the shaded set is the transitive closure of this one class. - A deny-all `EntityResolver` is installed as a local lambda instead of reusing `Resolvers`, which drops the whole `Resolvers` nested-class tree from the closure (12 -> 7 class files). - `XmlFactories.newDocumentBuilderFactory()` is routed through the new entry point. - `ShadingFootprintTest` uses `jdependency` to pin the reachable set to those 7 classes, so the footprint cannot silently grow back toward the full library. - Javadoc on both methods documents how to enable XInclude: it is held off by the deny-all external-fetch behavior, not the awareness flag, so enabling it also requires a custom `EntityResolver`. ## Testing `mvn -o test` (all 7 JAXP-combination executions) passes with 0 failures and 0 errors. The 2 skips are `SchemaLocationDomTest` under external Xerces, which does not honour `accessExternalSchema`. ## Follow-ups (separate PRs) - A deeply-nested-document (element-depth) test, to cover the limit FSP leaves unbounded on JDK 8-21. 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
