This is an automated email from the ASF dual-hosted git repository.
ppkarwasz pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/commons-xml.git
The following commit(s) were added to refs/heads/main by this push:
new bfa4a5a COMMONSXML-1: DocumentBuilderFactory: capability-driven
hardening (#5)
bfa4a5a is described below
commit bfa4a5a1155575dc7531e54f32a5dccbe1957a91
Author: Piotr P. Karwasz <[email protected]>
AuthorDate: Sat Jun 27 19:54:27 2026 +0200
COMMONSXML-1: DocumentBuilderFactory: capability-driven hardening (#5)
* Base DocumentBuilderFactory hardening on feature support
This change replaces the per-implementation class-name dispatch for DOM
with a single capability-driven recipe. Secure processing is required and the
external-DTD subset is skipped where supported; whether the JAXP 1.5
accessExternal properties are honoured then decides whether the bare factory is
already safe or needs a deny-all resolver wrapper, which is the only point
where the stock JDK and the external Xerces distribution diverge. Android stays
untouched because its parser expose [...]
An implementation is no longer rejected for being unrecognised. Any parser
that accepts secure processing and either the access properties or the resolver
wrapper satisfies the contract on its own, so only a parser that refuses secure
processing now fails.
A new test covers the previously unverified guarantee that an
xsi:schemaLocation hint is not fetched during DOM-side XSD validation, gated on
parsers that honour accessExternalSchema.
Assisted-By: Claude Opus 4.8 <[email protected]>
* Add minimal-footprint DocumentBuilderHardener.newInstance() entry point
Expose DocumentBuilderHardener with a public newInstance() so consumers that
only need a hardened DocumentBuilderFactory can shade the library and copy a
minimal set of classes. jdependency (the engine maven-shade-plugin's
minimizeJar uses) works at class granularity, so the shaded set is the
transitive closure of this one class.
To keep that closure small, install a deny-all EntityResolver as a lambda
local to DocumentBuilderHardener instead of reusing
Resolvers.DenyAll.ENTITY2.
EntityResolver's resolveEntity(publicId, systemId) hook is consulted for the
external DTD subset and every external entity, so it blocks all external
fetches on the DOM path just as well, while dropping the whole Resolvers
nested-class tree from the closure (12 -> 7 class files).
Route XmlFactories.newDocumentBuilderFactory() through the new entry point,
and
add ShadingFootprintTest, which uses jdependency to pin the reachable set to
exactly those 7 classes so the footprint cannot silently grow back toward
the
full library.
Assisted-By: Claude Opus 4.8 (1M context) <[email protected]>
* fix: keep `DelegatingDocumentBuilderFactory` a top-level class
* Fix Android test compilation
The androidTest source set compiles all of ../src/test/java against
android.jar, which broke on two tests new to this branch:
- ShadingFootprintTest needs org.vafer.jdependency (a Maven-only test
dependency) and pins the shading footprint, which is meaningless on
Android. Exclude it from the Android test compilation.
- SchemaLocationDomTest referenced XMLConstants.ACCESS_EXTERNAL_SCHEMA,
which android.jar does not expose. Inline the constant's value
(verified equal to the JDK constant); the test still skips at runtime
on Android, since the parser does not honour that property.
Verified on the api33 (Pixel 6a) managed device: 77 tests, 0 failures,
0 errors, 17 assumption-gated skips.
Assisted-By: Claude Opus 4.8 (1M context) <[email protected]>
* Gate SchemaLocationDomTest on schema-language support, not
accessExternalSchema
The test exercises JAXP 1.2 XSD validation, which requires the
SCHEMA_LANGUAGE
property; that, not accessExternalSchema, is the capability a parser must
have
to run it (Android lacks it). Gate both methods on supportsSchemaLanguage().
The hardened assertion also assumed the block was attributed to
accessExternalSchema, which only holds on the stock JDK. External Xerces
does
not honour that property and instead blocks the schema fetch through the
deny-all resolver, with a different message. Assert only that the failure
references the external schema, not the mechanism.
Two effects:
- The test now runs and passes on external Xerces too (covering the
resolver-based schema-block path), where it previously skipped.
- It no longer references XMLConstants.ACCESS_EXTERNAL_SCHEMA, so the
android.jar compile gap is resolved by this redesign rather than by
inlining the constant.
Verified: mvn -o test green across all executions; SchemaLocationDomTest
runs
2/2 with 0 skips on both stock JDK and external Xerces.
Assisted-By: Claude Opus 4.8 (1M context) <[email protected]>
* Remove the minimal-footprint shading entry point from this PR
Split per review: this PR (COMMONSXML-1) keeps only the capability-driven
DocumentBuilderFactory hardening. The minimal shading entry point (public
DocumentBuilderHardener.newInstance(), the inlined deny-all resolver,
ShadingFootprintTest and the jdependency test dependency) will be filed as
its own issue.
Reverses 68ad1d1 and 90ef0ae: DocumentBuilderHardener is package-private
again and harden() reuses Resolvers.DenyAll.ENTITY2;
XmlFactories.newDocumentBuilderFactory() calls harden() directly.
Assisted-By: Claude Opus 4.8 (1M context) <[email protected]>
* fix: apply review suggestions
---
android-tests/build.gradle.kts | 1 +
.../org/apache/commons/xml/AndroidProvider.java | 4 -
.../commons/xml/DocumentBuilderHardener.java | 103 +++++++++++++++++++
.../java/org/apache/commons/xml/JaxpSetters.java | 22 ++++
src/main/java/org/apache/commons/xml/Limits.java | 30 +++++-
.../org/apache/commons/xml/StockJdkProvider.java | 14 ---
.../org/apache/commons/xml/XercesProvider.java | 53 +---------
.../java/org/apache/commons/xml/XmlFactories.java | 24 +----
.../apache/commons/xml/SchemaLocationDomTest.java | 112 +++++++++++++++++++++
.../xml/UnsupportedXmlImplementationTest.java | 14 +--
.../resources/leaked/schema-location-instance.xml | 4 +
src/test/resources/leaked/schema-location.xsd | 10 ++
12 files changed, 294 insertions(+), 97 deletions(-)
diff --git a/android-tests/build.gradle.kts b/android-tests/build.gradle.kts
index 31e4161..a8938b9 100644
--- a/android-tests/build.gradle.kts
+++ b/android-tests/build.gradle.kts
@@ -16,6 +16,7 @@
*/
import com.android.build.api.dsl.ManagedVirtualDevice
+import org.gradle.api.tasks.compile.JavaCompile
plugins {
id("com.android.library") version "8.6.1"
diff --git a/src/main/java/org/apache/commons/xml/AndroidProvider.java
b/src/main/java/org/apache/commons/xml/AndroidProvider.java
index c71aacb..e32f1ee 100644
--- a/src/main/java/org/apache/commons/xml/AndroidProvider.java
+++ b/src/main/java/org/apache/commons/xml/AndroidProvider.java
@@ -172,10 +172,6 @@ public void setFeature(final String name, final boolean
value) throws SAXNotReco
private static final String NAMESPACE_PREFIXES_FEATURE =
"http://xml.org/sax/features/namespace-prefixes";
- static DocumentBuilderFactory configure(final DocumentBuilderFactory
factory) {
- return factory;
- }
-
static SAXParserFactory configure(final SAXParserFactory factory) {
return new GuardedSAXParserFactory(factory);
}
diff --git a/src/main/java/org/apache/commons/xml/DocumentBuilderHardener.java
b/src/main/java/org/apache/commons/xml/DocumentBuilderHardener.java
new file mode 100644
index 0000000..218e771
--- /dev/null
+++ b/src/main/java/org/apache/commons/xml/DocumentBuilderHardener.java
@@ -0,0 +1,103 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * https://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.xml;
+
+import static org.apache.commons.xml.JaxpSetters.setFeature;
+import static org.apache.commons.xml.JaxpSetters.setOptionalFeature;
+import static org.apache.commons.xml.JaxpSetters.trySetAttribute;
+
+import javax.xml.XMLConstants;
+import javax.xml.parsers.DocumentBuilder;
+import javax.xml.parsers.DocumentBuilderFactory;
+import javax.xml.parsers.ParserConfigurationException;
+
+import org.xml.sax.EntityResolver;
+
+/**
+ * Capability-driven hardening for any {@link DocumentBuilderFactory} on the
classpath.
+ *
+ * <p>Rather than branching on the implementation class, {@link
#harden(DocumentBuilderFactory)} probes what the factory supports and
adapts:</p>
+ * <ul>
+ * <li><strong>Android</strong> (Harmony / KXmlParser): recognised by
class name and left untouched. It exposes no {@link
XMLConstants#FEATURE_SECURE_PROCESSING
+ * FSP}, no JAXP 1.5 {@code ACCESS_EXTERNAL_*} and no attribute API at
all, while KXmlParser silently drops user-defined entities, so there is nothing
to
+ * apply.</li>
+ * <li><strong>FSP</strong>: required. It switches on the implementation's
built-in security manager, which is what carries the processing limits.</li>
+ * <li><strong>{@code XERCES_LOAD_EXTERNAL_DTD}</strong>: optional. Where
supported, it skips the external DTD subset on non-validating parsers so a
+ * DOCTYPE-only document parses without a fetch attempt. If not
supported, the fetch will throw instead, due to the following settings.</li>
+ * <li><strong>Limits</strong>: applied best-effort by {@link
Limits#tryApply(DocumentBuilderFactory)}, which adapts to the JDK attribute
limits or Xerces'
+ * {@code SecurityManager} as appropriate.</li>
+ * <li><strong>{@code ACCESS_EXTERNAL_DTD}</strong>: the dividing
capability. Implementations that honour it (the JDK-internal Xerces) block
external fetches
+ * through the JAXP 1.5 properties and are returned as-is.
Implementations that reject it (the external Xerces distribution) are wrapped
so a deny-all
+ * {@link EntityResolver} is installed on every {@link
DocumentBuilder} produced.</li>
+ * </ul>
+ */
+final class DocumentBuilderHardener {
+
+ /**
+ * Wrapper that sets a deny-all {@link EntityResolver} on every {@link
DocumentBuilder} produced.
+ *
+ * <p>Required for implementations that do not honour JAXP 1.5 {@code
ACCESS_EXTERNAL_*} (the external Xerces distribution): the factory carries no
resolver
+ * of its own, so it has to be set on each builder.</p>
+ */
+ private static final class HardeningDocumentBuilderFactory extends
DelegatingDocumentBuilderFactory {
+
+ private final EntityResolver resolver;
+
+ HardeningDocumentBuilderFactory(final DocumentBuilderFactory delegate,
final EntityResolver resolver) {
+ super(delegate);
+ this.resolver = resolver;
+ }
+
+ @Override
+ public DocumentBuilder newDocumentBuilder() throws
ParserConfigurationException {
+ final DocumentBuilder builder = super.newDocumentBuilder();
+ builder.setEntityResolver(resolver);
+ return builder;
+ }
+ }
+
+ /** Class name of Android's Harmony-based {@link DocumentBuilderFactory},
which exposes no hardening surface. */
+ private static final String ANDROID_DOCUMENT_BUILDER_FACTORY =
"org.apache.harmony.xml.parsers.DocumentBuilderFactoryImpl";
+
+ /** Xerces feature: load the external DTD subset for non-validating
parsers. */
+ private static final String XERCES_LOAD_EXTERNAL_DTD =
"http://apache.org/xml/features/nonvalidating/load-external-dtd";
+
+ static DocumentBuilderFactory harden(final DocumentBuilderFactory factory)
{
+ // Android exposes no FSP, ACCESS_EXTERNAL_* or attribute API, and
KXmlParser drops user-defined entities; nothing to apply.
+ if
(ANDROID_DOCUMENT_BUILDER_FACTORY.equals(factory.getClass().getName())) {
+ return factory;
+ }
+ // Required: enables the implementation's security manager, which
carries the limits.
+ setFeature(factory, XMLConstants.FEATURE_SECURE_PROCESSING, true);
+ // Optional, implementation-based: JDK attribute limits or Xerces'
SecurityManager.
+ Limits.tryApply(factory);
+ // Optional: skip the external DTD subset on non-validating parsers so
DOCTYPE-only documents parse without a blocked fetch attempt.
+ setOptionalFeature(factory, XERCES_LOAD_EXTERNAL_DTD, false);
+ // ACCESS_EXTERNAL_* support is the dividing capability between JAXP
1.5 implementations and older ones.
+ if (trySetAttribute(factory, XMLConstants.ACCESS_EXTERNAL_DTD, "")
+ && trySetAttribute(factory,
XMLConstants.ACCESS_EXTERNAL_SCHEMA, "")) {
+ // Honoured: the JAXP 1.5 properties block external fetches, so
the bare factory is already hardened.
+ return factory;
+ }
+ // Rejected: external Xerces ignores ACCESS_EXTERNAL_*; install a
deny-all resolver on every DocumentBuilder.
+ return new HardeningDocumentBuilderFactory(factory,
Resolvers.DenyAll.ENTITY2);
+ }
+
+ private DocumentBuilderHardener() {
+ }
+}
diff --git a/src/main/java/org/apache/commons/xml/JaxpSetters.java
b/src/main/java/org/apache/commons/xml/JaxpSetters.java
index 55c7f62..e4ffa05 100644
--- a/src/main/java/org/apache/commons/xml/JaxpSetters.java
+++ b/src/main/java/org/apache/commons/xml/JaxpSetters.java
@@ -55,6 +55,20 @@ static void setAttribute(final DocumentBuilderFactory
factory, final String attr
apply(factory, "attribute", attribute, () ->
factory.setAttribute(attribute, value));
}
+ /** @return {@code true} if the attribute was applied, {@code false} if
the implementation rejected it. */
+ static boolean trySetAttribute(final DocumentBuilderFactory factory, final
String attribute, final Object value) {
+ try {
+ factory.setAttribute(attribute, value);
+ return true;
+ } catch (final Exception e) {
+ return false;
+ }
+ }
+
+ static void setOptionalAttribute(final DocumentBuilderFactory factory,
final String attribute, final Object value) {
+ trySetAttribute(factory, attribute, value);
+ }
+
static void setAttribute(final TransformerFactory factory, final String
attribute, final Object value) {
apply(factory, "attribute", attribute, () ->
factory.setAttribute(attribute, value));
}
@@ -63,6 +77,14 @@ static void setFeature(final DocumentBuilderFactory factory,
final String featur
apply(factory, "feature", feature, () -> factory.setFeature(feature,
value));
}
+ static void setOptionalFeature(final DocumentBuilderFactory factory, final
String feature, final boolean value) {
+ try {
+ factory.setFeature(feature, value);
+ } catch (final Exception e) {
+ // Ignored: the implementation does not recognise this feature.
+ }
+ }
+
static void setFeature(final SAXParserFactory factory, final String
feature, final boolean value) {
apply(factory, "feature", feature, () -> factory.setFeature(feature,
value));
}
diff --git a/src/main/java/org/apache/commons/xml/Limits.java
b/src/main/java/org/apache/commons/xml/Limits.java
index c1f9a8f..03ff250 100644
--- a/src/main/java/org/apache/commons/xml/Limits.java
+++ b/src/main/java/org/apache/commons/xml/Limits.java
@@ -17,6 +17,7 @@
package org.apache.commons.xml;
import static org.apache.commons.xml.JaxpSetters.setAttribute;
+import static org.apache.commons.xml.JaxpSetters.setOptionalAttribute;
import static org.apache.commons.xml.JaxpSetters.setProperty;
import java.util.Collections;
@@ -216,6 +217,10 @@ final class Limits {
* Woodstox property: maximum number of entity expansions in a single
parse.
*/
private static final String WSTX_MAX_ENTITY_COUNT =
"com.ctc.wstx.maxEntityCount";
+ /**
+ * Class name of the external Apache Xerces {@link
DocumentBuilderFactory}, whose limits live on a {@code SecurityManager} rather
than JDK attributes.
+ */
+ private static final String EXTERNAL_XERCES_DOCUMENT_BUILDER_FACTORY =
"org.apache.xerces.jaxp.DocumentBuilderFactoryImpl";
static {
final Map<String, IntSupplier> map = new LinkedHashMap<>();
@@ -231,12 +236,23 @@ final class Limits {
}
/**
- * Sets every JDK-supported limit on a stock JDK {@link
DocumentBuilderFactory}.
+ * Best-effort application of the processing limits to a {@link
DocumentBuilderFactory}, dispatched on the implementation.
+ *
+ * <p>External Xerces carries its limits on an {@code
org.apache.xerces.util.SecurityManager} instance. Every other implementation
(the stock JDK and any
+ * future attribute-based parser) takes the JDK limit attributes. Neither
path throws if the implementation declines a limit.</p>
*
* @param factory The target factory to modify.
*/
- static void applyToJdkDom(final DocumentBuilderFactory factory) {
- JDK_LIMITS.forEach((name, supplier) -> setAttribute(factory, name,
Integer.toString(supplier.getAsInt())));
+ static void tryApply(final DocumentBuilderFactory factory) {
+ if
(EXTERNAL_XERCES_DOCUMENT_BUILDER_FACTORY.equals(factory.getClass().getName()))
{
+ // Install a fresh SecurityManager pinned to JDK 25 limits,
replacing Xerces' built-in caps which are looser than even JDK 8.
+ final Object securityManager = newSecurityManager();
+ applyToXerces(securityManager);
+ setAttribute(factory,
XercesProvider.XERCES_SECURITY_MANAGER_PROPERTY, securityManager);
+ return;
+ }
+ // Pin the JDK attribute limits to JDK 25 secure values; skip silently
any attribute the implementation does not recognise.
+ JDK_LIMITS.forEach((name, supplier) -> setOptionalAttribute(factory,
name, Integer.toString(supplier.getAsInt())));
}
/**
@@ -304,6 +320,14 @@ static void applyToXerces(final Object securityManager) {
}
}
+ private static Object newSecurityManager() {
+ try {
+ return
Class.forName("org.apache.xerces.util.SecurityManager").getDeclaredConstructor().newInstance();
+ } catch (final ReflectiveOperationException e) {
+ throw new HardeningException("Failed to instantiate
org.apache.xerces.util.SecurityManager; expected Xerces to be on the
classpath", e);
+ }
+ }
+
private static int getElementAttributeLimit() {
return read(SP_ELEMENT_ATTRIBUTE_LIMIT,
DEFAULT_ELEMENT_ATTRIBUTE_LIMIT);
}
diff --git a/src/main/java/org/apache/commons/xml/StockJdkProvider.java
b/src/main/java/org/apache/commons/xml/StockJdkProvider.java
index fa31557..99a7d8d 100644
--- a/src/main/java/org/apache/commons/xml/StockJdkProvider.java
+++ b/src/main/java/org/apache/commons/xml/StockJdkProvider.java
@@ -22,7 +22,6 @@
import static org.apache.commons.xml.JaxpSetters.setProperty;
import javax.xml.XMLConstants;
-import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.SAXParserFactory;
import javax.xml.stream.XMLInputFactory;
import javax.xml.transform.TransformerFactory;
@@ -72,19 +71,6 @@ final class StockJdkProvider {
*/
private static final String ZEPHYR_IGNORE_EXTERNAL_DTD =
"http://java.sun.com/xml/stream/properties/ignore-external-dtd";
- static DocumentBuilderFactory configure(final DocumentBuilderFactory
factory) {
- // Required: enables the JDK XMLSecurityManager limits.
- setFeature(factory, XMLConstants.FEATURE_SECURE_PROCESSING, true);
- // Let DOCTYPE-only documents parse silently without SSRF: skip the
external DTD subset on non-validating parsers.
- setFeature(factory, XERCES_LOAD_EXTERNAL_DTD, false);
- // Defense-in-depth: pin to JDK 25 limits so older JDKs do not fall
back to looser secure values.
- Limits.applyToJdkDom(factory);
- // Defense-in-depth: already FSP-secure defaults, set explicitly so
they are not relaxed via system property.
- setAttribute(factory, XMLConstants.ACCESS_EXTERNAL_DTD, "");
- setAttribute(factory, XMLConstants.ACCESS_EXTERNAL_SCHEMA, "");
- return factory;
- }
-
static SAXParserFactory configure(final SAXParserFactory factory) {
// Required: enables the JDK XMLSecurityManager limits.
setFeature(factory, XMLConstants.FEATURE_SECURE_PROCESSING, true);
diff --git a/src/main/java/org/apache/commons/xml/XercesProvider.java
b/src/main/java/org/apache/commons/xml/XercesProvider.java
index 7b89be1..4634c9f 100644
--- a/src/main/java/org/apache/commons/xml/XercesProvider.java
+++ b/src/main/java/org/apache/commons/xml/XercesProvider.java
@@ -20,9 +20,6 @@
import static org.apache.commons.xml.JaxpSetters.setFeature;
import javax.xml.XMLConstants;
-import javax.xml.parsers.DocumentBuilder;
-import javax.xml.parsers.DocumentBuilderFactory;
-import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import javax.xml.validation.Schema;
@@ -30,7 +27,6 @@
import javax.xml.validation.Validator;
import javax.xml.validation.ValidatorHandler;
-import org.xml.sax.EntityResolver;
import org.xml.sax.SAXNotRecognizedException;
import org.xml.sax.SAXNotSupportedException;
import org.xml.sax.XMLReader;
@@ -39,7 +35,7 @@
* Hardening recipes for the external Apache Xerces distribution (the {@code
xerces:xercesImpl} artifact).
*
* <p>Factory classes live in the {@code org.apache.xerces.*} package.
External Xerces does not ship a {@code TransformerFactory}, {@code
XMLInputFactory} or
- * {@code XPathFactory}, so this class only handles DOM, SAX and Schema
factories.</p>
+ * {@code XPathFactory}, so this class only handles SAX and Schema factories.
DOM hardening lives in {@link DocumentBuilderHardener}.</p>
*
* <p>Hardening recipe applied to every factory below uses the same building
blocks:</p>
* <ul>
@@ -53,8 +49,7 @@
* {@code ACCESS_EXTERNAL_*} properties, so an explicit resolver
installed on every parser/validator is the best way to block external
* entity, DTD and schema fetching, without disabling those features
altogether. The wrappers exist for two reasons:</p>
* <ol>
- * <li>{@link DocumentBuilderFactory} / {@link SAXParserFactory}
carry no resolver, so it has to be set on each
- * {@link DocumentBuilder} / {@link SAXParser} produced;</li>
+ * <li>{@link SAXParserFactory} carries no resolver, so it has to
be set on each {@link SAXParser} produced.</li>
* <li>Xerces' {@link Schema} does not propagate the {@link
SchemaFactory}'s resolver or security manager to its
* {@link Validator} / {@link ValidatorHandler} products, so the
wrapper re-installs both on every product.</li>
* </ol>
@@ -63,29 +58,6 @@
*/
final class XercesProvider {
- /**
- * Hardened Xerces {@link DocumentBuilderFactory} wrapper.
- *
- * <p>Sets the deny-all {@link EntityResolver} on every {@link
DocumentBuilder} produced; required because {@link DocumentBuilderFactory}
carries no
- * resolver of its own and Xerces does not honour JAXP 1.5 {@code
ACCESS_EXTERNAL_*}.</p>
- */
- private static final class HardeningDocumentBuilderFactory extends
DelegatingDocumentBuilderFactory {
-
- private final EntityResolver resolver;
-
- HardeningDocumentBuilderFactory(final DocumentBuilderFactory delegate,
final EntityResolver resolver) {
- super(delegate);
- this.resolver = resolver;
- }
-
- @Override
- public DocumentBuilder newDocumentBuilder() throws
ParserConfigurationException {
- final DocumentBuilder builder = super.newDocumentBuilder();
- builder.setEntityResolver(resolver);
- return builder;
- }
- }
-
private static Validator hardenValidator(final Validator validator) {
try {
Limits.applyToXerces(validator.getProperty(XERCES_SECURITY_MANAGER_PROPERTY));
@@ -114,27 +86,6 @@ private static ValidatorHandler
hardenValidatorHandler(final ValidatorHandler ha
/** Xerces feature: load the external DTD subset for non-validating
parsers. */
private static final String XERCES_LOAD_EXTERNAL_DTD =
"http://apache.org/xml/features/nonvalidating/load-external-dtd";
- private static Object newSecurityManager() {
- try {
- return
Class.forName("org.apache.xerces.util.SecurityManager").getDeclaredConstructor().newInstance();
- } catch (final ReflectiveOperationException e) {
- throw new HardeningException("Failed to instantiate
org.apache.xerces.util.SecurityManager; expected Xerces to be on the
classpath", e);
- }
- }
-
- static DocumentBuilderFactory configure(final DocumentBuilderFactory
factory) {
- // Required: enables Xerces' built-in SecurityManager (which is what
carries the limits).
- setFeature(factory, XMLConstants.FEATURE_SECURE_PROCESSING, true);
- // Let DOCTYPE-only documents parse silently without SSRF: skip the
external DTD subset on non-validating parsers.
- setFeature(factory, XERCES_LOAD_EXTERNAL_DTD, false);
- // Defense-in-depth: install a fresh SecurityManager pinned to JDK 25
limits, replacing Xerces' built-in caps which are looser than even JDK 8.
- final Object securityManager = newSecurityManager();
- Limits.applyToXerces(securityManager);
- factory.setAttribute(XERCES_SECURITY_MANAGER_PROPERTY,
securityManager);
- // Required: Xerces does not honour JAXP 1.5 ACCESS_EXTERNAL_*; the
wrapper installs a deny-all resolver on every DocumentBuilder.
- return new HardeningDocumentBuilderFactory(factory,
Resolvers.DenyAll.ENTITY2);
- }
-
static SAXParserFactory configure(final SAXParserFactory factory) {
// Required: enables Xerces' built-in SecurityManager (which is what
carries the limits).
setFeature(factory, XMLConstants.FEATURE_SECURE_PROCESSING, true);
diff --git a/src/main/java/org/apache/commons/xml/XmlFactories.java
b/src/main/java/org/apache/commons/xml/XmlFactories.java
index b97c59f..6228256 100644
--- a/src/main/java/org/apache/commons/xml/XmlFactories.java
+++ b/src/main/java/org/apache/commons/xml/XmlFactories.java
@@ -73,19 +73,6 @@
*/
public final class XmlFactories {
- static DocumentBuilderFactory dispatch(final DocumentBuilderFactory
factory) {
- switch (factory.getClass().getName()) {
- case
"com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl":
- return StockJdkProvider.configure(factory);
- case "org.apache.harmony.xml.parsers.DocumentBuilderFactoryImpl":
- return AndroidProvider.configure(factory);
- case "org.apache.xerces.jaxp.DocumentBuilderFactoryImpl":
- return XercesProvider.configure(factory);
- default:
- throw noProvider(factory);
- }
- }
-
private static SAXParserFactory dispatch(final SAXParserFactory factory) {
switch (factory.getClass().getName()) {
case
"com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl":
@@ -197,16 +184,15 @@ public static XMLReader harden(final XMLReader reader) {
/**
* Returns a fresh, hardened {@link DocumentBuilderFactory}.
*
- * <p>Beyond the three universal guarantees on {@link XmlFactories},
XInclude resolution is disabled. Calling
- * {@link DocumentBuilderFactory#setXIncludeAware(boolean)
setXIncludeAware(true)} on the returned factory does not re-enable resolution;
a parse that
- * encounters an {@code xi:include} element fails.</p>
+ * <p><strong>Enabling XInclude:</strong> {@link
DocumentBuilderFactory#setXIncludeAware(boolean) setXIncludeAware(true)} on its
own does not make XInclude
+ * usable, because an included resource is fetched like any other external
resource and is therefore blocked, failing the parse. A caller that genuinely
+ * wants XInclude must, in addition to enabling awareness, install a
custom {@link org.xml.sax.EntityResolver} that permits those specific
lookups.</p>
*
* @return a hardened factory.
- * @throws IllegalStateException if the underlying JAXP implementation is
not recognised by any bundled hardening recipe, or if the matching recipe cannot
- * apply its settings to it.
+ * @throws IllegalStateException if a required hardening setting cannot be
applied to the underlying implementation.
*/
public static DocumentBuilderFactory newDocumentBuilderFactory() {
- return dispatch(DocumentBuilderFactory.newInstance());
+ return
DocumentBuilderHardener.harden(DocumentBuilderFactory.newInstance());
}
/**
diff --git a/src/test/java/org/apache/commons/xml/SchemaLocationDomTest.java
b/src/test/java/org/apache/commons/xml/SchemaLocationDomTest.java
new file mode 100644
index 0000000..8b873e3
--- /dev/null
+++ b/src/test/java/org/apache/commons/xml/SchemaLocationDomTest.java
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * https://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.xml;
+
+import static org.apache.commons.xml.AttackTestSupport.LEAKED_MARKER;
+import static org.apache.commons.xml.AttackTestSupport.resourceUrl;
+import static org.junit.jupiter.api.Assertions.assertEquals;
+import static org.junit.jupiter.api.Assertions.assertThrows;
+import static org.junit.jupiter.api.Assertions.assertTrue;
+import static org.junit.jupiter.api.Assumptions.assumeTrue;
+
+import javax.xml.XMLConstants;
+import javax.xml.parsers.DocumentBuilder;
+import javax.xml.parsers.DocumentBuilderFactory;
+
+import org.junit.jupiter.api.Tag;
+import org.junit.jupiter.api.Test;
+import org.w3c.dom.Document;
+import org.xml.sax.InputSource;
+import org.xml.sax.SAXException;
+import org.xml.sax.SAXParseException;
+import org.xml.sax.helpers.DefaultHandler;
+
+/**
+ * Checks that a hardened {@link DocumentBuilderFactory} performing JAXP 1.2
XSD validation does not fetch an external schema named by an
+ * {@code xsi:noNamespaceSchemaLocation} hint in the instance document.
+ *
+ * <p>The instance is empty {@code <root/>}; the referenced schema declares a
default {@code leak} attribute carrying {@link
AttackTestSupport#LEAKED_MARKER}. A
+ * parser that fetches the schema inlines that default into the DOM (the
permissive control), while a hardened parser refuses the fetch and fails the
parse. The
+ * attribution differs by implementation (the stock JDK reports an {@code
accessExternalSchema} / {@code schema_reference} error; external Xerces'
deny-all
+ * resolver reports a forbidden-fetch error), so the test only asserts the
fetch was blocked, not how.</p>
+ *
+ * <p>The test runs only where the implementation supports JAXP 1.2
schema-language XSD validation (the stock JDK and external Xerces do; Android
does not), so
+ * it skips on parsers without it.</p>
+ */
+@Tag("dom")
+class SchemaLocationDomTest {
+
+ /** JAXP 1.2 property selecting the schema language used by {@link
DocumentBuilderFactory#setValidating(boolean)}. */
+ private static final String SCHEMA_LANGUAGE =
"http://java.sun.com/xml/jaxp/properties/schemaLanguage";
+
+ private static final String INSTANCE = "schema-location-instance.xml";
+
+ /** Name of the external schema the instance points at; both block
mechanisms name it in the failure message. */
+ private static final String SCHEMA = "schema-location.xsd";
+
+ @Test
+ void hardenedBlocksExternalSchemaFetch() {
+ assumeTrue(supportsSchemaLanguage(), "parser does not support JAXP 1.2
schema-language XSD validation");
+ final DocumentBuilderFactory factory =
enableXsdValidation(XmlFactories.newDocumentBuilderFactory());
+ // The schemaLocation fetch is denied and surfaced as a fatal error
rather than a silent fetch. The attribution is implementation-specific, so
assert
+ // only that the failure names the external schema, not the mechanism
(accessExternalSchema on the JDK, the deny-all resolver on external Xerces).
+ final SAXException thrown = assertThrows(SAXException.class, () ->
parse(factory));
+ assertTrue(thrown.getMessage() != null &&
thrown.getMessage().contains(SCHEMA),
+ "Block must reference the external schema, got: " +
thrown.getMessage());
+ }
+
+ @Test
+ void unconfiguredFetchesExternalSchema() throws Exception {
+ assumeTrue(supportsSchemaLanguage(), "parser does not support JAXP 1.2
schema-language XSD validation");
+ final DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
+ factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, false);
+ // Positive control: without hardening the external schema is fetched
and its default attribute is inlined into the DOM.
+ final Document document = parse(enableXsdValidation(factory));
+ assertEquals(LEAKED_MARKER,
document.getDocumentElement().getAttribute("leak"),
+ "Permissive parse should have fetched the external schema and
inlined its default attribute.");
+ }
+
+ private static DocumentBuilderFactory enableXsdValidation(final
DocumentBuilderFactory factory) {
+ factory.setNamespaceAware(true);
+ factory.setValidating(true);
+ factory.setAttribute(SCHEMA_LANGUAGE,
XMLConstants.W3C_XML_SCHEMA_NS_URI);
+ return factory;
+ }
+
+ private static Document parse(final DocumentBuilderFactory factory) throws
Exception {
+ final DocumentBuilder builder = factory.newDocumentBuilder();
+ builder.setErrorHandler(new DefaultHandler() {
+ @Override
+ public void error(final SAXParseException exception) throws
SAXException {
+ throw exception;
+ }
+ });
+ return builder.parse(new
InputSource(resourceUrl(INSTANCE).toString()));
+ }
+
+ private static boolean supportsSchemaLanguage() {
+ try {
+ final DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
+ factory.setValidating(true);
+ factory.setAttribute(SCHEMA_LANGUAGE,
XMLConstants.W3C_XML_SCHEMA_NS_URI);
+ return true;
+ } catch (final Exception e) {
+ return false;
+ }
+ }
+}
diff --git
a/src/test/java/org/apache/commons/xml/UnsupportedXmlImplementationTest.java
b/src/test/java/org/apache/commons/xml/UnsupportedXmlImplementationTest.java
index c0f304a..863fdf4 100644
--- a/src/test/java/org/apache/commons/xml/UnsupportedXmlImplementationTest.java
+++ b/src/test/java/org/apache/commons/xml/UnsupportedXmlImplementationTest.java
@@ -23,16 +23,18 @@
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
+import javax.xml.parsers.ParserConfigurationException;
import org.junit.jupiter.api.Test;
/**
- * Verifies that an unknown factory class surfaces {@link
IllegalStateException} with a message naming the class.
+ * Verifies that an implementation which does not honour the required
secure-processing feature surfaces {@link IllegalStateException} with a message
naming the
+ * class.
*/
class UnsupportedXmlImplementationTest {
/**
- * A stand-in factory class whose fully qualified name is not matched by
any bundled hardening recipe.
+ * A stand-in factory that rejects the secure-processing feature, like a
JAXP implementation that does not recognise it.
*/
public static final class FakeDocumentBuilderFactory extends
DocumentBuilderFactory {
@@ -52,8 +54,8 @@ public Object getAttribute(final String name) {
}
@Override
- public void setFeature(final String name, final boolean value) {
- // no-op
+ public void setFeature(final String name, final boolean value) throws
ParserConfigurationException {
+ throw new ParserConfigurationException("feature not recognised: "
+ name);
}
@Override
@@ -63,10 +65,10 @@ public boolean getFeature(final String name) {
}
@Test
- void dispatchRejectsUnknownFactory() {
+ void hardenRejectsUnsecurableFactory() {
final IllegalStateException thrown = assertThrows(
IllegalStateException.class,
- () -> XmlFactories.dispatch(new FakeDocumentBuilderFactory()));
+ () -> DocumentBuilderHardener.harden(new
FakeDocumentBuilderFactory()));
assertNotNull(thrown.getMessage());
assertTrue(thrown.getMessage().contains(FakeDocumentBuilderFactory.class.getName()),
"Exception message must name the unsupported class: " +
thrown.getMessage());
diff --git a/src/test/resources/leaked/schema-location-instance.xml
b/src/test/resources/leaked/schema-location-instance.xml
new file mode 100644
index 0000000..3caf442
--- /dev/null
+++ b/src/test/resources/leaked/schema-location-instance.xml
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- SPDX-License-Identifier: Apache-2.0 -->
+<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+ xsi:noNamespaceSchemaLocation="schema-location.xsd"/>
diff --git a/src/test/resources/leaked/schema-location.xsd
b/src/test/resources/leaked/schema-location.xsd
new file mode 100644
index 0000000..e40ca40
--- /dev/null
+++ b/src/test/resources/leaked/schema-location.xsd
@@ -0,0 +1,10 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- SPDX-License-Identifier: Apache-2.0 -->
+<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
+ <xs:element name="root">
+ <xs:complexType>
+ <!-- Default attribute value; a schema-validating parser inlines it into
the DOM only if it fetched this external schema. -->
+ <xs:attribute name="leak" type="xs:string" default="All your base are
belong to us"/>
+ </xs:complexType>
+ </xs:element>
+</xs:schema>