[ 
https://issues.apache.org/jira/browse/GROOVY-11979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul King updated GROOVY-11979:
-------------------------------
    Description: 
h2. Part 1: FactorySupport hardened factories (secure by default)

*What's included:*
* {{FactorySupport.java}}: every factory method now returns a hardened factory 
by default
** {{createDocumentBuilderFactory()}} — delegates to 
{{createDocumentBuilderFactory(false)}}; previously returned a bare JDK factory
** {{createSaxParserFactory()}} — delegates to 
{{createSaxParserFactory(false)}}; previously returned a bare JDK factory
** {{createDocumentBuilderFactory(boolean allowDocTypeDeclaration)}} *(new)*
** {{createSaxParserFactory(boolean allowDocTypeDeclaration)}} *(new)*
** {{createXMLInputFactory()}} *(new)*
** {{createXMLInputFactory(boolean allowDocTypeDeclaration)}} *(new)*
** {{createTransformerFactory(boolean allowDocTypeDeclaration, boolean 
allowExternalResources)}} *(new)*
** {{createSchemaFactory(String schemaLanguage)}} *(new)*
** {{createXPathFactory()}} *(new)*
* {{FactorySupport.java}}: added private quiet helpers for the new factory 
types (SchemaFactory, XPathFactory, TransformerFactory {{setAttribute}}, 
XMLInputFactory {{setProperty}}).
* {{FactorySupportTest.java}}: 14 new tests covering hardening defaults on both 
the zero-arg and boolean overloads, relax-flag round-trips for each factory 
type.

*Hardening recipes applied:*
|| Factory || Settings ||
| DocumentBuilderFactory | {{FEATURE_SECURE_PROCESSING=true}}, 
{{disallow-doctype-decl=!allow}}, {{XIncludeAware=false}}, 
{{ExpandEntityReferences=false}} |
| SAXParserFactory | {{FEATURE_SECURE_PROCESSING=true}}, 
{{disallow-doctype-decl=!allow}} |
| XMLInputFactory | {{SUPPORT_DTD=allow}}, 
{{IS_SUPPORTING_EXTERNAL_ENTITIES=false}} |
| TransformerFactory | {{FEATURE_SECURE_PROCESSING=true}}, 
{{disallow-doctype-decl=!allow}}, {{ACCESS_EXTERNAL_DTD}} and 
{{ACCESS_EXTERNAL_STYLESHEET}} = {{"all"}} or {{""}} per 
{{allowExternalResources}} |
| SchemaFactory | {{FEATURE_SECURE_PROCESSING=true}} (no {{ACCESS_EXTERNAL_*}} 
— preserves legitimate {{<xs:import>}}) |
| XPathFactory | {{FEATURE_SECURE_PROCESSING=true}} |

*Compat notes:*
* Direct callers of the zero-arg {{createDocumentBuilderFactory()}} / 
{{createSaxParserFactory()}} now receive hardened factories. Callers that 
previously parsed DOCTYPE-bearing input through those factories must switch to 
the {{(true)}} overload. Internal Groovy callers ({{XmlParser}}, 
{{XmlSlurper}}, {{XmlUtil}}, {{DOMBuilder}}) overlay their own settings on top 
of the default and are unaffected.
* No public method signatures changed. No methods deprecated.

h2. Part 2: Route internal XML factory sites through FactorySupport; tighten 
partial hardening

*Routing changes (refactor — no user-visible behaviour change):*
* {{XmlParser}}, {{XmlSlurper}}: replaced inline factory + 
{{setFeatureQuietly}} hardening with 
{{FactorySupport.createSaxParserFactory(allowDocTypeDeclaration)}}. Removed 
unused {{XMLConstants}} / {{setFeatureQuietly}} imports.
* {{XmlUtil.newFactoryInstance}}: routed through 
{{FactorySupport.createSaxParserFactory}}; method now declares {{throws 
ParserConfigurationException}} (existing callers already declared it).
* {{XmlUtil}} schema validation paths (3 sites): routed through 
{{FactorySupport.createSchemaFactory}}.
* {{DOMBuilder.parse}} static methods: routed through 
{{FactorySupport.createDocumentBuilderFactory}}.
* {{StreamingDOMBuilder}}: replaced raw 
{{DocumentBuilderFactory.newInstance()}} with 
{{FactorySupport.createDocumentBuilderFactory()}}.
* {{DomToGroovy}}: routed through 
{{FactorySupport.createDocumentBuilderFactory()}}; dropped redundant inline 
{{setFeature}}.
* {{DOMCategory}} XPath (2 sites): routed through 
{{FactorySupport.createXPathFactory()}}.

*New API:*
* {{DOMBuilder.newInstance(boolean validating, boolean namespaceAware, boolean 
allowDocTypeDeclaration)}} — relax knob for the instance/DSL parsing path. 
Existing zero-arg and 2-arg overloads default to hardened.
* {{SerializeOptions.allowExternalResources}} ({{boolean}}, default {{false}}) 
with getter/setter — relax knob for {{XmlUtil.serialize}} when transforming 
XSLT that legitimately imports external resources.

*Tightening (one user-visible default change):*
|| Site || Was || Now ||
| {{XmlUtil.serialize}} TransformerFactory | only 
{{disallow-doctype-decl=!allow}} | {{FEATURE_SECURE_PROCESSING=true}}, 
{{disallow-doctype-decl=!allow}}, {{ACCESS_EXTERNAL_DTD}} and 
{{ACCESS_EXTERNAL_STYLESHEET}} = {{""}} (or {{"all"}} per 
{{allowExternalResources}}) |
| {{XmlUtil}} SchemaFactory (3 sites) | nothing | 
{{FEATURE_SECURE_PROCESSING=true}} (no {{ACCESS_EXTERNAL_*}} — preserves 
legitimate {{<xs:import>}}) |
| {{DOMCategory}} XPathFactory | nothing | {{FEATURE_SECURE_PROCESSING=true}} |

*Defence-in-depth (no observable user impact):*
* {{StreamingDOMBuilder}} factory hardening (factory used only for 
{{newDocument()}}, not for parsing input).
* {{GrapeIvy}} Ivy descriptor parsing now hardened — inlined recipe (FSP + 
{{disallow-doctype-decl}} + {{XIncludeAware=false}} + 
{{ExpandEntityReferences=false}}) since {{groovy-grape-ivy}} does not depend on 
{{groovy-xml}}.
* {{DomToGroovy}} now also has {{FEATURE_SECURE_PROCESSING=true}} (previously 
only had {{disallow-doctype-decl}}).

*Tests:*
* New {{XmlSecurityTest.groovy}} with 14 cases pinning the secure-by-default 
contract: XXE, billion-laughs and external-DTD blocked by default in 
{{XmlParser}}/{{XmlSlurper}}/{{DOMBuilder.parse}}; 
{{allowDocTypeDeclaration=true}} relax knob still functional; 
{{XmlUtil.serialize}} blocks external XSLT imports by default and respects 
{{allowExternalResources=true}}; schema validation still resolves with FSP 
enabled; {{DOMCategory}} XPath still evaluates.

*Verification:*
* {{:groovy-xml:test}} — 237 tests pass (223 existing + 14 new).
* Full {{:compileJava :compileGroovy}} — clean.

*Compat notes:*
* Single user-visible default flip: {{XmlUtil.serialize}} blocks external XSLT 
imports unless {{SerializeOptions.allowExternalResources=true}}. Affects an 
uncommon usage pattern (passing XSLT stylesheets through the serialize path); 
documented in the new test cases and in the {{SerializeOptions}} javadoc.
* All other changes are internal refactor or defence-in-depth on paths that 
don't parse external input.

h2. Part 3 — StAX streaming conveniences and XML security documentation

*New API ({{XmlUtil}} + new package-private {{groovy.xml.StAXSupport}}):*
* {{XmlUtil.events(Reader)}} / {{events(Reader, boolean 
allowDocTypeDeclaration)}} — returns {{Stream<XMLEvent>}} over a hardened 
{{XMLInputFactory}}; closes underlying reader on stream close.
* {{XmlUtil.streamElements(Reader, String localName)}} / 3-arg / 4-arg 
overloads — returns {{Stream<org.w3c.dom.Node>}} of matching subtrees, 
namespace-qualified or local-name only. Pulls each subtree as a small DOM tree; 
suitable for streaming multi-gigabyte feeds without loading them into memory.

*Implementation:*
* New {{groovy.xml.StAXSupport}} (package-private) — holds the spliterator and 
per-match DOM-build logic (attributes, namespaces, characters, CDATA, comments, 
PIs).
* Both paths flow through 
{{FactorySupport.createXMLInputFactory(allowDocTypeDeclaration)}} so hardening 
is consistent.
* Stream's {{onClose}} closes both the {{XMLEventReader}} and the underlying 
{{Reader}} (the JDK's {{XMLEventReader.close()}} does not propagate).

*Docs:*
* New {{_xml-security.adoc}} covering defaults table, attack vectors, three 
relax knobs ({{allowDocTypeDeclaration}}, 
{{EntityResolver}}/{{setEntityBaseUrl}}, 
{{SerializeOptions.allowExternalResources}}), {{FactorySupport}} entry points, 
large-document streaming with virtual threads, advanced notes.
* {{xml-userguide.adoc}} — added a TIP after the parser options table linking 
to {{<<xml-security>>}}; included {{_xml-security.adoc}} via {{leveloffset=+1}} 
so it flows into the user guide.
* {{_dom-builder.adoc}} — note pointing at the new {{newInstance(validating, 
namespaceAware, allowDocTypeDeclaration)}} overload.
* {{_stax-builder.adoc}} — new "Reading XML with StAX" section pointing at 
{{FactorySupport.createXMLInputFactory()}}, {{XmlUtil.events}} and 
{{XmlUtil.streamElements}}.

*Tests (new {{XmlStreamingTest.groovy}}, 10 cases):*
* {{events}} yields the expected event-type sequence.
* {{streamElements}} matches by local name and by namespace; preserves 
attributes and nested content; emits outer-only when same-name elements are 
nested; closes the underlying reader on stream close; respects 
{{allowDocTypeDeclaration=true}}; rejects null {{localName}}; scales 
(5,000-record synthetic feed sums correctly).

*Verification:*
* {{:groovy-xml:test}} — 247 tests pass (237 from PR1+PR2, 10 new streaming).
* {{./gradlew asciidoctor}} — full doc build succeeds; rendered HTML contains 
the new {{id="xml-security"}} anchor.

*Compat notes:*
* Pure-additive PR. No public method signatures changed. No defaults flipped.


  was:
h3. Part 1: FactorySupport hardened factories (secure by default)

*What's included:*
* {{FactorySupport.java}}: every factory method now returns a hardened factory 
by default
** {{createDocumentBuilderFactory()}} — delegates to 
{{createDocumentBuilderFactory(false)}}; previously returned a bare JDK factory
** {{createSaxParserFactory()}} — delegates to 
{{createSaxParserFactory(false)}}; previously returned a bare JDK factory
** {{createDocumentBuilderFactory(boolean allowDocTypeDeclaration)}} *(new)*
** {{createSaxParserFactory(boolean allowDocTypeDeclaration)}} *(new)*
** {{createXMLInputFactory()}} *(new)*
** {{createXMLInputFactory(boolean allowDocTypeDeclaration)}} *(new)*
** {{createTransformerFactory(boolean allowDocTypeDeclaration, boolean 
allowExternalResources)}} *(new)*
** {{createSchemaFactory(String schemaLanguage)}} *(new)*
** {{createXPathFactory()}} *(new)*
* {{FactorySupport.java}}: added private quiet helpers for the new factory 
types (SchemaFactory, XPathFactory, TransformerFactory {{setAttribute}}, 
XMLInputFactory {{setProperty}}).
* {{FactorySupportTest.java}}: 14 new tests covering hardening defaults on both 
the zero-arg and boolean overloads, relax-flag round-trips for each factory 
type.

*Hardening recipes applied:*
|| Factory || Settings ||
| DocumentBuilderFactory | {{FEATURE_SECURE_PROCESSING=true}}, 
{{disallow-doctype-decl=!allow}}, {{XIncludeAware=false}}, 
{{ExpandEntityReferences=false}} |
| SAXParserFactory | {{FEATURE_SECURE_PROCESSING=true}}, 
{{disallow-doctype-decl=!allow}} |
| XMLInputFactory | {{SUPPORT_DTD=allow}}, 
{{IS_SUPPORTING_EXTERNAL_ENTITIES=false}} |
| TransformerFactory | {{FEATURE_SECURE_PROCESSING=true}}, 
{{disallow-doctype-decl=!allow}}, {{ACCESS_EXTERNAL_DTD}} and 
{{ACCESS_EXTERNAL_STYLESHEET}} = {{"all"}} or {{""}} per 
{{allowExternalResources}} |
| SchemaFactory | {{FEATURE_SECURE_PROCESSING=true}} (no {{ACCESS_EXTERNAL_*}} 
— preserves legitimate {{<xs:import>}}) |
| XPathFactory | {{FEATURE_SECURE_PROCESSING=true}} |

*Compat notes:*
* Direct callers of the zero-arg {{createDocumentBuilderFactory()}} / 
{{createSaxParserFactory()}} now receive hardened factories. Callers that 
previously parsed DOCTYPE-bearing input through those factories must switch to 
the {{(true)}} overload. Internal Groovy callers ({{XmlParser}}, 
{{XmlSlurper}}, {{XmlUtil}}, {{DOMBuilder}}) overlay their own settings on top 
of the default and are unaffected.
* No public method signatures changed. No methods deprecated.

h3. Part 2: Route internal XML factory sites through FactorySupport; tighten 
partial hardening

*Routing changes (refactor — no user-visible behaviour change):*
* {{XmlParser}}, {{XmlSlurper}}: replaced inline factory + 
{{setFeatureQuietly}} hardening with 
{{FactorySupport.createSaxParserFactory(allowDocTypeDeclaration)}}. Removed 
unused {{XMLConstants}} / {{setFeatureQuietly}} imports.
* {{XmlUtil.newFactoryInstance}}: routed through 
{{FactorySupport.createSaxParserFactory}}; method now declares {{throws 
ParserConfigurationException}} (existing callers already declared it).
* {{XmlUtil}} schema validation paths (3 sites): routed through 
{{FactorySupport.createSchemaFactory}}.
* {{DOMBuilder.parse}} static methods: routed through 
{{FactorySupport.createDocumentBuilderFactory}}.
* {{StreamingDOMBuilder}}: replaced raw 
{{DocumentBuilderFactory.newInstance()}} with 
{{FactorySupport.createDocumentBuilderFactory()}}.
* {{DomToGroovy}}: routed through 
{{FactorySupport.createDocumentBuilderFactory()}}; dropped redundant inline 
{{setFeature}}.
* {{DOMCategory}} XPath (2 sites): routed through 
{{FactorySupport.createXPathFactory()}}.

*New API:*
* {{DOMBuilder.newInstance(boolean validating, boolean namespaceAware, boolean 
allowDocTypeDeclaration)}} — relax knob for the instance/DSL parsing path. 
Existing zero-arg and 2-arg overloads default to hardened.
* {{SerializeOptions.allowExternalResources}} ({{boolean}}, default {{false}}) 
with getter/setter — relax knob for {{XmlUtil.serialize}} when transforming 
XSLT that legitimately imports external resources.

*Tightening (one user-visible default change):*
|| Site || Was || Now ||
| {{XmlUtil.serialize}} TransformerFactory | only 
{{disallow-doctype-decl=!allow}} | {{FEATURE_SECURE_PROCESSING=true}}, 
{{disallow-doctype-decl=!allow}}, {{ACCESS_EXTERNAL_DTD}} and 
{{ACCESS_EXTERNAL_STYLESHEET}} = {{""}} (or {{"all"}} per 
{{allowExternalResources}}) |
| {{XmlUtil}} SchemaFactory (3 sites) | nothing | 
{{FEATURE_SECURE_PROCESSING=true}} (no {{ACCESS_EXTERNAL_*}} — preserves 
legitimate {{<xs:import>}}) |
| {{DOMCategory}} XPathFactory | nothing | {{FEATURE_SECURE_PROCESSING=true}} |

*Defence-in-depth (no observable user impact):*
* {{StreamingDOMBuilder}} factory hardening (factory used only for 
{{newDocument()}}, not for parsing input).
* {{GrapeIvy}} Ivy descriptor parsing now hardened — inlined recipe (FSP + 
{{disallow-doctype-decl}} + {{XIncludeAware=false}} + 
{{ExpandEntityReferences=false}}) since {{groovy-grape-ivy}} does not depend on 
{{groovy-xml}}.
* {{DomToGroovy}} now also has {{FEATURE_SECURE_PROCESSING=true}} (previously 
only had {{disallow-doctype-decl}}).

*Tests:*
* New {{XmlSecurityTest.groovy}} with 14 cases pinning the secure-by-default 
contract: XXE, billion-laughs and external-DTD blocked by default in 
{{XmlParser}}/{{XmlSlurper}}/{{DOMBuilder.parse}}; 
{{allowDocTypeDeclaration=true}} relax knob still functional; 
{{XmlUtil.serialize}} blocks external XSLT imports by default and respects 
{{allowExternalResources=true}}; schema validation still resolves with FSP 
enabled; {{DOMCategory}} XPath still evaluates.

*Verification:*
* {{:groovy-xml:test}} — 237 tests pass (223 existing + 14 new).
* Full {{:compileJava :compileGroovy}} — clean.

*Compat notes:*
* Single user-visible default flip: {{XmlUtil.serialize}} blocks external XSLT 
imports unless {{SerializeOptions.allowExternalResources=true}}. Affects an 
uncommon usage pattern (passing XSLT stylesheets through the serialize path); 
documented in the new test cases and in the {{SerializeOptions}} javadoc.
* All other changes are internal refactor or defence-in-depth on paths that 
don't parse external input.


> Consolidate XML factory hardening and document secure-by-default parsing
> ------------------------------------------------------------------------
>
>                 Key: GROOVY-11979
>                 URL: https://issues.apache.org/jira/browse/GROOVY-11979
>             Project: Groovy
>          Issue Type: Improvement
>            Reporter: Paul King
>            Assignee: Paul King
>            Priority: Major
>
> h2. Part 1: FactorySupport hardened factories (secure by default)
> *What's included:*
> * {{FactorySupport.java}}: every factory method now returns a hardened 
> factory by default
> ** {{createDocumentBuilderFactory()}} — delegates to 
> {{createDocumentBuilderFactory(false)}}; previously returned a bare JDK 
> factory
> ** {{createSaxParserFactory()}} — delegates to 
> {{createSaxParserFactory(false)}}; previously returned a bare JDK factory
> ** {{createDocumentBuilderFactory(boolean allowDocTypeDeclaration)}} *(new)*
> ** {{createSaxParserFactory(boolean allowDocTypeDeclaration)}} *(new)*
> ** {{createXMLInputFactory()}} *(new)*
> ** {{createXMLInputFactory(boolean allowDocTypeDeclaration)}} *(new)*
> ** {{createTransformerFactory(boolean allowDocTypeDeclaration, boolean 
> allowExternalResources)}} *(new)*
> ** {{createSchemaFactory(String schemaLanguage)}} *(new)*
> ** {{createXPathFactory()}} *(new)*
> * {{FactorySupport.java}}: added private quiet helpers for the new factory 
> types (SchemaFactory, XPathFactory, TransformerFactory {{setAttribute}}, 
> XMLInputFactory {{setProperty}}).
> * {{FactorySupportTest.java}}: 14 new tests covering hardening defaults on 
> both the zero-arg and boolean overloads, relax-flag round-trips for each 
> factory type.
> *Hardening recipes applied:*
> || Factory || Settings ||
> | DocumentBuilderFactory | {{FEATURE_SECURE_PROCESSING=true}}, 
> {{disallow-doctype-decl=!allow}}, {{XIncludeAware=false}}, 
> {{ExpandEntityReferences=false}} |
> | SAXParserFactory | {{FEATURE_SECURE_PROCESSING=true}}, 
> {{disallow-doctype-decl=!allow}} |
> | XMLInputFactory | {{SUPPORT_DTD=allow}}, 
> {{IS_SUPPORTING_EXTERNAL_ENTITIES=false}} |
> | TransformerFactory | {{FEATURE_SECURE_PROCESSING=true}}, 
> {{disallow-doctype-decl=!allow}}, {{ACCESS_EXTERNAL_DTD}} and 
> {{ACCESS_EXTERNAL_STYLESHEET}} = {{"all"}} or {{""}} per 
> {{allowExternalResources}} |
> | SchemaFactory | {{FEATURE_SECURE_PROCESSING=true}} (no 
> {{ACCESS_EXTERNAL_*}} — preserves legitimate {{<xs:import>}}) |
> | XPathFactory | {{FEATURE_SECURE_PROCESSING=true}} |
> *Compat notes:*
> * Direct callers of the zero-arg {{createDocumentBuilderFactory()}} / 
> {{createSaxParserFactory()}} now receive hardened factories. Callers that 
> previously parsed DOCTYPE-bearing input through those factories must switch 
> to the {{(true)}} overload. Internal Groovy callers ({{XmlParser}}, 
> {{XmlSlurper}}, {{XmlUtil}}, {{DOMBuilder}}) overlay their own settings on 
> top of the default and are unaffected.
> * No public method signatures changed. No methods deprecated.
> h2. Part 2: Route internal XML factory sites through FactorySupport; tighten 
> partial hardening
> *Routing changes (refactor — no user-visible behaviour change):*
> * {{XmlParser}}, {{XmlSlurper}}: replaced inline factory + 
> {{setFeatureQuietly}} hardening with 
> {{FactorySupport.createSaxParserFactory(allowDocTypeDeclaration)}}. Removed 
> unused {{XMLConstants}} / {{setFeatureQuietly}} imports.
> * {{XmlUtil.newFactoryInstance}}: routed through 
> {{FactorySupport.createSaxParserFactory}}; method now declares {{throws 
> ParserConfigurationException}} (existing callers already declared it).
> * {{XmlUtil}} schema validation paths (3 sites): routed through 
> {{FactorySupport.createSchemaFactory}}.
> * {{DOMBuilder.parse}} static methods: routed through 
> {{FactorySupport.createDocumentBuilderFactory}}.
> * {{StreamingDOMBuilder}}: replaced raw 
> {{DocumentBuilderFactory.newInstance()}} with 
> {{FactorySupport.createDocumentBuilderFactory()}}.
> * {{DomToGroovy}}: routed through 
> {{FactorySupport.createDocumentBuilderFactory()}}; dropped redundant inline 
> {{setFeature}}.
> * {{DOMCategory}} XPath (2 sites): routed through 
> {{FactorySupport.createXPathFactory()}}.
> *New API:*
> * {{DOMBuilder.newInstance(boolean validating, boolean namespaceAware, 
> boolean allowDocTypeDeclaration)}} — relax knob for the instance/DSL parsing 
> path. Existing zero-arg and 2-arg overloads default to hardened.
> * {{SerializeOptions.allowExternalResources}} ({{boolean}}, default 
> {{false}}) with getter/setter — relax knob for {{XmlUtil.serialize}} when 
> transforming XSLT that legitimately imports external resources.
> *Tightening (one user-visible default change):*
> || Site || Was || Now ||
> | {{XmlUtil.serialize}} TransformerFactory | only 
> {{disallow-doctype-decl=!allow}} | {{FEATURE_SECURE_PROCESSING=true}}, 
> {{disallow-doctype-decl=!allow}}, {{ACCESS_EXTERNAL_DTD}} and 
> {{ACCESS_EXTERNAL_STYLESHEET}} = {{""}} (or {{"all"}} per 
> {{allowExternalResources}}) |
> | {{XmlUtil}} SchemaFactory (3 sites) | nothing | 
> {{FEATURE_SECURE_PROCESSING=true}} (no {{ACCESS_EXTERNAL_*}} — preserves 
> legitimate {{<xs:import>}}) |
> | {{DOMCategory}} XPathFactory | nothing | {{FEATURE_SECURE_PROCESSING=true}} 
> |
> *Defence-in-depth (no observable user impact):*
> * {{StreamingDOMBuilder}} factory hardening (factory used only for 
> {{newDocument()}}, not for parsing input).
> * {{GrapeIvy}} Ivy descriptor parsing now hardened — inlined recipe (FSP + 
> {{disallow-doctype-decl}} + {{XIncludeAware=false}} + 
> {{ExpandEntityReferences=false}}) since {{groovy-grape-ivy}} does not depend 
> on {{groovy-xml}}.
> * {{DomToGroovy}} now also has {{FEATURE_SECURE_PROCESSING=true}} (previously 
> only had {{disallow-doctype-decl}}).
> *Tests:*
> * New {{XmlSecurityTest.groovy}} with 14 cases pinning the secure-by-default 
> contract: XXE, billion-laughs and external-DTD blocked by default in 
> {{XmlParser}}/{{XmlSlurper}}/{{DOMBuilder.parse}}; 
> {{allowDocTypeDeclaration=true}} relax knob still functional; 
> {{XmlUtil.serialize}} blocks external XSLT imports by default and respects 
> {{allowExternalResources=true}}; schema validation still resolves with FSP 
> enabled; {{DOMCategory}} XPath still evaluates.
> *Verification:*
> * {{:groovy-xml:test}} — 237 tests pass (223 existing + 14 new).
> * Full {{:compileJava :compileGroovy}} — clean.
> *Compat notes:*
> * Single user-visible default flip: {{XmlUtil.serialize}} blocks external 
> XSLT imports unless {{SerializeOptions.allowExternalResources=true}}. Affects 
> an uncommon usage pattern (passing XSLT stylesheets through the serialize 
> path); documented in the new test cases and in the {{SerializeOptions}} 
> javadoc.
> * All other changes are internal refactor or defence-in-depth on paths that 
> don't parse external input.
> h2. Part 3 — StAX streaming conveniences and XML security documentation
> *New API ({{XmlUtil}} + new package-private {{groovy.xml.StAXSupport}}):*
> * {{XmlUtil.events(Reader)}} / {{events(Reader, boolean 
> allowDocTypeDeclaration)}} — returns {{Stream<XMLEvent>}} over a hardened 
> {{XMLInputFactory}}; closes underlying reader on stream close.
> * {{XmlUtil.streamElements(Reader, String localName)}} / 3-arg / 4-arg 
> overloads — returns {{Stream<org.w3c.dom.Node>}} of matching subtrees, 
> namespace-qualified or local-name only. Pulls each subtree as a small DOM 
> tree; suitable for streaming multi-gigabyte feeds without loading them into 
> memory.
> *Implementation:*
> * New {{groovy.xml.StAXSupport}} (package-private) — holds the spliterator 
> and per-match DOM-build logic (attributes, namespaces, characters, CDATA, 
> comments, PIs).
> * Both paths flow through 
> {{FactorySupport.createXMLInputFactory(allowDocTypeDeclaration)}} so 
> hardening is consistent.
> * Stream's {{onClose}} closes both the {{XMLEventReader}} and the underlying 
> {{Reader}} (the JDK's {{XMLEventReader.close()}} does not propagate).
> *Docs:*
> * New {{_xml-security.adoc}} covering defaults table, attack vectors, three 
> relax knobs ({{allowDocTypeDeclaration}}, 
> {{EntityResolver}}/{{setEntityBaseUrl}}, 
> {{SerializeOptions.allowExternalResources}}), {{FactorySupport}} entry 
> points, large-document streaming with virtual threads, advanced notes.
> * {{xml-userguide.adoc}} — added a TIP after the parser options table linking 
> to {{<<xml-security>>}}; included {{_xml-security.adoc}} via 
> {{leveloffset=+1}} so it flows into the user guide.
> * {{_dom-builder.adoc}} — note pointing at the new {{newInstance(validating, 
> namespaceAware, allowDocTypeDeclaration)}} overload.
> * {{_stax-builder.adoc}} — new "Reading XML with StAX" section pointing at 
> {{FactorySupport.createXMLInputFactory()}}, {{XmlUtil.events}} and 
> {{XmlUtil.streamElements}}.
> *Tests (new {{XmlStreamingTest.groovy}}, 10 cases):*
> * {{events}} yields the expected event-type sequence.
> * {{streamElements}} matches by local name and by namespace; preserves 
> attributes and nested content; emits outer-only when same-name elements are 
> nested; closes the underlying reader on stream close; respects 
> {{allowDocTypeDeclaration=true}}; rejects null {{localName}}; scales 
> (5,000-record synthetic feed sums correctly).
> *Verification:*
> * {{:groovy-xml:test}} — 247 tests pass (237 from PR1+PR2, 10 new streaming).
> * {{./gradlew asciidoctor}} — full doc build succeeds; rendered HTML contains 
> the new {{id="xml-security"}} anchor.
> *Compat notes:*
> * Pure-additive PR. No public method signatures changed. No defaults flipped.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to