Hi Shruthi,
     The following XSLT 1.0 transformation using Xalan-J 2.7.3,
resolves the issue you've described.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xsl:stylesheet [
   <!ENTITY eacute "Hello">
]>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
                         version="1.0">

       <xsl:output method="html" encoding="UTF-8" indent="yes"/>

       <xsl:template match="/">
            <html>
              <head>
                  <title>Openjdk bug report test</title>
              </head>
              <body>
                  <h2>&eacute;</h2>
               </body>
            </html>
        </xsl:template>

</xsl:stylesheet>

This resolution should work with Xalan-J bundled with OpenJDK as well.

HTH

On Thu, Apr 9, 2026 at 11:04 AM Shruthi . <[email protected]> wrote:

> I would like to seek clarification on a behavior observed when performing an 
> XSL transformation followed by XML parsing.
>
> Problem Description :
> A SAXParseException is encountered when parsing the result of a Java XSL 
> transformation that uses HTML output and contains accented characters 
> represented.
>
> Scenario:
> We perform an XSL transformation using `Transformer`, and then attempt to 
> parse the resulting output using `DocumentBuilder`.
>
> When the XSLT uses:
> <xsl:output method="html" encoding="UTF-8" indent="yes"/>
>
> the transformation succeeds, but parsing the result fails with the following 
> error:
>
> [Fatal Error] :4:98: The entity "eacute" was referenced, but not declared.
> org.xml.sax.SAXParseException; lineNumber: 4; columnNumber: 98; The entity 
> "eacute" was referenced, but not declared.
>         at 
> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
>         at 
> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:338)
>         at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
>         at HTMLEntityParsingTest.main(HTMLEntityParsingTest.java:40)
>
>
> However, when we change the XSLT output method to the below, the issue does 
> not occur.
> <xsl:output method="xml" encoding="UTF-8" indent="yes"/>
>
> Observation:
> It appears that the HTML output contains named entities such as `&eacute;`, 
> which are not recognized by the XML parser.
>
> Could you please confirm whether this behavior is expected, or if this could 
> be considered a bug or limitation in the current implementation?
>
> Releases:
> The issue is consistent in all OpenJDK version(JDK8 and above)



-- 
Regards,
Mukul Gandhi

Reply via email to