tsmock commented on a change in pull request #175:
URL: https://github.com/apache/ant/pull/175#discussion_r790853748
##########
File path:
src/main/org/apache/tools/ant/taskdefs/optional/junitlauncher/LegacyXmlResultFormatter.java
##########
@@ -298,8 +298,9 @@ private void writeFailed(final XMLStreamWriter writer,
final TestIdentifier test
writeAttribute(writer, ATTR_MESSAGE, message);
}
writeAttribute(writer, ATTR_TYPE, t.getClass().getName());
- // write out the stacktrace
- writer.writeCData(StringUtils.getStackTrace(t));
+ // write out the stacktrace, replacing any CDATA endings by
splitting them into multiple CDATA segments
+ // See https://bz.apache.org/bugzilla/show_bug.cgi?id=65833
+
writer.writeCData(StringUtils.getStackTrace(t).replaceAll("]]>",
"]]]]><![CDATA[>"));
Review comment:
The `encode` method currently only handles special characters (generic
XML). This handles a special _sequence_ for a specific data section. `]]>`.
From [1], "Within a CDATA section, only the CDEnd string is recognized as
markup, so that left angle brackets and ampersands may occur in their literal
form; they need not (and cannot) be escaped using " < " and " & ". CDATA
sections cannot nest." (CDEnd === `]]>`)
This means that replacing `>` with `>` would not work, as CDATA parsers
shouldn't treat `> === >`. They should treat it as a _literal_ `>` (since
the & is literal), which can break many usages of the stacktrace. Sure, people
could have regexes to fix the stack trace, but that is duplicate work for many
people.
With all that said, `writeCData` should properly handle text with `]]>` in
it, but since people have had to work around it, it may no longer be possible
to fix the issue in the actual implementing libraries without a lot of work.
How would it know if you were working around this issue, and _not_ escape the
cdata sequence again? (`]]>` -> `]]]]><!CDATA[>` ->
`]]]]]]><!CDATA[><!CDATA[>`).
Example pitfall: A test failure outputs XML with a CDATA section, which is
then written to CDATA in LegacyXmlResultFormatter. We want to keep the XML
output as is, which means escaping the interior CDATA endings. If this then
happens again in the `writeCData` method, then the xml output shown to the user
will most likely be wrong. From the above example, instead of `]]>`, the user
would see `]]]]><!CDATA[>` unless the CDATA parser recursively parses the data
(unlikely IMO).
[1] https://www.w3.org/TR/REC-xml/#dt-cdsection
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]