[GitHub] [ant] tsmock commented on a change in pull request #175: junitlauncher - Fix an issue in LegacyXmlResultFormatter with ]]> in stacktraces

GitBox Mon, 24 Jan 2022 07:26:52 -0800


tsmock commented on a change in pull request #175:
URL: https://github.com/apache/ant/pull/175#discussion_r790853748




##########
File path: 
src/main/org/apache/tools/ant/taskdefs/optional/junitlauncher/LegacyXmlResultFormatter.java
##########
@@ -298,8 +298,9 @@ private void writeFailed(final XMLStreamWriter writer, 
final TestIdentifier test
                     writeAttribute(writer, ATTR_MESSAGE, message);
                 }
                 writeAttribute(writer, ATTR_TYPE, t.getClass().getName());
-                // write out the stacktrace
-                writer.writeCData(StringUtils.getStackTrace(t));
+                // write out the stacktrace, replacing any CDATA endings by 
splitting them into multiple CDATA segments
+                // See https://bz.apache.org/bugzilla/show_bug.cgi?id=65833
+                
writer.writeCData(StringUtils.getStackTrace(t).replaceAll("]]>", 
"]]]]><![CDATA[>"));

Review comment:
       The `encode` method currently only handles special characters (generic 
XML). This handles a special _sequence_ for a specific data section. `]]>`.
   
   From [1], "Within a CDATA section, only the CDEnd string is recognized as 
markup, so that left angle brackets and ampersands may occur in their literal 
form; they need not (and cannot) be escaped using " &lt; " and " &amp; ". CDATA 
sections cannot nest." (CDEnd === `]]>`)
   
   This means that replacing `>` with `&gt;` would not work, as CDATA parsers 
shouldn't treat `&gt; === >`. They should treat it as a _literal_ `&gt;` (since 
the & is literal), which can break many usages of the stacktrace. Sure, people 
could have regexes to fix the stack trace, but that is duplicate work for many 
people.
   
   With all that said, `writeCData` should properly handle text with `]]>` in 
it, but since people have had to work around it, it may no longer be possible 
to fix the issue in the actual implementing libraries without a lot of work. 
How would it know if you were working around this issue, and _not_ escape the 
cdata sequence again? (`]]>` -> `]]]]><!CDATA[>` -> 
`]]]]]]><!CDATA[><!CDATA[>`).
   Example pitfall: A test failure outputs XML with a CDATA section, which is 
then written to CDATA in LegacyXmlResultFormatter. We want to keep the XML 
output as is, which means escaping the interior CDATA endings. If this then 
happens again in the `writeCData` method, then the xml output shown to the user 
will most likely be wrong. From the above example, instead of `]]>`, the user 
would see `]]]]><!CDATA[>` unless the CDATA parser recursively parses the data 
(unlikely IMO).
   
   [1] https://www.w3.org/TR/REC-xml/#dt-cdsection




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [ant] tsmock commented on a change in pull request #175: junitlauncher - Fix an issue in LegacyXmlResultFormatter with ]]> in stacktraces

Reply via email to