[ 
https://issues.apache.org/jira/browse/MINVOKER-351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847248#comment-17847248
 ] 

ASF GitHub Bot commented on MINVOKER-351:
-----------------------------------------

elharo commented on code in PR #242:
URL: 
https://github.com/apache/maven-invoker-plugin/pull/242#discussion_r1604812464


##########
src/main/java/org/apache/maven/plugins/invoker/AbstractInvokerMojo.java:
##########
@@ -1779,7 +1781,7 @@ private void writeJunitReport(BuildJob buildJob, String 
safeFileName) throws Moj
         if (buildLogFile != null && buildLogFile.exists()) {
             getLog().debug("fileLogger:" + buildLogFile);
             try {
-                systemOut.setValue(FileUtils.fileRead(buildLogFile));
+                
systemOut.setValue(StringEscapeUtils.escapeXml10(FileUtils.fileRead(buildLogFile)));

Review Comment:
   One thing I don't yet understand is what the "value" being set here is in 
Xpp3Dom. I *think* (could be wrong) it's supposed to be the raw text to be 
output, which means Xpp3Dom is in fact not a document object model at all. The 
confusion between object model and serialized form is a real problem with 
Xpp3Dom.  
   
   
   



##########
src/main/java/org/apache/maven/plugins/invoker/AbstractInvokerMojo.java:
##########
@@ -1640,7 +1642,7 @@ private void runBuild(
             }
         } catch (RunFailureException e) {
             buildJob.setResult(e.getType());
-            buildJob.setFailureMessage(e.getMessage());
+            
buildJob.setFailureMessage(StringEscapeUtils.escapeXml10(e.getMessage()));

Review Comment:
   I haven't followed through the complete code yet, but the failure message 
should generally only be escaped once, and that should be left to the XML 
writer. That is, don't escape a string that appears in PCDATA or an attribute 
value. Let the writer do that.
   
   This all assumes that the writer does correctly escape everything. Some 
libraries have borked this up in one way or another. Looking at it now 
Xpp3DomWriter is a mess and doesn't have a clean model of when and what to 
escape. It seems to confuse the unescaped model strings with the serialized 
escaped strings, and switches back and forth between escaped or not depending 
on switches.
   
   I'm not sure exactly how to fix this issue. Really what needs to happen is 
either fix XPP3DomWriter and likely XPPDom, while breaking the API and all 
existing dependents, or switch to a better designed XML library here. Short of 
that maybe you can make this work with enough unit tests. 
   





> Prevent XML-prohibited characters from entering JUnit report
> ------------------------------------------------------------
>
>                 Key: MINVOKER-351
>                 URL: https://issues.apache.org/jira/browse/MINVOKER-351
>             Project: Maven Invoker Plugin
>          Issue Type: Bug
>            Reporter: Mikkel Kjeldsen
>            Assignee: Slawomir Jaranowski
>            Priority: Major
>             Fix For: 3.7.0
>
>         Attachments: minvoker-351.tar.gz
>
>
> Neither the Maven Invoker plugin's implementation of {{<writeJunitReport>}} 
> nor the underlying XML infrastructure directly protect against the presence 
> of character literals prohibited by the XML specification, meaning such 
> literals can appear in the JUnit report and render it unreadable. *I would 
> appreciate if the Maven Invoker plugin could learn to strip prohibited 
> literals to protect its users from creative plugins.* I argue that this is a 
> safe and expected transformation that is not materially lossy.
> ----
> h2. Background
> MINVOKER-196 added the {{<writeJunitReport>}} option [back in 
> maven-invoker-plugin-3.2.1|https://github.com/apache/maven-invoker-plugin/blob/maven-invoker-plugin-3.2.1/src/main/java/org/apache/maven/plugins/invoker/AbstractInvokerMojo.java#L1878-L1946].
>  As of [maven-invoker-plugin-3.6.0 the effective implementation of the JUnit 
> report remains effectively 
> unchanged|https://github.com/apache/maven-invoker-plugin/blob/maven-invoker-plugin-3.6.0/src/main/java/org/apache/maven/plugins/invoker/AbstractInvokerMojo.java#L1695-L1754].
> The JUnit report includes a {{<system-out>}} element ([example 
> documentation|https://github.com/testmoapp/junitxml]) whose value Maven 
> Invoker populates with the raw build log contents. I've observed that this 
> value is XML-escaped, which I imagine is well understood in the 
> implementation, although I can't immediately find documentation to support 
> that.
> However, escaping notwithstanding, a number of character literals are 
> outright prohibited by the XML specifications. These literals cannot be 
> escaped, and their presence renders an XML document not well formed. The 
> exact set of prohibited characters varies by XML version; the report produced 
> by the Maven Invoker plugin is XML version 1.0. When the Maven Invoker plugin 
> reads in the build log it does not strip these character literals and neither 
> does the XML writer the Maven Invoker plugin relies on. Consequently, if a 
> build log ends up including a prohibited character the resulting JUnit report 
> will not be well formed.
> The set of prohibited characters is the complement of [the XML 
> specification's definition of {{Char}}|https://www.w3.org/TR/xml/#NT-Char].
> h2. Example
> Among the literals prohibited by XML version 1.0 is {{^H}} (backspace). When 
> [pitest runs via Maven|https://pitest.org/quickstart/maven/] it prints a 
> spinner to standard out, and the implementation uses backspace to render the 
> spinner in place. I have used the Maven Invoker plugin with 
> {{<writeJunitReport>}} to verify a pitest configuration, whereby I discovered 
> this limitation.
> h2. Remediation
> h3. Blame plugins
> Perhaps pitest should not behave this way but we can't change pitest, and 
> even if pitest could be changed that offers no protection against any other 
> plugin, so blaming plugins is an ineffective course of action.
> h3. Work-arounds
> The user can manually clean the build log in-place via 
> {{<postBuildHookScript>}}. This is technically fairly easy to do, and makes 
> the transformation very explicit, but it requires considerable local work to 
> address an issue many would find obscure and the transformation is 
> permanently lossy unless the user also backs up the raw log to another file 
> name.
> h3. Strip prohibited literals inside Maven Invoker plugin
> If the Maven Invoker plugin learns to strip offending character literals 
> in-between reading the build log and writing to the {{<system-out>}} value 
> then {{<writeJunitReport>}} will Just Work™, which I assert is what a user 
> will typically expect. Although the {{<system-out>}} value will no longer 
> exactly match the build log contents, this lossy translation is acceptable: 
> the prohibited characters are overwhelmingly unprintable to begin with and 
> therefore cannot be meaningfully rendered in a static context, and the raw 
> build log remains unchanged in the event that the user needs to investigate 
> or assert against the raw output.
> This change would be backwards compatible, because any existing user that 
> would be affected by it would already have unparseable JUnit reports.
> * I _believe_ that Java's {{j.u.r.Pattern}} can trivially express the 
> complement of allowed characters but there may exist more efficient solutions.
> * Consider also applying this transformation to the 2 uses of 
> {{buildJob.getFailureMessage()}}.
> h4. Replace prohibited literals inside Maven Invoker plugin
> As a variation of stripping prohibited character literals, the Maven Invoker 
> plugin could substitute sentinel values for prohibited character literals. 
> This approach has the downside that it requires additional decision making 
> for determining suitable substitution(s) but is otherwise comparable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to