[ 
https://jira.codehaus.org/browse/MPH-87?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Scholte updated MPH-87:
------------------------------

    Component/s: effective-pom
    Description: 
As stated in http://www.w3.org/TR/REC-xml/#sec-guessing-no-ext-info XML files 
without a BOM and without a XML encoding declaration should read the XML as 
UTF-8. 

{{help:effective-pom}} does use the platform encoding for writing the 
effective-pom without emitting an appropriate XML encoding declaration in the 
resulting XML file.

I have created a small sample project (available at 
https://github.com/mfriedenhagen/invalidpom, attached as ZIP) which will 
reproduce the issue.

While the parent pom 
(https://raw.github.com/mfriedenhagen/invalidpom/master/pom.xml) has a XML 
encoding declaration, 
https://raw.github.com/mfriedenhagen/invalidpom/master/child-invalid/pom.xml 
has none.

Now running:
{code}
mvn -s settings.xml -gs settings.xml clean validate
{code}

will produce an invalid character for the developer name "Jörg" in 
{{child-invalid}}. 

Two workarounds are:
* to include a XML encoding declaration as done in {{child-valid}}. 
* to use {{JAVA_TOOL_OPTIONS}} on Windows as stated in 
http://stackoverflow.com/a/623036/49132
* to use {{MAVEN_OPTS=-Dfile.encoding=utf-8 mvn -s settings.xml -gs 
settings.xml clean validate}}.

Nonetheless I consider this a Major bug, as it clearly violates the 
recommendations of W3C.


  was:
As stated in http://www.w3.org/TR/REC-xml/#sec-guessing-no-ext-info XML files 
without a BOM and without a XML encoding declaration should read the XML as 
UTF-8. 

{{help:effective-pom}} does use the platform encoding for writing the 
effective-pom without emitting an appropriate XML encoding declaration in the 
resulting XML file.

I have created a small sample project (available at 
https://github.com/mfriedenhagen/invalidpom, attached as ZIP) which will 
reproduce the issue.

While the parent pom 
(https://raw.github.com/mfriedenhagen/invalidpom/master/pom.xml) has a XML 
encoding declaration, 
https://raw.github.com/mfriedenhagen/invalidpom/master/child-invalid/pom.xml 
has none.

Now running:
{code}
mvn -s settings.xml -gs settings.xml clean validate
{code}

will produce an invalid character for the developer name "Jörg" in 
{{child-invalid}}. 

Two workarounds are:
* to include a XML encoding declaration as done in {{child-valid}}. 
* to use {{JAVA_TOOL_OPTIONS}} on Windows as stated in 
http://stackoverflow.com/a/623036/49132
* to use {{MAVEN_OPTS=-Dfile.encoding=utf-8 mvn -s settings.xml -gs 
settings.xml clean validate}}.

Nonetheless I consider this a Major bug, as it clearly violates the 
recommendations of W3C.


    
> help:effective-pom uses platform encoding and garbles non-ascii characters, 
> emits invalid XML
> ---------------------------------------------------------------------------------------------
>
>                 Key: MPH-87
>                 URL: https://jira.codehaus.org/browse/MPH-87
>             Project: Maven 2.x Help Plugin
>          Issue Type: Bug
>          Components: effective-pom
>    Affects Versions: 2.1.1
>         Environment: Windows, MacOSX, Linux, Maven 3.0.4
>            Reporter: Mirko Friedenhagen
>         Attachments: mfriedenhagen-invalidpom-MPH-87-0-g42a5c31.zip
>
>
> As stated in http://www.w3.org/TR/REC-xml/#sec-guessing-no-ext-info XML files 
> without a BOM and without a XML encoding declaration should read the XML as 
> UTF-8. 
> {{help:effective-pom}} does use the platform encoding for writing the 
> effective-pom without emitting an appropriate XML encoding declaration in the 
> resulting XML file.
> I have created a small sample project (available at 
> https://github.com/mfriedenhagen/invalidpom, attached as ZIP) which will 
> reproduce the issue.
> While the parent pom 
> (https://raw.github.com/mfriedenhagen/invalidpom/master/pom.xml) has a XML 
> encoding declaration, 
> https://raw.github.com/mfriedenhagen/invalidpom/master/child-invalid/pom.xml 
> has none.
> Now running:
> {code}
> mvn -s settings.xml -gs settings.xml clean validate
> {code}
> will produce an invalid character for the developer name "Jörg" in 
> {{child-invalid}}. 
> Two workarounds are:
> * to include a XML encoding declaration as done in {{child-valid}}. 
> * to use {{JAVA_TOOL_OPTIONS}} on Windows as stated in 
> http://stackoverflow.com/a/623036/49132
> * to use {{MAVEN_OPTS=-Dfile.encoding=utf-8 mvn -s settings.xml -gs 
> settings.xml clean validate}}.
> Nonetheless I consider this a Major bug, as it clearly violates the 
> recommendations of W3C.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://jira.codehaus.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to