[ https://issues.apache.org/jira/browse/MCHANGELOG-142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999159#comment-16999159 ]
Elliotte Rusty Harold commented on MCHANGELOG-142: -------------------------------------------------- should be investigated > UTF-8 Encoding doubled > ---------------------- > > Key: MCHANGELOG-142 > URL: https://issues.apache.org/jira/browse/MCHANGELOG-142 > Project: Maven Changelog Plugin > Issue Type: Bug > Affects Versions: 2.3 > Reporter: Jukka Harkki > Priority: Major > Labels: sample-project-missing > > Creating changelog.xml file doubles UTF-8 encoding if the git comment > information is already UTF-8 format. For example: if property outputEncoding > is set to ISO-8859-1 the output is (shown as od dump): > {code} > 0004060 7375 7420 696f 696d 616d 6e61 6d20 c379 > u s t o i m i m a a n m y ├ > 0004100 73b6 6c20 7369 a4c3 6b79 6573 7373 a4c3 > Â s l i s ├ ñ y k s e s s ├ ñ > {code} > And when set to UTF-8 the output is: > {code} > 0004060 6d69 6d69 6161 206e 796d 83c3 b6c2 2073 > i m i m a a n m y ├ â ┬ Â s > {code} > The result of UTF-8 encoding is that scandinavian umlauts are garbled. Code > C3 B6 is the right for the "ö"-letter. > The ISO-8859-1 format would do for the site documentation but since the file > changelog.xml header says ISO-8859-1 encoding, rest of the process fails to > process umlauts. > I modified class ChangeLogReport method writeChangelogXml() by commenting out > issue MCHANGELOG-86 writer change: > {code} > PrintWriter pw = new PrintWriter(new BufferedOutputStream(new > FileOutputStream(outputXML))); > pw.write(changelogXml.toString()); > pw.flush(); > pw.close(); > // MCHANGELOG-86 > // Writer writer = WriterFactory.newWriter( new BufferedOutputStream( > new FileOutputStream( outputXML ) ), > // getOutputEncoding() ); > // writer.write(changelogXml.toString()); > // writer.flush(); > // writer.close(); > {code} > It might be there is double escaping in Writer since couple of lines above > the change set is created with encoding information: > {code} > String changeset = changelogSet.toXML(getOutputEncoding()); > {code} > However, this is just a wild guess since I did not check out implementation > of changelogSet.toXML() or writer.write(). It could be also something > different in version control access since MCHANGELOG-86 was a SVN issue and > here we got with GIT. -- This message was sent by Atlassian Jira (v8.3.4#803005)