[ https://issues.apache.org/jira/browse/HADOOP-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14195110#comment-14195110 ]
Stephen Chu commented on HADOOP-11165: -------------------------------------- This is related to http://stackoverflow.com/questions/25404373/java-8-utf-8-encoding-issue-java-bug: "It is a property of the “Modified UTF-8” encoding to store surrogate pairs (or even unpaired chars of that range) like individual characters. And it’s an error if a decoder claiming to use standard UTF-8 uses “Modified UTF-8”. This seems to have been fixed with Java 8." So when running the test in Java 8, we'll get mismatching Strings because {{new String(UTF8.getBytes(before), "UTF-8")}} will not decode using "Modified UTF-8" anymore. > TestUTF8 fails when run against java 8 > -------------------------------------- > > Key: HADOOP-11165 > URL: https://issues.apache.org/jira/browse/HADOOP-11165 > Project: Hadoop Common > Issue Type: Test > Reporter: Ted Yu > Assignee: Stephen Chu > Priority: Minor > > Using jdk1.8.0_20 , I got: > {code} > testGetBytes(org.apache.hadoop.io.TestUTF8) Time elapsed: 0.007 sec <<< > FAILURE! > junit.framework.ComparisonFailure: > expected:<쑼ь⣄鬘㟻햫紖燺[?炀⑰풸낓⨵ἲꬌホ쭷㛕曬䟊⁍䴥䳠領蟭뱻宭竕昚鍳튇ꊕ혶齲쏈㠮胨䩦隼䍻킿喝벁ࢼ듿饭玳Մ剌䒤?䳛슟녚沖᯳?訨 > 牙⍖?䎠旘薑春觀葝礫⁑ﻱ⣽゚굿뒦ݦ︀偆?]古絥萟浐> but > was:<쑼ь⣄鬘㟻햫紖燺[�炀⑰풸낓⨵ἲꬌホ쭷㛕曬䟊⁍䴥䳠領蟭뱻宭竕昚鍳튇ꊕ혶齲쏈㠮胨䩦隼䍻킿喝벁ࢼ듿饭玳Մ剌䒤�䳛슟녚᯳�訨牙⍖�䎠旘薑春觀葝礫⁑ﻱ⣽゚굿뒦ݦ︀偆�]古絥萟浐> > at junit.framework.Assert.assertEquals(Assert.java:100) > at junit.framework.Assert.assertEquals(Assert.java:107) > at junit.framework.TestCase.assertEquals(TestCase.java:269) > at org.apache.hadoop.io.TestUTF8.testGetBytes(TestUTF8.java:58) > testIO(org.apache.hadoop.io.TestUTF8) Time elapsed: 0.002 sec <<< FAILURE! > junit.framework.ComparisonFailure: > expected:<...ᨍ⁖粩⧬车﹂脖朷䝄懒댵突疼資⍣眠畠忁[?]䪐ゑ鬍鍅遻ꈸ釡> but > was:<...ᨍ⁖粩⧬车﹂脖朷䝄懒댵突疼資⍣眠畠忁[�]䪐ゑ鬍鍅遻>ꈸ釡> > at junit.framework.Assert.assertEquals(Assert.java:100) > at junit.framework.Assert.assertEquals(Assert.java:107) > at junit.framework.TestCase.assertEquals(TestCase.java:269) > at org.apache.hadoop.io.TestUTF8.testIO(TestUTF8.java:86) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)