On Wed, Sep 1, 2010 at 5:43 AM, Deven You <devyo...@gmail.com> wrote:
> I have run the test on Linux, and got the same error. Seems it is due to > our > UTF-8 decoder. I will do more debugging to narrow down the root cause. Any > one is familiar with UTF-8? I hope I can get some help. > > Looks like the problem is in UTF_8's decodeLoop where it does: cArr[outIndex++] = (char) jchar; and similar in the non-array case where it does: out.put((char) jchar); in this case, jchar is the correct value of my codepoint (0x1d11e), but is being truncated to 'char'. instead it needs to be split into surrogates. -- Robert Muir rcm...@gmail.com