Philippe de Rochambeau wrote: > On the other hand, if I store the previous "go" character > plus an unusual > CJK ideogram whose Unicode equivalent is \u5439 (E5 90 B9 in UTF-8) > in the DB and retrieve the data, JRun 3.1 will only display the first > character in my form's textarea, plus a few invisible > characters, and the > database will contain the following hex values: > > E8 AA 9E E5 3F B9 20 20 20 20 20 20 0D 0A 0A > > As you can see, "go" is still there, but the following > character (E5 3F B9) > is not \u5439 (E5 90 B9). I cannot figure out how to fix this problem. > > Any help with this problem would be much appreciated.
I see what the problem is. As usual, it's all the fault of Bill Gate$. :-) If you interpret <E5, 90, B9> according to Windows-1252, you see that E5 is "å", B9 is "¹", but 90 is an unassigned slot! Unassigned characters are normally turned into a question marks, and "?"'s code is (guess what) 3F... <E8, AA, 9E> this works only by chance, because all three bytes are valid Windows-1252 characters: "é", "ª", and "ž", respectively. I guess that the problem starts when you try to fool the system into thinking that the text is ISO 8859-1: byte[] byt = (newQfLibelleArray[i]).getBytes( "ISO8859_1" ); String tempUtf16 = new String( byt ); But, sorry. I can't help with a fix, because I don't know Java API's well enough. Can't you do something like <.getBytes("UTF-8")>? Or, even better, doesn't (newQfLibelleArray[i]) have a method to return a <String> object directly? _ Marco