The quick fix of specifying UTF-8 sounds good to me. As you say, it's better than unspecified behavior.
--John On Fri, Feb 1, 2008 at 12:41 PM, Brian Eaton <[EMAIL PROTECTED]> wrote: > The current fetchJson implementation uses "new > String(results.getByteArray())" to convert the response bytes to a > string for inclusion in the JSON reply to the gadget. The behavior of > new String(byte[]) is undefined "when the given bytes are not valid in > the default charset". > > The default charset could be anything, and the returned bytes from the > remote server could also be anything. This is likely to cause > problems (data corruption) for gadgets fetching data from non-english > web sites. > > I'll open up a JIRA issue for this, but I wanted to see whether anyone > had proposals for a solution. The fix will probably involve using > CharsetDecoder, so we at least have well-defined behavior. How we > pick the CharsetDecoder to use is an open question. What to do when > the CharsetDecoding fails is another issue. I'm tempted to put in a > quick fix that specifies UTF-8 for the character set. That will > prevent anyone from depending on the current undefined behavior while > we work out what should happen. > > Cheers, > Brian >

