The quick fix of specifying UTF-8 sounds good to me. As you say, it's better
than unspecified behavior.

--John

On Fri, Feb 1, 2008 at 12:41 PM, Brian Eaton <[EMAIL PROTECTED]> wrote:

> The current fetchJson implementation uses "new
> String(results.getByteArray())" to convert the response bytes to a
> string for inclusion in the JSON reply to the gadget.  The behavior of
> new String(byte[]) is undefined "when the given bytes are not valid in
> the default charset".
>
> The default charset could be anything, and the returned bytes from the
> remote server could also be anything.  This is likely to cause
> problems (data corruption) for gadgets fetching data from non-english
> web sites.
>
> I'll open up a JIRA issue for this, but I wanted to see whether anyone
> had proposals for a solution.  The fix will probably involve using
> CharsetDecoder, so we at least have well-defined behavior.  How we
> pick the CharsetDecoder to use is an open question.  What to do when
> the CharsetDecoding fails is another issue.  I'm tempted to put in a
> quick fix that specifies UTF-8 for the character set.  That will
> prevent anyone from depending on the current undefined behavior while
> we work out what should happen.
>
> Cheers,
> Brian
>

Reply via email to