Re: Template parsing vulnerable to whims of default charset

Kalle Korhonen Wed, 07 Dec 2011 13:47:33 -0800

On Wed, Dec 7, 2011 at 1:09 PM, Martin Strand
<do.not.eat.yellow.s...@gmail.com> wrote:
> On Wed, 07 Dec 2011 19:08:19 +0100, Kalle Korhonen
> <kalle.o.korho...@gmail.com> wrote:
>> 2011/12/6 Robert Coie <r...@intrigue.com>:
>>> On Tue, Dec 06, 2011 at 04:49:42PM -0800, Kalle Korhonen wrote:
>>>> What's your JVM's file.encoding set to? (e.g. -Dfile.encoding=UTF-8).
>>>> The default for most JVMs is *not* UTF-8. Tapestry assumes UTF-8
>>>> throughout.
>>> I believe it's US-ASCII, as checked by Charset.defaultCharset(),
>>> although I have seen some other reports indicating that that may not be
>>> reliable due to caching. It's not "my" JVM in the sense that I can't
>>> change the settings - it's at the mercy of Google App Engine.
>> Ah, you are using GAE. Should have said that in the beginning. See
>> http://gaelyk.appspot.com/tutorial/setup for example and set the JVM
>> encoding to UTF-8.
> The point is still valid - Tapestry should not depend on the default
> charset.
> Robert, since you already found the problem perhaps you could open a jira
> ticket and submit a patch?
> https://issues.apache.org/jira/browse/TAP5


What kind of patch are you envisioning? You could do it for
XMLTokenStream, but I wonder if this creates more problems than
solves. There are plenty of other read operations and they should all
then explicitly specify UTF-8, otherwise you get mixed results. What
if Tapestry, or your own code depends on a library that doesn't
specify the encoding but uses platform default. Perhaps a better, more
generic solution is just to document that JVM's default encoding
should be set to -Dfile.encoding=UTF-8.

Kalle

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tapestry.apache.org
For additional commands, e-mail: users-h...@tapestry.apache.org

Re: Template parsing vulnerable to whims of default charset

Reply via email to