[ https://issues.apache.org/jira/browse/STORM-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rick Kellogg updated STORM-437: ------------------------------- Component/s: storm-multilang > multilang JsonSerializer does not enforce inputstream UTF-8 encoding > -------------------------------------------------------------------- > > Key: STORM-437 > URL: https://issues.apache.org/jira/browse/STORM-437 > Project: Apache Storm > Issue Type: Bug > Components: storm-multilang > Affects Versions: 0.9.2-incubating > Environment: AWS ubuntu 12.04 oracle java7 > Reporter: Itai Frenkel > Assignee: Itai Frenkel > Fix For: 0.9.3 > > > On some machines UTF-8 gets corrupted over the multilang protocol. Analysis > of the problem leads to JsonSerializer usage of InputStreamReader when > reading from stdin. > InputStreamReader uses the JVM defaults, which is usually UTF-8 but not > always. > Temporary Workaround: > Edit storm/conf/storm.yaml and enforce the default JVM charset as follows: > worker.childopts: "-Xmx768m -Dfile.encoding=UTF-8" > Required Fix in JsonSerializer: > Pass the string "UTF-8" to the InputStreamReader constructor as second > argument. > Notes: > The implementation already enforces UTF-8 when writing to stdout, so there is > no other fix needed there. > python simplejson and ruby json gem use UTF-8 as the default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)