Setting Charset in getBytes() call.

David Medinets Sun, 28 Oct 2012 14:50:52 -0700

https://issues.apache.org/jira/browse/ACCUMULO-241?focusedCommentId=13449680&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13449680


In this comment, John mentioned that all getBytes() method calls
should be changed to use UTF8. There are about 1,800 getBytes() calls
and not all of them involve String objects. I am working on ways to
identify a subset of these calls to change.

I have created https://issues.apache.org/jira/browse/ACCUMULO-836 to
track this issue.

Should we create one static Charset object?

  Class AccumuloDefaultCharset {
    public static Charset UTF8 = Charset.forName("UTF8");
  }

Should we use a static constant?

  public static String UTF8 = "UTF8";

I have found one instance of getBytes() in InputFormatBase:

  protected static byte[] getPassword(Configuration conf) {
    return Base64.decodeBase64(conf.get(PASSWORD, "").getBytes());
  }

Are there any reasons why I can't start specifying the charset? Is
UTF8 the right Charset to use? I am not an expert in non-English
charsets, so guidance would be welcome.

Setting Charset in getBytes() call.

Reply via email to