Since this only effects Strings, I'm even more inclined to leave the option at the JVM. Most of our methods that accept a `CharSequence` or `String` object end up creating a `Text` object based off them, which encodes them with UTF-8. I'd much rather make it our convention to always convert `String` to `Text` objects if we need to deal with them in a textual way; otherwise we're just dealing with `byte[]` when serializing keys and values.
Now, it's another story if Thrift is serializing `String`s with the JVM setting... On Mon, Oct 29, 2012 at 1:00 PM, David Medinets <[email protected]>wrote: > > David, can you give some sort of feel for the usages of the getBytes() > > calls? Since most of the API deals with things in terms of Text and > byte[] > > (Key and Value decomposed), are most of the usages > configuration/user-input > > based as your initial snippet from InputFormatBase showed? > > I will post a list of the files that I have changed before I commit. I > will post the file list as a response in this thread. >
