So perhaps we should have ISO-8859-1 as the standard. Mike- do you see any reason to use something beside ISO-8859-1 for the encodings?
John On Mon, Oct 29, 2012 at 3:14 PM, Michael Flester <[email protected]> wrote: > > UTF-8 should always be present (according to the JLS), and as a > multi-byte > > format should be able to encode any character that you would need to. > > > > UTF-8 cannot encode arbitrary data. All data that we store in accumulo > is not characters. A safe encoding to use as a pass through when you > don't know if you are dealing with characters is ISO-8859-1 since we know > that we can make the round trip from bytes to string to bytes without loss. >
