Re: UTF-16

2015-12-31 Thread Randy Abernethy
Hey David, Apache Thrift has a "string" type in its IDL and that type is a language native string in the generated code but is UTF-8 on the wire when using binary, compact or JSON protocols by default. I think Jens is posing the question (correct me if I'm wrong Jens): Should we also support UTF-

RE: UTF-16

2015-12-31 Thread David Bennett
>>>while UTF-8 is great, especially on Windows platforms UTF-16 is more common, >>>because the OS uses it heavily internally. Since Win2k it also supports >>>surrogates and supplementary characters. So there’s OS support for it. What >>>I don’t know is, how universally is UTF-16 (or a subset of

Re: UTF-16

2015-12-31 Thread Randy Abernethy
Hey Jens, I would vote to keep Thrift simple and standardized on UTF-8 alone. The simple part is the main thing for me. -Randy TL;DR In my experience many lament the 16 bit choice once made. Originally 16 bit Unicode (UCS-2) had no surrogates (as you mention), it was thought all of the impo

UTF-16

2015-12-31 Thread Jens Geyer
Hi all, while UTF-8 is great, especially on Windows platforms UTF-16 is more common, because the OS uses it heavily internally. Since Win2k it also supports surrogates and supplementary characters. So there’s OS support for it. What I don’t know is, how universally is UTF-16 (or a subset of it)