AW: UTF-16

2016-01-02 Thread Jens Geyer
Ok, thanks. The more I think about it it, the less beneficial it looks to me. JensG - Ursprüngliche Nachricht - Von: Randy Abernethy Gesendet: 01.01.2016 18:18 An: dev@thrift.apache.org Betreff: Re: UTF-16 Right now the "string" type in Apache Thrift is abstract. I like th

Re: UTF-16

2016-01-01 Thread Randy Abernethy
Right now the "string" type in Apache Thrift is abstract. I like that. If you are using Java or C# then Apache Thrift strings are UTF-16. If you are using Python or Go, then Apache Thrift strings are UTF-8. So the protocols are already serializing between the language native string type

Re: UTF-16

2016-01-01 Thread Ben Craig
I don't like the idea of adding a new utf-16 string type to the wire protocol, but I think it would be fine to add a utf-16 string type to the language bindings. UTF-8 would be sent over the wire, and then converted from the network buffer into the user's desired string type. A lot o

AW: UTF-16

2016-01-01 Thread Jens Geyer
well ... Happy new year! Von: Randy Abernethy Gesendet: 01.01.2016 02:56 An: dev@thrift.apache.org Betreff: Re: UTF-16 Hey David, Apache Thrift has a "string" type in its IDL and that type is a language native string in the generated code but is UTF-8 o

Re: UTF-16

2015-12-31 Thread Randy Abernethy
lso support UTF-16 string encoding on the wire with binary, compact and JSON protocols. -Randy On Thu, Dec 31, 2015 at 5:09 PM, David Bennett wrote: > >>>while UTF-8 is great, especially on Windows platforms UTF-16 is more > common, because the OS uses it heavily internally.

RE: UTF-16

2015-12-31 Thread David Bennett
>>>while UTF-8 is great, especially on Windows platforms UTF-16 is more common, >>>because the OS uses it heavily internally. Since Win2k it also supports >>>surrogates and supplementary characters. So there’s OS support for it. What >>>I don’t know is, how u

Re: UTF-16

2015-12-31 Thread Randy Abernethy
parsed and sorted, can be IDed without BOM, etc. Everything about UTF-16 is equal or worse, with the exception of the fact that some chars are 16 bits in UTF-16 and 24 bits in UTF-8. It is hard to know which format is used the most. Microsoft unsurprisingly says UTF-16 (the standard built into

UTF-16

2015-12-31 Thread Jens Geyer
Hi all, while UTF-8 is great, especially on Windows platforms UTF-16 is more common, because the OS uses it heavily internally. Since Win2k it also supports surrogates and supplementary characters. So there’s OS support for it. What I don’t know is, how universally is UTF-16 (or a subset of it

[jira] [Commented] (THRIFT-3404) JSON String reader doesn't recognize UTF-16 surrogate pair

2015-10-30 Thread Hudson (JIRA)
ttps://builds.apache.org/job/Thrift/1703/]) THRIFT-3404 Fixed JSON String reader doesn't recognize UTF-16 surrogate (jensg: rev a6509f7b378ed6591d550134fdda18e4a436fe77) * lib/delphi/test/TestClient.pas * lib/delphi/src/Thrift.Protocol.JSON.pas > JSON String reader doesn't recognize UTF-

[jira] [Resolved] (THRIFT-3404) JSON String reader doesn't recognize UTF-16 surrogate pair

2015-10-30 Thread Jens Geyer (JIRA)
n't recognize UTF-16 surrogate pair > -- > > Key: THRIFT-3404 > URL: https://issues.apache.org/jira/browse/THRIFT-3404 > Project: Thrift > Issue Type: Bug >

[jira] [Updated] (THRIFT-3404) JSON String reader doesn't recognize UTF-16 surrogate pair

2015-10-30 Thread Jens Geyer (JIRA)
[ https://issues.apache.org/jira/browse/THRIFT-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jens Geyer updated THRIFT-3404: --- Assignee: Phongphan Phuttha > JSON String reader doesn't recognize UTF-16 surrog

[jira] [Commented] (THRIFT-3404) JSON String reader doesn't recognize UTF-16 surrogate pair

2015-10-30 Thread ASF GitHub Bot (JIRA)
pull request at: https://github.com/apache/thrift/pull/671 > JSON String reader doesn't recognize UTF-16 surrogate pair > -- > > Key: THRIFT-3404 > URL: https://issues.apache.org

[jira] [Commented] (THRIFT-3404) JSON String reader doesn't recognize UTF-16 surrogate pair

2015-10-30 Thread ASF GitHub Bot (JIRA)
pull request: https://github.com/apache/thrift/pull/671 THRIFT-3404: Fixed JSON String reader doesn't recognize UTF-16 surrogate pairs Hi, This should fixed THRIFT-3404 to allow Delphi to be able to consume non-BMP unicode characters. I have modified test case a bit by

[jira] [Created] (THRIFT-3404) JSON String reader doesn't recognize UTF-16 surrogate pair

2015-10-30 Thread Phongphan Phuttha (JIRA)
Phongphan Phuttha created THRIFT-3404: - Summary: JSON String reader doesn't recognize UTF-16 surrogate pair Key: THRIFT-3404 URL: https://issues.apache.org/jira/browse/THRIFT-3404 Project: T

[jira] [Commented] (THRIFT-3403) JSON String reader doesn't recognize UTF-16 surrogate pairs

2015-10-29 Thread Hudson (JIRA)
ttps://builds.apache.org/job/Thrift/1702/]) THRIFT-3403 Fixed JSON string reader doesn't recognize UTF-16 surrogate (jensg: rev 11b515cd29292358305ace4ce20d7e626c7e7f42) * lib/csharp/test/JSON/Program.cs * lib/csharp/src/Protocol/TJSONProtocol.cs > JSON String reader doesn't recognize UTF-1

[jira] [Resolved] (THRIFT-3403) JSON String reader doesn't recognize UTF-16 surrogate pairs

2015-10-29 Thread Jens Geyer (JIRA)
Committed, thanks! > JSON String reader doesn't recognize UTF-16 surrogate pairs > --- > > Key: THRIFT-3403 > URL: https://issues.apache.org/jira/browse/THRIFT-3403 >

[jira] [Commented] (THRIFT-3403) JSON String reader doesn't recognize UTF-16 surrogate pairs

2015-10-29 Thread ASF GitHub Bot (JIRA)
pull request at: https://github.com/apache/thrift/pull/668 > JSON String reader doesn't recognize UTF-16 surrogate pairs > --- > > Key: THRIFT-3403 > URL: https://issues.apache.org

[jira] [Assigned] (THRIFT-3403) JSON String reader doesn't recognize UTF-16 surrogate pairs

2015-10-29 Thread Jens Geyer (JIRA)
[ https://issues.apache.org/jira/browse/THRIFT-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jens Geyer reassigned THRIFT-3403: -- Assignee: Jens Geyer > JSON String reader doesn't recognize UTF-16 surroga

[jira] [Commented] (THRIFT-3403) JSON String reader doesn't recognize UTF-16 surrogate pairs

2015-10-29 Thread ASF GitHub Bot (JIRA)
pull request: https://github.com/apache/thrift/pull/668 THRIFT-3403: Fixed JSON string reader doesn't recognize UTF-16 surrogate pairs Hi This should fix THRIFT-3403 by correctly decode surrogate pairs to single code point. You can merge this pull request into a Git repo

[jira] [Created] (THRIFT-3403) JSON String reader doesn't recognize UTF-16 surrogate pairs

2015-10-29 Thread Phongphan Phuttha (JIRA)
Phongphan Phuttha created THRIFT-3403: - Summary: JSON String reader doesn't recognize UTF-16 surrogate pairs Key: THRIFT-3403 URL: https://issues.apache.org/jira/browse/THRIFT-3403 Pr