[ https://issues.apache.org/jira/browse/THRIFT-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885577#comment-13885577 ]
Alexander Steshenko commented on THRIFT-2336: --------------------------------------------- apparently, TJsonProtocol seems to know how to handle utf-encoded characters, but it expects exactly two zeros after {{\u}}: {{private TByteArrayOutputStream readJSONString(boolean skipContext)}} {code} if (ch == ESCSEQ[0]) { ch = reader_.read(); if (ch == ESCSEQ[1]) { readJSONSyntaxChar(ZERO); readJSONSyntaxChar(ZERO); trans_.readAll(tmpbuf_, 0, 2); {code} > UTF-8 sent by PHP as JSON is not understood by JAVA's TJsonProtocol > ------------------------------------------------------------------- > > Key: THRIFT-2336 > URL: https://issues.apache.org/jira/browse/THRIFT-2336 > Project: Thrift > Issue Type: Bug > Reporter: Alexander Steshenko > > This is similar to THRIFT-2285. > Whenever I have our Thrift-For-Php send non-latin utf-8 characters, e.g. > "Русское Название" (Russian), I get this: > {noformat} > {"3":{"str":"\u0420\u0443\u0441\u0441\u043a\u043e\u0435 > \u041d\u0430\u0437\u0432\u0430\u043d\u0438\u0435"},"6":{"tf":0}} > {noformat} > which is a perfectly valid JSON, and I don't mind it being encoded like that. > Java fails with > {noformat} > Caused by: ! org.apache.thrift.protocol.TProtocolException: Unexpected > character:4 > {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)