[ 
https://issues.apache.org/jira/browse/THRIFT-4207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16223556#comment-16223556
 ] 

ASF GitHub Bot commented on THRIFT-4207:
----------------------------------------

Github user nsuke commented on a diff in the pull request:

    https://github.com/apache/thrift/pull/1274#discussion_r147556105
  
    --- Diff: lib/py/src/ext/protocol.tcc ---
    @@ -419,18 +419,30 @@ bool ProtocolBase<Impl>::encodeValue(PyObject* value, 
TType type, PyObject* type
     
       case T_STRING: {
         ScopedPyObject nval;
    +    Py_ssize_t len;
    +    char *str;
     
         if (PyUnicode_Check(value)) {
           nval.reset(PyUnicode_AsUTF8String(value));
           if (!nval) {
             return false;
           }
         } else {
    +      if (isUtf8(typeargs)) {
    +        if (PyBytes_AsStringAndSize(value, &str, &len) < 0) {
    +          return false;
    +        }
    +        // Check that input is a valid UTF-8 string.
    +        nval.reset(PyUnicode_DecodeUTF8(str, len, 0));
    +        if (!nval) {
    +          return false;
    +        }
    +      }
    --- End diff --
    
    Doesn't this affect every user's performance who are passing relatively 
large utf8-encoded `byte` ?
    The problem might be that we're not rejecting `byte` in the first place. 
Although "fixing" that wouldn't be backward compatible.
    What do you think ?


> Accelerated version of TBinaryProtocol allows invalid input to string fields.
> -----------------------------------------------------------------------------
>
>                 Key: THRIFT-4207
>                 URL: https://issues.apache.org/jira/browse/THRIFT-4207
>             Project: Thrift
>          Issue Type: Bug
>          Components: Python - Library
>    Affects Versions: 0.10.0
>            Reporter: Elvis Pranskevichus
>            Assignee: James E. King, III
>             Fix For: 0.11.0
>
>
> {{TBinaryProtocolAccelerated}} and {{TCompactProtocolAccelerated}} currently 
> accept arbitrary bytes as input to string fields even when {{py:utf8strings}} 
> is on.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to