Ryan Blue created PARQUET-180:
---------------------------------

             Summary: Parquet-thrift compile issue with 0.9.2.
                 Key: PARQUET-180
                 URL: https://issues.apache.org/jira/browse/PARQUET-180
             Project: Parquet
          Issue Type: Bug
            Reporter: Ryan Blue


Thrift 0.9.2 removed 
[{{setReadLength}}|https://github.com/apache/thrift/commit/2ca9c2028593782621c8876817d8772aa5f46ac7].
 This causes parquet-thrift to fail because it is called for TBinaryProtocol. 
The reason we use it is defensive: a size is read from the data and then that 
many bytes are read, so using this method sets a maximum and causes an 
exception rather than a strange failure later on. The code also has a comment 
that says it is okay when it can't be used.

{code}
    /* Reduce the chance of OOM when data is corrupted. When readBinary is 
called on TBinaryProtocol, it reads the length of the binary first,
     so if the data is corrupted, it could read a big integer as the length of 
the binary and therefore causes OOM to happen.
     Currently this fix only applies to TBinaryProtocol which has the 
setReadLength defined.
      */
    if (protocol instanceof TBinaryProtocol) {
      ((TBinaryProtocol)protocol).setReadLength(record.getLength());
    }
{code}

I think the fix is to remove the section above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to