[ 
https://issues.apache.org/jira/browse/PARQUET-180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307775#comment-14307775
 ] 

Ryan Blue commented on PARQUET-180:
-----------------------------------

That works? I didn't think linking was done lazily like that. Either way, we 
can definitely do this through reflection if it's valuable.

> Parquet-thrift compile issue with 0.9.2.
> ----------------------------------------
>
>                 Key: PARQUET-180
>                 URL: https://issues.apache.org/jira/browse/PARQUET-180
>             Project: Parquet
>          Issue Type: Bug
>            Reporter: Ryan Blue
>
> Thrift 0.9.2 removed 
> [{{setReadLength}}|https://github.com/apache/thrift/commit/2ca9c2028593782621c8876817d8772aa5f46ac7].
>  This causes parquet-thrift to fail because it is called for TBinaryProtocol. 
> The reason we use it is defensive: a size is read from the data and then that 
> many bytes are read, so using this method sets a maximum and causes an 
> exception rather than a strange failure later on. The code also has a comment 
> that says it is okay when it can't be used.
> {code}
>     /* Reduce the chance of OOM when data is corrupted. When readBinary is 
> called on TBinaryProtocol, it reads the length of the binary first,
>      so if the data is corrupted, it could read a big integer as the length 
> of the binary and therefore causes OOM to happen.
>      Currently this fix only applies to TBinaryProtocol which has the 
> setReadLength defined.
>       */
>     if (protocol instanceof TBinaryProtocol) {
>       ((TBinaryProtocol)protocol).setReadLength(record.getLength());
>     }
> {code}
> I think the fix is to remove the section above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to