[
https://issues.apache.org/jira/browse/TEZ-500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13974689#comment-13974689
]
Gopal V commented on TEZ-500:
-----------------------------
Is this still a problem?
I turned on RLE today to test out my theory and did a few ETL loads with it.
> RLE in IFile does not seem to work correctly
> --------------------------------------------
>
> Key: TEZ-500
> URL: https://issues.apache.org/jira/browse/TEZ-500
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Siddharth Seth
>
> The compressed length reported by the writer is typically larger than the
> Uncompressed length. The size of the output file generated matches the
> uncompressed length.
> The Shuffle fetchers allocate buffers based on the compressed length, and
> pull that much data. As a result the entire contents are not pulled in.
> Also, even if the entire content is pulled in - nextRawKey, nextRawValue ends
> up failing the moment a repeated key is hit.
--
This message was sent by Atlassian JIRA
(v6.2#6252)