[
https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034434#comment-14034434
]
Gopal V commented on HIVE-7231:
-------------------------------
The approach results in stray writes across the stripe boundaries.
I think this approach needs to be revisited to disconnect the HDFS block size
from the ORC stripe size.
The stripe size needs to be a factor of the HDFS block size, but the fraction
should not remain at 0.5x.
> Improve ORC padding
> -------------------
>
> Key: HIVE-7231
> URL: https://issues.apache.org/jira/browse/HIVE-7231
> Project: Hive
> Issue Type: Improvement
> Components: File Formats
> Affects Versions: 0.14.0
> Reporter: Prasanth J
> Assignee: Prasanth J
> Labels: orcfile
> Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch
>
>
> Current ORC padding is not optimal because of fixed stripe sizes within
> block. The padding overhead will be significant in some cases. Also padding
> percentage relative to stripe size is not configurable.
--
This message was sent by Atlassian JIRA
(v6.2#6252)