[
https://issues.apache.org/jira/browse/HIVE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14090193#comment-14090193
]
Lefty Leverenz commented on HIVE-4123:
--------------------------------------
Doc questions: Would it be okay to restore part of the original description
for *hive.exec.orc.write.format* in the wiki (and later in HiveConf.java)?
* current description is just "Define the version of the file to write" -- that
doesn't give any idea about possible values, since the default is null, and it
isn't clear that "version of the file" means Hive version
* original description was "use 0.11 version of RLE encoding. if this conf is
not defined or any other value specified, ORC will use the new RLE encoding"
So I'd like to add "Possible values are 0.11, 0.12, etc. If this parameter is
not defined, ORC will use the RLE encoding introduced in Hive 0.12. Any value
other than 0.11 results in the 0.12 encoding."
Is that accurate? Can releases be specified as "0.12.0" or "0.13.1"?
> The RLE encoding for ORC can be improved
> ----------------------------------------
>
> Key: HIVE-4123
> URL: https://issues.apache.org/jira/browse/HIVE-4123
> Project: Hive
> Issue Type: New Feature
> Components: File Formats
> Affects Versions: 0.12.0
> Reporter: Owen O'Malley
> Assignee: Prasanth J
> Labels: TODOC12, orcfile
> Fix For: 0.12.0
>
> Attachments: HIVE-4123-8.patch, HIVE-4123.1.git.patch.txt,
> HIVE-4123.2.git.patch.txt, HIVE-4123.3.patch.txt, HIVE-4123.4.patch.txt,
> HIVE-4123.5.txt, HIVE-4123.6.txt, HIVE-4123.7.txt, HIVE-4123.8.txt,
> HIVE-4123.8.txt, HIVE-4123.patch.txt, ORC-Compression-Ratio-Comparison.xlsx
>
>
> The run length encoding of integers can be improved:
> * tighter bit packing
> * allow delta encoding
> * allow longer runs
--
This message was sent by Atlassian JIRA
(v6.2#6252)