[ https://issues.apache.org/jira/browse/HIVE-7219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14035067#comment-14035067 ]
Lefty Leverenz commented on HIVE-7219: -------------------------------------- This adds *hive.exec.orc.encoding.strategy* to HiveConf.java and hive-default.xml.template, so it needs to be added to Configuration Properties in the wiki for 0.14.0. I've added a comment to HIVE-6586 so the parameter's definition will be included in the new version of HiveConf.java after HIVE-6037 gets committed. Question: Should the following information from [~prasanth_j] also be documented, and if so does it belong in the ORC wikidoc or with the parameter description in Configuration Properties? {quote} For integers, this patch will improve only very specific cases. If the encoding uses SHORT_REPEAT, DELTA (esp. fixed delta), PATCHED_BLOB then this patch will NOT have any effect, as these encodings does not use bit packing. The bit packed encodings like DIRECT, DELTA (variable delta) will see improvements. {quote} * [ORC Files | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC] * [Configuration Properties | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties] ** [put hive.exec.orc.encoding.strategy after hive.exec.orc.zerocopy | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.exec.orc.zerocopy] > Improve performance of serialization utils in ORC > ------------------------------------------------- > > Key: HIVE-7219 > URL: https://issues.apache.org/jira/browse/HIVE-7219 > Project: Hive > Issue Type: Improvement > Components: File Formats > Affects Versions: 0.14.0 > Reporter: Prasanth J > Assignee: Prasanth J > Labels: TODOC14 > Fix For: 0.14.0 > > Attachments: HIVE-7219.1.patch, HIVE-7219.2.patch, HIVE-7219.3.patch, > HIVE-7219.4.patch, orc-read-perf-jmh-benchmark.png > > > ORC uses serialization utils heavily for reading and writing data. The > bitpacking and unpacking code in writeInts() and readInts() can be unrolled > for better performance. Also double reader/writer performance can be improved > by bulk reading/writing from/to byte array. -- This message was sent by Atlassian JIRA (v6.2#6252)