[ 
https://issues.apache.org/jira/browse/HIVE-7219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037771#comment-14037771
 ] 

Prasanth J commented on HIVE-7219:
----------------------------------

bq. Question: Should the following information from Prasanth J also be 
documented, and if so does it belong in the ORC wikidoc or with the parameter 
description in Configuration Properties?
bq. For integers, this patch will improve only very specific cases. If the 
encoding uses SHORT_REPEAT, DELTA (esp. fixed delta), PATCHED_BLOB then this 
patch will NOT have any effect, as these encodings does not use bit packing. 
The bit packed encodings like DIRECT, DELTA (variable delta) will see 
improvements.

I think these are too specific for it to be put into user documentation.

> Improve performance of serialization utils in ORC
> -------------------------------------------------
>
>                 Key: HIVE-7219
>                 URL: https://issues.apache.org/jira/browse/HIVE-7219
>             Project: Hive
>          Issue Type: Improvement
>          Components: File Formats
>    Affects Versions: 0.14.0
>            Reporter: Prasanth J
>            Assignee: Prasanth J
>              Labels: TODOC14
>             Fix For: 0.14.0
>
>         Attachments: HIVE-7219.1.patch, HIVE-7219.2.patch, HIVE-7219.3.patch, 
> HIVE-7219.4.patch, orc-read-perf-jmh-benchmark.png
>
>
> ORC uses serialization utils heavily for reading and writing data. The 
> bitpacking and unpacking code in writeInts() and readInts() can be unrolled 
> for better performance. Also double reader/writer performance can be improved 
> by bulk reading/writing from/to byte array.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to