[ 
https://issues.apache.org/jira/browse/HIVE-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296784#comment-15296784
 ] 

Sergey Shelukhin commented on HIVE-9660:
----------------------------------------

[~owen.omalley] lots of ORC tests failed that may be related... also it looks 
like all the Tez tests got stuck, not sure if that's related or just HiveQA 
(they didn't get stuck in other jiras though)

> store end offset of compressed data for RG in RowIndex in ORC
> -------------------------------------------------------------
>
>                 Key: HIVE-9660
>                 URL: https://issues.apache.org/jira/browse/HIVE-9660
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-9660.01.patch, HIVE-9660.02.patch, 
> HIVE-9660.03.patch, HIVE-9660.04.patch, HIVE-9660.05.patch, 
> HIVE-9660.06.patch, HIVE-9660.07.patch, HIVE-9660.07.patch, 
> HIVE-9660.08.patch, HIVE-9660.09.patch, HIVE-9660.10.patch, 
> HIVE-9660.10.patch, HIVE-9660.11.patch, HIVE-9660.patch, HIVE-9660.patch, 
> HIVE-9660.patch, owen-hive-9660.patch
>
>
> Right now the end offset is estimated, which in some cases results in tons of 
> extra data being read.
> We can add a separate array to RowIndex (positions_v2?) that stores number of 
> compressed buffers for each RG, or end offset, or something, to remove this 
> estimation magic



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to