[ 
https://issues.apache.org/jira/browse/HIVE-25453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ádám Szita resolved HIVE-25453.
-------------------------------
    Fix Version/s: 4.0.0
       Resolution: Fixed

Committed to master. Thanks for the review [~pvary]!

> Add LLAP IO support for Iceberg ORC tables
> ------------------------------------------
>
>                 Key: HIVE-25453
>                 URL: https://issues.apache.org/jira/browse/HIVE-25453
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Ádám Szita
>            Assignee: Ádám Szita
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>          Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Adding support for reading Iceberg ORC tables via LLAP..
> The easy part is swapping out the plain simple VectorizedOrcRecordReader to 
> LlapRecordReader.
> The hard part is maintaining correctness even after a series of schema 
> changes that are normally allowed to Iceberg/ORC, but were not for simple ORC 
> or therefore for LLAP. To make it all work, LLAP had to be made to support a 
> broader schema evolution.
> Before this change LLAP made the simple assumption that the reader and file 
> schemas match all columns, now separate physical and logical read schemas and 
> corresponding include lists are used instead. Also added 
> logicalOrderedColumnIds here, which holds indices from the reader schema, but 
> in file schema order - a necessary tool for mapping the results produced by 
> LLAP, as LLAP always reads columns in the order as they are written out in 
> the file.
> Also added a new CLI driver class for testing the cached reads from 
> Iceberg/ORC tables via LLAP.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to