[ 
https://issues.apache.org/jira/browse/DRILL-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14965248#comment-14965248
 ] 

ASF GitHub Bot commented on DRILL-3938:
---------------------------------------

GitHub user vkorukanti opened a pull request:

    https://github.com/apache/drill/pull/211

    DRILL-3938: Support reading from Hive tables that have schema altered after 
the creation

    Hive creates a converter to convert from the partition schema to table 
schema when the table schema is altered after the partition is created. The 
behavior in mapping partition schema to table schema is:
     - if a column doesn't exist in partition schema, its value is considered 
as null
     - if the column type doesn't match the required type, it is converted 
according to various convert methods available in Hive.
    
    Currently we have to rely on the Hive converters, because Drill doesn't 
have the same convert methods that Hive has [1]. 
    
    [1] 
https://github.com/apache/hive/blob/branch-1.0/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java
    
    Also:
    + Remove "redoRecord" logic which is not needed after "automatic 
reallocation" (DRILL-1960) changes.
    + Remove HiveTestRecordReader. This is incomplete in implementation and not 
used anywhere. It is currently just
      a burden to maintain with changes in its superclass HiveRecordReader

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vkorukanti/drill DRILL-3938

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/211.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #211
    
----
commit 21c4325abcb25ce22846217e61eee23816f73571
Author: vkorukanti <venki.koruka...@gmail.com>
Date:   2015-10-19T18:35:09Z

    DRILL-3938: Support reading from Hive tables that have schema altered after 
the creation
    
    Also:
    + Remove "redoRecord" logic which is not needed after "automatic 
reallocation" (DRILL-1960) changes.
    + Remove HiveTestRecordReader. This is incomplete in implementation and not 
used anywhere. It is currently just
      a burden to maintain with changes in its superclass HiveRecordReader

----


> Hive: Failure reading from a partition when a new column is added to the 
> table after the partition creation
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-3938
>                 URL: https://issues.apache.org/jira/browse/DRILL-3938
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Hive
>    Affects Versions: 0.4.0
>            Reporter: Venki Korukanti
>            Assignee: Venki Korukanti
>             Fix For: 1.3.0
>
>
> Repro:
> From Hive:
> {code}
> CREATE TABLE kv(key INT, value STRING);
> LOAD DATA LOCAL INPATH 
> '/Users/hadoop/apache-repos/hive-install/apache-hive-1.0.0-bin/examples/files/kv1.txt'
>  INTO TABLE kv;
> CREATE TABLE kv_p(key INT, value STRING, part1 STRING);
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.exec.max.dynamic.partitions=10000;
> set hive.exec.max.dynamic.partitions.pernode=10000;
> INSERT INTO TABLE kv_p PARTITION (part1) SELECT key, value, value as s FROM 
> kv;
> ALTER TABLE kv_p ADD COLUMNS (newcol STRING);
> {code}
> From Drill:
> {code}
> USE hive;
> DESCRIBE kv_p;
> SELECT newcol FROM kv_p;
> throws column 'newcol' not found error in HiveRecordReader while selecting 
> only the projected columns.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to