[jira] [Commented] (DRILL-3938) Hive: Failure reading from a partition when a new column is added to the table after the partition creation

ASF GitHub Bot (JIRA) Mon, 30 Nov 2015 17:52:18 -0800

    [ 
https://issues.apache.org/jira/browse/DRILL-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15032878#comment-15032878
 ]


ASF GitHub Bot commented on DRILL-3938:
---------------------------------------

Github user mehant commented on a diff in the pull request:

    https://github.com/apache/drill/pull/211#discussion_r46231527
  
    --- Diff: 
contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/logical/ConvertHiveParquetScanToDrillParquetScan.java
 ---
    @@ -101,12 +101,24 @@ public boolean matches(RelOptRuleCall call) {
           return true;
         }
     
    +    final List<FieldSchema> tableSchema = hiveTable.getSd().getCols();
         // Make sure all partitions have the same input format as the table 
input format
         for (HivePartition partition : partitions) {
    -      Class<? extends InputFormat> inputFormat = 
getInputFormatFromSD(hiveTable, partition.getPartition().getSd());
    +      final StorageDescriptor partitionSD = 
partition.getPartition().getSd();
    +      Class<? extends InputFormat> inputFormat = 
getInputFormatFromSD(hiveTable, partitionSD);
           if (inputFormat == null || !inputFormat.equals(tableInputFormat)) {
             return false;
           }
    +
    +      // Make sure the schema of the table and schema of the partition 
matches. If not return false. Currently native
    --- End diff --
    
    Could you add a minor comment indicating that the schema changes between 
partition and table can happen due to "alter table" statements and that we 
would need the converter functions in case the type of the column has been 
changed via the alter table.


> Hive: Failure reading from a partition when a new column is added to the 
> table after the partition creation
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-3938
>                 URL: https://issues.apache.org/jira/browse/DRILL-3938
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Hive
>    Affects Versions: 0.4.0
>            Reporter: Venki Korukanti
>            Assignee: Venki Korukanti
>             Fix For: 1.4.0
>
>
> Repro:
> From Hive:
> {code}
> CREATE TABLE kv(key INT, value STRING);
> LOAD DATA LOCAL INPATH 
> '/Users/hadoop/apache-repos/hive-install/apache-hive-1.0.0-bin/examples/files/kv1.txt'
>  INTO TABLE kv;
> CREATE TABLE kv_p(key INT, value STRING, part1 STRING);
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.exec.max.dynamic.partitions=10000;
> set hive.exec.max.dynamic.partitions.pernode=10000;
> INSERT INTO TABLE kv_p PARTITION (part1) SELECT key, value, value as s FROM 
> kv;
> ALTER TABLE kv_p ADD COLUMNS (newcol STRING);
> {code}
> From Drill:
> {code}
> USE hive;
> DESCRIBE kv_p;
> SELECT newcol FROM kv_p;
> throws column 'newcol' not found error in HiveRecordReader while selecting 
> only the projected columns.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3938) Hive: Failure reading from a partition when a new column is added to the table after the partition creation

Reply via email to