[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

Anthony Hsu (JIRA) Fri, 18 Apr 2014 09:14:14 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13974192#comment-13974192
 ]


Anthony Hsu commented on HIVE-6835:
-----------------------------------

I'm guessing the schema was specified in the SERDEPROPERTIES to work around 
HIVE-3953.  However, one issue with storing the schema in TBLPROPERTIES instead 
is that for partitioned tables, when you do a {{describe \[extended] 
<table_name> partition(...);}}, you get
{code}
error_error_error_error_error_error_error       string                  from 
deserializer   
cannot_determine_schema string                  from deserializer   
check                   string                  from deserializer   
schema                  string                  from deserializer   
url                     string                  from deserializer   
and                     string                  from deserializer   
literal                 string                  from deserializer
{code}
because the AvroSerDe cannot find "avro.schema.literal" or "avro.schema.url".  
If you store the schema in SERDEPROPERTIES, you don't get this issue, since the 
SERDEPROPERTIES get copied to the partition when it is created.

I do think it is useful to make both the table-level properties and the 
partition-level properties available separately to the SerDe when it's doing 
its .initalize().  The SerDe should be able to decide which set of properties 
it wants to use. From this point of view, I think my change is still useful and 
valid.

> Reading of partitioned Avro data fails if partition schema does not match 
> table schema
> --------------------------------------------------------------------------------------
>
>                 Key: HIVE-6835
>                 URL: https://issues.apache.org/jira/browse/HIVE-6835
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.12.0
>            Reporter: Anthony Hsu
>            Assignee: Anthony Hsu
>         Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch
>
>
> To reproduce:
> {code}
> create table testarray (a array<string>);
> load data local inpath '/home/ahsu/test/array.txt' into table testarray;
> # create partitioned Avro table with one array column
> create table avroarray partitioned by (y string) row format serde 
> 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
> ('avro.schema.literal'='{"namespace":"test","name":"avroarray","type": 
> "record", "fields": [ { "name":"a", "type":{"type":"array","items":"string"} 
> } ] }')  STORED as INPUTFORMAT  
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
> insert into table avroarray partition(y=1) select * from testarray;
> # add an int column with a default value of 0
> alter table avroarray set serde 
> 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
> serdeproperties('avro.schema.literal'='{"namespace":"test","name":"avroarray","type":
>  "record", "fields": [ {"name":"intfield","type":"int","default":0},{ 
> "name":"a", "type":{"type":"array","items":"string"} } ] }');
> # fails with ClassCastException
> select * from avroarray;
> {code}
> The select * fails with:
> {code}
> Failed with exception java.io.IOException:java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
> cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

Reply via email to