[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980802#comment-13980802
 ] 

Hive QA commented on HIVE-6835:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12641763/HIVE-6835.5.patch

{color:red}ERROR:{color} -1 due to 40 failed/errored test(s), 5418 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_numeric
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_test_outer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullgroup3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_createas1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_dummy_source
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_symlink_text_input_format
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_current_database
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_19
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_21
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_22
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_24
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_9
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketizedhiveinputformat
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_dynamic_partitions_with_whitelist
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_stats_partialscan_autogether
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/34/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/34/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 40 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12641763

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch, HIVE-6835.5.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 

[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-25 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981229#comment-13981229
 ] 

Xuefu Zhang commented on HIVE-6835:
---

[~erwaman] If you can confirm that these test failures are unrelated to your 
patch, I can commit it in a few hours.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch, HIVE-6835.5.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-25 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981278#comment-13981278
 ] 

Anthony Hsu commented on HIVE-6835:
---

I will do some local testing soon and let you know.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch, HIVE-6835.5.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-25 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981425#comment-13981425
 ] 

Anthony Hsu commented on HIVE-6835:
---

I tried all the failed union_remove TestCliDriver tests locally and they all 
passed.  Looking at some of the previous precommit builds, several of them also 
have the same test failures, so I believe these test failures are unrelated to 
my changes.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch, HIVE-6835.5.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-25 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981431#comment-13981431
 ] 

Anthony Hsu commented on HIVE-6835:
---

BTW, I have been doing all my development and testing against Hadoop 1.2.1 
(-Phadoop-1).

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch, HIVE-6835.5.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-25 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981677#comment-13981677
 ] 

Anthony Hsu commented on HIVE-6835:
---

Thanks, [~xuefuz], for all your help and guidance.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Fix For: 0.14.0

 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch, HIVE-6835.5.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-24 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980100#comment-13980100
 ] 

Anthony Hsu commented on HIVE-6835:
---

Also updated [the RB|https://reviews.apache.org/r/20096/].

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch, HIVE-6835.5.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-24 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980129#comment-13980129
 ] 

Xuefu Zhang commented on HIVE-6835:
---

Latest patch looks good to me. +1, pending on test result.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch, HIVE-6835.5.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-23 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979175#comment-13979175
 ] 

Anthony Hsu commented on HIVE-6835:
---

P.S.: I also updated my Review Board request: 
https://reviews.apache.org/r/20096/

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-23 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979241#comment-13979241
 ] 

Xuefu Zhang commented on HIVE-6835:
---

Thanks for the patch, which looks good. I have some comments on RB.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch, 
 HIVE-6835.4.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-22 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977337#comment-13977337
 ] 

Anthony Hsu commented on HIVE-6835:
---

I started looking into this alternative and encountered an issue.  Most calls 
to serde.initialize() are treating serde as a Deserializer (interface).  I 
would either have to change the interface (and change all the implementations) 
or cast the Deserializer as an AbstractSerDe (whenever I want to use the new 
initialize() method), neither of which seems like a great solution. So I am 
back to supporting my original table. prefix approach. Any thoughts on this?

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-22 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977616#comment-13977616
 ] 

Xuefu Zhang commented on HIVE-6835:
---

Not sure if I understand your problem correctly, but I do understand that the 
scope of the proposed change has got bigger than your original approach. For 
any caller of serde initialization, we should be able to find whether serde 
instance extends AbstractSerde. If so, we cast the serde instance to 
AbstractSerde and call initialize(arg1, arg2, arg3). Otherwise, call 
serde.initialize(arg1, arg2). Does this solve the problem?

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-22 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977710#comment-13977710
 ] 

Anthony Hsu commented on HIVE-6835:
---

Yes, this is possible, but I would have to add these instanceof AbstractSerde 
checks and then cast the Deserializer as an AbstractSerde before I can use the 
new initialize() method.  There are dozens of usages of .initialize() and 
adding all this type checking/casting code in so many places just for this new 
method doesn't seem very clean to me.

Also, if we add the new initialize() method, what should we do for table-level 
serde initialization?  When dealing with the table, there are no partition 
properties, so are we supposed to pass the table properties for both the 
tblProps and partProps arguments? If we leave partProps null, then the default 
new initialize() method implementation will just pass null to the old 
initialize() method.

There doesn't seem to be a very clean way of adding a new initialize() method 
without creating a lot of redundant boilerplate code and creating confusion 
which initialize() method to use and what values to pass in.  Given these 
concerns, I feel that prepending table. might be a cleaner and less confusing 
approach.  What are your thoughts on this?

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-22 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977718#comment-13977718
 ] 

Xuefu Zhang commented on HIVE-6835:
---

bq. I feel that prepending table. might be a cleaner and less confusing 
approach. What are your thoughts on this?

This doesn't seem to be a viable approach due to its hacky/problematic nature.

bq.  many places just for this new method doesn't seem very clean to me.

You could have a utility method somewhere so that you need to call instanceof 
only once. Something like this:

{code}
  public static void initializeSerde(SerDe serde, Properties tblProps, 
Properties partProps) {
if (serde instanceof AbstractSerde) {
   ...
} else {
  ...
}
  }
{code}
Then, each caller just needs to switch to this method.

bq.  If we leave partProps null, then the default new initialize() method 
implementation will just pass null to the old initialize() method.

This sounds good to me.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-21 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975824#comment-13975824
 ] 

Xuefu Zhang commented on HIVE-6835:
---

I think that's pretty much what you need to do. While #2 may touch many files, 
it's fairly safe as #1 guarantees that the same code will be exercised. There 
isn't much API change. You add one with default implementation and deprecate 
the old one. In #3, you have both property sets and do whatever you need for 
Avro.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-21 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976144#comment-13976144
 ] 

Anthony Hsu commented on HIVE-6835:
---

Great, sounds like we're on the same page. I'll implement this new approach and 
upload a new patch soon.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-18 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974192#comment-13974192
 ] 

Anthony Hsu commented on HIVE-6835:
---

I'm guessing the schema was specified in the SERDEPROPERTIES to work around 
HIVE-3953.  However, one issue with storing the schema in TBLPROPERTIES instead 
is that for partitioned tables, when you do a {{describe \[extended] 
table_name partition(...);}}, you get
{code}
error_error_error_error_error_error_error   string  from 
deserializer   
cannot_determine_schema string  from deserializer   
check   string  from deserializer   
schema  string  from deserializer   
url string  from deserializer   
and string  from deserializer   
literal string  from deserializer
{code}
because the AvroSerDe cannot find avro.schema.literal or avro.schema.url.  
If you store the schema in SERDEPROPERTIES, you don't get this issue, since the 
SERDEPROPERTIES get copied to the partition when it is created.

I do think it is useful to make both the table-level properties and the 
partition-level properties available separately to the SerDe when it's doing 
its .initalize().  The SerDe should be able to decide which set of properties 
it wants to use. From this point of view, I think my change is still useful and 
valid.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-18 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974287#comment-13974287
 ] 

Xuefu Zhang commented on HIVE-6835:
---

 If TBLPROPERTIES are NOT copied to the partition, We should probably fix that 
problem instead. In Hive, partition holds a snapshot of the table schema when 
the partition is created. This should be applicable to AvroSerde as well.

Making table properties and partition properties available for a serde seems a 
good idea to me in general. However, what I questioned is the way the problem 
in AvroSerde is fixed. Especially, we don't want to fix the problem in a 
workaround solution while avoiding the original problem.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-18 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974342#comment-13974342
 ] 

Anthony Hsu commented on HIVE-6835:
---

If TBLPROPERTIES were copied to the partition, then you still might have the 
problem of the table-level Avro schema and the partition-level Avro schema 
getting out of sync, which might lead to ClassCastExceptions.  The Avro schema 
should always use the latest table-level schema, whether it is stored in 
TBLPROPERTIES or SERDEPROPERTIES.

The root of the problem is if an Avro schema somehow ends up in the partition 
properties, these could get out of sync with the table-level properties.  The 
Avro SerDe should always be using the table-level schema, and that's why my 
change was to (1) make the table-level properties available to the serde, and 
(2) change the Avro SerDe to use the table-level properties when present.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-18 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974366#comment-13974366
 ] 

Xuefu Zhang commented on HIVE-6835:
---

Points taken. However, I'm a little concerned that the solution here is a 
little bit hacky, as a serde might have a property prefixed by table.. In 
that case, this may mess up. Instead, I'm thinking if it makes more sense to 
define a new serde initialize API, and deprecate the old one.
{code}
public void initialize(Configuration configuration, Properties tableProperties, 
Properties partitionProperties) throws SerDeException;
{code}
With that, a serde is free to do whatever they need .


 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974464#comment-13974464
 ] 

Ashutosh Chauhan commented on HIVE-6835:


I agree with [~xuefuz]. Also, now that we have AbstractSerDe (which is an 
abstract class as oppose to interface Serde) adding new methods is easier from 
backward-compatibility point of view. Polluting one map object from various 
sources sounds like inviting trouble to me.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-18 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974670#comment-13974670
 ] 

Anthony Hsu commented on HIVE-6835:
---

[~xuefuz] and [~ashutoshc], just to clarify, is this the alternative solution 
you're proposing?:
# Add
{code}
public void initialize(Configuration configuration, Properties tableProperties, 
Properties partitionProperties) throws SerDeException;
{code}
to AbstractSerDe and provide a default implementation that just calls 
{{initialize(configuration, partitionProperties)}}
# Change all calls of {{partitionSerde.initialize(conf, partProps)}} to 
{{partitionSerde.initialize(conf, tblProps, partProps)}}
# Add
{code}
@Override
public void initialize(Configuration configuration, Properties tableProperties, 
Properties partitionProperties) throws SerDeException;
{code}
to AvroSerDe and provide an implementation that just uses the tableProperties

I am okay with taking this approach, though it involves a lot more code changes 
and will change the public AbstractSerDe API.  Let me know what your thoughts 
on this approach are.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-17 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973141#comment-13973141
 ] 

Xuefu Zhang commented on HIVE-6835:
---

Just curious. If the avro serde is initialized with the table schema (which is 
the latest), is there a problem for it to read the old data, that is, data that 
conforms to the partition level metadata? I have seen so many JIRAs about 
schema evolution, and isn't quite sure what is possible and what is not.

The example given here is adding a new column in the beginning. What about 
other cases, such as adding it at the end, or changing data type, etc?

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-17 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973166#comment-13973166
 ] 

Ashutosh Chauhan commented on HIVE-6835:


I would also like to know answer for Xuefu's questions. It will be good to 
document what kind of schema evolution is supported by Avro Serde and more 
importantly what kinds are *not* supported.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-17 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973279#comment-13973279
 ] 

Anthony Hsu commented on HIVE-6835:
---

The AvroSerDe handles schema evolution as described in 
http://avro.apache.org/docs/current/spec.html#Schema+Resolution.  However, in 
the Hive code, the AvroSerDe needs to always be initialized with the latest 
schema so that ObjectInspectorConverters.getConvertedOI() (in 
FetchOperator:getRecordReader()) will work.  When the AvroSerDe actually reads 
the Avro file, it will then compare the latest schema to the actual schema 
stored in the Avro file and do schema resolution/evolution.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-17 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973327#comment-13973327
 ] 

Xuefu Zhang commented on HIVE-6835:
---

{quote}
 in the Hive code, the AvroSerDe needs to always be initialized with the latest 
schema so that ObjectInspectorConverters.getConvertedOI() (in 
FetchOperator:getRecordReader()) will work.
{quote}

[~erwaman] I guess I don't quite follow this. The exception stack shows that 
casting error happens when reading old data with partition schema which is old 
schema. If the schema matches the data, I'm not sure why we'd have this casting 
error? On the other hand, if we use the new schema and read old data, would it 
be possible that error might arise?

Anyway, I'm not fully understanding the real cause of the problem and how the 
change will address all other possible scenarios.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-17 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973615#comment-13973615
 ] 

Anthony Hsu commented on HIVE-6835:
---

What happens is Hive tries to build ObjectInspectorConverters from the 
partition schema to the table schema.  If the partition schema is different 
from the table schema, you may get a ClassCastException like above.

When you add new columns at the end, this is not a problem because these new 
columns are chopped off.  See ObjectInspectorConverters:StructConverter:
{code}
int minFields = Math.min(inputFields.size(), outputFields.size());
fieldConverters = new ArrayListConverter(minFields);
{code}
It's only when you insert new columns at the beginning or in the middle that 
you might run into ClassCastExceptions.

For the AvroSerDe, if it always uses the latest schema (which should be the 
table-level schema), Hive will not get confused when constructing its 
ObjectInspectorConverters.  Then, later, when the AvroSerDe actually goes to 
read the Avro files, it can compare the latest schema with the (possibly old) 
schemas stored in the Avro data files themselves, and do the proper schema 
resolution, omitting fields or substituting default values, following the 
[schema resolution 
rules|http://avro.apache.org/docs/current/spec.html#Schema+Resolution].

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-17 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973654#comment-13973654
 ] 

Anthony Hsu commented on HIVE-6835:
---

On a side note: If you create an Avro table and store the schema in the 
TBLPROPERTIES -
{code}
CREATE TABLE ... TBLPROPERTIES ('avro.schema.literal'='...');
{code}
\- everything works fine with partitions because TBLPROPERTIES are NOT copied 
to the partition, so the partition will end using the TBLPROPERTIES for 
initializing the Avro SerDe.

It's only when you store the schema in the SERDEPROPERTIES -
{code}
CREATE TABLE ... WITH SERDEPROPERTIES ('avro.schema.literal'='...');
{code}
\- that problems arise.  SERDEPROPERTIES DO get copied to the partitions, so if 
you then end up changing the SERDEPROPERTIES stored at the table level, the 
SERDEPROPERTIES in the table and the partitions get out of sync and this 
sometimes leads to ClassCastExceptions with the AvroSerDe.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-17 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973722#comment-13973722
 ] 

Xuefu Zhang commented on HIVE-6835:
---

[~erwaman] Thanks for the explanation. Now I see where the problem is. 
SERDEPROPERTIES and TBLPROPERTIES are for different purpose. I'm curious why 
user would put avro.schema.literal in the serde properties, as this is table 
specific and it should be put in TBLPROPERTIES. SERDEPROPERTIES, on the other 
hand, is used to control serde behavior (plugin level instead of table level), 
such as field delimiter which doesn't necessary vary from table to table. If 
you check AvroSerde documentation, schema is specified in TBLPROPERTIES. 
https://cwiki.apache.org/confluence/display/Hive/AvroSerDe. Thus, it seems that 
this fix is for an invalid use case. What's your thought on this?

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13970596#comment-13970596
 ] 

Hive QA commented on HIVE-6835:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12640124/HIVE-6835.2.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5402 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hive.service.cli.thrift.TestThriftHttpCLIService.testExecuteStatementAsync
{noformat}

Test results: http://bigtop01.cloudera.org:8080/job/precommit-hive/5/testReport
Console output: http://bigtop01.cloudera.org:8080/job/precommit-hive/5/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12640124

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-16 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972241#comment-13972241
 ] 

Carl Steinbach commented on HIVE-6835:
--

[~ashutoshc]: Thanks for catching the Thrift codegen problem.

[~erwaman]: Updated patch looks good. +1

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-15 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13969281#comment-13969281
 ] 

Carl Steinbach commented on HIVE-6835:
--

+1

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962471#comment-13962471
 ] 

Hive QA commented on HIVE-6835:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12639043/HIVE-6835.1.patch

{color:green}SUCCESS:{color} +1 5550 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2167/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2167/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12639043

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-03 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13958989#comment-13958989
 ] 

Anthony Hsu commented on HIVE-6835:
---

Right now, when AvroSerDe.initialize() is called, the Properties it is passed 
include both table and partition properties, with the partition properties 
*overriding* the table properties.  The AvroSerDe needs the *latest* schema 
(which should be stored in the table properties) for proper initialization and 
to prevent the ClassCastException.  My proposal is to pass both the table and 
partition properties to SerDe.initialize() by prepending the table properties 
with table., and let the SerDe decide which set of properties to use.

BTW, here's the full stack trace when you do the select *:
{code}
Failed with exception java.io.IOException:java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
cannot be cast to 
org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
14/04/03 10:11:02 ERROR CliDriver: Failed with exception 
java.io.IOException:java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
cannot be cast to 
org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
java.io.IOException: java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
cannot be cast to 
org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:551)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:489)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:136)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1471)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:272)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:217)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:414)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:782)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:676)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
cannot be cast to 
org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:148)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.init(ObjectInspectorConverters.java:304)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:150)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:407)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:515)
... 14 more
{code}

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu

 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray (a arraystring) partitioned by (y string) row format 
 serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [