[ https://issues.apache.org/jira/browse/HIVE-16370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alice Fan reassigned HIVE-16370: -------------------------------- Assignee: (was: Alice Fan) > Avro data type null not supported on partitioned tables > ------------------------------------------------------- > > Key: HIVE-16370 > URL: https://issues.apache.org/jira/browse/HIVE-16370 > Project: Hive > Issue Type: Bug > Affects Versions: 1.1.0, 2.1.1 > Reporter: rui miranda > Priority: Minor > Attachments: HIVE-16370.01-branch-1.patch, HIVE-16370.2.patch > > > I was attempting to create hive tables over some partitioned Avro files. It > seems the void data type (Avro null) is not supported on partitioned tables > (i could not replicate the bug on an un-partitioned table). > --------------- > i managed to replicate the bug on two different hive versions. > Hive 1.1.0-cdh5.10.0 > Hive 2.1.1-amzn-0 > ---------------- > how to replicate (avro tools are required to create the avro files): > $ wget > http://mirror.serversupportforum.de/apache/avro/avro-1.8.1/java/avro-tools-1.8.1.jar > $ mkdir /tmp/avro > $ mkdir /tmp/avro/null > $ echo "{ \ > \"type\" : \"record\", \ > \"name\" : \"null_failure\", \ > \"namespace\" : \"org.apache.avro.null_failure\", \ > \"doc\":\"the purpose of this schema is to replicate the hive avro null > failure\", \ > \"fields\" : [{\"name\":\"one\", \"type\":\"null\",\"default\":null}] \ > } " > /tmp/avro/null/schema.avsc > $ echo "{\"one\":null}" > /tmp/avro/null/data.json > $ java -jar avro-tools-1.8.1.jar fromjson --schema-file > /tmp/avro/null/schema.avsc /tmp/avro/null/data.json > /tmp/avro/null/data.avro > $ hdfs dfs -mkdir /tmp/avro > $ hdfs dfs -mkdir /tmp/avro/null > $ hdfs dfs -mkdir /tmp/avro/null/schema > $ hdfs dfs -mkdir /tmp/avro/null/data > $ hdfs dfs -mkdir /tmp/avro/null/data/foo=bar > $ hdfs dfs -copyFromLocal /tmp/avro/null/schema.avsc > /tmp/avro/null/schema/schema.avsc > $ hdfs dfs -copyFromLocal /tmp/avro/null/data.avro > /tmp/avro/null/data/foo=bar/data.avro > $ hive > hive> CREATE EXTERNAL TABLE avro_null > PARTITIONED BY (foo string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' > STORED as INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' > LOCATION > '/tmp/avro/null/data/' > TBLPROPERTIES ( > 'avro.schema.url'='/tmp/avro/null/schema/schema.avsc') > ; > OK > Time taken: 3.127 seconds > hive> msck repair table avro_null; > OK > Partitions not in metastore: avro_null:foo=bar > Repair: Added partition to metastore avro_null:foo=bar > Time taken: 0.712 seconds, Fetched: 2 row(s) > hive> select * from avro_null; > FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: > Failed with exception Hive internal error inside > isAssignableFromSettablePrimitiveOI void not supported > yet.java.lang.RuntimeException: Hive internal error inside > isAssignableFromSettablePrimitiveOI void not supported yet. > hive> select foo, count(1) from avro_null group by foo; > OK > bar 1 > Time taken: 29.806 seconds, Fetched: 1 row(s) -- This message was sent by Atlassian JIRA (v7.6.3#76005)