Tom Snee created HIVE-9312:
------------------------------
Summary: Literal string "\n" confuses Avro SerDe
Key: HIVE-9312
URL: https://issues.apache.org/jira/browse/HIVE-9312
Project: Hive
Issue Type: Bug
Components: Serializers/Deserializers
Affects Versions: 0.13.0
Environment: Hortonworks Data Platform 2.1.2.1 on Centos 6.5
Reporter: Tom Snee
Avro files with string fields that contain a backslash followed by 'n' confuse
the Avro SerDe.
Steps to recreate:
1. Put attached schema nested.avsc into HDFS under /user/someone.
2. Convert attached JSON file example.json into Avro with avro-tools, like so:
"java -jar avro-tools-1.7.7.jar fromjson --schema-file nested.avsc example.json
> example.avro"
3. Put example.avro into HDFS under /user/someone/avro-files.
4. Create a Hive table with this statement:
CREATE EXTERNAL TABLE avro_table
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION
'/user/someone/avro-files/'
TBLPROPERTIES (
'avro.schema.url'='hdfs:///user/someone/nested.avsc'
);
5. Observe that "select * from avro_table;" returns one row, as expected.
6. Observe that "select * from avro_table where
mastersubjectnumber='A12B3CDE-FGH4-5I67-89J0-KLMN1OPQ23R4';" returns 13 garbled
rows.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)