I created a parquet file, expose that to hive using an external table, but select from such tables are always giving NULL.
to show the symptom, I created the following data set , each record has
only 2 fields __PRIMARY_KEY__ and nullableInt. the schema represented in
avro is the following (I converted the data into parquet through the
avro-parquet convertor)
{"type":"record","name":"mytest","namespace":"yy.com
","doc":"","fields":[{"name":"__PRIMARY_KEY__","type":"string","doc":""},{"name":"nullableInt","type":["int","null"],"doc":""}],"version":"1424373511441"}
the following is the parquet hive table def. I also attached the sample
parquet file.
Thanks!
yang
drop table mytest;
CREATE EXTERNAL TABLE IF NOT EXISTS mytest
(
PRIMARY_KEY String,
nullableInt int
)
STORED AS PARQUET
LOCATION '/user/myusername/camus/topics/mytest/hourly/2015/02/19/11/'
;
select * from mytest limit 10;
mytest.1.0.4.8.1424372400000.parquet
Description: Binary data
