How to convert .avpr/.avdl file to .avsc file.
Hi All, In my application I am getting events in avro serialized format.These data are serialized using .avdl file in java. In my application I have to parse those events in hive. In web tutorial I can see ,hive understands .avsc format. https://cwiki.apache.org/Hive/avroserde-working-with-avro-from-hive.html Is there any way to convert .avpr to .avsc ? Alternately can I directly use .avpr/.avdl in hive? Please provide example. Thanks in advance. Sourabh
Re: How to convert .avpr/.avdl file to .avsc file.
Hi All, In my application I am getting avro events. I have to process those in hive. Using avro schema I have created hive table. But I am not able to load those avro events to the hive table(created by same avro schema). I am using: load data inpath '/user/test/xyz.avro' into table xyz; When I execute: select * from xyz; Failed with exception java.io.IOException:java.io.IOException: Not a data file. Please advice what should I do? Thanks in advance. Sourabh
How to avro data to hive
Hi All, In my application I am getting avro events. I have to process those in hive. Using avro schema I have created hive table. But I am not able to load those avro events to the hive table(created by same avro schema). I am using: load data inpath '/user/test/xyz.avro' into table xyz; When I execute: select * from xyz; Failed with exception java.io.IOException:java.io.IOException: Not a data file. Create table script: CREATE TABLE xyz ROW FORMAT SERDE 'com.linkedin.haivvreo.AvroSerDe' WITH SERDEPROPERTIES ( 'schema.url'='file:/home/test/tmp/xyz.avsc') STORED as INPUTFORMAT 'com.linkedin.haivvreo.AvroContainerInputFormat' OUTPUTFORMAT 'com.linkedin.haivvreo.AvroContainerOutputFormat'; Please advice what should I do? Thanks in advance. Sourabh
How to process different types of avro schema
Hi All, In my application I am getting a stream of avro events. This stream contains different types of avro events belonging to different schemas. I was wondering what is the right way to process this data and do analytics on top of this. Can I use hive? I did study the avro serde that could be used to decode avro data and I’m thinking I need to transform the input stream into (multiple) entries belonging to different tables. For this I’m considering using a mapper job that would extract these events type by type and then we could use hive on top of these separate schemas. I’m wondering if anyone has dealt with such scenario before and if this approach would work with decent performance? Alternative way is to use all the logic in M-R code for the analytics that we want to do on top of this data. Please advise. Thanks in advance. Sourabh