How to convert .avpr/.avdl file to .avsc file.

2013-03-20 Thread sourabh chaki
Hi All,

In my application I am getting events in avro serialized format.These data
are serialized using .avdl file in java.

In my application I have to parse those events in hive. In web tutorial I
can see ,hive understands .avsc format.

https://cwiki.apache.org/Hive/avroserde-working-with-avro-from-hive.html

Is there any way to convert .avpr to .avsc ?

Alternately can I directly use .avpr/.avdl in hive? Please provide example.

Thanks in advance.

Sourabh


Re: How to convert .avpr/.avdl file to .avsc file.

2013-03-20 Thread sourabh chaki
Hi All,

In my application I am getting avro events. I have to process those in
hive. Using avro schema I have created hive table. But I am not able to
load those avro events to the hive table(created by same avro schema).

I am using: load data inpath '/user/test/xyz.avro' into table xyz;

When I execute: select * from xyz;
Failed with exception java.io.IOException:java.io.IOException: Not a data
file.

Please advice what should I do?

Thanks in advance.

Sourabh


How to avro data to hive

2013-03-20 Thread sourabh chaki
Hi All,


 In my application I am getting avro events. I have to process those in
 hive. Using avro schema I have created hive table. But I am not able to
 load those avro events to the hive table(created by same avro schema).

 I am using: load data inpath '/user/test/xyz.avro' into table xyz;

 When I execute: select * from xyz;
 Failed with exception java.io.IOException:java.io.IOException: Not a data
 file.


Create table script:
CREATE TABLE xyz
  ROW FORMAT SERDE
  'com.linkedin.haivvreo.AvroSerDe'
  WITH SERDEPROPERTIES (
'schema.url'='file:/home/test/tmp/xyz.avsc')
  STORED as INPUTFORMAT
  'com.linkedin.haivvreo.AvroContainerInputFormat'
  OUTPUTFORMAT
  'com.linkedin.haivvreo.AvroContainerOutputFormat';


 Please advice what should I do?

 Thanks in advance.

 Sourabh



How to process different types of avro schema

2013-03-18 Thread sourabh chaki
Hi All,

In my application I am getting a stream of avro events. This stream
contains different types of avro events belonging to different schemas. I
was wondering what is the right way to process this data and do analytics
on top of this. Can I use hive? I did study the avro serde that could be
used to decode avro data and I’m thinking I need to transform the input
stream into (multiple) entries belonging to different tables. For this I’m
considering using a mapper job that would extract these events type by type
and then we could use hive on top of these separate schemas. I’m wondering
if anyone has dealt with such scenario before and if this approach would
work with decent performance?

Alternative way is to use all the logic in M-R code for the analytics that
we want to do on top of this data. Please advise.

Thanks in advance.

Sourabh