It’s a common case in that the Event listed here is the generic avro_event 
object when serializing to HDFS.

We had someone simply change Event from body[byte[]] to body[String] when 
serializing, which has the unfortunate side-effect of altering data if it’s not 
UTF-8.
It did however solve the Hive issue quickly.


-Charles

On Nov 14, 2013, at 10:22 AM, Nitin Pawar 
<[email protected]<mailto:[email protected]>> wrote:

Concat support is there .. but for for string datatypes. Not for tinyints.  Not 
sure its so common use case.
If you want to build it then you can contribute back to hive.


On Thu, Nov 14, 2013 at 11:48 PM, Deepak Subhramanian 
<[email protected]<mailto:[email protected]>> wrote:
Thanks Nitin. UDF is a good solution. I was wondering if there was a builtin 
support for hive since it is the default flume format for flume avro sink.

Thanks, Deepak


On Wed, Nov 13, 2013 at 1:15 PM, Nitin Pawar 
<[email protected]<mailto:[email protected]>> wrote:
sorry hit send to soon ..

correction rather than just changing your table definition.


On Wed, Nov 13, 2013 at 6:45 PM, Nitin Pawar 
<[email protected]<mailto:[email protected]>> wrote:
Not really sure there is a direct way to concat anything other than strings in 
hive unless typecasting them to string.

So you may want to keep the datatype of array elements to strings and try. else 
you may want to build your own udf to do it which looks more elegant way rather 
than just typecasting it.


On Wed, Nov 13, 2013 at 5:18 PM, Deepak Subhramanian 
<[email protected]<mailto:[email protected]>> wrote:


Hi,

Anyone tried reading the default avro output from flume in Hive.

I am using Flume to generate events in the default flume avro output format. 
Bytes in avro schema are stored as array<tinyint> in Hive when I use avroserde 
for hive . How do I convert array<tinyint> to string to read the flume body 
data. I am using hive version 0.10

CREATE  external TABLE flume_avro_test ROW FORMAT
    > SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
    > STORED AS
    > INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
    > OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
    > LOCATION '/testlogs/2013/11/08/17'
    > TBLPROPERTIES 
('avro.schema.literal'='{"type":"record","name":"Event","fields":[{"name":"headers","type":{"type":"map","values":"string"}},{"name":"body","type":"bytes"}]}');


describe flume_avro_test
    > ;
OK
headers map<string,string> from deserializer
body array<tinyint> from deserializer

Thanks,
Deepak Subhramanian



--
Nitin Pawar



--
Nitin Pawar



--
Deepak Subhramanian



--
Nitin Pawar

Reply via email to