Hi Yong, I followed your 2nd suggestion. My data format is is nested(list of map), So I created .avsc as below.
{"namespace": "test.avro", "type": "record", "name": "Session", "fields": [ {"name":"VisitCommon", "type": { "type": "map", "values":"string"}, {"name":"events", "type": { "type": "array", "items":{ "name":"Event", "type":"map", "values":"string"} } } ] } And I tried creating corresponding classes by using avro tool and with plugin, but there are few errors on generated java code. What could be the issue? 1) Error: The method deepCopy(Schema, List<Map<CharSequence,CharSequence>>) is undefined for the type GenericData 2) And also observed there is some deprecated code. @Deprecated public java.util.Map<java.lang.CharSequence,java.lang.CharSequence> VisitCommon; I used eclipse plugin as mentioned below. http://avro.apache.org/docs/1.7.6/mr.html Thanks & Regards, B Anil Kumar. On Fri, Jan 31, 2014 at 8:27 AM, AnilKumar B <akumarb2...@gmail.com> wrote: > Thanks Yong. > > Thanks & Regards, > B Anil Kumar. > > > On Fri, Jan 31, 2014 at 12:44 AM, java8964 <java8...@hotmail.com> wrote: > >> In avro, you need to think about a schema to match your data. Avor's >> schema is very flexible and should be able to store all kinds of data. >> >> If you have a Json string, you have 2 options to generate the Avro schema >> for it: >> >> 1) Use "type: string" to store the whole Json string into Avro. This will >> be easiest, but you have to parse the data later when you use it. >> 2) Use Avro schema to match your json data, using matching structure from >> avro for your data, like 'record, array, map' etc. >> >> Yong >> >> ------------------------------ >> Date: Fri, 31 Jan 2014 00:13:59 +0530 >> Subject: shifting sequenceFileOutput format to Avro format >> From: akumarb2...@gmail.com >> To: user@hadoop.apache.org >> >> >> Hi, >> >> As of now in my jobs, I am using SequenceFileOutputFormat and I am >> emitting custom java objects as MR output. >> >> Now I am planning to emit it in avro format, I went through few blogs >> but still have following doubts. >> >> 1) My current custom Writable objects has nested json format as >> toString(), So when I shift to avro format, should I just emit json string >> in avro format, instead of writable custom object? >> >> 2) If so, how can I create schema? My json string is nested and will have >> random key/value pairs. >> >> 3) Or can I still emit as custom objects? >> >> >> >> Thanks & Regards, >> B Anil Kumar. >> > >