Hi,
You can create a record with the integer and the record itself as fields and
use this as the record for the job. Your schema would look something as
follows.
{
"name" : "augmentedRecord",
"type" : "record",
"fields" : [{
"name" : "index",
"type" : "int"
},{
"name" : "actualRecord",
"type" : "record",
"fields" : [{
<<your original schema>>
}]
}]
}
- Sudhan S
On Tue, May 17, 2011 at 3:40 AM, W.P. McNeill <[email protected]> wrote:
> I am writing a Hadoop application whose values are objects called Records
> which are serialized using Avro. (I specify a Serialization class for the
> Records via the io.serializations property.)
>
> I now need to expand my application so that instead of just a Record I need
> to have a more complicated data structure, call it an Augmented Record. Say
> that an Augmented Record contains integer N in addition to the record, so
> now the value looks like (N, Record). Adding an integer field to the Record
> schema just to support this one Hadoop process would be a hack, but I also
> can't create a Writable (WritableInt, Record) object because Record uses its
> own Avro serialization scheme and so is not Writable. What I want to do is
> basically create a new schema of the form [Integer: N, Record: R], where the
> Record schema is read in dynamically. Can I dynamically nest schema in this
> manner? If not, what is the best approach to serializing an Augmented
> Record?
>
> Thanks.
>
>