[
https://issues.apache.org/jira/browse/AVRO-1699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Paul Mazak updated AVRO-1699:
-----------------------------
Status: Patch Available (was: Open)
> AutoMap field values between Avro objects with different schemas
> ----------------------------------------------------------------
>
> Key: AVRO-1699
> URL: https://issues.apache.org/jira/browse/AVRO-1699
> Project: Avro
> Issue Type: New Feature
> Components: java
> Affects Versions: 1.7.6
> Reporter: Paul Mazak
> Attachments: AVRO-1699.patch
>
>
> There are a few use cases for this:
> *Various Avro input data to one common output*
> You want to pickup Avro files in different schemas and normalize into one.
> You might wish to transform to the superset of the input schemas.
> *Aggregating Raw Data*
> You want to rewrite data grouped by some fields and aggregated. The output
> Avro in this case would be a subset of the input Avro, where at least the
> group by fields are in both input and output schemas.
> *Alternate Views*
> You have Avro data that you want to trim different ways to create subsets
> that would be useful for views in Hive or exports for SQL tables.
> *Schema Migration*
> You've added fields to a schema and you are storing data in both the old and
> new schemas. You have Avro in an old schema and you can't process it with
> Avro in the new schema (using pig or java map-reduce). AutoMapping would
> up-convert your old data by setting null for the new fields added, and all
> data are in the new schema. This was
> [asked|http://stackoverflow.com/questions/27131942/is-it-possible-to-retrieve-schema-from-avro-data-and-use-them-in-mapreduce]
> about on StackOverflow.
> _Considerations:_
> * Loop over the source schema fields available to automap over and return
> any that were unable to be mapped.
> * Allow mappings between compatible types. For example going from integers
> to longs, floats to strings, etc.
> * Field names match case-sensitive.
> * Make use of aliases in the schema when considering fields to automap.
> * Deep copy nested structures like arrays and maps
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)