[
https://issues.apache.org/jira/browse/PIG-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Prim updated PIG-4326:
------------------------------
Attachment: supportForMapsOfArraysOfRecords.patch
Did include your previous comment wrongly, my mistake, your attached patch
works with the test. Anyway, I still think we should not change the existing
behavior for maps of records.
Previously, for maps of records the AvroStorageSchemaConversionUtils did create:
{code}
map[ MyRecord: (fielda: int, ...., fieldz: int) ]
{code}
which I think is what we want as the record should be one tuple and you want to
preserve a possible alias. Your fix removes this tuple and the schema looks
like
{code}
map[ fielda: int, ...., fieldz: int ]
{code}
So I uploaded a new proposal for a patch, which keeps the original behavior for
maps of records, whereas for maps of maps and maps of arrays, it removes the
additional nesting tuple, thus resulting in e.g.
{code}
map[ array: { MyRecord: (fielda: int, ...., fieldz: int) } ]
{code}
> AvroStorageSchemaConversionUtilities does not properly convert schema for
> maps of arrays of records
> ---------------------------------------------------------------------------------------------------
>
> Key: PIG-4326
> URL: https://issues.apache.org/jira/browse/PIG-4326
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.12.0, 0.13.0
> Reporter: Michael Prim
> Assignee: Michael Prim
> Fix For: 0.15.0
>
> Attachments: PIG-4326-0.patch, mapsOfArraysOfRecords.patch,
> supportForMapsOfArraysOfRecords.patch
>
>
> I tried to convert the avro schema of a map of arrays of records into the
> proper pig schema and got always empty map schemas in pig.
> The reason is that the AvroStorageSchemaConversionUtilities does only assume
> records or primitive types as content of the map. However, a map of arrays,
> or a map of map, could have a schema itself and requires recursive calling to
> derive the full schema.
> I wrote a unit test to test for maps of arrays of records which fails with
> every pig release since the AvroStorage was rewritten (I think this was in
> 0.12), and there have been no changes since then in the trunk.
> Further the attached patch contains the (rather simple) fix that makes the
> schema conversion utils succeed.
> Would appreciate further comments and if this can be included upstream.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)