[
https://issues.apache.org/jira/browse/PIG-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Olga Natkovich reassigned PIG-1371:
-----------------------------------
Assignee: Richard Ding
Richard, can you take a look and see how feasible this is for 0.8.0, thanks
> Pig should handle deep casting of complex types
> ------------------------------------------------
>
> Key: PIG-1371
> URL: https://issues.apache.org/jira/browse/PIG-1371
> Project: Pig
> Issue Type: Bug
> Reporter: Pradeep Kamath
> Assignee: Richard Ding
> Fix For: 0.8.0
>
> Attachments: PIG-1371-partial.patch
>
>
> Consider input data in BinStorage format which has a field of bag type -
> bg:{t:(i:int)}. In the load statement if the schema specified has the type
> for this field specified as bg:{t:(c:chararray)}, the current behavior is
> that Pig thinks of the field to be of type specified in the load statement
> (bg:{t:(c:chararray)}) but no deep cast from bag of int (the real data) to
> bag of chararray (the user specified schema) is made.
> There are two issues currently:
> 1) The TypeCastInserter only considers the byte 'type' between the loader
> presented schema and user specified schema to decided whether to introduce a
> cast or not. In the above case since both schema have the type "bag" no cast
> is inserted. This check has to be extended to consider the full FieldSchema
> (with inner subschema) in order to decide whether a cast is needed.
> 2) POCast should be changed to handle casting a complex type to the type
> specified the user supplied FieldSchema. Here is there is one issue to be
> considered - if the user specified the cast type to be bg:{t:(i:int, j:int)}
> and the real data had only one field what should the result of the cast be:
> * A bag with two fields - the int field and a null? - In this approach pig
> is assuming the lone field in the data is the first field which might be
> incorrect if it in fact is the second field.
> * A null bag to indicate that the bag is of unknown value - this is the one
> I personally prefer
> * The cast throws an IncompatibleCastException
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.