[ https://issues.apache.org/jira/browse/PIG-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alan Gates reassigned PIG-1371: ------------------------------- Assignee: Alan Gates (was: Richard Ding) > Pig should handle deep casting of complex types > ------------------------------------------------ > > Key: PIG-1371 > URL: https://issues.apache.org/jira/browse/PIG-1371 > Project: Pig > Issue Type: Bug > Reporter: Pradeep Kamath > Assignee: Alan Gates > Attachments: PIG-1371-partial.patch > > > Consider input data in BinStorage format which has a field of bag type - > bg:{t:(i:int)}. In the load statement if the schema specified has the type > for this field specified as bg:{t:(c:chararray)}, the current behavior is > that Pig thinks of the field to be of type specified in the load statement > (bg:{t:(c:chararray)}) but no deep cast from bag of int (the real data) to > bag of chararray (the user specified schema) is made. > There are two issues currently: > 1) The TypeCastInserter only considers the byte 'type' between the loader > presented schema and user specified schema to decided whether to introduce a > cast or not. In the above case since both schema have the type "bag" no cast > is inserted. This check has to be extended to consider the full FieldSchema > (with inner subschema) in order to decide whether a cast is needed. > 2) POCast should be changed to handle casting a complex type to the type > specified the user supplied FieldSchema. Here is there is one issue to be > considered - if the user specified the cast type to be bg:{t:(i:int, j:int)} > and the real data had only one field what should the result of the cast be: > * A bag with two fields - the int field and a null? - In this approach pig > is assuming the lone field in the data is the first field which might be > incorrect if it in fact is the second field. > * A null bag to indicate that the bag is of unknown value - this is the one > I personally prefer > * The cast throws an IncompatibleCastException -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.