[ 
https://issues.apache.org/jira/browse/PIG-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13786709#comment-13786709
 ] 

Aniket Mokashi commented on PIG-3082:
-------------------------------------

But, we should document this as incompatible change so that there are no 
surprises?

> outputSchema of a UDF allows two usages when describing a Tuple schema
> ----------------------------------------------------------------------
>
>                 Key: PIG-3082
>                 URL: https://issues.apache.org/jira/browse/PIG-3082
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Julien Le Dem
>            Assignee: Jonathan Coveney
>             Fix For: 0.12.0
>
>         Attachments: PIG-3082-0.patch, PIG-3082-1.patch
>
>
> When defining an evalfunc that returns a Tuple there are two ways you can 
> implement outputSchema().
> - The right way: return a schema that contains one Field that contains the 
> type and schema of the return type of the UDF
> - The unreliable way: return a schema that contains more than one field and 
> it will be understood as a tuple schema even though there is no type (which 
> is in Field class) to specify that. This is particularly deceitful when the 
> output schema is derived from the input schema and the outputted Tuple 
> sometimes contain only one field. In such cases Pig understands the output 
> schema as a tuple only if there is more than one field. And sometimes it 
> works, sometimes it does not.
> We should at least issue a warning (backward compatibility) if not plain 
> throw an exception when the output schema contains more than one Field.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to