[jira] [Commented] (PIG-2375) Incorrect outputSchema is invoked when overloading UDF in 0.9.1

Prashant Kommireddi (Commented) (JIRA) Mon, 19 Dec 2011 22:36:02 -0800

    [ 
https://issues.apache.org/jira/browse/PIG-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172981#comment-13172981
 ]


Prashant Kommireddi commented on PIG-2375:
------------------------------------------

On having thought some more, when getArgToFuncMapping() is used the check is 
made based on this to invoke an overriding EvalFunc. It does not really make a 
lot of sense to use outputSchema(Schema inputSchema) to verify input schema 
once again.

The role of outputSchema (as the name suggests) should be to specify the output 
schema for the UDF, and NOT necessarily to verify the input schema. Though for 
a new user or writer of Pig UDFs this might not seem obvious when overriding 
UDFs.

To summarize: 

1. When UDF is not overriden, it is ok to use outputSchema to verify input 
schema.
2. When UDF is Overriden, it does not make sense to use outputSchema to verify 
input schema. This is because getArgToFuncMapping already finds a matching spec 
based on the input schema.
                
> Incorrect outputSchema is invoked when overloading UDF in 0.9.1
> ---------------------------------------------------------------
>
>                 Key: PIG-2375
>                 URL: https://issues.apache.org/jira/browse/PIG-2375
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.1
>            Reporter: Prashant Kommireddi
>            Assignee: Prashant Kommireddi
>             Fix For: 0.9.1
>
>         Attachments: LogFieldValue.java, LogFieldValues.java
>
>
> When overloading a UDF with getArgToFuncMapping() the parent/root UDF 
> outputSchema() is being called. 
> {code}
>   @Override
>     public List<FuncSpec> getArgToFuncMapping() throws FrontendException {
>         List<FuncSpec> funcList = new ArrayList<FuncSpec>();
>         Schema s = new Schema();
>         s.add(new Schema.FieldSchema(null, DataType.TUPLE));
>         s.add(new Schema.FieldSchema(null, DataType.CHARARRAY));
>         funcList.add(new FuncSpec(this.getClass().getName(), s));
>         Schema s1 = new Schema();
>         s1.add(new Schema.FieldSchema(null, DataType.TUPLE));
>         s1.add(new Schema.FieldSchema(null, DataType.TUPLE));
>         funcList.add(new FuncSpec(LogFieldValues.class.getName(), s1));
>         return funcList;
>     }
> {code}
> In the above function, "LogFieldValues" is used when the input is (tuple, 
> tuple) but the outputSchema() is invoked from the root UDF.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2375) Incorrect outputSchema is invoked when overloading UDF in 0.9.1

Reply via email to