[
https://issues.apache.org/jira/browse/PIG-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172003#comment-13172003
]
Prashant Kommireddi commented on PIG-2375:
------------------------------------------
The problem does not seem to be that incorrect outputSchema is invoked. Rather,
the root/parent UDF is always instantiated before the actual overriding UDF is
invoked.
getFieldSchema() (from UserFuncExpression) is invoked on root UDF before it is
called on overriding UDF.
{code}
@Override
public LogicalSchema.LogicalFieldSchema getFieldSchema() throws
FrontendException {
if (fieldSchema!=null)
return fieldSchema;
LogicalSchema inputSchema = new LogicalSchema();
List<Operator> succs = plan.getSuccessors(this);
if (succs!=null) {
for(Operator lo : succs){
if (((LogicalExpression)lo).getFieldSchema()==null) {
inputSchema = null;
break;
}
inputSchema.addField(((LogicalExpression)lo).getFieldSchema());
}
}
// Since ef only set one time, we never change its value, so we can
optimize it by instantiate only once.
// This significantly optimize the performance of frontend (PIG-1738)
if (ef==null)
ef = (EvalFunc<?>) PigContext.instantiateFuncFromSpec(mFuncSpec);
ef.setUDFContextSignature(signature);
Properties props =
UDFContext.getUDFContext().getUDFProperties(ef.getClass());
if(Util.translateSchema(inputSchema)!=null)
props.put("pig.evalfunc.inputschema."+signature,
Util.translateSchema(inputSchema));
// Store inputSchema into the UDF context
ef.setInputSchema(Util.translateSchema(inputSchema));
//WHY DOES THIS NEED TO BE CALLED ON THE EVALFUNC THAT IS NOT USED
Schema udfSchema = ef.outputSchema(Util.translateSchema(inputSchema));
if (udfSchema != null) {
Schema.FieldSchema fs;
if(udfSchema.size() == 0) {
.
.
.
.
{code}
Why would getFieldSchema() need to be invoked on root UDF when exec() actually
needs to invoked on an overriding EvalFunc?
> Incorrect outputSchema is invoked when overloading UDF in 0.9.1
> ---------------------------------------------------------------
>
> Key: PIG-2375
> URL: https://issues.apache.org/jira/browse/PIG-2375
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.9.1
> Reporter: Prashant Kommireddi
> Assignee: Prashant Kommireddi
> Fix For: 0.9.1
>
> Attachments: LogFieldValue.java, LogFieldValues.java
>
>
> When overloading a UDF with getArgToFuncMapping() the parent/root UDF
> outputSchema() is being called.
> {code}
> @Override
> public List<FuncSpec> getArgToFuncMapping() throws FrontendException {
> List<FuncSpec> funcList = new ArrayList<FuncSpec>();
> Schema s = new Schema();
> s.add(new Schema.FieldSchema(null, DataType.TUPLE));
> s.add(new Schema.FieldSchema(null, DataType.CHARARRAY));
> funcList.add(new FuncSpec(this.getClass().getName(), s));
> Schema s1 = new Schema();
> s1.add(new Schema.FieldSchema(null, DataType.TUPLE));
> s1.add(new Schema.FieldSchema(null, DataType.TUPLE));
> funcList.add(new FuncSpec(LogFieldValues.class.getName(), s1));
> return funcList;
> }
> {code}
> In the above function, "LogFieldValues" is used when the input is (tuple,
> tuple) but the outputSchema() is invoked from the root UDF.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira