[
https://issues.apache.org/jira/browse/PIG-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910646#action_12910646
]
Thejas M Nair commented on PIG-1616:
------------------------------------
Example to demonstrate the problem -
{code}
grunt> run /Users/tejas/pig_unions_udf/trunk/uudf.pig
grunt> l1 = load '/tmp/bag.txt' as (a, b : bag { t : tuple (i : int) } );
grunt> f1 = foreach l1 generate a, MAX(b.i) as mx;
grunt> describe f1;
f1: {a: bytearray,mx: int}
grunt> l2 = load '/tmp/bag.txt' as (a, b : bag { t : tuple (i : float) } );
grunt> f2 = foreach l2 generate a, COUNT(b.i) as mx;
grunt> describe f2;
f2: {a: bytearray,mx: long}
grunt> u = union onschema f1, f2;
grunt> describe u;
u: {a: bytearray,mx: double}
-- it should be u: {a: bytearray,mx: long}
{code}
> 'union onschema' does not use create output with correct schema when udfs are
> involved
> --------------------------------------------------------------------------------------
>
> Key: PIG-1616
> URL: https://issues.apache.org/jira/browse/PIG-1616
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.8.0
> Reporter: Thejas M Nair
> Assignee: Thejas M Nair
> Fix For: 0.8.0
>
>
> 'union onshcema' creates a merged schema based on the input schemas. It does
> that in the queryparser, and at that stage the udf return type used is the
> default return type. The actual return type for the udf is determined later
> in the TypeCheckingVisitor using EvalFunc.getArgsToFuncMapping().
> 'union onschema' should use the final type for its input relation to create
> the merged schema.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.