[jira] Commented: (PIG-1616) 'union onschema' does not use create output with correct schema when udfs are involved

Thejas M Nair (JIRA) Fri, 17 Sep 2010 09:11:00 -0700

    [ 
https://issues.apache.org/jira/browse/PIG-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910646#action_12910646
 ]


Thejas M Nair commented on PIG-1616:
------------------------------------

Example to demonstrate the problem -
{code}
grunt> run /Users/tejas/pig_unions_udf/trunk/uudf.pig
grunt> l1 = load '/tmp/bag.txt' as (a, b : bag { t : tuple (i : int) } );   
grunt> f1 = foreach l1 generate a, MAX(b.i) as mx;  
grunt> describe f1;
f1: {a: bytearray,mx: int}

grunt> l2 = load '/tmp/bag.txt' as (a, b : bag { t : tuple (i : float) } );
grunt> f2 = foreach l2 generate a, COUNT(b.i) as mx;
grunt> describe f2;
f2: {a: bytearray,mx: long}

grunt> u = union onschema f1, f2;
grunt> describe u;
u: {a: bytearray,mx: double}
-- it should be u: {a: bytearray,mx: long}
{code}

> 'union onschema' does not use create output with correct schema when udfs are 
> involved
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-1616
>                 URL: https://issues.apache.org/jira/browse/PIG-1616
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>
> 'union onshcema' creates a merged schema based on the input schemas. It does 
> that in the queryparser, and at that stage the udf return type used is the 
> default return type.  The actual return type for the udf is determined later 
> in the TypeCheckingVisitor using EvalFunc.getArgsToFuncMapping().
> 'union onschema' should use the final type for its input relation to create 
> the merged schema.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-1616) 'union onschema' does not use create output with correct schema when udfs are involved

Reply via email to