[ 
https://issues.apache.org/jira/browse/PIG-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JArod Wen updated PIG-2224:
---------------------------

    Status: Patch Available  (was: Open)

Patch comes.

> Incorrect arity test in AstValidator.g with ALL and column-based grouping 
> condition together in cogroup
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2224
>                 URL: https://issues.apache.org/jira/browse/PIG-2224
>             Project: Pig
>          Issue Type: Bug
>          Components: grunt
>    Affects Versions: 0.9.0
>         Environment: Suse Linux 9/MacOS(10.7)
>            Reporter: JArod Wen
>              Labels: ALL, arity, astvalidator, cogroup, grunt
>             Fix For: 0.9.1
>
>
> When ALL and column-based grouping condition are used together in COGROUP, 
> the arity test in AstValidator.g (line 242) incorrectly sets the arity and 
> causes exception. For example, assume we have the follow two relations:
> a = load 'A' as (col_a_0, col_a_1);
> b = load 'B' as (col_b_0, col_b_1);
> The following statement will throw an invalidation error:
> c = cogroup a by col_a_0, b ALL;
> It is because when processing a:col_a_0, the arity is set to 1; then when 
> processing b:ALL, due to the null value in join_group_by_clause will emit 
> arity 0 for the second relation, and arity test fails. 
> Reversing the two relations will be a work-around for this error:
> c = cogroup b ALL, a by col_a_0;
> However it is a lucky shot: when processing b:ALL, since join_group_by_clause 
> is null, arity is still 0; then when processing a:col_a_0, arity will be 
> initialized so no arity test is done in this case (so it passes).
> The main reason is the omission of the consideration on ALL keyword during 
> the arity test. I attached a patch to fix this, by separating the arity test 
> for both join_group_by_clause and ALL. The patch is tested locally and it 
> works.
> {code}
> Index: src/org/apache/pig/parser/AstValidator.g
> ===================================================================
> --- src/org/apache/pig/parser/AstValidator.g  (revision 1158481)
> +++ src/org/apache/pig/parser/AstValidator.g  (working copy)
> @@ -242,7 +242,7 @@
>  ;
>  
>  group_item
> - : rel ( join_group_by_clause | ALL | ANY ) ( INNER | OUTER )?
> + : rel ( join_group_by_clause 
>     {
>         if( $group_clause::arity == 0 ) {
>             // For the first input
> @@ -252,6 +252,19 @@
>                 "The arity of the group by columns do not match." );
>         }
>     }
> +   | ALL 
> +   {
> +       if($group_clause::arity == 0){
> +           $group_clause::arity = 1;
> +       } else {
> +           if($group_clause::arity != 1){
> +               throw new ParserValidationException( input, new 
> SourceLocation( (PigParserNode)$group_item.start ),
> +                   "The arity of the group by columns do not match." );
> +           }
> +       }
> +   } 
> +   | ANY ) ( INNER | OUTER )?
> +
>  ;
>  
>  rel : alias {  validateAliasRef( aliases, $alias.node, $alias.name ); }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to