[ https://issues.apache.org/jira/browse/PIG-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
JArod Wen updated PIG-2224: --------------------------- Status: Patch Available (was: Open) Patch comes. > Incorrect arity test in AstValidator.g with ALL and column-based grouping > condition together in cogroup > ------------------------------------------------------------------------------------------------------- > > Key: PIG-2224 > URL: https://issues.apache.org/jira/browse/PIG-2224 > Project: Pig > Issue Type: Bug > Components: grunt > Affects Versions: 0.9.0 > Environment: Suse Linux 9/MacOS(10.7) > Reporter: JArod Wen > Labels: ALL, arity, astvalidator, cogroup, grunt > Fix For: 0.9.1 > > > When ALL and column-based grouping condition are used together in COGROUP, > the arity test in AstValidator.g (line 242) incorrectly sets the arity and > causes exception. For example, assume we have the follow two relations: > a = load 'A' as (col_a_0, col_a_1); > b = load 'B' as (col_b_0, col_b_1); > The following statement will throw an invalidation error: > c = cogroup a by col_a_0, b ALL; > It is because when processing a:col_a_0, the arity is set to 1; then when > processing b:ALL, due to the null value in join_group_by_clause will emit > arity 0 for the second relation, and arity test fails. > Reversing the two relations will be a work-around for this error: > c = cogroup b ALL, a by col_a_0; > However it is a lucky shot: when processing b:ALL, since join_group_by_clause > is null, arity is still 0; then when processing a:col_a_0, arity will be > initialized so no arity test is done in this case (so it passes). > The main reason is the omission of the consideration on ALL keyword during > the arity test. I attached a patch to fix this, by separating the arity test > for both join_group_by_clause and ALL. The patch is tested locally and it > works. > {code} > Index: src/org/apache/pig/parser/AstValidator.g > =================================================================== > --- src/org/apache/pig/parser/AstValidator.g (revision 1158481) > +++ src/org/apache/pig/parser/AstValidator.g (working copy) > @@ -242,7 +242,7 @@ > ; > > group_item > - : rel ( join_group_by_clause | ALL | ANY ) ( INNER | OUTER )? > + : rel ( join_group_by_clause > { > if( $group_clause::arity == 0 ) { > // For the first input > @@ -252,6 +252,19 @@ > "The arity of the group by columns do not match." ); > } > } > + | ALL > + { > + if($group_clause::arity == 0){ > + $group_clause::arity = 1; > + } else { > + if($group_clause::arity != 1){ > + throw new ParserValidationException( input, new > SourceLocation( (PigParserNode)$group_item.start ), > + "The arity of the group by columns do not match." ); > + } > + } > + } > + | ANY ) ( INNER | OUTER )? > + > ; > > rel : alias { validateAliasRef( aliases, $alias.node, $alias.name ); } > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira