[ 
https://issues.apache.org/jira/browse/PIG-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003615#comment-13003615
 ] 

Thejas M Nair commented on PIG-1618:
------------------------------------

bq. is this only changes describe output or does it have other non-backward 
comptible changes?
It changes the schema of the statement, so any statements that relies on a non 
null schema of foreach statement will not work any more.
But I think the new behavior is correct.

For example -
{code}
grunt> describe g;
g: {group: bytearray,a: {(null)}}
grunt> f = foreach g generate $0 , flatten(a);
grunt> f2 = foreach f generate group; -- this would give an error - Projected 
field [group] does not exist in schema: null.
{code}

It also affects lineage tracing, in case of co-group where inputs don't have 
schema.

{code}
a = load 'a' using PigStorage('a') ;
b = load 'a' using PigStorage('b') ;
c = cogroup a by $0, b by $0 ;
d = foreach c generate group, flatten(a), flatten(b)  ;
e = foreach d generate $1 + 1, $2 + 1  ;
-- in 0.8 the load func spec of a would have been associated with $1, and that 
of b with $2 .
{code}

This also means that some valid statements that would not work with 0.8 will 
now work -

{code}
a = load 'a' using PigStorage('a') ;
b = group a by $0;
c = foreach b generate group, flatten(b);
d = foreach c generate $2; 
-- in 0.8 this would have given an error - Error during parsing. Out of bound 
access. Trying to access non-existent column: 2. Schema {group: 
bytearray,bytearray} has 2 column(s).
-- in trunk this will work as it should

{code}



> Switch to new parser generator technology
> -----------------------------------------
>
>                 Key: PIG-1618
>                 URL: https://issues.apache.org/jira/browse/PIG-1618
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Alan Gates
>            Assignee: Xuefu Zhang
>             Fix For: 0.9.0
>
>         Attachments: NewParser-1.patch, NewParser-10.patch, 
> NewParser-11.patch, NewParser-12.patch, NewParser-13.2.patch, 
> NewParser-13.patch, NewParser-14.patch, NewParser-15.patch, 
> NewParser-18.patch, NewParser-19.3.patch, NewParser-19.patch, 
> NewParser-2.patch, NewParser-3.patch, NewParser-3.patch, NewParser-4.patch, 
> NewParser-5.patch, NewParser-6.patch, NewParser-7.patch, NewParser-8.patches, 
> NewParser-9.patch, antlr-3.2.jar, javadoc.patch
>
>
> There are many bugs in Pig related to the parser, particularly to bad error 
> messages.  After review of Java CC we feel these will be difficult to address 
> using that tool.  Also, the .jjt files used by JavaCC are hard to understand 
> and maintain.  
> ANTLR is being reviewed as the most likely choice to move to, but other 
> parsers will be reviewed as well.
> This JIRA will act as an umbrella issue for other parser issues.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to