[ 
https://issues.apache.org/jira/browse/PIG-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549272#comment-13549272
 ] 

Chetan Nadgire commented on PIG-3114:
-------------------------------------

Here's my investigation:

I am using assertOutput(aliasInput, input, alias, expected) to verify output. 

1. 'PigTest::assertOutput(String aliasInput, String[] input, String alias, 
String[] expected)' calls  'PigTest::registerScript()' to parse input query and 
to substitute aliases with input/output relations and load/store commands and 
generate query to be submitted to pigserver.

It copies input (from pigtest) to destination file and calls 
'PigTest::assertOutput(String alias, String[] expected)'

2. PigTest::assertOutput(String alias, String[] expected) again calls 
PigTest::registerScript() and submits generated query to pigserver again. Note 
that PigTest uses same instance on PigServer

It then appends newline to expected output and calls 
'assertEquals(StringUtils.join(expected, "\n"), 
StringUtils.join(getAlias(alias), "\n"));'

3. PigTest::getAlias() also calls registerScript()


So PigServer::registerScript() it called 3 times with modification to same 
query. PigServer::Graph::scriptCache is used to store query and since same 
PigServer is used for three calls, same query gets appended in scriptCache.
When this query is parse it errors out due to multiple definitions of macro.

Here's flow: 

PigTest::assertOutput(String aliasInput, String[] input, String alias, String[] 
expected)
 -> PigTest::registerScript()
    ->  PigServer::registerScript()

query submitted to pigserver (scriptCache):


DEFINE my_macro_1 (QUERY, A) RETURNS C {
    $C = ORDER $QUERY BY total DESC, $A;
} 

data =  LOAD 'input' AS (query:CHARARRAY);
queries_group = GROUP data BY query;
queries_count = FOREACH queries_group GENERATE group AS query, COUNT(data) AS 
total;
queries_ordered = my_macro_1(queries_count, query);
queries_limit = LIMIT queries_ordered 2;
STORE queries_limit INTO 'output';

----------------------------------------------------------------------------------------

PigTest::assertOutput(String alias, String[] expected)
 -> PigTest::registerScript()
    ->  PigServer::registerScript()

query submitted to pigserver (scriptCache):

DEFINE my_macro_1 (QUERY, A) RETURNS C {
    $C = ORDER $QUERY BY total DESC, $A;
} 

data =  LOAD 'input' AS (query:CHARARRAY);
queries_group = GROUP data BY query;
queries_count = FOREACH queries_group GENERATE group AS query, COUNT(data) AS 
total;
queries_ordered = my_macro_1(queries_count, query);
queries_limit = LIMIT queries_ordered 2;
STORE queries_limit INTO 'output';
DEFINE my_macro_1 (QUERY, A) RETURNS C {
    $C = ORDER $QUERY BY total DESC, $A;
} 

data =  LOAD 'input' AS (query:CHARARRAY);
queries_group = GROUP data BY query;
queries_count = FOREACH queries_group GENERATE group AS query, COUNT(data) AS 
total;
queries_ordered = my_macro_1(queries_count, query);
queries_limit = LIMIT queries_ordered 2;
STORE queries_limit INTO 'output';

----------------------------------------------------------------------------------------

PigTest::assertEquals(String expected, String current)
 -> PigTest::registerScript()
    ->  PigServer::registerScript()

query submitted to pigserver (scriptCache):

DEFINE my_macro_1 (QUERY, A) RETURNS C {
    $C = ORDER $QUERY BY total DESC, $A;
} 

data =  LOAD 'input' AS (query:CHARARRAY);
queries_group = GROUP data BY query;
queries_count = FOREACH queries_group GENERATE group AS query, COUNT(data) AS 
total;
queries_ordered = my_macro_1(queries_count, query);
queries_limit = LIMIT queries_ordered 2;
STORE queries_limit INTO 'output';
DEFINE my_macro_1 (QUERY, A) RETURNS C {
    $C = ORDER $QUERY BY total DESC, $A;
} 

data =  LOAD 'input' AS (query:CHARARRAY);
queries_group = GROUP data BY query;
queries_count = FOREACH queries_group GENERATE group AS query, COUNT(data) AS 
total;
queries_ordered = my_macro_1(queries_count, query);
queries_limit = LIMIT queries_ordered 2;
STORE queries_limit INTO 'output';
DEFINE my_macro_1 (QUERY, A) RETURNS C {
    $C = ORDER $QUERY BY total DESC, $A;
} 

data =  LOAD 'input' AS (query:CHARARRAY);
queries_group = GROUP data BY query;
queries_count = FOREACH queries_group GENERATE group AS query, COUNT(data) AS 
total;
queries_ordered = my_macro_1(queries_count, query);
queries_limit = LIMIT queries_ordered 2;
STORE queries_limit INTO 'output';

                
> Duplicated macro name error when using pigunit
> ----------------------------------------------
>
>                 Key: PIG-3114
>                 URL: https://issues.apache.org/jira/browse/PIG-3114
>             Project: Pig
>          Issue Type: Bug
>          Components: parser
>            Reporter: Chetan Nadgire
>
> I'm using PigUnit to test a pig script within which a macro is defined.
> Pig runs fine on cluster but getting parsing error with pigunit.
> So I tried very basic pig script with macro and getting similar error.
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during 
> parsing. <line 9> null. Reason: Duplicated macro name 'my_macro_1'
>       at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607)
>       at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546)
>       at org.apache.pig.PigServer.registerQuery(PigServer.java:516)
>       at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:988)
>       at 
> org.apache.pig.pigunit.pig.GruntParser.processPig(GruntParser.java:61)
>       at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412)
>       at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
>       at 
> org.apache.pig.pigunit.pig.PigServer.registerScript(PigServer.java:56)
>       at org.apache.pig.pigunit.PigTest.registerScript(PigTest.java:160)
>       at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:231)
>       at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:261)
>       at FirstPigTest.MyPigTest.testTop2Queries(MyPigTest.java:32)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at junit.framework.TestCase.runTest(TestCase.java:176)
>       at junit.framework.TestCase.runBare(TestCase.java:141)
>       at junit.framework.TestResult$1.protect(TestResult.java:122)
>       at junit.framework.TestResult.runProtected(TestResult.java:142)
>       at junit.framework.TestResult.run(TestResult.java:125)
>       at junit.framework.TestCase.run(TestCase.java:129)
>       at junit.framework.TestSuite.runTest(TestSuite.java:255)
>       at junit.framework.TestSuite.run(TestSuite.java:250)
>       at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
>       at 
> org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
>       at 
> org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
>       at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
>       at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
>       at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
>       at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
> Caused by: Failed to parse: <line 9> null. Reason: Duplicated macro name 
> 'my_macro_1'
>       at 
> org.apache.pig.parser.QueryParserDriver.makeMacroDef(QueryParserDriver.java:406)
>       at 
> org.apache.pig.parser.QueryParserDriver.expandMacro(QueryParserDriver.java:277)
>       at 
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:178)
>       at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599)
>       ... 30 more
>  
> Pig script which is failing :
> {code:title=test.pig|borderStyle=solid}
> DEFINE my_macro_1 (QUERY, A) RETURNS C {
>     $C = ORDER $QUERY BY total DESC, $A;
> } ;
> data =  LOAD 'input' AS (query:CHARARRAY);
> queries_group = GROUP data BY query;
> queries_count = FOREACH queries_group GENERATE group AS query, COUNT(data) AS 
> total;
> queries_ordered = my_macro_1(queries_count, query);
> queries_limit = LIMIT queries_ordered 2;
> STORE queries_limit INTO 'output';
> {code}
> If I remove macro pigunit works fine. Even just defining macro without using 
> it results in parsing error.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to