Re: Pig and Storm
I've added a wiki page for a "Pig on Storm Proposal" at https://cwiki.apache.org/confluence/display/PIG/Pig+on+Storm+Proposal I've included a primer on Storm (and Trident) as well as some of the challenges I foresee. Please read though my proposal and let me know what your thoughts are. On Wed, Jul 24, 2013 at 2:36 PM, Alan Gates wrote: > This sounds exciting. The next question is how do you plan to do it? > Would a physical plan be translated to a Storm job (or jobs)? Would it > need a different physical plan? Or would you just have the connection at > the language layer and all the planning separate? Do you envision needing > extensions/changes to the language to support Storm? Feel free to add a > page to Pig's wiki with your thoughts on an approach. > > Alan. > > On Jul 23, 2013, at 9:52 AM, Pradeep Gollakota wrote: > > > Hi Pig Developers, > > > > I wanted to reach out to you all and ask for you opinion on something. > > > > As a Pig user, I have come to love Pig as a framework. Pig provides a > great > > set of abstractions that make working with large datasets easy. Currently > > Pig is only backed by hadoop. However, with the new rise of Twitter Storm > > as a distributed real time processing engine, Pig users are missing out > on > > a great opportunity to be able to work with Pig in Storm. As a user of > Pig, > > Hadoop and Storm, and keeping with the Pig philosophy of "Pigs live > > anywhere," I'd like to get your thoughts on starting the implementation > of > > a Pig backend for Storm. > > > > Thanks > > Pradeep > >
[jira] Subscription: PIG patch available
Issue Subscription Filter: PIG patch available (15 issues) Subscriber: pigdaily Key Summary PIG-3389"Set job.name" does not work with dump command https://issues.apache.org/jira/browse/PIG-3389 PIG-3374CASE and IN fail when expression includes dereferencing operator https://issues.apache.org/jira/browse/PIG-3374 PIG-3359Register Statements and Param Substitution in Macros https://issues.apache.org/jira/browse/PIG-3359 PIG-3346New property that controls the number of combined splits https://issues.apache.org/jira/browse/PIG-3346 PIG-Fix remaining Windows core unit test failures https://issues.apache.org/jira/browse/PIG- PIG-3295Casting from bytearray failing after Union (even when each field is from a single Loader) https://issues.apache.org/jira/browse/PIG-3295 PIG-3292Logical plan invalid state: duplicate uid in schema during self-join to get cross product https://issues.apache.org/jira/browse/PIG-3292 PIG-3257Add unique identifier UDF https://issues.apache.org/jira/browse/PIG-3257 PIG-3210Pig fails to start when it cannot write log to log files https://issues.apache.org/jira/browse/PIG-3210 PIG-3199Expose LogicalPlan via PigServer API https://issues.apache.org/jira/browse/PIG-3199 PIG-3166Update eclipse .classpath according to ivy library.properties https://issues.apache.org/jira/browse/PIG-3166 PIG-3123Simplify Logical Plans By Removing Unneccessary Identity Projections https://issues.apache.org/jira/browse/PIG-3123 PIG-3088Add a builtin udf which removes prefixes https://issues.apache.org/jira/browse/PIG-3088 PIG-3021Split results missing records when there is null values in the column comparison https://issues.apache.org/jira/browse/PIG-3021 PIG-1914Support load/store JSON data in Pig https://issues.apache.org/jira/browse/PIG-1914 You may edit this subscription at: https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384
[jira] [Updated] (PIG-3114) Duplicated macro name error when using pigunit
[ https://issues.apache.org/jira/browse/PIG-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sajid Raza updated PIG-3114: Attachment: PatchedPigTest.java Temporary workaround in end-user code to get PigTest to work. > Duplicated macro name error when using pigunit > -- > > Key: PIG-3114 > URL: https://issues.apache.org/jira/browse/PIG-3114 > Project: Pig > Issue Type: Bug > Components: parser >Affects Versions: 0.11 >Reporter: Chetan Nadgire >Assignee: Chetan Nadgire > Fix For: 0.12 > > Attachments: PatchedPigTest.java, PIG-3114.patch, PIG-3114.patch > > > I'm using PigUnit to test a pig script within which a macro is defined. > Pig runs fine on cluster but getting parsing error with pigunit. > So I tried very basic pig script with macro and getting similar error. > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during > parsing. null. Reason: Duplicated macro name 'my_macro_1' > at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546) > at org.apache.pig.PigServer.registerQuery(PigServer.java:516) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:988) > at > org.apache.pig.pigunit.pig.GruntParser.processPig(GruntParser.java:61) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) > at > org.apache.pig.pigunit.pig.PigServer.registerScript(PigServer.java:56) > at org.apache.pig.pigunit.PigTest.registerScript(PigTest.java:160) > at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:231) > at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:261) > at FirstPigTest.MyPigTest.testTop2Queries(MyPigTest.java:32) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at junit.framework.TestCase.runTest(TestCase.java:176) > at junit.framework.TestCase.runBare(TestCase.java:141) > at junit.framework.TestResult$1.protect(TestResult.java:122) > at junit.framework.TestResult.runProtected(TestResult.java:142) > at junit.framework.TestResult.run(TestResult.java:125) > at junit.framework.TestCase.run(TestCase.java:129) > at junit.framework.TestSuite.runTest(TestSuite.java:255) > at junit.framework.TestSuite.run(TestSuite.java:250) > at > org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84) > at > org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50) > at > org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) > Caused by: Failed to parse: null. Reason: Duplicated macro name > 'my_macro_1' > at > org.apache.pig.parser.QueryParserDriver.makeMacroDef(QueryParserDriver.java:406) > at > org.apache.pig.parser.QueryParserDriver.expandMacro(QueryParserDriver.java:277) > at > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:178) > at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599) > ... 30 more > > Pig script which is failing : > {code:title=test.pig|borderStyle=solid} > DEFINE my_macro_1 (QUERY, A) RETURNS C { > $C = ORDER $QUERY BY total DESC, $A; > } ; > data = LOAD 'input' AS (query:CHARARRAY); > queries_group = GROUP data BY query; > queries_count = FOREACH queries_group GENERATE group AS query, COUNT(data) AS > total; > queries_ordered = my_macro_1(queries_count, query); > queries_limit = LIMIT queries_ordered 2; > STORE queries_limit INTO 'output'; > {code} > If I remove macro pigunit works fine. Even just defining macro without using > it results in parsing error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3114) Duplicated macro name error when using pigunit
[ https://issues.apache.org/jira/browse/PIG-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13719060#comment-13719060 ] Sajid Raza commented on PIG-3114: - Attached the workaround to this JIRA. > Duplicated macro name error when using pigunit > -- > > Key: PIG-3114 > URL: https://issues.apache.org/jira/browse/PIG-3114 > Project: Pig > Issue Type: Bug > Components: parser >Affects Versions: 0.11 >Reporter: Chetan Nadgire >Assignee: Chetan Nadgire > Fix For: 0.12 > > Attachments: PatchedPigTest.java, PIG-3114.patch, PIG-3114.patch > > > I'm using PigUnit to test a pig script within which a macro is defined. > Pig runs fine on cluster but getting parsing error with pigunit. > So I tried very basic pig script with macro and getting similar error. > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during > parsing. null. Reason: Duplicated macro name 'my_macro_1' > at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546) > at org.apache.pig.PigServer.registerQuery(PigServer.java:516) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:988) > at > org.apache.pig.pigunit.pig.GruntParser.processPig(GruntParser.java:61) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) > at > org.apache.pig.pigunit.pig.PigServer.registerScript(PigServer.java:56) > at org.apache.pig.pigunit.PigTest.registerScript(PigTest.java:160) > at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:231) > at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:261) > at FirstPigTest.MyPigTest.testTop2Queries(MyPigTest.java:32) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at junit.framework.TestCase.runTest(TestCase.java:176) > at junit.framework.TestCase.runBare(TestCase.java:141) > at junit.framework.TestResult$1.protect(TestResult.java:122) > at junit.framework.TestResult.runProtected(TestResult.java:142) > at junit.framework.TestResult.run(TestResult.java:125) > at junit.framework.TestCase.run(TestCase.java:129) > at junit.framework.TestSuite.runTest(TestSuite.java:255) > at junit.framework.TestSuite.run(TestSuite.java:250) > at > org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84) > at > org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50) > at > org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) > Caused by: Failed to parse: null. Reason: Duplicated macro name > 'my_macro_1' > at > org.apache.pig.parser.QueryParserDriver.makeMacroDef(QueryParserDriver.java:406) > at > org.apache.pig.parser.QueryParserDriver.expandMacro(QueryParserDriver.java:277) > at > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:178) > at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599) > ... 30 more > > Pig script which is failing : > {code:title=test.pig|borderStyle=solid} > DEFINE my_macro_1 (QUERY, A) RETURNS C { > $C = ORDER $QUERY BY total DESC, $A; > } ; > data = LOAD 'input' AS (query:CHARARRAY); > queries_group = GROUP data BY query; > queries_count = FOREACH queries_group GENERATE group AS query, COUNT(data) AS > total; > queries_ordered = my_macro_1(queries_count, query); > queries_limit = LIMIT queries_ordered 2; > STORE queries_limit INTO 'output'; > {code} > If I remove macro pigunit works fine. Even just defining macro without using > it results in parsing error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3389) "Set job.name" does not work with dump command
[ https://issues.apache.org/jira/browse/PIG-3389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13718904#comment-13718904 ] Alan Gates commented on PIG-3389: - +1 > "Set job.name" does not work with dump command > -- > > Key: PIG-3389 > URL: https://issues.apache.org/jira/browse/PIG-3389 > Project: Pig > Issue Type: Bug > Components: grunt >Reporter: Cheolsoo Park >Assignee: Cheolsoo Park >Priority: Minor > Fix For: 0.12 > > Attachments: PIG-3389.patch > > > The "job.name" property can be used to overwrite the default job name in Pig, > but the dump command does not honor it. > To reproduce the issue, run the following commands in Grunt shell in MR mode: > {code} > SET job.name 'FOO'; > a = LOAD '/foo'; > DUMP a; > {code} > You will see the job name is not 'FOO' in the JT UI. However, using store > instead of dump sets the job name correctly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2248) Pig parser does not detect when a macro name masks a UDF name
[ https://issues.apache.org/jira/browse/PIG-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-2248: Status: Open (was: Patch Available) Canceling patch as discussion is still on-going as to best approach > Pig parser does not detect when a macro name masks a UDF name > - > > Key: PIG-2248 > URL: https://issues.apache.org/jira/browse/PIG-2248 > Project: Pig > Issue Type: Bug > Components: parser >Affects Versions: 0.9.0 >Reporter: Alan Gates >Assignee: Johnny Zhang >Priority: Minor > Attachments: PIG-2248.patch.txt, PIG-2248.patch.txt, > PIG-2248.patch.txt, PIG-2248.patch.txt > > > Pig accepts a macro like: > {code} > define COUNT(in_relation, min_gpa) returns c { >b = filter $in_relation by gpa >= $min_gpa; >$c = foreach b generate age, name; >} > {code} > This should produce a warning that it is masking a UDF. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3182) Pig currently lacks functions to trim the whitespace only on one side
[ https://issues.apache.org/jira/browse/PIG-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13718879#comment-13718879 ] Kousuke Saruta commented on PIG-3182: - Thank you for updating the title and description of this JIRA, Cheolsoo! > Pig currently lacks functions to trim the whitespace only on one side > - > > Key: PIG-3182 > URL: https://issues.apache.org/jira/browse/PIG-3182 > Project: Pig > Issue Type: New Feature > Components: internal-udfs >Reporter: Padma Ravindran >Assignee: Kousuke Saruta >Priority: Minor > Labels: patch > Fix For: 0.12 > > Attachments: LTrim.java.patch, PIG-3182.patch, PIG-3182.patch, > PIG-3182.patch > > > Pig currently lacks function to trim the whitespace only on the left hand > side of a given word > ltrim(' lorem ') = 'lorem ' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Pig and Storm
This sounds exciting. The next question is how do you plan to do it? Would a physical plan be translated to a Storm job (or jobs)? Would it need a different physical plan? Or would you just have the connection at the language layer and all the planning separate? Do you envision needing extensions/changes to the language to support Storm? Feel free to add a page to Pig's wiki with your thoughts on an approach. Alan. On Jul 23, 2013, at 9:52 AM, Pradeep Gollakota wrote: > Hi Pig Developers, > > I wanted to reach out to you all and ask for you opinion on something. > > As a Pig user, I have come to love Pig as a framework. Pig provides a great > set of abstractions that make working with large datasets easy. Currently > Pig is only backed by hadoop. However, with the new rise of Twitter Storm > as a distributed real time processing engine, Pig users are missing out on > a great opportunity to be able to work with Pig in Storm. As a user of Pig, > Hadoop and Storm, and keeping with the Pig philosophy of "Pigs live > anywhere," I'd like to get your thoughts on starting the implementation of > a Pig backend for Storm. > > Thanks > Pradeep
[jira] [Created] (PIG-3393) STARTSWITH udf doesn't override outputSchema method
Cheolsoo Park created PIG-3393: -- Summary: STARTSWITH udf doesn't override outputSchema method Key: PIG-3393 URL: https://issues.apache.org/jira/browse/PIG-3393 Project: Pig Issue Type: Bug Components: internal-udfs Reporter: Cheolsoo Park Assignee: Sriram Krishnan Fix For: 0.12 It appears that a wrong patch was committed in PIG-2879. Looking at the code in trunk, the comments in the jira are not addressed yet committed: # outputSchema() method should be overridden. # exceptions should be handled better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3163) Pig current releases lack a UDF endsWith.This UDF tests if a given string ends with the specified suffix.
[ https://issues.apache.org/jira/browse/PIG-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-3163: --- Release Note: Pig now includes a ENDSWITH built-in UDF that checks for presence of a given suffix in a chararray. > Pig current releases lack a UDF endsWith.This UDF tests if a given string > ends with the specified suffix. > - > > Key: PIG-3163 > URL: https://issues.apache.org/jira/browse/PIG-3163 > Project: Pig > Issue Type: New Feature > Components: piggybank >Affects Versions: 0.10.0 >Reporter: Anuroopa George >Assignee: Sriram Krishnan > Fix For: 0.12 > > Attachments: pig-3163.patch > > > Pig current releases lack a UDF endsWith.This UDF tests if a given string > ends with the specified suffix.This UDF returns true if the character > sequence represented by the string argument given as a suffix is a suffix of > the character sequence represented by the given string; false otherwise.Also > true will be returned if the given suffix is an empty string or is equal to > the given String. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3392) Document STARTSWITH and ENDSWITH UDFs
[ https://issues.apache.org/jira/browse/PIG-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-3392: --- Description: PIG-2879 and PIG-3163 added new built-in udfs "STARTSWITH" and "ENDSWITH", but documentation is missing. (was: PIG-3163 added a new built-in udf "ENDSWITH", but documentation is missing.) Summary: Document STARTSWITH and ENDSWITH UDFs (was: Document ENDSWITH UDF) > Document STARTSWITH and ENDSWITH UDFs > - > > Key: PIG-3392 > URL: https://issues.apache.org/jira/browse/PIG-3392 > Project: Pig > Issue Type: Improvement > Components: documentation >Reporter: Cheolsoo Park >Assignee: Sriram Krishnan > Fix For: 0.12 > > > PIG-2879 and PIG-3163 added new built-in udfs "STARTSWITH" and "ENDSWITH", > but documentation is missing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (PIG-3392) Document ENDSWITH UDF
Cheolsoo Park created PIG-3392: -- Summary: Document ENDSWITH UDF Key: PIG-3392 URL: https://issues.apache.org/jira/browse/PIG-3392 Project: Pig Issue Type: Improvement Components: documentation Reporter: Cheolsoo Park Assignee: Sriram Krishnan Fix For: 0.12 PIG-3163 added a new built-in udf "ENDSWITH", but documentation is missing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3182) Pig currently lacks functions to trim the whitespace only on one side
[ https://issues.apache.org/jira/browse/PIG-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-3182: --- Resolution: Fixed Release Note: LTRIM and RTRIM are new built-in UDFs that trim the whitespace of a string on the left and right hand side respectively. (was: Patch available) Status: Resolved (was: Patch Available) Committed to trunk. > Pig currently lacks functions to trim the whitespace only on one side > - > > Key: PIG-3182 > URL: https://issues.apache.org/jira/browse/PIG-3182 > Project: Pig > Issue Type: New Feature > Components: internal-udfs >Reporter: Padma Ravindran >Assignee: Kousuke Saruta >Priority: Minor > Labels: patch > Fix For: 0.12 > > Attachments: LTrim.java.patch, PIG-3182.patch, PIG-3182.patch, > PIG-3182.patch > > > Pig currently lacks function to trim the whitespace only on the left hand > side of a given word > ltrim(' lorem ') = 'lorem ' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3182) Pig currently lacks functions to trim the whitespace only on one side
[ https://issues.apache.org/jira/browse/PIG-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-3182: --- Component/s: internal-udfs Summary: Pig currently lacks functions to trim the whitespace only on one side (was: Pig currently lacks function to trim the whitespace only on the left hand side) +1. Thank you Kousuke! Since you added both RTrim and LTrim, I updated the title of the jira properly. > Pig currently lacks functions to trim the whitespace only on one side > - > > Key: PIG-3182 > URL: https://issues.apache.org/jira/browse/PIG-3182 > Project: Pig > Issue Type: New Feature > Components: internal-udfs >Reporter: Padma Ravindran >Assignee: Kousuke Saruta >Priority: Minor > Labels: patch > Fix For: 0.12 > > Attachments: LTrim.java.patch, PIG-3182.patch, PIG-3182.patch, > PIG-3182.patch > > > Pig currently lacks function to trim the whitespace only on the left hand > side of a given word > ltrim(' lorem ') = 'lorem ' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3114) Duplicated macro name error when using pigunit
[ https://issues.apache.org/jira/browse/PIG-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13718179#comment-13718179 ] Ruslan Al-Fakikh commented on PIG-3114: --- Sajid Raza: Can you please paste the code for your workaround? > Duplicated macro name error when using pigunit > -- > > Key: PIG-3114 > URL: https://issues.apache.org/jira/browse/PIG-3114 > Project: Pig > Issue Type: Bug > Components: parser >Affects Versions: 0.11 >Reporter: Chetan Nadgire >Assignee: Chetan Nadgire > Fix For: 0.12 > > Attachments: PIG-3114.patch, PIG-3114.patch > > > I'm using PigUnit to test a pig script within which a macro is defined. > Pig runs fine on cluster but getting parsing error with pigunit. > So I tried very basic pig script with macro and getting similar error. > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during > parsing. null. Reason: Duplicated macro name 'my_macro_1' > at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546) > at org.apache.pig.PigServer.registerQuery(PigServer.java:516) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:988) > at > org.apache.pig.pigunit.pig.GruntParser.processPig(GruntParser.java:61) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) > at > org.apache.pig.pigunit.pig.PigServer.registerScript(PigServer.java:56) > at org.apache.pig.pigunit.PigTest.registerScript(PigTest.java:160) > at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:231) > at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:261) > at FirstPigTest.MyPigTest.testTop2Queries(MyPigTest.java:32) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at junit.framework.TestCase.runTest(TestCase.java:176) > at junit.framework.TestCase.runBare(TestCase.java:141) > at junit.framework.TestResult$1.protect(TestResult.java:122) > at junit.framework.TestResult.runProtected(TestResult.java:142) > at junit.framework.TestResult.run(TestResult.java:125) > at junit.framework.TestCase.run(TestCase.java:129) > at junit.framework.TestSuite.runTest(TestSuite.java:255) > at junit.framework.TestSuite.run(TestSuite.java:250) > at > org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84) > at > org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50) > at > org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) > Caused by: Failed to parse: null. Reason: Duplicated macro name > 'my_macro_1' > at > org.apache.pig.parser.QueryParserDriver.makeMacroDef(QueryParserDriver.java:406) > at > org.apache.pig.parser.QueryParserDriver.expandMacro(QueryParserDriver.java:277) > at > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:178) > at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599) > ... 30 more > > Pig script which is failing : > {code:title=test.pig|borderStyle=solid} > DEFINE my_macro_1 (QUERY, A) RETURNS C { > $C = ORDER $QUERY BY total DESC, $A; > } ; > data = LOAD 'input' AS (query:CHARARRAY); > queries_group = GROUP data BY query; > queries_count = FOREACH queries_group GENERATE group AS query, COUNT(data) AS > total; > queries_ordered = my_macro_1(queries_count, query); > queries_limit = LIMIT queries_ordered 2; > STORE queries_limit INTO 'output'; > {code} > If I remove macro pigunit works fine. Even just defining macro without using > it results in parsing error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira