[jira] Subscription: PIG patch available
Issue Subscription Filter: PIG patch available (19 issues) Subscriber: pigdaily Key Summary PIG-3894Datetime function AddDuration, SubtractDuration and all Between functions don't check for null values in the input tuple. https://issues.apache.org/jira/browse/PIG-3894 PIG-3877Getting Geo Latitude/Longitude from Address Lines https://issues.apache.org/jira/browse/PIG-3877 PIG-3874FileLocalizer temp path can sometimes be non-unique https://issues.apache.org/jira/browse/PIG-3874 PIG-3873Geo distance calculation using Haversine https://issues.apache.org/jira/browse/PIG-3873 PIG-3867Added hadoop home to build classpath for build pig with unit test on windows https://issues.apache.org/jira/browse/PIG-3867 PIG-3866Create ThreadLocal classloader per PigContext https://issues.apache.org/jira/browse/PIG-3866 PIG-3865Remodel the XMLLoader to work to be faster and more maintainable https://issues.apache.org/jira/browse/PIG-3865 PIG-3861duplicate jars get added to distributed cache https://issues.apache.org/jira/browse/PIG-3861 PIG-3825Stats collection needs to be changed for hadoop2 (with auto local mode) https://issues.apache.org/jira/browse/PIG-3825 PIG-3772Syntax error when casting an inner schema of a bag and line break involved https://issues.apache.org/jira/browse/PIG-3772 PIG-3771Piggybank Avrostorage makes a lot of namenode calls in the backend https://issues.apache.org/jira/browse/PIG-3771 PIG-3737Bundle dependent jars in distribution in %PIG_HOME%/lib folder https://issues.apache.org/jira/browse/PIG-3737 PIG-3735UDF to data cleanse the dirty data with expected pattern https://issues.apache.org/jira/browse/PIG-3735 PIG-3668COR built-in function when atleast one of the coefficient values is NaN https://issues.apache.org/jira/browse/PIG-3668 PIG-3635Fix e2e tests for Hadoop 2.X on Windows https://issues.apache.org/jira/browse/PIG-3635 PIG-3613UDF for SimilarityMatching between strings with matching scores https://issues.apache.org/jira/browse/PIG-3613 PIG-3587add functionality for rolling over dates https://issues.apache.org/jira/browse/PIG-3587 PIG-3441Allow Pig to use default resources from Configuration objects https://issues.apache.org/jira/browse/PIG-3441 PIG-3373XMLLoader returns non-matching nodes when a tag name spans through the block boundary https://issues.apache.org/jira/browse/PIG-3373 You may edit this subscription at: https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384
[jira] [Commented] (PIG-3771) Piggybank Avrostorage makes a lot of namenode calls in the backend
[ https://issues.apache.org/jira/browse/PIG-3771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13970103#comment-13970103 ] Cheolsoo Park commented on PIG-3771: I was wondering whether it's possible to change the key type of schemaToMergedSchemaMap from Path to URI to \[de\]serialize it directly, but it seems to require quite a few changes. +1. > Piggybank Avrostorage makes a lot of namenode calls in the backend > -- > > Key: PIG-3771 > URL: https://issues.apache.org/jira/browse/PIG-3771 > Project: Pig > Issue Type: Bug >Affects Versions: 0.11.1 >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy > Fix For: 0.13.0 > > Attachments: PIG-3771-1.patch > > > The amount of list status calls it makes in setLocation if combined with > wildcards can really slow down the namenode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3892) Pig distribution for hadoop 2
[ https://issues.apache.org/jira/browse/PIG-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13970098#comment-13970098 ] Daniel Dai commented on PIG-3892: - It should be automatic. Do you have a use case user need to explicit pass the version number? > Pig distribution for hadoop 2 > - > > Key: PIG-3892 > URL: https://issues.apache.org/jira/browse/PIG-3892 > Project: Pig > Issue Type: Bug > Components: build >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > > Currently Pig distribution only bundle pig.jar for Hadoop 1. For Hadoop 2 > users they need to compile again using -Dhadoopversion=23 flag. That is a > quite confusing process. We need to make Pig work with Hadoop 2 out of box. I > am thinking two approaches: > 1. Bundle both pig-h1.jar and pig-h2.jar in distribution, and bin/pig will > chose the right pig.jar to run > 2. Make two Pig distributions for Hadoop 1 and Hadoop > Any opinion? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3892) Pig distribution for hadoop 2
[ https://issues.apache.org/jira/browse/PIG-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13970018#comment-13970018 ] Prashant Kommireddi commented on PIG-3892: -- +1 for 1. [~daijy] - would the way to invoke a certain version be passed as an argument to bin/pig, an env variable, both, something else? > Pig distribution for hadoop 2 > - > > Key: PIG-3892 > URL: https://issues.apache.org/jira/browse/PIG-3892 > Project: Pig > Issue Type: Bug > Components: build >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > > Currently Pig distribution only bundle pig.jar for Hadoop 1. For Hadoop 2 > users they need to compile again using -Dhadoopversion=23 flag. That is a > quite confusing process. We need to make Pig work with Hadoop 2 out of box. I > am thinking two approaches: > 1. Bundle both pig-h1.jar and pig-h2.jar in distribution, and bin/pig will > chose the right pig.jar to run > 2. Make two Pig distributions for Hadoop 1 and Hadoop > Any opinion? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3892) Pig distribution for hadoop 2
[ https://issues.apache.org/jira/browse/PIG-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13970001#comment-13970001 ] Alan Gates commented on PIG-3892: - +1 for 1. IIRC bin/hadoop has a -version option, so we don't even need to depend on magic jars being present, we can just ask hadoop. > Pig distribution for hadoop 2 > - > > Key: PIG-3892 > URL: https://issues.apache.org/jira/browse/PIG-3892 > Project: Pig > Issue Type: Bug > Components: build >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > > Currently Pig distribution only bundle pig.jar for Hadoop 1. For Hadoop 2 > users they need to compile again using -Dhadoopversion=23 flag. That is a > quite confusing process. We need to make Pig work with Hadoop 2 out of box. I > am thinking two approaches: > 1. Bundle both pig-h1.jar and pig-h2.jar in distribution, and bin/pig will > chose the right pig.jar to run > 2. Make two Pig distributions for Hadoop 1 and Hadoop > Any opinion? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3880) After compiling trunk, I am seeing ClassLoaderObjectInputStream ClassNotFoundException.
[ https://issues.apache.org/jira/browse/PIG-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969951#comment-13969951 ] David Medinets commented on PIG-3880: - Good point. Perhaps my version of hadoop is too old? Hadoop 0.20.203.0 Subversion http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203-r 1099333 Compiled by oom on Wed May 4 07:57:50 PDT 2011 > After compiling trunk, I am seeing ClassLoaderObjectInputStream > ClassNotFoundException. > --- > > Key: PIG-3880 > URL: https://issues.apache.org/jira/browse/PIG-3880 > Project: Pig > Issue Type: Bug > Components: grunt >Affects Versions: 0.13.0 >Reporter: David Medinets > > I pulled trunk from subversion using the following commands: > mkdir pig > cd pig > svn co http://svn.apache.org/repos/asf/pig/trunk > cd trunk > ant > export PATH=$PATH:$HOME/pig/trunk/bin > export ACCUMULO_HOME=/opt/accumulo > export HADOOP_HOME=/opt/hadoop > export PIG_HOME=$HOME/pig/trunk > export PIG_CLASSPATH="$HOME/pig/trunk/build/ivy/lib/Pig/*" > export PIG_CLASSPATH="$ACCUMULO_HOME/lib/*:$PIG_CLASSPATH" > cd ~ > pig > Then I ran into this error: > java.lang.NoClassDefFoundError: > org/apache/commons/io/input/ClassLoaderObjectInputStream > at org.apache.pig.Main.run(Main.java:399) > When I change PIG_JAR to use the fat jar, I was able to run the pig command > without getting the exception. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PIG-3456) Reduce threadlocal conf access in backend for each record
[ https://issues.apache.org/jira/browse/PIG-3456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-3456: Status: Open (was: Patch Available) Cancelling patch as it needs to be rebased after PIG-3591 > Reduce threadlocal conf access in backend for each record > - > > Key: PIG-3456 > URL: https://issues.apache.org/jira/browse/PIG-3456 > Project: Pig > Issue Type: Improvement >Affects Versions: 0.11.1 >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy > Fix For: 0.13.0 > > Attachments: PIG-3456-1-no-whitespace.patch, PIG-3456-1.patch > > > Noticed few things while browsing code > 1) DefaultTuple has a protected boolean isNull = false; which is never used. > Removing this gives ~3-5% improvement for big jobs > 2) Config checking with ThreadLocal conf is repeatedly done for each record. > For eg: createDataBag in POCombinerPackage. But initialized only for first > time in other places like POPackage, POJoinPackage, etc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3771) Piggybank Avrostorage makes a lot of namenode calls in the backend
[ https://issues.apache.org/jira/browse/PIG-3771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969938#comment-13969938 ] Rohini Palaniswamy commented on PIG-3771: - Can someone review this? > Piggybank Avrostorage makes a lot of namenode calls in the backend > -- > > Key: PIG-3771 > URL: https://issues.apache.org/jira/browse/PIG-3771 > Project: Pig > Issue Type: Bug >Affects Versions: 0.11.1 >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy > Fix For: 0.13.0 > > Attachments: PIG-3771-1.patch > > > The amount of list status calls it makes in setLocation if combined with > wildcards can really slow down the namenode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3874) FileLocalizer temp path can sometimes be non-unique
[ https://issues.apache.org/jira/browse/PIG-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969889#comment-13969889 ] Rohini Palaniswamy commented on PIG-3874: - Can you simplify {code} String tempPath= FileLocalizer.getTemporaryPath(pigContext).toString(); Path path = new Path(tempPath); URI uri = path.toUri(); String prefix = ""; if (uri.getScheme() != null) { prefix = uri.getScheme() + ":"; } assertTrue(tempPath.startsWith(prefix + pigTempDir.getPath())); {code} to {code} String tempPath= FileLocalizer.getTemporaryPath(pigContext).toString(); Path path = new Path(tempPath); assertTrue(tempPath.startsWith(pigTempDir.toURI())); {code} > FileLocalizer temp path can sometimes be non-unique > --- > > Key: PIG-3874 > URL: https://issues.apache.org/jira/browse/PIG-3874 > Project: Pig > Issue Type: Bug >Affects Versions: 0.13.0 >Reporter: Mona Chitnis >Assignee: Mona Chitnis > Fix For: 0.13.0 > > Attachments: PIG-3874-1.patch, PIG-3874.patch > > > In some rare corner cases, more than one process can arrive at the same > randomly generated temporary path to localize task files. This needs to be > handled with a check to see if location already exists and to get a unique > path. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: [ANNOUNCE] Apache Pig 0.12.1 released
Thanks Prashant! On Tue, Apr 15, 2014 at 10:58 AM, Cheolsoo Park wrote: > Thank you Prashant for your hard work! > > > On Mon, Apr 14, 2014 at 5:37 PM, Daniel Dai wrote: > >> Thanks Prashant! >> >> On Mon, Apr 14, 2014 at 5:30 PM, Prashant Kommireddi >> wrote: >> > The Pig team is happy to announce the Pig 0.12.1 release. >> > >> > Apache Pig provides a high-level data-flow language and execution >> framework >> > for parallel computation on Hadoop clusters. >> > >> > More details about Pig can be found at http://pig.apache.org/. >> > >> > This is a maintenance release of Pig 0.12 and contains several bug fixes >> > and improvements. The details of the release can be found at >> > http://pig.apache.org/releases.html. >> > >> > You can download the release here >> > http://www.apache.org/dyn/closer.cgi/pig >> > >> > The released maven artifacts have been made available on >> > repository.apache.org >> > >> > We would like to thank all contributors that made this release possible. >> > >> > Thanks, >> > Prashant Kommireddi >> >> -- >> CONFIDENTIALITY NOTICE >> NOTICE: This message is intended for the use of the individual or entity to >> which it is addressed and may contain information that is confidential, >> privileged and exempt from disclosure under applicable law. If the reader >> of this message is not the intended recipient, you are hereby notified that >> any printing, copying, dissemination, distribution, disclosure or >> forwarding of this communication is strictly prohibited. If you have >> received this communication in error, please contact the sender immediately >> and delete it from your system. Thank You. >> -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (PIG-3772) Syntax error when casting an inner schema of a bag and line break involved
[ https://issues.apache.org/jira/browse/PIG-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969848#comment-13969848 ] Daniel Dai commented on PIG-3772: - Thanks [~ssvinarchukhorton], the patch works for me. There is another occurrence of the same pattern in " MORE", shall we change it as well? > Syntax error when casting an inner schema of a bag and line break involved > -- > > Key: PIG-3772 > URL: https://issues.apache.org/jira/browse/PIG-3772 > Project: Pig > Issue Type: Bug >Affects Versions: 0.11.1 >Reporter: Haishan Liu >Assignee: Sergey Svinarchuk > Fix For: 0.13.0 > > Attachments: PIG-3772.patch > > > Hi, > The following script fails with syntax error > {code} > A = load 'data' as (a:{(x:chararray, y:float)}, b:chararray); > B = foreach A generate > b, > (bag{tuple(long)}) a.x as ax:{(x:long)}; > {code} > where the cast statement is on its own line. > The script fails with the following exception: > {code} > 19-02-2014 17:30:22 PST bug_script ERROR - > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during > parsing. Syntax error, unexpected symbol at or near 'bag' > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer.registerQuery(PigServer.java:516) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:988) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.Main.run(Main.java:604) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigRunner.run(PigRunner.java:49) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper.runPigJob(HadoopSecurePigWrapper.java:116) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:106) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:103) > 19-02-2014 17:30:22 PST bug_script ERROR -at > java.security.AccessController.doPrivileged(Native Method) > 19-02-2014 17:30:22 PST bug_script ERROR -at > javax.security.auth.Subject.doAs(Subject.java:396) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper.main(HadoopSecurePigWrapper.java:103) > 19-02-2014 17:30:22 PST bug_script ERROR - Caused by: Failed to parse: > Syntax error, unexpected symbol at or near 'bag' > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:235) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:177) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599) > 19-02-2014 17:30:22 PST bug_script ERROR -... 16 more > {code} > The script succeeds if the foreach statement is written in one line: > {code} > B = foreach A generate b, (bag{tuple(long)}) a.x as ax:{(x:long)}; > {code} > This problem happens only in batch mode. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: [ANNOUNCE] Apache Pig 0.12.1 released
Thank you Prashant for your hard work! On Mon, Apr 14, 2014 at 5:37 PM, Daniel Dai wrote: > Thanks Prashant! > > On Mon, Apr 14, 2014 at 5:30 PM, Prashant Kommireddi > wrote: > > The Pig team is happy to announce the Pig 0.12.1 release. > > > > Apache Pig provides a high-level data-flow language and execution > framework > > for parallel computation on Hadoop clusters. > > > > More details about Pig can be found at http://pig.apache.org/. > > > > This is a maintenance release of Pig 0.12 and contains several bug fixes > > and improvements. The details of the release can be found at > > http://pig.apache.org/releases.html. > > > > You can download the release here > > http://www.apache.org/dyn/closer.cgi/pig > > > > The released maven artifacts have been made available on > > repository.apache.org > > > > We would like to thank all contributors that made this release possible. > > > > Thanks, > > Prashant Kommireddi > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You. >
[jira] [Commented] (PIG-3880) After compiling trunk, I am seeing ClassLoaderObjectInputStream ClassNotFoundException.
[ https://issues.apache.org/jira/browse/PIG-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969808#comment-13969808 ] Josh Elser commented on PIG-3880: - What version of Hadoop are you using, [~medined]? I recall situations on other projects where the dependency management expected certain artifacts to be provided by hadoop when the user's version didn't actually provide that jar. I believe commons-io was one of these artifacts that I was bit by too. This seems to be a plausible explanation to what you're seeing. The jarwithhadoop would contain the dependencies and thus you wouldn't have the issues if your local hadoop install was missing necessary jars. > After compiling trunk, I am seeing ClassLoaderObjectInputStream > ClassNotFoundException. > --- > > Key: PIG-3880 > URL: https://issues.apache.org/jira/browse/PIG-3880 > Project: Pig > Issue Type: Bug > Components: grunt >Affects Versions: 0.13.0 >Reporter: David Medinets > > I pulled trunk from subversion using the following commands: > mkdir pig > cd pig > svn co http://svn.apache.org/repos/asf/pig/trunk > cd trunk > ant > export PATH=$PATH:$HOME/pig/trunk/bin > export ACCUMULO_HOME=/opt/accumulo > export HADOOP_HOME=/opt/hadoop > export PIG_HOME=$HOME/pig/trunk > export PIG_CLASSPATH="$HOME/pig/trunk/build/ivy/lib/Pig/*" > export PIG_CLASSPATH="$ACCUMULO_HOME/lib/*:$PIG_CLASSPATH" > cd ~ > pig > Then I ran into this error: > java.lang.NoClassDefFoundError: > org/apache/commons/io/input/ClassLoaderObjectInputStream > at org.apache.pig.Main.run(Main.java:399) > When I change PIG_JAR to use the fat jar, I was able to run the pig command > without getting the exception. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3737) Bundle dependent jars in distribution in %PIG_HOME%/lib folder
[ https://issues.apache.org/jira/browse/PIG-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969793#comment-13969793 ] Cheolsoo Park commented on PIG-3737: Thank you Daniel. That looks like a good list to me. > Bundle dependent jars in distribution in %PIG_HOME%/lib folder > -- > > Key: PIG-3737 > URL: https://issues.apache.org/jira/browse/PIG-3737 > Project: Pig > Issue Type: Bug >Reporter: Shuaishuai Nie >Assignee: Daniel Dai > Attachments: PIG-3737.1.patch > > > Pig should bundle with dependencies like avro.jar and json-simple.jar -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3889) Direct fetch doesn't set job submission timestamps
[ https://issues.apache.org/jira/browse/PIG-3889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969779#comment-13969779 ] Lorand Bendig commented on PIG-3889: Cheolsoo, thanks for committing it! > Direct fetch doesn't set job submission timestamps > -- > > Key: PIG-3889 > URL: https://issues.apache.org/jira/browse/PIG-3889 > Project: Pig > Issue Type: Bug >Reporter: Lorand Bendig >Assignee: Lorand Bendig > Fix For: 0.13.0 > > Attachments: PIG-3889-2.patch, PIG-3889.patch > > > The following query fails in fetch mode: > {code} > A = load 'data' as (a:chararray); > B = FOREACH A generate 'a', CurrentTime(); > dump B; > {code} > Reason: CurrentTime() throws an exception if {{pig.job.submitted.timestamp}} > is not set. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PIG-3889) Direct fetch doesn't set job submission timestamps
[ https://issues.apache.org/jira/browse/PIG-3889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-3889: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. Thank you Lorand! > Direct fetch doesn't set job submission timestamps > -- > > Key: PIG-3889 > URL: https://issues.apache.org/jira/browse/PIG-3889 > Project: Pig > Issue Type: Bug >Reporter: Lorand Bendig >Assignee: Lorand Bendig > Fix For: 0.13.0 > > Attachments: PIG-3889-2.patch, PIG-3889.patch > > > The following query fails in fetch mode: > {code} > A = load 'data' as (a:chararray); > B = FOREACH A generate 'a', CurrentTime(); > dump B; > {code} > Reason: CurrentTime() throws an exception if {{pig.job.submitted.timestamp}} > is not set. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3890) Global sort is not working (order by) Pig over Tez
[ https://issues.apache.org/jira/browse/PIG-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969556#comment-13969556 ] Nagamallikarjuna commented on PIG-3890: --- HI All, am preparing the document for this Exception which includes the commit/branch of Pig and Tez, Pig script, and AM/container logs. I will update as soon as possible. > Global sort is not working (order by) Pig over Tez > -- > > Key: PIG-3890 > URL: https://issues.apache.org/jira/browse/PIG-3890 > Project: Pig > Issue Type: Sub-task > Environment: Linux >Reporter: Nagamallikarjuna >Priority: Minor > Labels: Global, pig, sort, tez > > I tried to run pig scripts on top of Apache Tez. I am getting the following > exception while running global sort (order by operator). > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias orddata > at org.apache.pig.PigServer.openIterator(PigServer.java:880) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:774) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173) > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) > at org.apache.pig.Main.run(Main.java:541) > at org.apache.pig.Main.main(Main.java:156) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at org.apache.hadoop.util.RunJar.main(RunJar.java:212) > Caused by: java.io.IOException: Couldn't retrieve job. > at org.apache.pig.PigServer.store(PigServer.java:944) > at org.apache.pig.PigServer.openIterator(PigServer.java:855) > ... 12 more -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3892) Pig distribution for hadoop 2
[ https://issues.apache.org/jira/browse/PIG-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969544#comment-13969544 ] Rohini Palaniswamy commented on PIG-3892: - We should go with 1. It is easy to do and one installation works for both Hadoop 1 and 2 depending upon what HADOOP_HOME or HADOOP_PREFIX points to. We just check for presence of hadoop-core*.jar in hadoop classpath and if present put pig-h1.jar in classpath else put pig-h2.jar in classpath. > Pig distribution for hadoop 2 > - > > Key: PIG-3892 > URL: https://issues.apache.org/jira/browse/PIG-3892 > Project: Pig > Issue Type: Bug > Components: build >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > > Currently Pig distribution only bundle pig.jar for Hadoop 1. For Hadoop 2 > users they need to compile again using -Dhadoopversion=23 flag. That is a > quite confusing process. We need to make Pig work with Hadoop 2 out of box. I > am thinking two approaches: > 1. Bundle both pig-h1.jar and pig-h2.jar in distribution, and bin/pig will > chose the right pig.jar to run > 2. Make two Pig distributions for Hadoop 1 and Hadoop > Any opinion? -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 20320: PIG:3855 Turn on UnionOptimizer by default and add new e2e tests for union
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/20320/ --- (Updated April 15, 2014, 1:37 p.m.) Review request for pig, Cheolsoo Park and Daniel Dai. Changes --- Updated patch - fixes a case in UnionOptimizer where it was failing when POSplit had a shared successor and was just writing to POValueOutputTez - Set parallelism to Math.min(sum of predecessors, 20) till we have ARP patch from Daniel. - Changed isFRJoin to connectedToPackage to generalize it. Included that in the copy constructor and clone else UnionOptimizer will have issue. Bugs: PIG:3855 https://issues.apache.org/jira/browse/PIG:3855 Repository: pig Description --- Changes done: Created a new input in TEZ-1003 and used that so that we can turn on UnionOptimizer by default. Without that seeing lot of performance degradation in production scripts. Added lot of e2e tests for UnionOptimizer and fixed code based on the issues found. Fixed couple of other minor issues like default parallelism not honored Serializing full store was causing problems with some UDFs on deserialize for checkOutputSpecs. This patch depends on TEZ-1003. So will check in once that is available as part of tez snapshot in maven. Diffs (updated) - http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POLocalRearrange.java 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/MultiQueryOptimizerTez.java 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POFRJoinTez.java 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POIdentityInOutTez.java 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POLocalRearrangeTez.java 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POShuffleTezLoad.java 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POValueInputTez.java 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POValueOutputTez.java 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/optimizers/UnionOptimizer.java 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/tools/pigstats/tez/TezTaskStats.java 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/test/e2e/pig/tests/nightly.conf 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-MQ-2-OPTOFF.gld 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-MQ-2.gld 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-10-OPTOFF.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-10.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-2.gld 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-6-OPTOFF.gld 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-6.gld 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-7-OPTOFF.gld 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-7.gld 1587343 http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-9-OPTOFF.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-9.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/tez/TestTezCompiler.java 1587343 Diff: https://reviews.apache.o
[jira] [Updated] (PIG-3772) Syntax error when casting an inner schema of a bag and line break involved
[ https://issues.apache.org/jira/browse/PIG-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Svinarchuk updated PIG-3772: --- Status: Patch Available (was: Reopened) I attached patch with fix this issue. Please review it. > Syntax error when casting an inner schema of a bag and line break involved > -- > > Key: PIG-3772 > URL: https://issues.apache.org/jira/browse/PIG-3772 > Project: Pig > Issue Type: Bug >Affects Versions: 0.11.1 >Reporter: Haishan Liu >Assignee: Sergey Svinarchuk > Fix For: 0.13.0 > > Attachments: PIG-3772.patch > > > Hi, > The following script fails with syntax error > {code} > A = load 'data' as (a:{(x:chararray, y:float)}, b:chararray); > B = foreach A generate > b, > (bag{tuple(long)}) a.x as ax:{(x:long)}; > {code} > where the cast statement is on its own line. > The script fails with the following exception: > {code} > 19-02-2014 17:30:22 PST bug_script ERROR - > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during > parsing. Syntax error, unexpected symbol at or near 'bag' > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer.registerQuery(PigServer.java:516) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:988) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.Main.run(Main.java:604) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigRunner.run(PigRunner.java:49) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper.runPigJob(HadoopSecurePigWrapper.java:116) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:106) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:103) > 19-02-2014 17:30:22 PST bug_script ERROR -at > java.security.AccessController.doPrivileged(Native Method) > 19-02-2014 17:30:22 PST bug_script ERROR -at > javax.security.auth.Subject.doAs(Subject.java:396) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper.main(HadoopSecurePigWrapper.java:103) > 19-02-2014 17:30:22 PST bug_script ERROR - Caused by: Failed to parse: > Syntax error, unexpected symbol at or near 'bag' > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:235) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:177) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599) > 19-02-2014 17:30:22 PST bug_script ERROR -... 16 more > {code} > The script succeeds if the foreach statement is written in one line: > {code} > B = foreach A generate b, (bag{tuple(long)}) a.x as ax:{(x:long)}; > {code} > This problem happens only in batch mode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PIG-3772) Syntax error when casting an inner schema of a bag and line break involved
[ https://issues.apache.org/jira/browse/PIG-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Svinarchuk updated PIG-3772: --- Attachment: PIG-3772.patch > Syntax error when casting an inner schema of a bag and line break involved > -- > > Key: PIG-3772 > URL: https://issues.apache.org/jira/browse/PIG-3772 > Project: Pig > Issue Type: Bug >Affects Versions: 0.11.1 >Reporter: Haishan Liu >Assignee: Sergey Svinarchuk > Fix For: 0.13.0 > > Attachments: PIG-3772.patch > > > Hi, > The following script fails with syntax error > {code} > A = load 'data' as (a:{(x:chararray, y:float)}, b:chararray); > B = foreach A generate > b, > (bag{tuple(long)}) a.x as ax:{(x:long)}; > {code} > where the cast statement is on its own line. > The script fails with the following exception: > {code} > 19-02-2014 17:30:22 PST bug_script ERROR - > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during > parsing. Syntax error, unexpected symbol at or near 'bag' > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer.registerQuery(PigServer.java:516) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:988) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.Main.run(Main.java:604) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigRunner.run(PigRunner.java:49) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper.runPigJob(HadoopSecurePigWrapper.java:116) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:106) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:103) > 19-02-2014 17:30:22 PST bug_script ERROR -at > java.security.AccessController.doPrivileged(Native Method) > 19-02-2014 17:30:22 PST bug_script ERROR -at > javax.security.auth.Subject.doAs(Subject.java:396) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper.main(HadoopSecurePigWrapper.java:103) > 19-02-2014 17:30:22 PST bug_script ERROR - Caused by: Failed to parse: > Syntax error, unexpected symbol at or near 'bag' > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:235) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:177) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599) > 19-02-2014 17:30:22 PST bug_script ERROR -... 16 more > {code} > The script succeeds if the foreach statement is written in one line: > {code} > B = foreach A generate b, (bag{tuple(long)}) a.x as ax:{(x:long)}; > {code} > This problem happens only in batch mode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Reopened] (PIG-3772) Syntax error when casting an inner schema of a bag and line break involved
[ https://issues.apache.org/jira/browse/PIG-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Svinarchuk reopened PIG-3772: > Syntax error when casting an inner schema of a bag and line break involved > -- > > Key: PIG-3772 > URL: https://issues.apache.org/jira/browse/PIG-3772 > Project: Pig > Issue Type: Bug >Affects Versions: 0.11.1 >Reporter: Haishan Liu >Assignee: Sergey Svinarchuk > Fix For: 0.13.0 > > > Hi, > The following script fails with syntax error > {code} > A = load 'data' as (a:{(x:chararray, y:float)}, b:chararray); > B = foreach A generate > b, > (bag{tuple(long)}) a.x as ax:{(x:long)}; > {code} > where the cast statement is on its own line. > The script fails with the following exception: > {code} > 19-02-2014 17:30:22 PST bug_script ERROR - > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during > parsing. Syntax error, unexpected symbol at or near 'bag' > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer.registerQuery(PigServer.java:516) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:988) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.Main.run(Main.java:604) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigRunner.run(PigRunner.java:49) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper.runPigJob(HadoopSecurePigWrapper.java:116) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:106) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:103) > 19-02-2014 17:30:22 PST bug_script ERROR -at > java.security.AccessController.doPrivileged(Native Method) > 19-02-2014 17:30:22 PST bug_script ERROR -at > javax.security.auth.Subject.doAs(Subject.java:396) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper.main(HadoopSecurePigWrapper.java:103) > 19-02-2014 17:30:22 PST bug_script ERROR - Caused by: Failed to parse: > Syntax error, unexpected symbol at or near 'bag' > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:235) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:177) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599) > 19-02-2014 17:30:22 PST bug_script ERROR -... 16 more > {code} > The script succeeds if the foreach statement is written in one line: > {code} > B = foreach A generate b, (bag{tuple(long)}) a.x as ax:{(x:long)}; > {code} > This problem happens only in batch mode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3772) Syntax error when casting an inner schema of a bag and line break involved
[ https://issues.apache.org/jira/browse/PIG-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969449#comment-13969449 ] Sergey Svinarchuk commented on PIG-3772: I reproduced this issue. But I had next exception: 2014-04-15 13:16:19,047 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: mismatched input ';' expecting RIGHT_PAREN And this issue reproduce in batch mode and interactive mode. But it not reproduce if create test with this script. This is problem in read and parse pig scripts. Because in GruntParser.processPig(String cmd) input string for second command will be {noformat} B = foreach A generate b, (bag{tuple(long)} {noformat} > Syntax error when casting an inner schema of a bag and line break involved > -- > > Key: PIG-3772 > URL: https://issues.apache.org/jira/browse/PIG-3772 > Project: Pig > Issue Type: Bug >Affects Versions: 0.11.1 >Reporter: Haishan Liu >Assignee: Sergey Svinarchuk > Fix For: 0.13.0 > > > Hi, > The following script fails with syntax error > {code} > A = load 'data' as (a:{(x:chararray, y:float)}, b:chararray); > B = foreach A generate > b, > (bag{tuple(long)}) a.x as ax:{(x:long)}; > {code} > where the cast statement is on its own line. > The script fails with the following exception: > {code} > 19-02-2014 17:30:22 PST bug_script ERROR - > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during > parsing. Syntax error, unexpected symbol at or near 'bag' > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer.registerQuery(PigServer.java:516) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:988) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.Main.run(Main.java:604) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigRunner.run(PigRunner.java:49) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper.runPigJob(HadoopSecurePigWrapper.java:116) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:106) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:103) > 19-02-2014 17:30:22 PST bug_script ERROR -at > java.security.AccessController.doPrivileged(Native Method) > 19-02-2014 17:30:22 PST bug_script ERROR -at > javax.security.auth.Subject.doAs(Subject.java:396) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) > 19-02-2014 17:30:22 PST bug_script ERROR -at > azkaban.jobtype.HadoopSecurePigWrapper.main(HadoopSecurePigWrapper.java:103) > 19-02-2014 17:30:22 PST bug_script ERROR - Caused by: Failed to parse: > Syntax error, unexpected symbol at or near 'bag' > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:235) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:177) > 19-02-2014 17:30:22 PST bug_script ERROR -at > org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599) > 19-02-2014 17:30:22 PST bug_script ERROR -... 16 more > {code} > The script succeeds if the foreach statement is written in one line: > {code} > B = foreach A generate b, (bag{tuple(long)}) a.x as ax:{(x:long)}; > {code} > This problem happens only in batch mode. -- This message was sent by Atlassian JIRA (v6.2#6252)