[jira] Subscription: PIG patch available

2014-04-15 Thread jira
Issue Subscription
Filter: PIG patch available (19 issues)

Subscriber: pigdaily

Key Summary
PIG-3894Datetime function AddDuration, SubtractDuration and all Between 
functions don't check for null values in the input tuple.
https://issues.apache.org/jira/browse/PIG-3894
PIG-3877Getting Geo Latitude/Longitude from Address Lines
https://issues.apache.org/jira/browse/PIG-3877
PIG-3874FileLocalizer temp path can sometimes be non-unique
https://issues.apache.org/jira/browse/PIG-3874
PIG-3873Geo distance calculation using Haversine
https://issues.apache.org/jira/browse/PIG-3873
PIG-3867Added hadoop home to build classpath for build pig with unit test 
on windows
https://issues.apache.org/jira/browse/PIG-3867
PIG-3866Create ThreadLocal classloader per PigContext
https://issues.apache.org/jira/browse/PIG-3866
PIG-3865Remodel the XMLLoader to work to be faster and more maintainable
https://issues.apache.org/jira/browse/PIG-3865
PIG-3861duplicate jars get added to distributed cache
https://issues.apache.org/jira/browse/PIG-3861
PIG-3825Stats collection needs to be changed for hadoop2 (with auto local 
mode)
https://issues.apache.org/jira/browse/PIG-3825
PIG-3772Syntax error when casting an inner schema of a bag and line break 
involved
https://issues.apache.org/jira/browse/PIG-3772
PIG-3771Piggybank Avrostorage makes a lot of namenode calls in the backend
https://issues.apache.org/jira/browse/PIG-3771
PIG-3737Bundle dependent jars in distribution in %PIG_HOME%/lib folder
https://issues.apache.org/jira/browse/PIG-3737
PIG-3735UDF to data cleanse the dirty data with expected pattern
https://issues.apache.org/jira/browse/PIG-3735
PIG-3668COR built-in function when atleast one of the coefficient values is 
NaN
https://issues.apache.org/jira/browse/PIG-3668
PIG-3635Fix e2e tests for Hadoop 2.X on Windows
https://issues.apache.org/jira/browse/PIG-3635
PIG-3613UDF for SimilarityMatching between strings with matching scores
https://issues.apache.org/jira/browse/PIG-3613
PIG-3587add functionality for rolling over dates
https://issues.apache.org/jira/browse/PIG-3587
PIG-3441Allow Pig to use default resources from Configuration objects
https://issues.apache.org/jira/browse/PIG-3441
PIG-3373XMLLoader returns non-matching nodes when a tag name spans through 
the block boundary
https://issues.apache.org/jira/browse/PIG-3373

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384


[jira] [Commented] (PIG-3771) Piggybank Avrostorage makes a lot of namenode calls in the backend

2014-04-15 Thread Cheolsoo Park (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13970103#comment-13970103
 ] 

Cheolsoo Park commented on PIG-3771:


I was wondering whether it's possible to change the key type of 
schemaToMergedSchemaMap from Path to URI to \[de\]serialize it directly, but it 
seems to require quite a few changes.

+1.

> Piggybank Avrostorage makes a lot of namenode calls in the backend
> --
>
> Key: PIG-3771
> URL: https://issues.apache.org/jira/browse/PIG-3771
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11.1
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.13.0
>
> Attachments: PIG-3771-1.patch
>
>
>   The amount of list status calls it makes in setLocation if combined with 
> wildcards can really slow down the namenode. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3892) Pig distribution for hadoop 2

2014-04-15 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13970098#comment-13970098
 ] 

Daniel Dai commented on PIG-3892:
-

It should be automatic. Do you have a use case user need to explicit pass the 
version number?

> Pig distribution for hadoop 2
> -
>
> Key: PIG-3892
> URL: https://issues.apache.org/jira/browse/PIG-3892
> Project: Pig
>  Issue Type: Bug
>  Components: build
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
>
> Currently Pig distribution only bundle pig.jar for Hadoop 1. For Hadoop 2 
> users they need to compile again using -Dhadoopversion=23 flag. That is a 
> quite confusing process. We need to make Pig work with Hadoop 2 out of box. I 
> am thinking two approaches:
> 1. Bundle both pig-h1.jar and pig-h2.jar in distribution, and bin/pig will 
> chose the right pig.jar to run
> 2. Make two Pig distributions for Hadoop 1 and Hadoop 
> Any opinion?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3892) Pig distribution for hadoop 2

2014-04-15 Thread Prashant Kommireddi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13970018#comment-13970018
 ] 

Prashant Kommireddi commented on PIG-3892:
--

+1 for 1. 

[~daijy] - would the way to invoke a certain version be passed as an argument 
to bin/pig, an env variable, both, something else?

> Pig distribution for hadoop 2
> -
>
> Key: PIG-3892
> URL: https://issues.apache.org/jira/browse/PIG-3892
> Project: Pig
>  Issue Type: Bug
>  Components: build
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
>
> Currently Pig distribution only bundle pig.jar for Hadoop 1. For Hadoop 2 
> users they need to compile again using -Dhadoopversion=23 flag. That is a 
> quite confusing process. We need to make Pig work with Hadoop 2 out of box. I 
> am thinking two approaches:
> 1. Bundle both pig-h1.jar and pig-h2.jar in distribution, and bin/pig will 
> chose the right pig.jar to run
> 2. Make two Pig distributions for Hadoop 1 and Hadoop 
> Any opinion?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3892) Pig distribution for hadoop 2

2014-04-15 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13970001#comment-13970001
 ] 

Alan Gates commented on PIG-3892:
-

+1 for 1.  IIRC bin/hadoop has a -version option, so we don't even need to 
depend on magic jars being present, we can just ask hadoop.

> Pig distribution for hadoop 2
> -
>
> Key: PIG-3892
> URL: https://issues.apache.org/jira/browse/PIG-3892
> Project: Pig
>  Issue Type: Bug
>  Components: build
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
>
> Currently Pig distribution only bundle pig.jar for Hadoop 1. For Hadoop 2 
> users they need to compile again using -Dhadoopversion=23 flag. That is a 
> quite confusing process. We need to make Pig work with Hadoop 2 out of box. I 
> am thinking two approaches:
> 1. Bundle both pig-h1.jar and pig-h2.jar in distribution, and bin/pig will 
> chose the right pig.jar to run
> 2. Make two Pig distributions for Hadoop 1 and Hadoop 
> Any opinion?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3880) After compiling trunk, I am seeing ClassLoaderObjectInputStream ClassNotFoundException.

2014-04-15 Thread David Medinets (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969951#comment-13969951
 ] 

David Medinets commented on PIG-3880:
-

Good point. Perhaps my version of hadoop is too old?

Hadoop 0.20.203.0
Subversion
http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203-r
1099333
Compiled by oom on Wed May  4 07:57:50 PDT 2011






> After compiling trunk, I am seeing ClassLoaderObjectInputStream 
> ClassNotFoundException.
> ---
>
> Key: PIG-3880
> URL: https://issues.apache.org/jira/browse/PIG-3880
> Project: Pig
>  Issue Type: Bug
>  Components: grunt
>Affects Versions: 0.13.0
>Reporter: David Medinets
>
> I pulled trunk from subversion using the following commands:
> mkdir pig
> cd pig
> svn co http://svn.apache.org/repos/asf/pig/trunk
> cd trunk
> ant
> export PATH=$PATH:$HOME/pig/trunk/bin
> export ACCUMULO_HOME=/opt/accumulo
> export HADOOP_HOME=/opt/hadoop
> export PIG_HOME=$HOME/pig/trunk
> export PIG_CLASSPATH="$HOME/pig/trunk/build/ivy/lib/Pig/*"
> export PIG_CLASSPATH="$ACCUMULO_HOME/lib/*:$PIG_CLASSPATH"
> cd ~
> pig
> Then I ran into this error:
> java.lang.NoClassDefFoundError: 
> org/apache/commons/io/input/ClassLoaderObjectInputStream
>   at org.apache.pig.Main.run(Main.java:399)
> When I change PIG_JAR to use the fat jar, I was able to run the pig command 
> without getting the exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PIG-3456) Reduce threadlocal conf access in backend for each record

2014-04-15 Thread Rohini Palaniswamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-3456:


Status: Open  (was: Patch Available)

Cancelling patch as it needs to be rebased after PIG-3591

> Reduce threadlocal conf access in backend for each record
> -
>
> Key: PIG-3456
> URL: https://issues.apache.org/jira/browse/PIG-3456
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.11.1
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.13.0
>
> Attachments: PIG-3456-1-no-whitespace.patch, PIG-3456-1.patch
>
>
> Noticed few things while browsing code
> 1) DefaultTuple has a protected boolean isNull = false; which is never used. 
> Removing this gives ~3-5% improvement for big jobs
> 2) Config checking with ThreadLocal conf is repeatedly done for each record. 
> For eg: createDataBag in POCombinerPackage. But initialized only for first 
> time in other places like POPackage, POJoinPackage, etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3771) Piggybank Avrostorage makes a lot of namenode calls in the backend

2014-04-15 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969938#comment-13969938
 ] 

Rohini Palaniswamy commented on PIG-3771:
-

Can someone review this?

> Piggybank Avrostorage makes a lot of namenode calls in the backend
> --
>
> Key: PIG-3771
> URL: https://issues.apache.org/jira/browse/PIG-3771
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11.1
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.13.0
>
> Attachments: PIG-3771-1.patch
>
>
>   The amount of list status calls it makes in setLocation if combined with 
> wildcards can really slow down the namenode. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3874) FileLocalizer temp path can sometimes be non-unique

2014-04-15 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969889#comment-13969889
 ] 

Rohini Palaniswamy commented on PIG-3874:
-

Can you simplify

{code}
String tempPath= FileLocalizer.getTemporaryPath(pigContext).toString();
Path path = new Path(tempPath);
URI uri = path.toUri();
String prefix = "";
if (uri.getScheme() != null) {
prefix = uri.getScheme() + ":";
}
assertTrue(tempPath.startsWith(prefix + pigTempDir.getPath()));
{code}

to 

{code}
String tempPath= FileLocalizer.getTemporaryPath(pigContext).toString();
Path path = new Path(tempPath);
assertTrue(tempPath.startsWith(pigTempDir.toURI()));
{code}

> FileLocalizer temp path can sometimes be non-unique
> ---
>
> Key: PIG-3874
> URL: https://issues.apache.org/jira/browse/PIG-3874
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.13.0
>Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: 0.13.0
>
> Attachments: PIG-3874-1.patch, PIG-3874.patch
>
>
> In some rare corner cases, more than one process can arrive at the same 
> randomly generated temporary path to localize task files. This needs to be 
> handled with a check to see if location already exists and to get a unique 
> path.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: [ANNOUNCE] Apache Pig 0.12.1 released

2014-04-15 Thread Thejas Nair
Thanks Prashant!


On Tue, Apr 15, 2014 at 10:58 AM, Cheolsoo Park  wrote:
> Thank you Prashant for your hard work!
>
>
> On Mon, Apr 14, 2014 at 5:37 PM, Daniel Dai  wrote:
>
>> Thanks Prashant!
>>
>> On Mon, Apr 14, 2014 at 5:30 PM, Prashant Kommireddi
>>  wrote:
>> > The Pig team is happy to announce the Pig 0.12.1 release.
>> >
>> > Apache Pig provides a high-level data-flow language and execution
>> framework
>> > for parallel computation on Hadoop clusters.
>> >
>> > More details about Pig can be found at http://pig.apache.org/.
>> >
>> > This is a maintenance release of Pig 0.12 and contains several bug fixes
>> > and improvements. The details of the release can be found at
>> > http://pig.apache.org/releases.html.
>> >
>> > You can download the release here
>> > http://www.apache.org/dyn/closer.cgi/pig
>> >
>> > The released maven artifacts have been made available on
>> > repository.apache.org
>> >
>> > We would like to thank all contributors that made this release possible.
>> >
>> > Thanks,
>> > Prashant Kommireddi
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Commented] (PIG-3772) Syntax error when casting an inner schema of a bag and line break involved

2014-04-15 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969848#comment-13969848
 ] 

Daniel Dai commented on PIG-3772:
-

Thanks [~ssvinarchukhorton], the patch works for me. There is another 
occurrence of the same pattern in " MORE", shall we change it as well?

> Syntax error when casting an inner schema of a bag and line break involved
> --
>
> Key: PIG-3772
> URL: https://issues.apache.org/jira/browse/PIG-3772
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11.1
>Reporter: Haishan Liu
>Assignee: Sergey Svinarchuk
> Fix For: 0.13.0
>
> Attachments: PIG-3772.patch
>
>
> Hi,
> The following script fails with syntax error
> {code}
> A = load 'data' as (a:{(x:chararray, y:float)}, b:chararray);
> B = foreach A generate
> b,
> (bag{tuple(long)}) a.x as ax:{(x:long)};
> {code}
> where the cast statement is on its own line.
> The script fails with the following exception:
> {code}
> 19-02-2014 17:30:22 PST bug_script ERROR - 
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during 
> parsing.   Syntax error, unexpected symbol at or near 'bag'
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer.registerQuery(PigServer.java:516)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:988)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.Main.run(Main.java:604)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigRunner.run(PigRunner.java:49)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper.runPigJob(HadoopSecurePigWrapper.java:116)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:106)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:103)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> java.security.AccessController.doPrivileged(Native Method)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> javax.security.auth.Subject.doAs(Subject.java:396)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper.main(HadoopSecurePigWrapper.java:103)
> 19-02-2014 17:30:22 PST bug_script ERROR - Caused by: Failed to parse:   
> Syntax error, unexpected symbol at or near 'bag'
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:235)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:177)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599)
> 19-02-2014 17:30:22 PST bug_script ERROR -... 16 more
> {code}
> The script succeeds if the foreach statement is written in one line:
> {code}
> B = foreach A generate b, (bag{tuple(long)}) a.x as ax:{(x:long)};
> {code}
> This problem happens only in batch mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: [ANNOUNCE] Apache Pig 0.12.1 released

2014-04-15 Thread Cheolsoo Park
Thank you Prashant for your hard work!


On Mon, Apr 14, 2014 at 5:37 PM, Daniel Dai  wrote:

> Thanks Prashant!
>
> On Mon, Apr 14, 2014 at 5:30 PM, Prashant Kommireddi
>  wrote:
> > The Pig team is happy to announce the Pig 0.12.1 release.
> >
> > Apache Pig provides a high-level data-flow language and execution
> framework
> > for parallel computation on Hadoop clusters.
> >
> > More details about Pig can be found at http://pig.apache.org/.
> >
> > This is a maintenance release of Pig 0.12 and contains several bug fixes
> > and improvements. The details of the release can be found at
> > http://pig.apache.org/releases.html.
> >
> > You can download the release here
> > http://www.apache.org/dyn/closer.cgi/pig
> >
> > The released maven artifacts have been made available on
> > repository.apache.org
> >
> > We would like to thank all contributors that made this release possible.
> >
> > Thanks,
> > Prashant Kommireddi
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>


[jira] [Commented] (PIG-3880) After compiling trunk, I am seeing ClassLoaderObjectInputStream ClassNotFoundException.

2014-04-15 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969808#comment-13969808
 ] 

Josh Elser commented on PIG-3880:
-

What version of Hadoop are you using, [~medined]? I recall situations on other 
projects where the dependency management expected certain artifacts to be 
provided by hadoop when the user's version didn't actually provide that jar. I 
believe commons-io was one of these artifacts that I was bit by too.

This seems to be a plausible explanation to what you're seeing. The 
jarwithhadoop would contain the dependencies and thus you wouldn't have the 
issues if your local hadoop install was missing necessary jars.

> After compiling trunk, I am seeing ClassLoaderObjectInputStream 
> ClassNotFoundException.
> ---
>
> Key: PIG-3880
> URL: https://issues.apache.org/jira/browse/PIG-3880
> Project: Pig
>  Issue Type: Bug
>  Components: grunt
>Affects Versions: 0.13.0
>Reporter: David Medinets
>
> I pulled trunk from subversion using the following commands:
> mkdir pig
> cd pig
> svn co http://svn.apache.org/repos/asf/pig/trunk
> cd trunk
> ant
> export PATH=$PATH:$HOME/pig/trunk/bin
> export ACCUMULO_HOME=/opt/accumulo
> export HADOOP_HOME=/opt/hadoop
> export PIG_HOME=$HOME/pig/trunk
> export PIG_CLASSPATH="$HOME/pig/trunk/build/ivy/lib/Pig/*"
> export PIG_CLASSPATH="$ACCUMULO_HOME/lib/*:$PIG_CLASSPATH"
> cd ~
> pig
> Then I ran into this error:
> java.lang.NoClassDefFoundError: 
> org/apache/commons/io/input/ClassLoaderObjectInputStream
>   at org.apache.pig.Main.run(Main.java:399)
> When I change PIG_JAR to use the fat jar, I was able to run the pig command 
> without getting the exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3737) Bundle dependent jars in distribution in %PIG_HOME%/lib folder

2014-04-15 Thread Cheolsoo Park (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969793#comment-13969793
 ] 

Cheolsoo Park commented on PIG-3737:


Thank you Daniel. That looks like a good list to me.

> Bundle dependent jars in distribution in %PIG_HOME%/lib folder
> --
>
> Key: PIG-3737
> URL: https://issues.apache.org/jira/browse/PIG-3737
> Project: Pig
>  Issue Type: Bug
>Reporter: Shuaishuai Nie
>Assignee: Daniel Dai
> Attachments: PIG-3737.1.patch
>
>
> Pig should bundle with dependencies like avro.jar and json-simple.jar



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3889) Direct fetch doesn't set job submission timestamps

2014-04-15 Thread Lorand Bendig (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969779#comment-13969779
 ] 

Lorand Bendig commented on PIG-3889:


Cheolsoo, thanks for committing it!

> Direct fetch doesn't set job submission timestamps
> --
>
> Key: PIG-3889
> URL: https://issues.apache.org/jira/browse/PIG-3889
> Project: Pig
>  Issue Type: Bug
>Reporter: Lorand Bendig
>Assignee: Lorand Bendig
> Fix For: 0.13.0
>
> Attachments: PIG-3889-2.patch, PIG-3889.patch
>
>
> The following query fails in fetch mode:
> {code}
> A = load 'data' as (a:chararray);   
> B = FOREACH A generate 'a', CurrentTime(); 
> dump B;
> {code}
> Reason: CurrentTime() throws an exception if {{pig.job.submitted.timestamp}} 
> is not set.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PIG-3889) Direct fetch doesn't set job submission timestamps

2014-04-15 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-3889:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thank you Lorand!

> Direct fetch doesn't set job submission timestamps
> --
>
> Key: PIG-3889
> URL: https://issues.apache.org/jira/browse/PIG-3889
> Project: Pig
>  Issue Type: Bug
>Reporter: Lorand Bendig
>Assignee: Lorand Bendig
> Fix For: 0.13.0
>
> Attachments: PIG-3889-2.patch, PIG-3889.patch
>
>
> The following query fails in fetch mode:
> {code}
> A = load 'data' as (a:chararray);   
> B = FOREACH A generate 'a', CurrentTime(); 
> dump B;
> {code}
> Reason: CurrentTime() throws an exception if {{pig.job.submitted.timestamp}} 
> is not set.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3890) Global sort is not working (order by) Pig over Tez

2014-04-15 Thread Nagamallikarjuna (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969556#comment-13969556
 ] 

Nagamallikarjuna commented on PIG-3890:
---

HI All, am preparing the document for this Exception which includes the 
commit/branch of Pig and Tez,  Pig script, and AM/container logs. I will update 
as soon as possible.

> Global sort is not working (order by) Pig over Tez
> --
>
> Key: PIG-3890
> URL: https://issues.apache.org/jira/browse/PIG-3890
> Project: Pig
>  Issue Type: Sub-task
> Environment: Linux
>Reporter: Nagamallikarjuna
>Priority: Minor
>  Labels: Global, pig, sort, tez
>
> I tried to run pig scripts on top of Apache Tez. I am getting the following 
> exception while running global sort (order by operator).
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to 
> open iterator for alias orddata
> at org.apache.pig.PigServer.openIterator(PigServer.java:880)
> at 
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:774)
> at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
> at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
> at org.apache.pig.Main.run(Main.java:541)
> at org.apache.pig.Main.main(Main.java:156)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: java.io.IOException: Couldn't retrieve job.
> at org.apache.pig.PigServer.store(PigServer.java:944)
> at org.apache.pig.PigServer.openIterator(PigServer.java:855)
> ... 12 more



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3892) Pig distribution for hadoop 2

2014-04-15 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969544#comment-13969544
 ] 

Rohini Palaniswamy commented on PIG-3892:
-

We should go with 1. It is easy to do and one installation works for both 
Hadoop 1 and 2 depending upon what HADOOP_HOME or HADOOP_PREFIX points to. We 
just check for presence of hadoop-core*.jar in hadoop classpath and if present 
put pig-h1.jar in classpath else put pig-h2.jar in classpath. 

> Pig distribution for hadoop 2
> -
>
> Key: PIG-3892
> URL: https://issues.apache.org/jira/browse/PIG-3892
> Project: Pig
>  Issue Type: Bug
>  Components: build
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
>
> Currently Pig distribution only bundle pig.jar for Hadoop 1. For Hadoop 2 
> users they need to compile again using -Dhadoopversion=23 flag. That is a 
> quite confusing process. We need to make Pig work with Hadoop 2 out of box. I 
> am thinking two approaches:
> 1. Bundle both pig-h1.jar and pig-h2.jar in distribution, and bin/pig will 
> chose the right pig.jar to run
> 2. Make two Pig distributions for Hadoop 1 and Hadoop 
> Any opinion?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 20320: PIG:3855 Turn on UnionOptimizer by default and add new e2e tests for union

2014-04-15 Thread Rohini Palaniswamy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20320/
---

(Updated April 15, 2014, 1:37 p.m.)


Review request for pig, Cheolsoo Park and Daniel Dai.


Changes
---

Updated patch
   - fixes a case in UnionOptimizer where it was failing when POSplit had a 
shared successor and was just writing to POValueOutputTez
   - Set parallelism to Math.min(sum of predecessors, 20) till we have ARP 
patch from Daniel.
   - Changed isFRJoin to connectedToPackage to generalize it. Included that in 
the copy constructor and clone else UnionOptimizer will have issue. 


Bugs: PIG:3855
https://issues.apache.org/jira/browse/PIG:3855


Repository: pig


Description
---

Changes done:
Created a new input in TEZ-1003 and used that so that we can turn on 
UnionOptimizer by default. Without that seeing lot of performance degradation 
in production scripts.
Added lot of e2e tests for UnionOptimizer and fixed code based on the issues 
found.
Fixed couple of other minor issues like
default parallelism not honored
Serializing full store was causing problems with some UDFs on deserialize for 
checkOutputSpecs.
This patch depends on TEZ-1003. So will check in once that is available as part 
of tez snapshot in maven.


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POLocalRearrange.java
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/MultiQueryOptimizerTez.java
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POFRJoinTez.java
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POIdentityInOutTez.java
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POLocalRearrangeTez.java
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POShuffleTezLoad.java
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POValueInputTez.java
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POValueOutputTez.java
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/optimizers/UnionOptimizer.java
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/tools/pigstats/tez/TezTaskStats.java
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/test/e2e/pig/tests/nightly.conf
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-MQ-2-OPTOFF.gld
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-MQ-2.gld
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-10-OPTOFF.gld
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-10.gld
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-2.gld
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-6-OPTOFF.gld
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-6.gld
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-7-OPTOFF.gld
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-7.gld
 1587343 
  
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-9-OPTOFF.gld
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-9.gld
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/tez/TestTezCompiler.java
 1587343 

Diff: https://reviews.apache.o

[jira] [Updated] (PIG-3772) Syntax error when casting an inner schema of a bag and line break involved

2014-04-15 Thread Sergey Svinarchuk (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Svinarchuk updated PIG-3772:
---

Status: Patch Available  (was: Reopened)

I attached patch with fix this issue. Please review it.

> Syntax error when casting an inner schema of a bag and line break involved
> --
>
> Key: PIG-3772
> URL: https://issues.apache.org/jira/browse/PIG-3772
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11.1
>Reporter: Haishan Liu
>Assignee: Sergey Svinarchuk
> Fix For: 0.13.0
>
> Attachments: PIG-3772.patch
>
>
> Hi,
> The following script fails with syntax error
> {code}
> A = load 'data' as (a:{(x:chararray, y:float)}, b:chararray);
> B = foreach A generate
> b,
> (bag{tuple(long)}) a.x as ax:{(x:long)};
> {code}
> where the cast statement is on its own line.
> The script fails with the following exception:
> {code}
> 19-02-2014 17:30:22 PST bug_script ERROR - 
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during 
> parsing.   Syntax error, unexpected symbol at or near 'bag'
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer.registerQuery(PigServer.java:516)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:988)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.Main.run(Main.java:604)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigRunner.run(PigRunner.java:49)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper.runPigJob(HadoopSecurePigWrapper.java:116)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:106)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:103)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> java.security.AccessController.doPrivileged(Native Method)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> javax.security.auth.Subject.doAs(Subject.java:396)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper.main(HadoopSecurePigWrapper.java:103)
> 19-02-2014 17:30:22 PST bug_script ERROR - Caused by: Failed to parse:   
> Syntax error, unexpected symbol at or near 'bag'
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:235)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:177)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599)
> 19-02-2014 17:30:22 PST bug_script ERROR -... 16 more
> {code}
> The script succeeds if the foreach statement is written in one line:
> {code}
> B = foreach A generate b, (bag{tuple(long)}) a.x as ax:{(x:long)};
> {code}
> This problem happens only in batch mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PIG-3772) Syntax error when casting an inner schema of a bag and line break involved

2014-04-15 Thread Sergey Svinarchuk (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Svinarchuk updated PIG-3772:
---

Attachment: PIG-3772.patch

> Syntax error when casting an inner schema of a bag and line break involved
> --
>
> Key: PIG-3772
> URL: https://issues.apache.org/jira/browse/PIG-3772
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11.1
>Reporter: Haishan Liu
>Assignee: Sergey Svinarchuk
> Fix For: 0.13.0
>
> Attachments: PIG-3772.patch
>
>
> Hi,
> The following script fails with syntax error
> {code}
> A = load 'data' as (a:{(x:chararray, y:float)}, b:chararray);
> B = foreach A generate
> b,
> (bag{tuple(long)}) a.x as ax:{(x:long)};
> {code}
> where the cast statement is on its own line.
> The script fails with the following exception:
> {code}
> 19-02-2014 17:30:22 PST bug_script ERROR - 
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during 
> parsing.   Syntax error, unexpected symbol at or near 'bag'
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer.registerQuery(PigServer.java:516)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:988)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.Main.run(Main.java:604)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigRunner.run(PigRunner.java:49)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper.runPigJob(HadoopSecurePigWrapper.java:116)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:106)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:103)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> java.security.AccessController.doPrivileged(Native Method)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> javax.security.auth.Subject.doAs(Subject.java:396)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper.main(HadoopSecurePigWrapper.java:103)
> 19-02-2014 17:30:22 PST bug_script ERROR - Caused by: Failed to parse:   
> Syntax error, unexpected symbol at or near 'bag'
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:235)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:177)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599)
> 19-02-2014 17:30:22 PST bug_script ERROR -... 16 more
> {code}
> The script succeeds if the foreach statement is written in one line:
> {code}
> B = foreach A generate b, (bag{tuple(long)}) a.x as ax:{(x:long)};
> {code}
> This problem happens only in batch mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Reopened] (PIG-3772) Syntax error when casting an inner schema of a bag and line break involved

2014-04-15 Thread Sergey Svinarchuk (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Svinarchuk reopened PIG-3772:



> Syntax error when casting an inner schema of a bag and line break involved
> --
>
> Key: PIG-3772
> URL: https://issues.apache.org/jira/browse/PIG-3772
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11.1
>Reporter: Haishan Liu
>Assignee: Sergey Svinarchuk
> Fix For: 0.13.0
>
>
> Hi,
> The following script fails with syntax error
> {code}
> A = load 'data' as (a:{(x:chararray, y:float)}, b:chararray);
> B = foreach A generate
> b,
> (bag{tuple(long)}) a.x as ax:{(x:long)};
> {code}
> where the cast statement is on its own line.
> The script fails with the following exception:
> {code}
> 19-02-2014 17:30:22 PST bug_script ERROR - 
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during 
> parsing.   Syntax error, unexpected symbol at or near 'bag'
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer.registerQuery(PigServer.java:516)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:988)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.Main.run(Main.java:604)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigRunner.run(PigRunner.java:49)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper.runPigJob(HadoopSecurePigWrapper.java:116)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:106)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:103)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> java.security.AccessController.doPrivileged(Native Method)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> javax.security.auth.Subject.doAs(Subject.java:396)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper.main(HadoopSecurePigWrapper.java:103)
> 19-02-2014 17:30:22 PST bug_script ERROR - Caused by: Failed to parse:   
> Syntax error, unexpected symbol at or near 'bag'
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:235)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:177)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599)
> 19-02-2014 17:30:22 PST bug_script ERROR -... 16 more
> {code}
> The script succeeds if the foreach statement is written in one line:
> {code}
> B = foreach A generate b, (bag{tuple(long)}) a.x as ax:{(x:long)};
> {code}
> This problem happens only in batch mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3772) Syntax error when casting an inner schema of a bag and line break involved

2014-04-15 Thread Sergey Svinarchuk (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969449#comment-13969449
 ] 

Sergey Svinarchuk commented on PIG-3772:


I reproduced this issue. But I had next exception: 
2014-04-15 13:16:19,047 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
1200:   mismatched input ';' expecting RIGHT_PAREN
And this issue reproduce in batch mode and interactive mode. 
But it not reproduce if create test with this script. 
This is problem in read and parse pig scripts. Because in 
GruntParser.processPig(String cmd) input string for second command will be 
{noformat}
B = foreach A generate
b,
(bag{tuple(long)}
{noformat}

> Syntax error when casting an inner schema of a bag and line break involved
> --
>
> Key: PIG-3772
> URL: https://issues.apache.org/jira/browse/PIG-3772
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11.1
>Reporter: Haishan Liu
>Assignee: Sergey Svinarchuk
> Fix For: 0.13.0
>
>
> Hi,
> The following script fails with syntax error
> {code}
> A = load 'data' as (a:{(x:chararray, y:float)}, b:chararray);
> B = foreach A generate
> b,
> (bag{tuple(long)}) a.x as ax:{(x:long)};
> {code}
> where the cast statement is on its own line.
> The script fails with the following exception:
> {code}
> 19-02-2014 17:30:22 PST bug_script ERROR - 
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during 
> parsing.   Syntax error, unexpected symbol at or near 'bag'
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer.registerQuery(PigServer.java:516)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:988)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.Main.run(Main.java:604)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigRunner.run(PigRunner.java:49)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper.runPigJob(HadoopSecurePigWrapper.java:116)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:106)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper$1.run(HadoopSecurePigWrapper.java:103)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> java.security.AccessController.doPrivileged(Native Method)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> javax.security.auth.Subject.doAs(Subject.java:396)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> azkaban.jobtype.HadoopSecurePigWrapper.main(HadoopSecurePigWrapper.java:103)
> 19-02-2014 17:30:22 PST bug_script ERROR - Caused by: Failed to parse:   
> Syntax error, unexpected symbol at or near 'bag'
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:235)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:177)
> 19-02-2014 17:30:22 PST bug_script ERROR -at 
> org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599)
> 19-02-2014 17:30:22 PST bug_script ERROR -... 16 more
> {code}
> The script succeeds if the foreach statement is written in one line:
> {code}
> B = foreach A generate b, (bag{tuple(long)}) a.x as ax:{(x:long)};
> {code}
> This problem happens only in batch mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)