Re: Moving to new e2e harness for end-to-end testing

2011-09-01 Thread Alan Gates
One of my goals is to find a way to run public nightly builds for Pig.  Since 
it needs a cluster it's not clear to me Apache's infrastructure is the best 
choice.  I'm also investigating OSU's supercell (http://supercell.osuosl.org/) 
facilities.  Other recommendations are welcome.  I'd also like to get it set up 
so that there is a patch build, where contributors can submit a patch and a 
list of tests they think should be run, and have it run for them.

For now I am running the tests nightly on a local Jenkins and will report any 
errors when I see them.  And there's EC2 for running your own tests if you 
don't have a cluster.  But this isn't a viable long term solution.

Alan.

On Sep 1, 2011, at 6:49 PM, Dmitriy Ryaboy wrote:

> Alan,
> Great work.
> Any plans for hooking this up to the apache Jenkins instance?
> 
> D
> 
> On Thu, Sep 1, 2011 at 5:50 PM, Alan Gates  wrote:
> 
>> I have gotten the end-to-end test harness to the point where it runs
>> basically all the existing tests and where it can be run from ant.  It can
>> be run on either an existing cluster or in Amazon's EC2.  There are
>> instructions on how to run it in both settings at
>> https://cwiki.apache.org/confluence/display/PIG/HowToTest.  Please try it
>> out to see if it works in your environment.
>> 
>> As proposed in
>> https://cwiki.apache.org/confluence/display/PIG/PigTestProposal, I would
>> like to start migrating the junit tests that are really end-to-end tests (ie
>> the ones that use MiniCluster) to the e2e harness.  I'll start with the very
>> long running ones and work down to the shorter ones.
>> 
>> Also, as we contribute new patches these should include end-to-end tests
>> that will run in the new harness rather than in junit.  True unit tests
>> should still, of course, be done in junit.
>> 
>> Alan.



Re: Moving to new e2e harness for end-to-end testing

2011-09-01 Thread Dmitriy Ryaboy
Alan,
Great work.
Any plans for hooking this up to the apache Jenkins instance?

D

On Thu, Sep 1, 2011 at 5:50 PM, Alan Gates  wrote:

> I have gotten the end-to-end test harness to the point where it runs
> basically all the existing tests and where it can be run from ant.  It can
> be run on either an existing cluster or in Amazon's EC2.  There are
> instructions on how to run it in both settings at
> https://cwiki.apache.org/confluence/display/PIG/HowToTest.  Please try it
> out to see if it works in your environment.
>
> As proposed in
> https://cwiki.apache.org/confluence/display/PIG/PigTestProposal, I would
> like to start migrating the junit tests that are really end-to-end tests (ie
> the ones that use MiniCluster) to the e2e harness.  I'll start with the very
> long running ones and work down to the shorter ones.
>
> Also, as we contribute new patches these should include end-to-end tests
> that will run in the new harness rather than in junit.  True unit tests
> should still, of course, be done in junit.
>
> Alan.


Moving to new e2e harness for end-to-end testing

2011-09-01 Thread Alan Gates
I have gotten the end-to-end test harness to the point where it runs basically 
all the existing tests and where it can be run from ant.  It can be run on 
either an existing cluster or in Amazon's EC2.  There are instructions on how 
to run it in both settings at 
https://cwiki.apache.org/confluence/display/PIG/HowToTest.  Please try it out 
to see if it works in your environment.

As proposed in https://cwiki.apache.org/confluence/display/PIG/PigTestProposal, 
I would like to start migrating the junit tests that are really end-to-end 
tests (ie the ones that use MiniCluster) to the e2e harness.  I'll start with 
the very long running ones and work down to the shorter ones.

Also, as we contribute new patches these should include end-to-end tests that 
will run in the new harness rather than in junit.  True unit tests should 
still, of course, be done in junit.

Alan.

[jira] [Commented] (PIG-2263) Different error messages in grunt mode and file mode

2011-09-01 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095690#comment-13095690
 ] 

Ashutosh Chauhan commented on PIG-2263:
---

{code}
a = load 'default.boolean_table' using org.apache.hcatalog.pig.HCatLoader();
b = foreach a generate org.apache.hcatalog.utils.HCatTypeCheck('boolean+int', 
*);
store b into 'myout';
{code}

This is suppose to fail in grunt mode:
{code}
ERROR 1115: HCatalog column type 'BOOLEAN' is not supported in Pig as a column 
type

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245: 
 Cannot get schema from loadFunc 
org.apache.hcatalog.pig.HCatLoader
at 
org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:154)
at 
org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:109)
at 
org.apache.pig.newplan.logical.visitor.LineageFindRelVisitor.visit(LineageFindRelVisitor.java:100)
at 
org.apache.pig.newplan.logical.relational.LOLoad.accept(LOLoad.java:218)
at 
org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
at 
org.apache.pig.newplan.logical.visitor.CastLineageSetter.(CastLineageSetter.java:57)
at org.apache.pig.PigServer$Graph.compile(PigServer.java:1677)
at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1609)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1581)
at org.apache.pig.PigServer.registerQuery(PigServer.java:583)
at 
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:942)
at 
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:67)
at org.apache.pig.Main.run(Main.java:487)
at org.apache.pig.Main.main(Main.java:108)
Caused by: org.apache.pig.PigException: ERROR 1115: HCatalog column type 
'BOOLEAN' is not supported in Pig as a column type
at org.apache.hcatalog.pig.PigHCatUtil.getPigType(PigHCatUtil.java:281)
at org.apache.hcatalog.pig.PigHCatUtil.getPigType(PigHCatUtil.java:240)
at 
org.apache.hcatalog.pig.PigHCatUtil.getResourceSchemaFromFieldSchema(PigHCatUtil.java:189)
at 
org.apache.hcatalog.pig.PigHCatUtil.getResourceSchema(PigHCatUtil.java:163)
at org.apache.hcatalog.pig.HCatLoader.getSchema(HCatLoader.java:163)
at 
org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:150)
{code}

In file mode:
{code}
---
ERROR 1200: Pig script failed to parse: 
 pig script failed to validate: 
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245: 
 Cannot get schema from loadFunc 
org.apache.hcatalog.pig.HCatLoader

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during 
parsing. Pig script failed to parse: 
 pig script failed to validate: 
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245: 
 Cannot get schema from loadFunc 
org.apache.hcatalog.pig.HCatLoader
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1638)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1583)
at org.apache.pig.PigServer.registerQuery(PigServer.java:583)
at 
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:942)
at 
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:553)
at org.apache.pig.Main.main(Main.java:108)
Caused by: Failed to parse: Pig script failed to parse: 
 pig script failed to validate: 
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245: 
 Cannot get schema from loadFunc 
org.apache.hcatalog.pig.HCatLoader
at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:178)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1630)
... 9 more
Caused by: 
 pig script failed to validate: 
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245: 
 Cannot get schema from loadFunc 
org.apache.hcatalog.pig.HCatLoader
at 
org.apache.pig.parser.LogicalPlanBuilder.buildForeachOp(LogicalPlanBuilder.java:430)
at 
org.apache.pig.parser.LogicalPlanGenerator.foreach_clause(LogicalPlanGenerator.java:10825)
at 
org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:127

[jira] [Created] (PIG-2263) Different error messages in grunt mode and file mode

2011-09-01 Thread Ashutosh Chauhan (JIRA)
Different error messages in grunt mode and file mode


 Key: PIG-2263
 URL: https://issues.apache.org/jira/browse/PIG-2263
 Project: Pig
  Issue Type: Bug
  Components: parser
Affects Versions: 0.9.0
Reporter: Ashutosh Chauhan


Since in grunt parsing happens statement by statement and in file, whole of the 
script is considered, its possible to have different error messages. Problem 
here is that in file mode, most important error message is gobbled up and is 
not printed either on stdout/stderr or in logfile. In 0.8 it was correctly 
getting printed at stderr.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIG-2262) AvroStorage dependencies are missing from the release tarball

2011-09-01 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated PIG-2262:
---

Attachment: PIG-2262.patch

This patch fixes the the problem.

> AvroStorage dependencies are missing from the release tarball
> -
>
> Key: PIG-2262
> URL: https://issues.apache.org/jira/browse/PIG-2262
> Project: Pig
>  Issue Type: Bug
>  Components: build, piggybank
>Reporter: Tom White
>Assignee: Tom White
> Attachments: PIG-2262.patch
>
>
> This makes AvroStorage hard to use, since users have to download the 
> dependencies manually, or build Pig themselves.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (PIG-2262) AvroStorage dependencies are missing from the release tarball

2011-09-01 Thread Tom White (JIRA)
AvroStorage dependencies are missing from the release tarball
-

 Key: PIG-2262
 URL: https://issues.apache.org/jira/browse/PIG-2262
 Project: Pig
  Issue Type: Bug
  Components: build, piggybank
Reporter: Tom White
Assignee: Tom White


This makes AvroStorage hard to use, since users have to download the 
dependencies manually, or build Pig themselves.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIG-2249) Enable pig e2e testing on EC2

2011-09-01 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-2249:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch 2 checked in.

> Enable pig e2e testing on EC2
> -
>
> Key: PIG-2249
> URL: https://issues.apache.org/jira/browse/PIG-2249
> Project: Pig
>  Issue Type: New Feature
>  Components: tools
>Affects Versions: 0.10
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 0.10
>
> Attachments: PIG-2249.patch, PIG-2249_2.patch
>
>
> We need to enable users to test Pig on actual Hadoop clusters.  We can use 
> Whirr to allow users to easily spin up instances on EC2 and run the 
> end-to-end tests.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIG-2249) Enable pig e2e testing on EC2

2011-09-01 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-2249:


Attachment: PIG-2249_2.patch

Changed the patch slightly to download from Apache archives instead of a 
specific mirror.

> Enable pig e2e testing on EC2
> -
>
> Key: PIG-2249
> URL: https://issues.apache.org/jira/browse/PIG-2249
> Project: Pig
>  Issue Type: New Feature
>  Components: tools
>Affects Versions: 0.10
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 0.10
>
> Attachments: PIG-2249.patch, PIG-2249_2.patch
>
>
> We need to enable users to test Pig on actual Hadoop clusters.  We can use 
> Whirr to allow users to easily spin up instances on EC2 and run the 
> end-to-end tests.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (PIG-2239) Pig should use "bin/hadoop jar pig-withouthadoop.jar" in bin/pig instead of forming java command itself

2011-09-01 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095401#comment-13095401
 ] 

Daniel Dai commented on PIG-2239:
-

Actually 4 is not true. We do not go through GenericOptionsParser and hadoop 
will not take any command line options. So the command line parsing will be 
totally up to Pig as before.

> Pig should use "bin/hadoop jar pig-withouthadoop.jar" in bin/pig instead of 
> forming java command itself  
> -
>
> Key: PIG-2239
> URL: https://issues.apache.org/jira/browse/PIG-2239
> Project: Pig
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
> Attachments: PIG-2239-0.patch
>
>
> This will obliterate tons of classpath issues and hadoop versions, paths 
> problem that has fraught bin/pig and Pig in general.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Jenkins build is back to normal : Pig-trunk #1080

2011-09-01 Thread Apache Jenkins Server
See 




[jira] [Updated] (PIG-2261) Restore support for parenthesis in Pig 0.9

2011-09-01 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-2261:
--

Summary: Restore support for parenthesis in Pig 0.9  (was: Restor support 
for parenthesis in Pig 0.9)

> Restore support for parenthesis in Pig 0.9
> --
>
> Key: PIG-2261
> URL: https://issues.apache.org/jira/browse/PIG-2261
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.9.0
>Reporter: Richard Ding
> Fix For: 0.9.1
>
>
> Pig 0.8 and earlier versions used to support syntax such as 
>  
> {code}
> A =(load )
> {code}
> This was removed as "useless" in 0.9 when the grammar was redone. It turns 
> out that some user is using this for ease of code generation so we want to 
> restore it back.
> Just to clarify, Pig 0.9 continues to support composite statements such as
> {code}
> B = filter (load 'data' as (a, b)) by a > 0;
> {code}
> It just removed "useless" parenthesis and doesn't support statements like
> {code}
> A = (load 'data' as (a, b));
> {code}
>  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (PIG-2261) Restor support for parenthesis in Pig 0.9

2011-09-01 Thread Richard Ding (JIRA)
Restor support for parenthesis in Pig 0.9
-

 Key: PIG-2261
 URL: https://issues.apache.org/jira/browse/PIG-2261
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.9.0
Reporter: Richard Ding
 Fix For: 0.9.1


Pig 0.8 and earlier versions used to support syntax such as 
 
{code}
A =(load )
{code}

This was removed as "useless" in 0.9 when the grammar was redone. It turns out 
that some user is using this for ease of code generation so we want to restore 
it back.

Just to clarify, Pig 0.9 continues to support composite statements such as

{code}
B = filter (load 'data' as (a, b)) by a > 0;
{code}

It just removed "useless" parenthesis and doesn't support statements like

{code}
A = (load 'data' as (a, b));
{code}
 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira